Print Page - Criteria of Identification of hydrogen bonds

Questions and answers => RNA structures (DSSR) => Topic started by: xiaoj12 on June 09, 2016, 12:13:56 pm

Netiquette · Download · News · Gallery · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · Video Overview · DSSR v2.7.4 (DSSR Manual) · Homepage

Title: Criteria of Identification of hydrogen bonds
Post by: xiaoj12 on June 09, 2016, 12:13:56 pm

Hi Xiang-Jun,

Could you elaborate your heuristic criteria of the identification of hydrogen bonds in DSSR? I understand it is geometric based. But I couldn't find the details except for the distance cutoff in the DSSR paper in NAR.
Thanks!

JJ

Title: Re: Criteria of Identification of hydrogen bonds
Post by: xiangjun on June 09, 2016, 01:06:42 pm

Hi JJ,

There are far more technical details on H-bond identification than the short paragraph in the DSSR paper can handle. The basic idea is the mutual best match between any two N/O atom pairs under a distance cutoff. The algorithm was first implemented in early versions of 3DNA (~2000), and has been continuously refined over the years. Generally speaking, the H-bond finding functionality in 3DNA/DSSR does its job, e.g., detecting 3 H-bonds for a G-C pair, and 2 for A-T as expected.

The --get-hbond option in DSSR may fulfill your purpose. Or you may check HBPLUS (http://www.ebi.ac.uk/thornton-srv/software/HBPLUS/) or HBexplore (http://bioinformatics.oxfordjournals.org/content/12/4/281.full.pdf) that is dedicated to finding H-bonds.

Xiang-Jun

Title: Re: Criteria of Identification of hydrogen bonds
Post by: xiaoj12 on June 09, 2016, 01:27:00 pm

Xiang-Jun,

Thanks for your fast reply. What does "mutual best match" specify? Does it include a consideration of the angle? How does the algorithm avoid the spurious hydrogen bonds?

JJ

Title: Re: Criteria of Identification of hydrogen bonds
Post by: xiangjun on June 09, 2016, 02:01:28 pm

Good questions. However, the answer is unlikely feasible with simple text descriptions as we are doing here. By and large, I developed the H-bond finding algorithms in 3DNA/DSSR mainly for a specific purpose, i.e., to make DSSR self-contained. Its funcatonality was initially not intended to be exposed to the outside world. I added the DSSR --get-hbond option upon a user's request, and I was hoping the H-bond output may be helpful to other people as well.

Now, it is a nice surprise for me to know that someone like you seems to be interested in knowing the details of the 3DNA/DSSR H-bond identification algorithm. I am planing to release the source code of 3DNA v2.3 in the near future, where you will be able to dig into the bottom of it.

Presumably, the H-bond output from DSSR fits your needs. May I know a bit more about your use case? Why do you bother with 3DNA/DSSR for H-bonding instead of the well-established HBPLUS or HBexplore?

Xiang-Jun

Title: Re: Criteria of Identification of hydrogen bonds
Post by: xiaoj12 on June 09, 2016, 03:35:56 pm

Xiang-Jun,

It's good to hear you'd like to release some source code. To answer your question, 1) it appears to me 3DNA/DSSR has been optimized for identification of h-bond according to your description in your paper, which leads me to come here with above questions I asked. 2) I'm working on a project specifically on folding/unfolding nucleic acids, where the hbond statistics can help quantify the possible interesting structures (e.g. G-quadruplex). In such case, HBPLUS or HBexplore might not be well tuned as your program though I haven't tested with them. What do you think according to your experience? Do you think your algorithm indeed has particular merits on nucleic acids in comparisons with others that simply consider the distance between donor and receptor and angles?

As you could probably tell, it should make me clearer and more comfortable to pick DSSR instead of others if I could understand what specific optimization of your algorithm is.

JJ

Title: Re: Criteria of Identification of hydrogen bonds
Post by: xiangjun on June 09, 2016, 03:54:42 pm

It is always good to have some background information. As mentioned in my previous replies, I developed the H-bonding identification algorithm for 3DNA/DSSR solely to make my programs self-contained (e.g., without unnecessary third-party dependency). I honestly do not know if my algorithm is 'optimized' for finding H-bonds in nucleic acid structures. Furthermore, I have never made any direct comparisons against HBPLUS or HBexplore. All I know is that it works for my purpose, and it may serve as a convenient alternative for others. You're the first who has asked for the technical details.

An updated 3DNA v2.3 release with source code will be available toward the end of this month or early July. Stay tuned.

Xiang-Jun

Title: Re: Criteria of Identification of hydrogen bonds
Post by: xiaoj12 on June 09, 2016, 05:04:45 pm

Xiang-Jun,

I appreciate your honest reply. Now I may wanna bother you with a test case I just did. The hbond result from dssr is

Code: [Select]

 # H-bonds in '4DII_aptamer_autopsf.pdb' identified by DSSR, Xiang-Jun Lu (xiangjun@x3dna.org)
18
   16   189  #1     p    3.070 N:N N2@D.GUA1 N7@D.GUA6
   21   187  #2     p    2.941 N:O N1@D.GUA1 O6@D.GUA6
   24   481  #3     p    2.781 O:N O6@D.GUA1 N1@D.GUA15
   26   476  #4     p    2.816 N:N N7@D.GUA1 N2@D.GUA15
   49   449  #5     p    2.789 N:N N2@D.GUA2 N7@D.GUA14
   54   447  #6     p    2.970 N:O N1@D.GUA2 O6@D.GUA14
   57   151  #7     p    2.640 O:N O6@D.GUA2 N1@D.GUA5
   59   146  #8     p    2.737 N:N N7@D.GUA2 N2@D.GUA5
  154   347  #9     p    3.001 O:N O6@D.GUA5 N1@D.GUA11
  156   342  #10    p    2.987 N:N N7@D.GUA5 N2@D.GUA11
  179   319  #11    p    2.907 N:N N2@D.GUA6 N7@D.GUA10
  184   317  #12    p    2.849 N:O N1@D.GUA6 O6@D.GUA10
  244   291  #13    p    3.965 N:O N2@D.GUA8 O3'@D.THY9
  244   302  #14    p    3.073 N:O N2@D.GUA8 OP2@D.GUA10
  309   486  #15    p    3.002 N:N N2@D.GUA10 N7@D.GUA15
  314   484  #16    p    2.894 N:O N1@D.GUA10 O6@D.GUA15
  350   444  #17    p    2.574 O:N O6@D.GUA11 N1@D.GUA14
  352   439  #18    p    2.762 N:N N7@D.GUA11 N2@D.GUA14

However, to my understanding, this ssDNA should have 2 G-tetrads with 16 hbonds but dssr returned 18. I suspect this was because it counted the hbonds involved in phosphate. However, another test pdb shows more complicated results. Particularly, the same acceptor atoms involve in p-type hbond and x-type simultaneously. I was wondering if you could take a look at those attached pdbs to check whether the results from dssr are reasonable.

Thanks again.

JJ

Title: Re: Criteria of Identification of hydrogen bonds
Post by: xiangjun on June 09, 2016, 11:32:00 pm

Hi JJ,

Thanks for providing concrete examples to illustrate clearly what your concerns are with regard to the H-bonds detected by DSSR. As you already noticed, for 4DII_aptamer_autopsf.pdb, the two extra H-bonds are associated with the phosphate group. So here DSSR is doing its job.

Code: [Select]

  244   291  #13    p    3.965 N:O N2@D.GUA8 O3'@D.THY9
  244   302  #14    p    3.073 N:O N2@D.GUA8 OP2@D.GUA10

Your supplied Frames1.pdb "shows more complicated results. Particularly, the same acceptor atoms involve in p-type hbond and x-type simultaneously." However, as illustrated in the attached image, here again DSSR is working as designed.

The --get-hbond option is documented in the DSSR User Manual, specifically:

Quote

A one-letter symbol showing the atom-pair type (p) of the H-bond. It is ‘p’ for a donor-acceptor atom pair; ‘o’ for a donor/acceptor (such as the 2′-hydroxyl oxygen) with any other atom; ‘x’ for a donor-donor or acceptor-acceptor pair (as in #17, line 19 in the listing); ‘?’ if the donor/acceptor status of any H-bond atom is unknown.

So the 'x' type is for the unusual donor-donor or acceptor-acceptor atom pair. Normally, such H-bonds should not be possible or allowed. However, as shown below for one of the G-tetrads in Frames1.pdb, the 'spurious' H-bonds are otherwise perfectly reasonable from a geometric point of view. Thus DSSR also reports such special cases, but with a 'x' symbol to draw user's attention. I take this as a feature instead of a bug in the H-bonding identification algorithm of 3DNA/DSSR. For example, such a H-bond is presumed to exist in the C+C i-motif (See Figure S2E (http://forum.x3dna.org/dssr-nar-paper/supplementary-figure-2-three-similar-base-pairs-in-trna-and-its-mimic/) of the DSSR paper).

Could you run other H-bonding detection programs (including HBPLUS and HBexplore) on these two example PDB structures, and report back how they behave? After all, the 3DNA/DSSR H-bond finding algorithm may not be as sophisticated as these dedicated tools.

HTH,

Xiang-Jun

Title: Re: Criteria of Identification of hydrogen bonds
Post by: xiaoj12 on June 10, 2016, 02:12:22 pm

Hi Xiang-Jun,

Thank you very much for your reply. I don't have access to HBPLUS or HBexplore yet, though I've contacted HBPLUS development team for the installation file. If you know an easy way to access to them, just let me know. I'd love to do a further test and comparison.

As you pointed, could you explain type 'o' and '?' a little further? What are the other atoms in "

Quote

‘o’ for a donor/acceptor (such as the 2′-hydroxyl oxygen) with any other atom;

" and why there could be a hbond even though "

Quote

the donor/acceptor status of any H-bond atom is unknown

". Examples should help.

Thanks,

JJ

Title: Re: Criteria of Identification of hydrogen bonds
Post by: xiangjun on June 10, 2016, 11:03:13 pm

Quote

If you know an easy way to access to them, just let me know.

Contacting the authors of HBPLUS or HBexplore is the right approach. Both HBPLUS and HBexplore are dedicated to H-bonding detection, and should be more sophisticated or 'standard' than DSSR's simple strategy. You may well end up using one of them instead of DSSR for your purpose.

Quote

could you explain type 'o' and '?' a little further?

For 'o' type H-bond, the most common case is related to the 2'-OH in RNA ribose sugar. In the DSSR User Manual, the sample output for 1msy contains quite a few H-bonds of the 'o' type, all involving O2' atoms.

The '?' type is reserved for cases where the donor/acceptor status of an N/O atom with potential for H-bond cannot be determined. Note that DSSR's H-bond identification algorithm applies not just to standard bases (A, C, G, T, U), but also the 'unexpected' in a rudimentary way. As to:

Quote

why there could be a hbond even though the donor/acceptor status of any H-bond atom is unknown?

For example, some ligands may participate in H-bonding interactions with DNA/RNA. However, DSSR does/can not have an exhaustive assignment of donor/acceptor status for N/O atoms with H-bonding potential. If an H-bond involving such as 'unspecified' N/O atom is detected by DSSR's geometric approach, it is assigned with the '?' type (symbol).

Unfortunately, I do not have a concrete example in hand right now. However, this '?' type H-bond may not be relevant to your application at all. If you are still interested in using DSSR, and find one such '?' case any time, I'd be more than happy to give you an explanation.

HTH,

Xiang-Jun

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University

3DNA Forum

Questions and answers => RNA structures (DSSR) => Topic started by: xiaoj12 on June 09, 2016, 12:13:56 pm