Netiquette · Download · News · Gallery · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · Video Overview · DSSR v2.6.0 (DSSR Manual) · Homepage

Author Topic: A-DNA definition  (Read 105156 times)

Offline arnab

  • with-posts
  • *
  • Posts: 5
    • View Profile
A-DNA definition
« on: March 31, 2012, 03:12:48 pm »
Dear Lu and Members,

 What is the exact definition (set of parameters and ranges) used by 3DNA to define a base pair step as A-form, B-form or *TA* form? If it has been discussed already, I will be happy to know the location etc.

Thank you.

Arnab Mukherjee
IISER, Pune

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1708
    • View Profile
    • 3DNA homepage
Re: A-DNA definition
« Reply #1 on: April 01, 2012, 10:31:26 am »
The scheme of classifying a dinucleotide step into A-, B- or TA-DNA form is described in the 2003 NAR paper. More specifically, it is based on Zp and Zp(h); see Figure 5(c) linked below. For example, if Zp > 1.5 Å, then it is taken as A-DNA.



Per your request, listed below is the exact definition for A-, B- and TA-DNA, as excerpted from 3DNA source code. Note the "sanity check" at the beginning; the empirical criteria try to ensure a right-handed duplex consisting of Watson-Crick bps and with reasonable geometry. Also bear in mind that the classification is intended to be indicative rather than conclusive.

Code: [Select]
if (dval_in_range(mtwist, 10.0, 60.0)  /* over-all twist average */
    && WC_info[i] && WC_info[i + 1]  /* WC geometry */
    && dval_in_range(twist_rise[i][1], 10.0, 60.0)  /* right-handed */
    && dval_in_range(twist_rise[i][2], 2.5, 5.5)  /* Rise in range */
    && dval_in_range(aveS[i][1], -5.0, -0.5)  /* Xp */
    && dval_in_range(aveS[i][2], 7.5, 10.0)  /* Yp */
    && dval_in_range(aveS[i][3], -2.0, 3.5)  /* Zp */
    && dval_in_range(aveH[i][1], -11.5, 2.5)  /* XpH */
    && dval_in_range(aveH[i][2], 1.5, 10.0)  /* YpH */
    && dval_in_range(aveH[i][3], -3.0, 9.0)) {  /* ZpH */
    if (aveS[i][3] >= 1.5)  /* A-form */
        strABT[i] = 1;
    else if (aveH[i][3] >= 4.0)  /* TA-form */
        strABT[i] = 3;
    else if (aveS[i][3] <= 0.5 && aveH[i][1] < 0.5)  /* B-form */
        strABT[i] = 2;  /* aveS[i][3] < 0.5 for C-DNA #47 */
}

HTH,

Xiang-Jun
« Last Edit: April 01, 2012, 11:15:36 am by xiangjun »

Offline arnab

  • with-posts
  • *
  • Posts: 5
    • View Profile
Re: A-DNA definition
« Reply #2 on: April 01, 2012, 03:02:47 pm »
Dear Xiang-Jun,

  As usual, you are extremely prompt and precise in your answer. Your answer and the code snippet clearly explained the definition. However, my doubt arose from my own observation which is explained in detail below.

  During the perturbation to B-DNA, I observed higher Zp of 1.93 (str221.pdb attached) for step 6CG/CG (3DNA output attached as str221.out) . However, this step has not been assigned to A type.

 In comparison some other structures observed 5ps later (str226.pdb and str226.out) was assigned A-form for the same step mentioned above which has Zp=2.09.

 Therefore, I assume there must be more to the story. I will appreciate your help in solving this puzzle.

 Thanks again.

Best regards
Arnab

Offline arnab

  • with-posts
  • *
  • Posts: 5
    • View Profile
Re: A-DNA definition
« Reply #3 on: April 01, 2012, 03:24:08 pm »
To add to my previous post,  the sanity check clears out str221.pdb for 6CG/CG step which has Zp 1.93 but unassigned type. However, I don't know the check corresponding to "WC_info && WC_info[i + 1]  /* WC geometry */". Therefore, this may be a part of the issue of not getting the right form.

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1708
    • View Profile
    • 3DNA homepage
Re: A-DNA definition
« Reply #4 on: April 01, 2012, 05:24:57 pm »
Hi Arnab,

Thanks for your follow up. I am impressed by your attention to the little "details" -- oftentimes, the small part counts a lot.

Quote from: Arnab
Therefore, I assume there must be more to the story.
You are absolutely right -- see below for details.

Quote
To add to my previous post,  the sanity check clears out str221.pdb for 6CG/CG step which has Zp 1.93 but unassigned type. However, I don't know the check corresponding to "WC_info && WC_info[i + 1]  /* WC geometry */". Therefore, this may be a part of the issue of not getting the right form.
From the two PDB files you attached, it is easy to verify that all bps are of standard Watson-Crick type. So there is no issue with sanity check on WC_info(i) and WC_info(i + 1).

The real underlying reason for your observed discrepancy between str221 and str226 is as follows: to be on the safe side, 3DNA performs an additional check before assigning a dinucleotide step into A-, B- or TA-DNA form: there must be at least two consecutive dinucleotide steps of the same type to avoid any single isolated (mostly spurious) "transition" step.

Take your str221.pdb as an example,
    step       Xp      Yp      Zp     XpH     YpH     ZpH    Form
   1 GC/GC   -4.05    9.12   -1.51   -1.02    8.24   -4.15
   2 CG/CG   -2.24    8.83    0.29   -1.71    8.83    0.32     B
   3 GC/GC   -4.06    9.10   -0.65   -6.93    8.67    2.77     B
   4 CA/TG   -3.22    9.18    0.30   -7.45    8.28    4.02
   5 AC/GT   -3.17    9.21    1.03   -5.69    9.14    1.49
   6 CG/CG   -3.38    8.33    1.93   -7.49    8.45    1.33
   7 GT/AC   -3.58    8.97    0.80   -7.40    8.81    1.83
   8 TG/CA   -4.26    8.76   -0.67   -9.08    6.51    5.89
   9 GC/GC   -2.30    8.70   -0.08   -0.73    8.70    0.21     B
  10 CG/CG   -3.74    9.22   -0.75   -7.66    8.74    2.97     B
  11 GC/GC   -2.90    9.03    0.30   -5.61    8.55    2.92     B

According to the criteria detailed in my previous reply, step 6 CG/CG is indeed classified as A-DNA, since its Zp (1.93) > 1.5 Å. However, each of its neighbors -- 5 AC/GT and 7 GT/AC -- has a Zp < 1.5 Å, so neither is in A-form. Thus, 6 CG/CG is downgraded as unclassified. Note that without this additional check, step 4 CA/TG would have been taken as TA-DNA [Zp(h) = 4.02 > 4.0 Å].

With the above note, one can see easily why step 6 CG/CG in str226 is classified as A-DNA -- it's simply because its neighbor 5 AC/GT is also A-DNA.
    step       Xp      Yp      Zp     XpH     YpH     ZpH    Form
   1 GC/GC   -4.37    8.77   -1.05   -4.95    8.83   -0.22     B
   2 CG/CG   -2.56    8.63    0.13   -2.38    8.49    1.55     B
   3 GC/GC   -3.74    9.12   -0.62   -4.46    9.06   -1.34     B
   4 CA/TG   -2.64    9.32    0.91   -6.75    7.91    5.00
   5 AC/GT   -3.16    9.32    1.56   -7.39    9.41    0.91     A
   6 CG/CG   -3.07    8.71    2.09   -7.45    8.73    1.93     A

   7 GT/AC   -3.49    9.02    0.43   -6.64    8.97    1.09     B
   8 TG/CA   -3.83    9.27   -0.68   -7.64    9.09    1.94     B
   9 GC/GC   -2.69    8.96    0.34   -2.36    8.95    0.55     B
  10 CG/CG   -4.15    8.96   -0.57   -7.70    8.62    2.56     B
  11 GC/GC   -3.43    8.71    1.58   -6.69    8.70    1.66

I may refine the criteria used for dinucleotide classification in future release of 3DNA, and I welcome your feedback. For your analysis of MD simulation trajectories, I'd suggest that you check directly the raw data (Xp, Yp, Zp, XpH, YpH, ZpH etc).

HTH,

Xiang-Jun
 
« Last Edit: April 01, 2012, 09:22:54 pm by xiangjun »

Offline arnab

  • with-posts
  • *
  • Posts: 5
    • View Profile
Re: A-DNA definition
« Reply #5 on: April 02, 2012, 12:38:07 am »
Dear Xiang-Jun,

  Thank you again for clearing the doubt which I was carrying for some time. Wish I could contact you earlier.

  I have a request regarding future implementation of 3DNA. In order to facilitate faster analysis of DNA structures along a MD trajectory, if you could modify slightly to read a trajectory, it will be very useful. Earlier CURVES now changed as CURVES+ can read a trajectory. Right now, I convert my GROMACS trajectory to pdb, then send out for 3DNA analysis and then parse information using a script and this takes time if the trajectory is tens to hundreds of nanoseconds.

 If you don't have time and need additional pair of hands, I can create a patch between GROMACS and 3DNA where a GROMACS trajectory can be analyzed for all 3DNA related information.

 Best regards
Arnab

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1708
    • View Profile
    • 3DNA homepage
Re: A-DNA definition
« Reply #6 on: April 02, 2012, 11:43:41 am »
Hi Arnab,

I'm glad our conversation in this thread helped clear your doubt. It has been at least several years since I looked at the details about the classification of dinucleotide steps in 3DNA. Your questiones refreshed my memory on this topic.

From the output files you attached, I know you are using 3DNA v2.0. Did you know that as of v2.1, 3DNA provides a Ruby script x3dna_ensemble for the analysis of MD simulation trajectories? The help info is as below:

Code: [Select]
x3dna_ensemble -h
------------------------------------------------------------------------
Utilities for the analysis and visualization of an ensemble
    Usage: x3dna_ensemble [-h|-v] sub-command [-h] [options]
    where sub-command must be one of:
        analyze -- analyze MODEL/ENDMDL delineated ensemble (NMR or MD)
        block_image -- generate a base block schematic image
        extract -- extract structural parameters after running 'analyze'
        reorient -- reorient models to a particular frame/orientation
------------------------------------------------------------------------
  --version, -v:   Print version and exit
     --help, -h:   Show this message

Note that the script starts with a MODEL/ENDMDL delineated ensemble or a collection of individual entries in the standard PDB format. For an example, see the directory $X3DNA/examples/ensemble/md, and run the following to see the possibilities:

Code: [Select]
x3dna_ensemble analyze -h
x3dna_ensemble extract -h

Quote from: Arnab
If you don't have time and need additional pair of hands, I can create a patch between GROMACS and 3DNA where a GROMACS trajectory can be analyzed for all 3DNA related information.
To make 3DNA better sever the community, I really need help from responsive and enthusiastic 3DNA users like you! The script x3dna_ensemble currently does not directly read a third-party specific trajectory file, so I have been planning to add a convert sub-command with options for GROMACS, Amber, CHARMM etc. Your contribution to "create a patch between GROMACS and 3DNA where a GROMACS trajectory can be analyzed for all 3DNA related information" is certainly welcome.

To consolidate our efforts, could you please do the following:
  • Download and install 3DNA v2.1beta, and try out the examples mentioned above to see how the new facilities help your workflow.
  • Check to see how much can be provided from GROMACS. In a recent thread titled "How to properly use x3dna_ensemble?", I become aware of the fact that "Gromacs can actually devide an xtc-trajectory file into separate pdb files in MODEL/ENDML format."
  • After checking the above two points, we can focus on what are still missing or inconvenient.
Best regards,

Xiang-Jun
« Last Edit: April 02, 2012, 11:47:38 am by xiangjun »

Offline arnab

  • with-posts
  • *
  • Posts: 5
    • View Profile
Re: A-DNA definition
« Reply #7 on: April 02, 2012, 02:00:40 pm »
Dear Xiang-Jun,

  I will get back to you after checking the steps you suggested. However, what I really meant is to supply the coordinate directly from xtc to 3DNA directly bypassing the pdb so that the analysis is extremely fast. I have done the same with curves (in house) since the source code is available.

  However, this can be done also by generating pdb trajectories at the cost of disk space and slight speed compromise.

Best regards
Arnab

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1708
    • View Profile
    • 3DNA homepage
Re: A-DNA definition
« Reply #8 on: April 02, 2012, 03:12:55 pm »
Hi Arnab,

Quote
However, what I really meant is to supply the coordinate directly from xtc to 3DNA directly bypassing the pdb so that the analysis is extremely fast.

I see your point. By adhering to the standard MODEL/ENDMDL delineated PDB format, however, x3dna_ensemble can handle directly an NMR ensemble. Via a purpose-specific format adaptor, the script should be applicable to the analysis of simulation trajectories from any third-party MD package. Intuitively, I feel this approach is simple, flexible, and practical. Of course, only 3DNA users, especially MD practitioners, can judge if x3dna_ensemble is able to meet real-world challenges. Please share your experience.

Xiang-Jun
 
« Last Edit: April 02, 2012, 10:35:29 pm by xiangjun »

 

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University