Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Netiquette · Download · News · Gallery · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · Video Overview · DSSR v2.6.0 (DSSR Manual) · Homepage

Messages - xiangjun

Pages: 1 ... 51 52 [53] 54 55 ... 66
1301
MD simulations / Re: Analysis of PDB file
« on: April 07, 2012, 09:21:16 am »
Hi,

The simple Perl script manalyze was introduced around v1.5 for the analysis of "multiple" structures. Over the years, I've not been aware of its usage: your question is the first one. As of v2.1beta, I am migrating from Perl to Ruby as the scripting language for 3DNA. Now manalyze and most other not widely used Perl scripts are moved out the $X3DNA/bin/ directory into $X3DNA/perl_scripts/ -- they are obsolete, but kept there for the record.

As of 3DNA v2.1, the Ruby script "x3dna_ensemble" should be used for the analysis of NMR ensembles or MD simulation trajectories. Type -h for detailed info, and run the examples to get familiar with its usage/functionality.

Your attached PDB file contains only one model, so you can use the find_pair/analyze combination to calculate 3DNA parameters. Note, however, your structure has poor geometry, as shown in the image below. As a rule, one should always perform "sanity" check to ensure sensible results.

HTH,

Xiang-Jun

1302
No, I'm not saying or intending to imply that ssDNA and ssRNA are energetically equivalent. However, from a practical prospective, first it'd be better to have something than nothing. Second, a prediction is just a prediction, it is no guarantee that the "predicted" structure is "real" or even "meaningful": just think about how many predictions you can have from the same program with different settings, not to mention the different software tools available. Third, the recent article "RNA-Puzzles: A CASP-like evaluation of RNA three-dimensional structure prediction" [April 2012 [18 (4)] issue of the RNA journal] serves as a proof of my point. In theory (and in practice), a software can easily "predict" many more protein/DNA/RNA structures than the PDB has accumulated over the past 40 years.

Aside from all the augments above, 3DNA does not deal with "energetics" at all (in its current version, at least) -- it is purely geometrical.

Xiang-Jun

1303
MD simulations / Re: How to properly use x3dna_ensemble?
« on: April 05, 2012, 01:29:28 pm »
Quote
BUT, when I do "x3dna_ensamble analyze" for multiple snapshots, I do not see helical axis vectors in the output file.
The information you need, "Position (Px, Py, Pz) and local helical axis vector (Hx, Hy, Hz) for each dinucleotide step",  is not parsed in the current version of x3dna_ensemble. I will get it added soon and keep you updated.

Xiang-Jun
 

1304
See the thread "changing secondary structure to tertiary structure of ssDNA".

Quote
For RNA we have some software such as Rosetta to predict 3D structure
Does it make sense to start from the same secondary structure you have, but simply change Ts to Us, and predict a 3D RNA structure using Rosetta etc. Then you can delete O2' atoms, and mutate Us to Ts with mutate_bases that will preserve backbone conformation and base-pairing geometry.

Xiang-Jun

1305
See my previous reply. In short, 3DNA is not directly up to your purpose yet, even though some of its components may be useful in certain part of your workflow. The recent "RNA-Puzzles" paper in the RNA journal is the state-of-the-art on predicting DNA/RNA tertiary structures.

Good luck with your project.

Xiang-Jun

1306
Hi Vandana,

I am a bit surprised that you cannot run the basic 'pwd' (present working directory) command in you MinGW/MSYS -- Windows 7 system. Anyway, I'm glad to know that you've got 3DNA up and running; now you can start to play around with 3DNA. If you have any questions, do not heistate to post back on the forum!

Xiang-Jun

1307
Hi Vandana,

Sorry to hear your problem in setting up 3DNA on Windows 7 using MinGW/MSYS. To help identify where the problem is, please do the followings:
  • First, change into the 3DNA bin/ directory
  • Then type: pwd, what is the output?
  • What's the output of ruby -v?
  • If the above step runs successfully, what's the output of ruby x3dna_setup?

Xiang-Jun

1308
General discussions (Q&As) / Re: A-DNA definition
« on: April 02, 2012, 03:12:55 pm »
Hi Arnab,

Quote
However, what I really meant is to supply the coordinate directly from xtc to 3DNA directly bypassing the pdb so that the analysis is extremely fast.

I see your point. By adhering to the standard MODEL/ENDMDL delineated PDB format, however, x3dna_ensemble can handle directly an NMR ensemble. Via a purpose-specific format adaptor, the script should be applicable to the analysis of simulation trajectories from any third-party MD package. Intuitively, I feel this approach is simple, flexible, and practical. Of course, only 3DNA users, especially MD practitioners, can judge if x3dna_ensemble is able to meet real-world challenges. Please share your experience.

Xiang-Jun
 

1309
General discussions (Q&As) / Re: A-DNA definition
« on: April 02, 2012, 11:43:41 am »
Hi Arnab,

I'm glad our conversation in this thread helped clear your doubt. It has been at least several years since I looked at the details about the classification of dinucleotide steps in 3DNA. Your questiones refreshed my memory on this topic.

From the output files you attached, I know you are using 3DNA v2.0. Did you know that as of v2.1, 3DNA provides a Ruby script x3dna_ensemble for the analysis of MD simulation trajectories? The help info is as below:

Code: [Select]
x3dna_ensemble -h
------------------------------------------------------------------------
Utilities for the analysis and visualization of an ensemble
    Usage: x3dna_ensemble [-h|-v] sub-command [-h] [options]
    where sub-command must be one of:
        analyze -- analyze MODEL/ENDMDL delineated ensemble (NMR or MD)
        block_image -- generate a base block schematic image
        extract -- extract structural parameters after running 'analyze'
        reorient -- reorient models to a particular frame/orientation
------------------------------------------------------------------------
  --version, -v:   Print version and exit
     --help, -h:   Show this message

Note that the script starts with a MODEL/ENDMDL delineated ensemble or a collection of individual entries in the standard PDB format. For an example, see the directory $X3DNA/examples/ensemble/md, and run the following to see the possibilities:

Code: [Select]
x3dna_ensemble analyze -h
x3dna_ensemble extract -h

Quote from: Arnab
If you don't have time and need additional pair of hands, I can create a patch between GROMACS and 3DNA where a GROMACS trajectory can be analyzed for all 3DNA related information.
To make 3DNA better sever the community, I really need help from responsive and enthusiastic 3DNA users like you! The script x3dna_ensemble currently does not directly read a third-party specific trajectory file, so I have been planning to add a convert sub-command with options for GROMACS, Amber, CHARMM etc. Your contribution to "create a patch between GROMACS and 3DNA where a GROMACS trajectory can be analyzed for all 3DNA related information" is certainly welcome.

To consolidate our efforts, could you please do the following:
  • Download and install 3DNA v2.1beta, and try out the examples mentioned above to see how the new facilities help your workflow.
  • Check to see how much can be provided from GROMACS. In a recent thread titled "How to properly use x3dna_ensemble?", I become aware of the fact that "Gromacs can actually devide an xtc-trajectory file into separate pdb files in MODEL/ENDML format."
  • After checking the above two points, we can focus on what are still missing or inconvenient.
Best regards,

Xiang-Jun

1310
General discussions (Q&As) / Re: A-DNA definition
« on: April 01, 2012, 05:24:57 pm »
Hi Arnab,

Thanks for your follow up. I am impressed by your attention to the little "details" -- oftentimes, the small part counts a lot.

Quote from: Arnab
Therefore, I assume there must be more to the story.
You are absolutely right -- see below for details.

Quote
To add to my previous post,  the sanity check clears out str221.pdb for 6CG/CG step which has Zp 1.93 but unassigned type. However, I don't know the check corresponding to "WC_info && WC_info[i + 1]  /* WC geometry */". Therefore, this may be a part of the issue of not getting the right form.
From the two PDB files you attached, it is easy to verify that all bps are of standard Watson-Crick type. So there is no issue with sanity check on WC_info(i) and WC_info(i + 1).

The real underlying reason for your observed discrepancy between str221 and str226 is as follows: to be on the safe side, 3DNA performs an additional check before assigning a dinucleotide step into A-, B- or TA-DNA form: there must be at least two consecutive dinucleotide steps of the same type to avoid any single isolated (mostly spurious) "transition" step.

Take your str221.pdb as an example,
    step       Xp      Yp      Zp     XpH     YpH     ZpH    Form
   1 GC/GC   -4.05    9.12   -1.51   -1.02    8.24   -4.15
   2 CG/CG   -2.24    8.83    0.29   -1.71    8.83    0.32     B
   3 GC/GC   -4.06    9.10   -0.65   -6.93    8.67    2.77     B
   4 CA/TG   -3.22    9.18    0.30   -7.45    8.28    4.02
   5 AC/GT   -3.17    9.21    1.03   -5.69    9.14    1.49
   6 CG/CG   -3.38    8.33    1.93   -7.49    8.45    1.33
   7 GT/AC   -3.58    8.97    0.80   -7.40    8.81    1.83
   8 TG/CA   -4.26    8.76   -0.67   -9.08    6.51    5.89
   9 GC/GC   -2.30    8.70   -0.08   -0.73    8.70    0.21     B
  10 CG/CG   -3.74    9.22   -0.75   -7.66    8.74    2.97     B
  11 GC/GC   -2.90    9.03    0.30   -5.61    8.55    2.92     B

According to the criteria detailed in my previous reply, step 6 CG/CG is indeed classified as A-DNA, since its Zp (1.93) > 1.5 Å. However, each of its neighbors -- 5 AC/GT and 7 GT/AC -- has a Zp < 1.5 Å, so neither is in A-form. Thus, 6 CG/CG is downgraded as unclassified. Note that without this additional check, step 4 CA/TG would have been taken as TA-DNA [Zp(h) = 4.02 > 4.0 Å].

With the above note, one can see easily why step 6 CG/CG in str226 is classified as A-DNA -- it's simply because its neighbor 5 AC/GT is also A-DNA.
    step       Xp      Yp      Zp     XpH     YpH     ZpH    Form
   1 GC/GC   -4.37    8.77   -1.05   -4.95    8.83   -0.22     B
   2 CG/CG   -2.56    8.63    0.13   -2.38    8.49    1.55     B
   3 GC/GC   -3.74    9.12   -0.62   -4.46    9.06   -1.34     B
   4 CA/TG   -2.64    9.32    0.91   -6.75    7.91    5.00
   5 AC/GT   -3.16    9.32    1.56   -7.39    9.41    0.91     A
   6 CG/CG   -3.07    8.71    2.09   -7.45    8.73    1.93     A

   7 GT/AC   -3.49    9.02    0.43   -6.64    8.97    1.09     B
   8 TG/CA   -3.83    9.27   -0.68   -7.64    9.09    1.94     B
   9 GC/GC   -2.69    8.96    0.34   -2.36    8.95    0.55     B
  10 CG/CG   -4.15    8.96   -0.57   -7.70    8.62    2.56     B
  11 GC/GC   -3.43    8.71    1.58   -6.69    8.70    1.66

I may refine the criteria used for dinucleotide classification in future release of 3DNA, and I welcome your feedback. For your analysis of MD simulation trajectories, I'd suggest that you check directly the raw data (Xp, Yp, Zp, XpH, YpH, ZpH etc).

HTH,

Xiang-Jun
 

1311
General discussions (Q&As) / Re: A-DNA definition
« on: April 01, 2012, 10:31:26 am »
The scheme of classifying a dinucleotide step into A-, B- or TA-DNA form is described in the 2003 NAR paper. More specifically, it is based on Zp and Zp(h); see Figure 5(c) linked below. For example, if Zp > 1.5 Å, then it is taken as A-DNA.



Per your request, listed below is the exact definition for A-, B- and TA-DNA, as excerpted from 3DNA source code. Note the "sanity check" at the beginning; the empirical criteria try to ensure a right-handed duplex consisting of Watson-Crick bps and with reasonable geometry. Also bear in mind that the classification is intended to be indicative rather than conclusive.

Code: [Select]
if (dval_in_range(mtwist, 10.0, 60.0)  /* over-all twist average */
    && WC_info[i] && WC_info[i + 1]  /* WC geometry */
    && dval_in_range(twist_rise[i][1], 10.0, 60.0)  /* right-handed */
    && dval_in_range(twist_rise[i][2], 2.5, 5.5)  /* Rise in range */
    && dval_in_range(aveS[i][1], -5.0, -0.5)  /* Xp */
    && dval_in_range(aveS[i][2], 7.5, 10.0)  /* Yp */
    && dval_in_range(aveS[i][3], -2.0, 3.5)  /* Zp */
    && dval_in_range(aveH[i][1], -11.5, 2.5)  /* XpH */
    && dval_in_range(aveH[i][2], 1.5, 10.0)  /* YpH */
    && dval_in_range(aveH[i][3], -3.0, 9.0)) {  /* ZpH */
    if (aveS[i][3] >= 1.5)  /* A-form */
        strABT[i] = 1;
    else if (aveH[i][3] >= 4.0)  /* TA-form */
        strABT[i] = 3;
    else if (aveS[i][3] <= 0.5 && aveH[i][1] < 0.5)  /* B-form */
        strABT[i] = 2;  /* aveS[i][3] < 0.5 for C-DNA #47 */
}

HTH,

Xiang-Jun

1312
MD simulations / Re: How to properly use x3dna_ensemble?
« on: March 27, 2012, 01:15:31 pm »
Glad to know that you've made some progress. However, the message as shown below still bothers me:
Quote
ruby 5276 child_info_fork::abort: address space needed by 'etc.so' (0x370000) is already occupied
A quick Google search turns out quite a few hits concerning Ruby, and Cygwin on Windows. Will following Cygwin FAQ #4.44 "How do I fix fork() failures?" solve your problem? Specifically, the following sentence seems to address this problem:
Quote
Read the 'rebase' package README in /usr/share/doc/rebase/, and follow the instructions there to run 'rebaseall'

Additionally, what version of Ruby are you using?
Code: [Select]
ruby -v
I am switching from Perl to Ruby as the scripting language for 3DNA, hopefully this won't cause practical issues for Windows users.

Xiang-Jun

1313
MD simulations / Re: How to properly use x3dna_ensemble?
« on: March 27, 2012, 11:02:16 am »
Hi Nikolay,

The first two steps are fine. The third step is weird -- are you using MinGW or Cygwin version? This is a problem not necessarily specific to 3DNA, but obviously I'd like to see a solution to it.

Xiang-Jun

1314
MD simulations / Re: How to properly use x3dna_ensemble?
« on: March 27, 2012, 10:11:29 am »
Hi Nikolay,

Glad to hear that you've made progress. However, from my understanding of what you described, it seems something is still not quite right. The fixed-name file bp_step.par contains only the parameters for a single structure (snapshot), not the whole ensemble. It is a bit more difficult to explain the details in text, so I would suggest you repeat the examples in the directory $X3DNA/examples/ensemble/md/:
Code: [Select]
cd $X3DNA/examples/ensemble/md/
x3dna_ensemble analyze -h
x3dna_ensemble extract -h
Once you understand how the examples work, you should be able to apply the same idea to the analysis of your MD trajectories. Of course, if you have any questions, please do not hesitate to post back at the forum.

Xiang-Jun

PS: command-line help
The help page for x3dna_ensemble analyze
Code: [Select]
------------------------------------------------------------------------
Analyze a MODEL/ENDMDL delineated ensemble of NMR structures or MD
trajectories. All models must correspond to different conformations of
the same molecule. A template base-pair input file, generated with
'find_pair' and corrected manually as necessary, must be provided.

Usage:
        x3dna_ensemble analyze options
Examples:
        x3dna_ensemble analyze -b bpfile.dat -e sample_md0.pdb
             # 21 models (0-20); output (default): 'ensemble_example.out'
             # also generate 'model_list.dat', see example below
        x3dna_ensemble analyze -b bpfile.dat -m model_list.dat -o ensemble_example2.out
             # diff ensemble_example.out ensemble_example2.out

        x3dna_ensemble analyze -b bpfile.dat -p 'pdbdir/model_*.pdb' -o ensemble_example3.out
             # note to quote the -p option; 20 models (1-20)
             # also generate 'pdb_list.dat', see example below
        x3dna_ensemble analyze -b bpfile.dat -l pdb_list.dat -o ensemble_example4.out
             # diff ensemble_example3.out ensemble_example4.out
             # note the order of the models: 1, 10..19, 2, 20, 3..9
Options:
------------------------------------------------------------------------
    --bpfile, -b <s>:   Name of file containing base-pairing info
   --outfile, -o <s>:   Output file (default: ensemble_example.out)
  --ensemble, -e <s>:   Ensemble delineated with MODEL/ENDMDL pairs
    --models, -m <s>:   File containing an explicit list of model numbers
   --pattern, -p <s>:   Pattern of model files to process (e.g., *.pdb)
      --list, -l <s>:   File containing an explicit list of models
          --info, -i:   Show only model info in the ensemble [with -e]
          --help, -h:   Show this message

The help page for x3dna_ensemble extract
Code: [Select]
------------------------------------------------------------------------
Extract 3DNA structural parameters of an ensemble of NMR structures or
MD trajectories, after running 'x3dna_ensemble analyze'. The extracted
parameters are intended to be exported into Excel, Matlab and R etc for
further data analysis/visualization.

Usage:
        x3dna_ensemble extract options
Examples:
        x3dna_ensemble extract -l
             # to see a list of all parameters
        x3dna_ensemble extract -p prop
             # for propeller, no need to specify full: -p pr suffices
             # -p 36 also fine (see above); use 'ensemble_example.out'
        x3dna_ensemble extract -p slide -s , -f ensemble_example3.out
             # comma separated, from file 'ensemble_example3.out'
        x3dna_ensemble extract -p roll -s ' ' -n -o roll.dat
             # space separated, no row-label, to file 'roll.dat'
        x3dna_ensemble extract -e 1 -p chi1
             # extract the chi torsion angle of strand I, but exclude
             # those from the two terminal base pairs. For comparison,
             # run also: x3dna_ensemble extract -p chi1
        x3dna_ensemble extract -a
             # extract all parameters, each in a separate file
Options:
------------------------------------------------------------------------
  --separator, -s <s>:   Separator for fields [\t] (default: )
   --par-name, -p <s>:   Name of parameter to extract
   --fromfile, -f <s>:   Parameters file (default: ensemble_example.out)
    --outfile, -o <s>:   File of selected parameter (default: stdout)
   --end-bps, -e <i+>:   Number of end pairs to ignore (default: 0, 0)
            --all, -a:   Extract all parameters into separate files
          --clean, -c:   Clean up parameter files by the -a option
           --list, -l:   List all parameters
        --no-1col, -n:   Delete the first (label) column
           --help, -h:   Show this message


1315
MD simulations / Re: How to properly use x3dna_ensemble?
« on: March 26, 2012, 10:29:30 am »
Hi Nikolay,

I am glad to hear that Gromacs provides the facility to output an ensemble of multiple models delineated by MODEL/ENDMDL. As of 3DNA v2.1beta (currently distributed), the x3dna_ensemble Ruby script can handle multiple structures in a PDB MODEL/ENDMDL ensemble:

Code: [Select]
x3dna_ensemble -h
------------------------------------------------------------------------                                                       
Utilities for the analysis and visualization of an ensemble
    Usage: x3dna_ensemble [-h|-v] sub-command [-h] [options]
    where sub-command must be one of:
        analyze -- analyze MODEL/ENDMDL delineated ensemble (NMR or MD)
        block_image -- generate a base block schematic image
        extract -- extract structural parameters after running 'analyze'
        reorient -- reorient models to a particular frame/orientation
------------------------------------------------------------------------
  --version, -v:   Print version and exit
     --help, -h:   Show this message

Check the $X3DNA/examples/ensemble/ directory for examples, and report back if you have any problem. Note that I still need to add a detailed documentation of the new x3dna_ensemble utility. However, it should be straightforward to play with, and I am always quicker in responding to user's questions than writing doc ...

Xiang-Jun

1316
Hi Hugh,

Thank you so much for contributing back your Perl script that solves your problem, and providing new sample Gaussian-Babel-generated PDB date files. At it turns out, the three PDB files you attached -- AT.pdb, AU.pdb, and GC.pdb -- are all fine with 3DNA. You can verify this point by running find_pair with the -s option:
Code: [Select]
find_pair -s GC.pdb stdout
# and it will output the following:
GC.pdb
GC.outs
    1      # single helix
    2      # number of bases
    1    1 # explicit bp numbering/hetero atoms
    1      # ....>A:...1_:[..G]G
    2      # ....>B:...1_:[..C]C
However, none of the three PDB files contains a base pair, per the default parameters -- check using a molecular graphics viewer like Jmol or PyMOL.

The atom naming issue related to PDB files from computational chemistry packages (e.g. Gaussian) and Babel has appeared a few times in the 3DNA forum. As far as 3DNA is concerned, your effort has led to the first known solution (I am aware of) to this problem. Your question has prompted me to read the article "Open Babel: An open chemical toolbox" and download the latest Open Babel v2.3.1.

Best regards,

Xiang-Jun

1317
Thanks for using 3DNA and your elaborate post -- your attached PDB file helped in uncovering where the problem is.

Quote
1. Is the attached file properly formatted for use with 3DNA?
No, it is not. Specifically, the atom names do not conform to the PDB convention. Using one of the U residues as an example, see the following two images:
Gaussian-Babel PDBStandard PDB
On the left is the U based on Gaussian-Babel generated PDB file, and on the right is based on the standard PDB file. Notice how the standard PDB have names like " N1 " instead of " N  ", and " O2 " instead of " O  " etc. Proper atom names are important for 3DNA to identify which atom is which.

Quote
2. If find_pair does not find a base pair, will it still output the base pair geometry parameters that were calculated?
The problem is not that find_pair misses a pair due to parameter cutoffs, but the residues are not taken as nucleotides at all. Your best bet is simply to make your PDB file standard compliant, then both problems will be gone.

In your attached test.pdb file, there are two uracils, which follow the same atom ordering and naming convention. Could you provide me example files with A, C, G, and T? It may be worthwhile to have a utility program in 3DNA that can convert Gaussian-Babel generated PDB file to the standard format.

Xiang-Jun


1318
Hi Kumutha,

Thanks for posting the DNA sequence and its beautiful secondary structure predicted with mfold. 3DNA per se, and the 3DNA server based on it, does not have the magic (yet) to directly convert such a secondary structure into tertiary structure. However, I do believe 3DNA has functionality to help solve some components of the puzzle. In your case, the long double-helical stem can be approximated with a fiber B-DNA model; then you need to model the loop part at the top and the two terminal nucleotides at the bottom end. You need to assemble the three parts together, and possibly perform some energy minimization to achieve good stereochemistry.

You may find the article "RNA-Puzzles: A CASP-like evaluation of RNA three-dimensional structure prediction" published in the April 2012 [18 (4)] issue of the RNA journal helpful.

Xiang-Jun

1319
MD simulations / Re: How to properly use x3dna_ensemble?
« on: March 22, 2012, 08:31:51 am »
Hi Nikolay,

Thanks for using 3DNA, especially trying the new Ruby x3dna_ensemble utility. Sorry to hear that you are "completely lost". Regarding the convert sub-command, you are ahead of time:
Code: [Select]
x3dna_ensemble convert -h
...to be added for Amber/Gromacs/CHARMM etc
  --package, -p <s>:   Name of MD simulation package (default: amber)
         --help, -h:   Show this message
i.e., this functionality is currently not implemented yet. However, I'd be interested in adding such a converter for GROMACS if you provide me an example (shortened) trajectory file, and let me know the detailed description of the xtc/trr file formats. On the other hand, I believe (or guess) GROMACS should have a facility to convert its native trajectory file to the standard PDB MODEL/ENDMDL format.

Please let me know.

Xiang-Jun


1320
FAQs / How to calculate DNA bending angle?
« on: March 21, 2012, 02:10:48 pm »
DNA bending angle is a frequently used parameter in the literature, often associated with DNA-protein complexes. Nevertheless, 3DNA does not provide a direct measure of the "bending angle" in its output file of structural parameters. The topic is more subtle and complicated than it appears.

On its face, an angle is defined by two vectors; let's call them a and b, and if each is normalized, then the angle (in degrees) between them is: acos(dot(a, b)) * 180/pi. Geometrically, after moving the tails of the two vectors into the same position (e.g., origin), the heads would normally define a plane, unless a and b are strictly parallel (0°) or anti-parallel (180°).

DNA structures are three-dimensional, normally far more complicated than a single number can quantify. The concept of DNA bending angle, as I understand it, is only applicable to DNA structures with two relatively straight fragments (as in CAP-DNA complexes). Under such situations,  one can fit a least-squares (LS) linear helical axis to each of the two fragments, and calculate the angle between them. Towards this end, 3DNA outputs the following section when it judges that the input structure is not strongly curved. Using 355d/bdl084, which is distributed with 3DNA, as an example:
Code: [Select]
Global linear helical axis defined by equivalent C1' and RN9/YN1 atom pairs
Deviation from regular linear helix: 3.30(0.52)
Helix:    -0.127  -0.275  -0.953
HETATM 9998  XS    X X 999      17.536  25.713  25.665
HETATM 9999  XE    X X 999      12.911  15.677  -9.080
Average and standard deviation of helix radius:
P: 9.42(0.82), O4': 6.37(0.85),  C1': 5.85(0.86)

Where the Helix: line gives the normalized vector along the "best-fit" helical axis. The two HETATM records provides the two end points of the helix, and they are directly related to the Helix: line by a simple equation. Following the above example,  we have (Octave/Matlab code):

Code: [Select]
XE = [12.911  15.677  -9.080];
XS = [17.536  25.713  25.665];

dd = XE - XS
%   -4.6250  -10.0360  -34.7450

Helix = dd / norm(dd)
%  -0.12685  -0.27526  -0.95296  ==> [-0.127  -0.275  -0.953]

With the two HETATM records, one can easily add them into the original PDB file to display the helical axis using a molecular graphics programs (e.g., RasMol, Jmol or PyMOL). Moreover, the two helix vectors can be used to reorient the original PDB structure into a view so that one helical fragment lies along the x-axis, and the other in the xy-plane. As documented in detail in recipes #4 on "Automatic identification of double-helical regions in a DNA–RNA junction" of the 2008 3DNA Nature Protocols paper, "The chosen view allows for easy visualization and protractor measurement of the overall bending angle between the two relatively straight helices."

The following points are well worth noting:
  • The LS fitting procedure used in 3DNA follows SCHNAaP, which was based on the algorithm in the well-known NewHelix program, maintained by Dr. Richard Dickerson upto the 1990s. While fitting a global linear helical axis to strongly curved DNA structures makes no sense with derived parameters (NewHelix itself has been replaced by FreeHelix, also from Dickerson), I do believe it is meaningful to fit a linear helix to a relatively straight DNA fragment. That's why I have kept this functionality in SCHNAaP and 3DNA; 3DNA bending angle calculation serves as an example illustrating the point – it provides an "intuitive" way for biologists to understand how the bending angle is calculated; it can actually be measured directly.
  • Instead of directly LS-fitting a linear helical axis with 3DNA, one can alternatively superimpose a regular fiber model into the DNA fragment, and then derive the straight helical axis from the fitted coordinates. The two approaches normally gives slightly different numerical values, as would be expected.
  • Overall, bending angle is (at most) an approximate measure of DNA curvature. In my opinion, the concept is only applicable for comparing a set of structures, each with two relatively straight helical fragments. Even in such cases, the relative spatial relationship between two segments is more complicated than a simple (bending) angle could quantify. Be watchful – do not exaggerate the significance of small variations in bending angle.

1321
General discussions (Q&As) / Re: Re: 3DNA download
« on: March 21, 2012, 09:03:05 am »
Check documentations at the $X3DNA/doc directory; work out the recipes of the 2008 3DNA Nature Protocols paper; ask questions in the forum. I will update the brief tutorial shortly.

Xiang-Jun

1322
Yes, it is possible to install 3DNA on Windows XP -- the distributed v2.1beta-cygwin-win and v2.1beta-mingw-win were actually compiled on Windows XP. As to MinGW vs Cygwin, it is up to you: in my limited experience playing 3DNA on Windows, Cygwin seems more Linux-like, whilst MinGW is more Windows-like. If you are an experienced Windows user, you may try MinGW first.

Of course, if you have any technical problems, please post them here.

Xiang-Jun

1323
FAQs / How to handle modified (uncommon) bases?
« on: March 20, 2012, 09:40:19 pm »
In 3DNA, modified bases are mapped to their standard counterparts, e.g. 5‐iodouracil (5IU) to uracil (U) and 1‐methyladenine (1MA) to adenine (A), and are designated with lower case letters (as u and a respectively for the examples cited above). Technically, the mapping is stored in file $X3DNA/config/baselist.dat, and looks like this:
Code: [Select]
  A     A
 DA     A
ADE     A
....
5IU     u      # I connected to C5
....
1MA     a      # C connected to N1

Each mapped one-letter base (X = A/C/G/T/U for the standard nucleotides and x = a/c/g/t/u for the modified ones) has a corresponding Atomic_X.pdb (or Atomic.x.pdb) file oriented in the standard base reference frame. By default, the two sets (X and x) are identical, i.e., Atomic_A.pdb has the same content as Atomic.a.pdb. The mapping information is used in a ls-fitting procedure to define the base reference frame for each nucleotide in a PDB file, and allows for easy analysis of unusual DNA and RNA structures.

As of v2.1, when encountering a new modified base, 3DNA will automatically perform the mapping, and outputs the following message (using a contrived example):
Code: [Select]
Match '2MG' to 'g' for residue 2MG   10  on chain A [#1]
    check it & consider to add line '2MG     g' to file <baselist.dat>

Simply adding a line containing 2MG     g to file baselist.dat and the above info message will be gone. This is a contrived example because I deliberately deleted that line from baselist.dat for this illustration.

I implemented this auto-mapping as an experimental feature at least back in v1.5, but did not document it for public use. My experience over the years has shown that the auto-mapping is functioning as designed. Now with this feature set by default, processing of large datasets can be fully automated. Moreover, using find_pair, it is easy to get a complete list of modified bases in a dataset, e.g., in all the NDB entires.

1324
Structural analysis of nucleic acid used to be a rather tedious process, especially for irregular, complicated RNA structures and nucleic acid-protein complexes (e.g., the large ribosomal subunit 1jj2/rr0033). Without valid base-pairing information as input, the various analysis software will produce meaningless results. The program find_pair was originally created to solve this specific problem, by generating input file to 3DNA analysis routines (analyze/cehs) directly from a PDB file.

In its core, find_pair uses a pure geometric approach to identify all possible pairs (Watson-Cricks or non-canonical pairs actually exist in a structure), their H-bonding patterns and helix context. Specifically, the major criteria used are as follows:
  • The distance between the origins of the two bases (as defined by their standard reference frames) must be less than certain limit (15.0 Å by default) - otherwise, they would be too far away to be called a pair.
  • The vertical separation (i.e., stagger) between the two base planes must be less than certain limit (2.5 Å by default) - otherwise, they would be stacking instead of pairing.
  • The angle between the two base z-axes (i.e., their normal vectors) is less than a cut-off (65.0° by default).
  • There is at least one pair of nitrogen/oxygen base atoms that are within a H-bonding cut off distance (4.0 Å by default).
If two bases fulfill these geometric requirements, they are defined to be a pair, without taking consideration of their chemical constituents. Thus our method allows for identification of unconventional pairs as easily as the canonical ones. The program then checks for possible H-bonding patterns, whether the normal donor-acceptor (noted by '-' as in O6 - N4 for a G·C pair) or the unusual donor-donor, acceptor-acceptor (noted by '*' as in O2 * N3 for a C·C pair in urx057). The non-canonical pairs, especially those with unusual H-bonding patterns, should be checked more carefully - they could be due to errors in structure determination, or they could have some special meaning/significance unnoticed previously.

The default criteria mentioned above are based on a survey of the NDB structures. Generally speaking, they are pretty generous and work quite well in the most common cases we've encountered. However, we are aware of the possibilities of special cases where some of them might be too restrict or too generous, thus leading to find_pair to miss or produce superfluous base pairs. The default settings are stored in a text file named misc_3dna.par under the directory $X3DNA/config/ where users can modify as they see fit. Changes in that directory will have a global effect - wherever you run find_pair on your system, the modified values will be used. Alternately, users could make a copy of misc_3dna.par to their current working directory and change it over there for local effect. Note that the local setting has precedence over the global one.

As an example, find_pair will miss the 127th base-pair I:..53_:[.DT]T-----A[.DA]:.-53_:J in structure 1kx5/pd0287 in its default settings. This is because the H-bonding distance between T:N3 - A:N1 is 4.20 Å and that for T:O4 - A:N6 is 4.85 Å; both of them are larger than the default 4.0 Å cut off. Increasing the H-bonding criterion in file misc_3dna.par from 4.0 Å to 5.0 Å will solve this problem. Please note that in 3DNA, users can start directly from an uncompressed PDB file, without having to extract the DNA fragment first:
  • find_pair 1kx5.pdb 1kx5.inp to get input file for analyze
  • analyze 1kx5.inp to get detailed structural parameters in file 1kx5.out
  • The above two steps can be combined into one: find_pair 1kx5.pdb stdout | analyze stdin
In addition to (or instead of) manipulating parameters in misc_3dna.par, oftentimes it may be preferable to manually edit find_pair-generated base-piar files before feeding them into analyze/cehs. This allows for maximum flexibility as to which pair to consider in calculating 3DNA structural parameters.

Also worth noting is the -p option of find_pair: without this option, find_pair locates base pairs in double-helical regions; thus the Watson-Crick pairs take precedence over the Wobble and other non-canonical pairs. With the -p, then all pairs and higher order base associations (i.e., triplets and above) are detected.

 

1325
The easiest way to build a nucleic acid structure with the sugar-phosphate backbone, other than predefined fiber models, is to use the rebuild program. The backbone building scheme uses exactly the same protocol as the default for base-only model. The user needs to add the -atomic option to rebuild, and to choose the desired rigid sugar-phosphate backbone to be attached to the standard base geometry.

The four types of currently available backbone conformations are listed in the directory $X3DNA/config/atomic. To use any of these backbones, it is necessary to copy the standard nucleotide files associated with each type of backbone to $X3DNA/config or your current working directory, and to name each nucleotide as follows: Atomic_X.pdb (where X = A, C, G, T, U; or Atomic.x.pdb where x =  a, c, g, t, u for modified bases). The default Atomic_X.pdb files contains only the C1' backbone atom, and the base geometry is independent of the backbone conformation.

To build a DNA structure with B-DNA backbone conformation, for example, one uses the BDNA_X.pdb set to replace Atomic_X.pdb. There is a sub-command cp_std of the Ruby utility program x3dna_utils to help with this: x3dna_utils cp_std BDNA. This will copy BDNA_X.pdb to the current working directory and rename it Atomic_X.pdb. Please note that rebuild searches for Atomic_X.pdb files first in the current working directory, and then in $X3DNA/config.

To make the above description clear, here is an example. Go to the directory $X3DNA/examples/analyze_rebuild, and try to reproduce the following:
  • use the command, x3dna_utils cp_std BDNA, so that you will have Atomic_X.pdb files
  • use find_pair bdl084.pdb | analyze, to analyze the structure bdl084 (355d) and to generate a file named bp_step.par
  • use rebuild -atomic bp_step.par bdl084_3dna.pdb, to generate the PDB file bdl084_3dna.pdb with a standard B-backbone
The RMSD between all atoms of the original bdl084.pdb file and the generated bdl084_3dna.pdb file is only 0.73 Å. Please note that in the rebuilt bdl084_3dna.pdb file, some O3'(i-1) to P(i) linkages can be quite long (broken). This structure, however, serves well as a starting point for further energy minimization. See post "Restraint optimization of DNA backbone geometry using PHENIX" for how to regularize the overlong bonds.

Pages: 1 ... 51 52 [53] 54 55 ... 66

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University