Netiquette · Download · News · Gallery · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL
· Video Overview · DSSR v2.5.0 (DSSR Manual) · Homepage
-
Dear 3DNA users,
I am using Amber for MD simulation. I generated hundreds of pdbs from the trajectories and analyzed DNA using 3DNA programe. It will be helpful for me if someone have a script for extracting information from 3DNA output file. For e.g. I want to extract the twist value for each base pair step of the dodecamer DNA and average it over number of frames.
Thanks in advance
Sincerely
Aneesh
-
Hi Aneesh,
Over the years, I have written a few posts related to the topic of applying 3DNA to the analysis of molecular dynamics (MD) simulations, including:
- 3DNA for the analysis of molecular dynamics simulations (http://http://xiang-jun.blogspot.com/2010/07/3dna-for-analysis-of-molecular-dynamics.html)
- 3DNA in molecular dynamics simulations (http://http://xiang-jun.blogspot.com/2009/10/3dna-in-molecular-dynamics-simulations.html)
- Curves+ vs 3DNA (http://http://xiang-jun.blogspot.com/2009/08/curves-vs-3dna.html)
Also, I contacted a couple of practitioners in the MD field, trying to seek a possible collaborator to make using 3DNA more straightforward for this increasing user community. For various reasons, nothing significant has come out from this effort. I am hoping users who have successfully applied 3DNA in MD analysis would contribute their scripts so others can benefit from and build upon. In the meantime, if you could post your MD analysis procedure and the problems you faced, others (myself included) may be able to help you more concretely.
Xiang-Jun
-
Dear Dr. Xiang-Jun,
Thanks for the reply. In my case, I am using a dodecamer B-DNA and have generated 100 snapshots (PDBs) from the simulation trajectories. Followed by this I have done the 3DNA analysis for each snapshot and got 100 3DNA output files. Now I want to extract different parameters from the output file. For e.g, I want to get the average 'Twist' value for each base pair step. For that I have to extract Twist value for each base pair from 100 3DNA output files and average it and calculate the standard error.
Hope I made it clear now.
Waiting for your valuable reply.
Thanks in advance
Sincerely
Aneesh
-
Welcome back. It is certainly clearer than before. However, it would be far more helpful if you could be even more specific, i.e., by providing an example. For example, I am not sure what the "100 snapshots (PDBs) from the simulation trajectories" look like. Are the 100 snapshots stored in 100 separate PDB files, or all in one? If the later, how are the snapshots separated? By MODEL/ENDMDL as in NMR structure? What would be an appropriate output format for the extracted parameters? In addition to the mean values of some parameters, e.g., Twist, how about their standard deviations and other related simple statistics? All such details need to be considered to come up with a script that is more generally applicable.
Thus, to help others help you more effectively, try to come up with a (minimum) concrete example, including all necessary input data files and your expected results (in numbers). Moreover, if you have already written some scripts, attach them with your post.
Alternatively, as mentioned in my blog post "Curves+ vs 3DNA (http://http://xiang-jun.blogspot.com/2009/08/curves-vs-3dna.html)", Curves+ has built in support for the analysis of MD simulation trajectories, and it may well serve your need.
HTH,
Xiang-Jun
-
Thanks for the immediate reply.
>Are the 100 snapshots stored in 100 separate PDB files, or all in one?
They are stored in 100 seperate PDBs.
> What would be an appropriate output format for the extracted parameters?
For e.g. Below is the part of 3DNA output file.
****************************************************************************
Local base-pair step parameters
step Shift Slide Rise Tilt Roll Twist
1 CG/CG -0.47 0.25 2.95 -3.89 5.88 26.83
2 GC/GC -0.38 1.12 3.63 -1.47 -11.72 51.06
3 CG/CG -0.59 -0.23 3.19 -6.17 9.46 22.37
4 GA/TC 1.02 -0.13 3.56 4.26 16.61 29.18
5 AA/TT -1.86 0.42 3.30 -0.02 1.06 40.25
6 AT/AT 0.09 -0.55 3.44 -9.90 -5.12 33.95
7 TT/AA -0.04 -1.23 3.62 -5.80 3.39 34.13
8 TC/GA -0.79 -1.74 3.13 3.33 -7.64 31.44
9 CG/CG 1.87 0.46 3.57 3.93 15.07 30.64
10 GC/GC -0.98 0.47 3.21 -6.57 -5.86 40.09
11 CG/CG 1.38 -0.30 2.83 4.67 1.96 33.37
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ave. -0.07 -0.13 3.31 -1.60 2.10 33.94
s.d. 1.10 0.82 0.28 5.18 9.23 7.71
****************************************************************************
Here, I want to extract a certain parameter, say Twist, for each base pair step . So the output file contains a single column with twist value corresponding to one particular base pair step ( say for 4th step, GA/TC) for which Twist values are extracted from 100 3DNA outputs ( i.e a single column with 100 lines). This should repeat for all the base pair steps ( I dont have a better idea to do this) :(
> In addition to the mean values of some parameters, e.g., Twist, how about their standard deviations and other related simple statistics?
At the end of the column contains the mean value and standard deviation for that particular base pair step.
Hope this will help
Sincerely
Aneesh
-
They are stored in 100 seperate PDBs.
Is this norm? I cannot imagine that a MD simulation with thousands of snapshots ends up with thousands of PDB files. Anyway, are your 100 PDBs all stored in a directory? Do the PDB files share a specific pattern?
Additionally, are you familiar with R or Matlab/Octave? In my mind, the script to extract 3DNA output parameters would ideally write a tab-delimted data table that can be easily fed into commonly available tools for easy calculation of mean/std etc statistics.
I am pretty occupied with my job right now, but I will try to "spare" some time to come up with a preliminary script to get you started (hopefully by the end of next week).
Xiang-Jun
-
Dear Xiang-Jun,
I can convert each snapshots into PDBs. All the PDBs are stored in the same directory and all of them share a specific pattern of names, like 3dna_1.pdb, 3dna_2.pdb etc.
Is there any other way in your mind to analyze the large number of 3DNA ouptuts generated for MD trajectories?
Sincerely
Aneesh
-
Hello Aneesh,
As Xiang-Jun suggested, you can definitely use Curves+ (which is free). Curves+ (and its analysis program canal) has built in support for amber md trajectories.
using canal you can get time series files for each base-pair or helical parameter such as twist, slide, shift, etc.... The output files are arranged in columns corresponding to individual base pair steps.
Or you can use a simple python script that I have posted here for MD trajectory analysis. My script only extracts the time series of the parameters.
Search the forum, you should be able to find my post.
HTH,
Alpay
-
Hi Aneesh,
In addition to Alpay's above reply, did you also notice his most recent post "A modified 3DNA parser for MD trajectory analysis (http://http://3dna.rutgers.edu:8080/forum/viewtopic.php?f=4&t=193)" at the section of Users' contributions. Please have a try and report back your experience.
I will try to come up with a script that hopefully streamlines the process of extracting 3DNA output from MD trajectory analysis. Alpay's parser may well serve as a starting point.
HTH,
Xiang-Jun
-
Dear Alpay and Xiang-Jun,
Thanks for your reply. I will definitely go through the suggestions you made and will update you about it.
Thanks once again
Sincerely
Aneesh
-
Hi Aneesh,
Finally, I've come up with something to present :D See my post "Ruby scripts for the analysis of MD simulation trajectories (http://http://3dna.rutgers.edu:8080/forum/viewtopic.php?f=11&t=195&p=538)". Please have your follow ups there.
Cheers!
Xiang-Jun
Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids
Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University