Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: script for extracting data from 3DNA output file  (Read 28521 times)

Offline aneeshcna

  • with-posts
  • *
  • Posts: 9
    • View Profile
script for extracting data from 3DNA output file
« on: December 24, 2010, 02:07:40 am »
Dear 3DNA users,
               I am using Amber for MD simulation. I generated hundreds of pdbs from the trajectories and analyzed DNA using 3DNA programe. It will be helpful for  me if someone have a script for extracting information from 3DNA output file. For e.g. I want to extract the twist value for each base pair step of the dodecamer DNA and average it over number of frames.

Thanks in advance

Sincerely
Aneesh

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: script for extracting data from 3DNA output file
« Reply #1 on: December 25, 2010, 10:44:26 pm »
Hi Aneesh,

Over the years, I have written a few posts related to the topic of applying 3DNA to the analysis of molecular dynamics (MD) simulations, including:

Also, I contacted a couple of practitioners in the MD field, trying to seek a possible collaborator to make using 3DNA more straightforward for this increasing user community. For various reasons, nothing significant has come out from this effort. I am hoping users who have successfully applied 3DNA in MD analysis would contribute their scripts so others can benefit from and build upon. In the meantime, if you could post your MD analysis procedure and the problems you faced, others (myself included) may be able to help you more concretely.

Xiang-Jun

Offline aneeshcna

  • with-posts
  • *
  • Posts: 9
    • View Profile
Re: script for extracting data from 3DNA output file
« Reply #2 on: January 04, 2011, 12:51:40 am »
Dear Dr. Xiang-Jun,

           Thanks for the reply. In my case, I am using a dodecamer B-DNA and  have generated 100 snapshots (PDBs) from the simulation trajectories. Followed by this I have done the 3DNA analysis for each snapshot and  got 100 3DNA output files. Now I want to extract different parameters from the output file. For e.g, I want to get the  average 'Twist' value for each base pair step. For that I have to extract Twist value for each base pair from 100 3DNA output files and average it and calculate the standard error.

Hope I made it clear now.

Waiting for your valuable reply.

Thanks in advance

Sincerely
Aneesh

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: script for extracting data from 3DNA output file
« Reply #3 on: January 04, 2011, 09:47:06 pm »
Welcome back. It is certainly clearer than before. However, it would be far more helpful if you could be even more specific, i.e., by providing an example. For example, I am not sure what the "100 snapshots (PDBs) from the simulation trajectories" look like. Are the 100 snapshots stored in 100 separate PDB files, or all in one? If the later, how are the snapshots separated? By MODEL/ENDMDL as in NMR structure? What would be an appropriate output format for the extracted parameters? In addition to the mean values of some parameters, e.g., Twist, how about their standard deviations and other related simple statistics? All such details need to be considered to come up with a script that is more generally applicable.

Thus, to help others help you more effectively, try to come up with a (minimum) concrete example, including all necessary input data files and your expected results (in numbers). Moreover, if you have already written some scripts, attach them with your post.

Alternatively, as mentioned in my blog post "Curves+ vs 3DNA", Curves+ has built in support for the analysis of MD simulation trajectories, and it may well serve your need.

HTH,

Xiang-Jun

Offline aneeshcna

  • with-posts
  • *
  • Posts: 9
    • View Profile
Re: script for extracting data from 3DNA output file
« Reply #4 on: January 06, 2011, 04:17:39 am »
Thanks for the immediate reply.

>Are the 100 snapshots stored in 100 separate PDB files, or all in one?
           They are stored in 100 seperate PDBs.
> What would be an appropriate output format for the extracted parameters?
       For e.g. Below is the part of 3DNA output file.
****************************************************************************
Local base-pair step parameters
    step       Shift     Slide      Rise      Tilt      Roll     Twist
   1 CG/CG     -0.47      0.25      2.95     -3.89      5.88     26.83
   2 GC/GC     -0.38      1.12      3.63     -1.47    -11.72     51.06
   3 CG/CG     -0.59     -0.23      3.19     -6.17      9.46     22.37
   4 GA/TC      1.02     -0.13      3.56      4.26     16.61     29.18
   5 AA/TT     -1.86      0.42      3.30     -0.02      1.06     40.25
   6 AT/AT      0.09     -0.55      3.44     -9.90     -5.12     33.95
   7 TT/AA     -0.04     -1.23      3.62     -5.80      3.39     34.13
   8 TC/GA     -0.79     -1.74      3.13      3.33     -7.64     31.44
   9 CG/CG      1.87      0.46      3.57      3.93     15.07     30.64
  10 GC/GC     -0.98      0.47      3.21     -6.57     -5.86     40.09
  11 CG/CG      1.38     -0.30      2.83      4.67      1.96     33.37
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ave.     -0.07     -0.13      3.31     -1.60      2.10     33.94
      s.d.      1.10      0.82      0.28      5.18      9.23      7.71
****************************************************************************
        Here, I want to extract a certain parameter, say Twist, for each base pair step . So the output file contains a single column with twist value  corresponding  to one particular base pair step ( say for 4th step, GA/TC) for which Twist values are  extracted from 100 3DNA outputs ( i.e a single column with 100 lines). This should repeat for all the base pair steps ( I dont have a better idea to do this)  :(
 

> In addition to the mean values of some parameters, e.g., Twist, how about their standard deviations and other related simple statistics?

     At the end of the column contains the mean value and standard deviation for that particular base pair step.


Hope this will help

Sincerely
Aneesh

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: script for extracting data from 3DNA output file
« Reply #5 on: January 06, 2011, 10:12:29 pm »
Quote
They are stored in 100 seperate PDBs.
Is this norm? I cannot imagine that a MD simulation with thousands of snapshots ends up with thousands of PDB files. Anyway, are your 100 PDBs all stored in a directory? Do the PDB files share a specific pattern?

Additionally, are you familiar with R or Matlab/Octave? In my mind, the script to extract 3DNA output parameters would ideally write a tab-delimted data table that can be easily fed into commonly available tools for easy calculation of mean/std etc statistics.

I am pretty occupied with my job right now, but I will try to "spare" some time to come up with a preliminary script to get you started (hopefully by the end of next week).

Xiang-Jun

Offline aneeshcna

  • with-posts
  • *
  • Posts: 9
    • View Profile
Re: script for extracting data from 3DNA output file
« Reply #6 on: January 07, 2011, 04:18:58 am »
Dear Xiang-Jun,
             I can convert each snapshots into  PDBs. All the PDBs are stored in the same directory and all of them share a specific pattern of names, like 3dna_1.pdb, 3dna_2.pdb etc.

            Is there any other way in your mind to analyze the large number of 3DNA ouptuts generated for MD trajectories?

Sincerely
Aneesh

Offline temizna

  • with-posts
  • *
  • Posts: 19
    • View Profile
Re: script for extracting data from 3DNA output file
« Reply #7 on: January 07, 2011, 01:46:58 pm »
Hello Aneesh,
As Xiang-Jun suggested, you can definitely use Curves+ (which is free). Curves+ (and its analysis program canal) has built in support for amber md trajectories.
using canal you can get time series files for each base-pair or helical parameter such as twist, slide, shift, etc.... The output files are arranged in columns corresponding to individual base pair steps.

Or you can use a simple python script that I have posted here for MD trajectory analysis. My script only extracts the time series of the parameters.

Search the forum, you should be able to find my post.


HTH,
Alpay

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: script for extracting data from 3DNA output file
« Reply #8 on: January 07, 2011, 09:26:28 pm »
Hi Aneesh,

In addition to Alpay's above reply, did you also notice his most recent post "A modified 3DNA parser for MD trajectory analysis" at the section of Users' contributions. Please have a try and report back your experience.

I will try to come up with a script that hopefully streamlines the process of extracting 3DNA output from MD trajectory analysis. Alpay's parser may well serve as a starting point.

HTH,

Xiang-Jun

Offline aneeshcna

  • with-posts
  • *
  • Posts: 9
    • View Profile
Re: script for extracting data from 3DNA output file
« Reply #9 on: January 12, 2011, 12:31:47 am »
Dear Alpay and Xiang-Jun,
                     Thanks for your reply. I will definitely go through the suggestions you made and will update you about it.

Thanks once again

Sincerely
Aneesh

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: script for extracting data from 3DNA output file
« Reply #10 on: January 19, 2011, 12:16:10 am »
Hi Aneesh,

Finally, I've come up with something to present  :D  See my post "Ruby scripts for the analysis of MD simulation trajectories". Please have your follow ups there.

Cheers!

Xiang-Jun

 

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University