Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Messages - temizna

Pages: [1]
1
MD simulations / Re: Ruby scripts / where is output file?
« on: April 06, 2011, 11:04:54 am »
You may have a path issue. Try running like this:
term> ruby x3dna_md.rb -b bpfile.dat -e sample_md.pdb -o my.out

Make sure you are calling the correct ruby from the correct place and the X3DNA environment variable is set.
HTH,
Alpay

2
Hi Xiang-Jun

After my initial problems, the scripts are working nicely. I have only one problem/request. I noticed that the x3dna_md script is dumping the results after finishing processing all the snapshots/models. This means that it is keeping a lot in the memory. MD simulations tend to have thousands of snaphsots that needs to be processed. I can of course skip snapshots and process for example every 10, but I really do not want to throw away any information. Is it possible for the script to append to the output file after processing each model?

Thanks
Alpay

3
The problem is temp_model.out file. I removed everything from the tar file but the scripts, the bpfile.dat and the sample_md.pdb, then ran the script. It gave me an error. Then I ran analyze -c temp_model.inp temp_model.pdb and created temp_model.out
Running the ensemble option again, processed the whole trajectory. I tested this on my own trajectory with 50 snapshots and now thing seem to work.


HTH,
Alpay

4
Thanks for the reply Xiang-Jun.

I am trying to process a big trajectory file. 50000 models. When I run the script, I get the following error:
rb x3dna_md.rb -b ../cf/run1/cf.dat -e ../cf/run1/cf.1.tr.pdb
        ../cf/run1/cf.1.tr.pdb: with model numbers <= 0
x3dna_md.rb:96:in `block (2 levels) in parse_base_pair_parameters': undefined method `[]' for nil:NilClass (NoMethodError)
        from x3dna_md.rb:96:in `collect'
        from x3dna_md.rb:96:in `block in parse_base_pair_parameters'
        from x3dna_md.rb:95:in `each'
        from x3dna_md.rb:95:in `each_with_index'
        from x3dna_md.rb:95:in `parse_base_pair_parameters'
        from x3dna_md.rb:203:in `block in parse_3dna_output'
        from x3dna_md.rb:200:in `open'
        from x3dna_md.rb:200:in `parse_3dna_output'
        from x3dna_md.rb:281:in `block in process_ensemble_models'
        from x3dna_md.rb:272:in `each'
        from x3dna_md.rb:272:in `process_ensemble_models'
        from x3dna_md.rb:79:in `main'
        from x3dna_md.rb:497:in `<main>'
Process model #0 / 50000
-------------------------

It seems the script is getting the correct number of models as evidenced from the presence of a model_list.dat

I will try shorter trajectories to see if the same error exists.
Alpay

5
MD simulations / Snapshots must be separated by MODEL/ENDMDL
« on: March 10, 2011, 10:50:17 am »
Dear Xiang-Jun,
I have just noticed another potential bug. Your ruby script looks only MODEL/ENDMDL pairs to separate each snapshot. Although this works for the sample trajectory pdb you use (which was created using gromacs), not every visualization/simulation package create the files same way. For example, vmd does not use MODEL to start each snapshot and it ends with END only. Some other programs use TER instead of END. May be we should start matching "ATOM     1 " assuming there is always a first atom from the simulation trajectory.

HTH,
Alpay

6
Thanks for the changes Xiang-Jun. I just ran the test, it looks good. Now I will move on to do a real (and long) analysis run of a cruciform. I will let you know how it turns out.
Regarding BI/BII, for my most recent work ( I am preparing the manuscript now) I used R to do most of the analysis after I generated the parameters with curves+. I calculated BI/BII myself using a variety of e-z cut-offs. I know most people use 0, but there was a 93 paper by Hartmann and Lavery which suggested 70. When I looked at the distribution of e-z from my simulations I saw that 50 was a better separation. I ended up using 0 to be consistent with previous publications. I just wanted to let you know how BI/BII turned out for me.

Alpay

7
MD simulations / Re: average values from MD simulations
« on: March 09, 2011, 10:24:27 am »
Hi Ara

As Xiang-Jun suggested, you can use R and load your twist/roll data as matices to R. then by simply using colMeans function you can get the simulation means for each step as a vector and plot it.

HTH,
Alpay

8
Dear Xiang-Jun,

After I submitted my reply, I already changed the line to have the snapshots as numbers only in your script. :) Still, I think it is better to have the numbers as default. May be in your next version?

By distribution calculation I meant in addition to the means, standard deviations you already supply, it would be nice to have BI/BII distributions averaged over the trajectory, histograms of each parameter (per base pair/base pair step and over the whole structure).  I know there is no perfect analysis script/program out there, but the more the initial analysis does the better for the end user. It also encourages people to use it more!

One more suggestion: In the simulations there are always end effects. A general practice among MD people is to remove the first and last base pairs from the analysis to reduce these effects in the subsequent analysis. It would be great to have this option available as a choice in the script.

Cheers!
Alpay

9
Dear Xiang-Jun,

Thank you for your scripts! I just tested the sample set and they seem to work fine. I am glad you coded an -all option. I personally like to run these scripts once and have all the information available at the same time. In addition, this is very similar to what Curves does as well. Next step would be to incorporate some distribution calculation.

I have one small comment. Instead of printing out model_1, model_2, etc.. We should just have the snapshot numbers, since an MD trajectory tends to have thousands of them.

I look forward to using these scripts in my analysis.

Cheers!
Alpay

10
MD simulations / A modified 3DNA parser for MD trajectory analysis
« on: January 07, 2011, 06:50:10 pm »
I am attaching a python script that my be of use for trajectory analysis. I based this Python script on the code of Yurong's X3DNA parser.

Recipe:
First run nmr_strs program on your trajectory (save trajectory as one pdb file)
The python program parses the .out files generated by nmr_strs
>  nmr_strs --pdbfile test.pdb
> csh mv.txt
     script to rename files:
     you will need to rename the *.out files to "file.NUM.out" where NUM is the snapshot number from nmr_strs output.
        Example:
     mv testNUM.out test.NUM.out
     where NUM is the snapshot number.

> python parse_3dna_out.py file NUMofSNAPSHOTS NUMofBP
> python parse_3dna_out.py test 20 12
usage is: python parse3dna.py file NUMofSNAPSHOTS NUMofBP
The test file contains 100 ps snapshots from a 2 nanosecond simulation trajectory of a DNA hairpin. (20 snapshots)
NUMofSNAPSHOTS is the total number of snapshots read in nmr_strs program
NUMofBP is the number of base pairs identified for the trajectory.

This will create csv files along the trajectory for each base pair (as columns excluding the end base pair where the values are 0 or not calculated)

You can open csv files with excel or any other program.

Hope this helps,
Alpay

11
Hello Aneesh,
As Xiang-Jun suggested, you can definitely use Curves+ (which is free). Curves+ (and its analysis program canal) has built in support for amber md trajectories.
using canal you can get time series files for each base-pair or helical parameter such as twist, slide, shift, etc.... The output files are arranged in columns corresponding to individual base pair steps.

Or you can use a simple python script that I have posted here for MD trajectory analysis. My script only extracts the time series of the parameters.

Search the forum, you should be able to find my post.


HTH,
Alpay

12
General discussions (Q&As) / Re: Missing Groove Measurement
« on: December 13, 2010, 01:49:26 pm »
Then you need to parse the out files to get the time series info for each parameter you want to further analyze. 3DNA does not output time series data. May be in the future! :)

13
Assuming your 20mer and 30mer structures resemble average B-DNA conformation, you can build the 30mer using 3DNA. Then match 20mer NMR structure with the 30mer using VMD or UCSF Chimera. You can use the sugar phosphate backbone to rmsd match.

HTH,
Have fun!

Alpay

14
General discussions (Q&As) / Re: triplex building
« on: September 13, 2010, 09:18:52 am »
Thanks for the suggestion. I will definitely try that this week.

Alpay

15
General discussions (Q&As) / Re: single-stranded base stacking
« on: September 10, 2010, 02:17:31 pm »
Hi Liu,

If you check the web3DNA site and enter a single stranded DNA/RNA PDB file such as 1S40 ( a SS DNA/protein complex) you will see that 3DNA can analyze a single stranded nucleic acid structure. Also check Lu's Nature Methods Paper for an example of how to analyze/make figures of base stacking.

Have fun,
Alpay

16
General discussions (Q&As) / Re: triplex building
« on: September 09, 2010, 11:05:47 am »
Your welcome. The script is a simple script that reads your nmr_strs script output. I  also have a version of nmr_strs where I use anyhelix instead of analyze just to see if there is any difference in the outputs.

Goin gback to the triplex, by regular G.G and A.A Hoogsten bps, I meant idealized H-bonding with bases lying on the same plane such as the schematic representation supplied on the wiki page of hoogsten base pairs.
My sequence is basically a stretch of GAG repeats (10 of them) where half the repeats loop around and form the anti parallel triplex like this:
3'-CTCCTCCTCCTCCTC
5'-GAGGAGGAGGAGGAG-G
       3'-GAGGAGGAGGAG-GA

where the repeat between the dashes represent a loop.

135D and 136D pdb structures do have the GGC triplex with GG base pair step similar to my repeat. May be that is a good place to start.

A simple way I can think of is: build B-DNA helix of 5 GAG repeats. cp the file, separate the the strands in one file and than dock the single strand on the double helix. I am going to be working this angle as well. ( as soon as I get my hands on a decent docking software :) )

Thanks again for your help. I appreciate it.
Alpay

17
General discussions (Q&As) / Re: triplex building
« on: September 08, 2010, 09:58:49 am »
Hi Xiang-Jun,

Thanks for the fast reply. Unfortunately, I only have sequence info that is predicted to form a triplex, no structure information at all!. I am assuming that the triplex will form regular W-C bps on the B-DNA double strand and regular Hoogsten bps with the single strand. I am planing to perform simulations to judge the stability and effects of mutations on the stability of the triplex. SO, it is model building from scratch. Unfortunately, NAB is not straightforward and even the examples seems to be not working.

Anyway, thanks again for your help and thanks for a really nice software package.
Alpay

18
General discussions (Q&As) / Re: analysis along the md trajectories
« on: September 01, 2010, 05:47:28 pm »
I am attaching a python script that my be of use. I based this Python script on the code of  Yurong's X3DNA parser.


First run nmr_strs program on your trajectory (save trajectory as one pdb file)
The python program parses the .out files generated by nmr_strs

usage is: python parse3dna.py file NUMofSNAPSHOTS-1 NUMofBP
you will need to rename the *.out files to "file.NUM.out" where NUM is the snapshot number from nmr_strs output.
NUMofSNAPSHOTS is the total number of snapshots read in nmr_strs program
NUMofBP is the number of base pairs identified for the trajectory.

This will create csv files along the trajectory for each base pair.

Hope it helps.

Alpay

19
General discussions (Q&As) / triplex building
« on: September 01, 2010, 05:23:15 pm »
Hello everyone!

I am trying to build a triplex DNA structure in the form of (PPY) Purine-Purine-Pyrimidine (i.e. GGC and AAT with GC and AT Watson-crick and GG and AA Hoogsten base pairs) repeats.
The fiber option in 3DNA gives YYP triplexes. I am willing  to construct new parameter files. I just need to know how to go about it. What kind of files do i need to create and how to integrate them with 3DNA?

Thanks

Alpay

Pages: [1]

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University