Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: issues with x3dna_ensemble  (Read 61224 times)

Offline shynomat

  • with-posts
  • *
  • Posts: 16
    • View Profile
issues with x3dna_ensemble
« on: May 30, 2013, 10:48:32 am »
Hi  all,,
I need some help. I am trying to use x3dna_ensemble to calculate the inter-base step parameters. I am trying to follow the steps mentioned by Xiang-Jun in the following post: http://forum.x3dna.org/md-simulations/how-to-properly-use-x3dna_ensemble/
this is what I did so far
1. I went inside the directory,
     X3DNA/x3dna-v2.1beta/examples/ensemble/md
2. Issued the command
     x3dna_ensemble analyze -h
but it is giving me the following message:
/usr/bin/env: ruby: No such file or directory
I also tried typing:
~/X3DNA/x3dna-v2.1beta/bin/x3dna_ensemble analyze -h
still got same message

In my .bashrc file, I have included the following two lines when I first installed (few months ago) x3DNA
export X3DNA=/home/shyno/X3DNA/x3dna-v2.1beta
export PATH=$PATH:$X3DNA/bin

thanks,
Shyno

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: issues with x3dna_ensemble
« Reply #1 on: May 30, 2013, 02:26:18 pm »
Quote
/usr/bin/env: ruby: No such file or directory
It seems you do not have Ruby installed on your machine. What happens if you issue the following command:

Code: [Select]
ruby
Xiang-Jun

Offline shynomat

  • with-posts
  • *
  • Posts: 16
    • View Profile
Re: issues with x3dna_ensemble
« Reply #2 on: June 03, 2013, 10:19:41 am »
Hello Xianjun,
thanks so much for your reply. You are right, I don't have ruby installed in my system. I am trying to install it, but is running into some issues.

thanks
Shyno

Offline shynomat

  • with-posts
  • *
  • Posts: 16
    • View Profile
Re: issues with x3dna_ensemble
« Reply #3 on: June 03, 2013, 11:39:33 am »
Hello Xianjun,
I was able to install the ruby.. I am trying to understand the example. It would be great if you could answer the following questions:
1. When I run the command "find_pair sample_md0.pdb bpfile.dat", the output file bpfile.dat is different from the one given initially. For example, this file has 13 base pairs as opposed to 12 base pairs in the initial file.I noticed you said this is ok in your post wich I referred in the first message.
2. I am assuming the sample_md0.pdb is generated from the initial frame of the trajectory. For example, I have a trajectory consisting of 5000 frames and I am analysing all frames except the first 999 frames. So in this case, the file equivalent to sample_md0.pdb, will be the pdb file  corresponding to 1000 frame.
3. Now for the next step,
x3dna_ensemble analyze -b bpfile.dat -p 'pdbdir/model_*.pdb' -o ensemble.out
All the model_*.pdb files in the pdbdir are pdb files created for different frames (or timesteps)?
I am assuming this step works for just one frame also. Since my system contains water, and other ions, and I am writing a pdb file for each frame just selecting the dna alone.

thanks in advance for your help,

best
Shyno

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: issues with x3dna_ensemble
« Reply #4 on: June 03, 2013, 12:32:07 pm »
Glad to see you have Ruby installed.

Quote
1. When I run the command "find_pair sample_md0.pdb bpfile.dat", the output file bpfile.dat is different from the one given initially. For example, this file has 13 base pairs as opposed to 12 base pairs in the initial file.I noticed you said this is ok in your post which I referred in the first message.
That's understandable. The point of creating a bpfile explicitly is to provide flexibility as to which bps to analyze. If you run find_pair on each frame of a MD trajectory, the identified bp info may be different.

Quote
2. I am assuming the sample_md0.pdb is generated from the initial frame of the trajectory. For example, I have a trajectory consisting of 5000 frames and I am analysing all frames except the first 999 frames. So in this case, the file equivalent to sample_md0.pdb, will be the pdb file  corresponding to 1000 frame.
The file 'sample_md0.pdb' is provided by a 3DNA user, and I've used it as a test case in developing the x3dna_ensemble script. The ensemble file contains 21 models (numbered 0,1,..20) delineated with MODEL/ENDMDL in PDB format. The content of the file is entirely up to you. The x3dna_ensemble script will analyze each model sequentially.

Quote
3. Now for the next step,
x3dna_ensemble analyze -b bpfile.dat -p 'pdbdir/model_*.pdb' -o ensemble.out
All the model_*.pdb files in the pdbdir are pdb files created for different frames (or timesteps)?
I am assuming this step works for just one frame also. Since my system contains water, and other ions, and I am writing a pdb file for each frame just selecting the dan alone.
You can put anything there. The post "Using Glob with Directories" may be helpful to you.

HTH

Xiang-Jun

Offline shynomat

  • with-posts
  • *
  • Posts: 16
    • View Profile
Re: issues with x3dna_ensemble
« Reply #5 on: June 03, 2013, 06:11:51 pm »
Hello Xianjun,
Thanks so much for your detailed message. This is very helpful. I have few more questions.
1. I am attaching the bpfile.dat, I notice the methylated cytosine (M) is recognized as just C. Is there a way to change this. Should I modify any of the files in x3dna-v2.1beta/
For example, some time back, I have included a line in baselist.dat
M C
Also I was using curves+ to do the analysis. The output of curves+, file named 'sel.lis' will keep the residue name as M.
I assume this doesn't make any difference to the values for the step parameters
2. For step parameters, the values listed in ensemble.out are taken from bp_step.par, correct?
Because I would like to extract the base pair steps along with the values. I didn't find the base pairs listed explicitly in the ensemble.out
When some of the bp steps are missing, it is confusing which value is associated with a particular bp step.
3. Also what is the difference between major_gw_pp and major_gw_refined and similar file for minor groove as well. I am interested in extracting the minor and major groove widths.
Please see the attached files.

thanks again,
Shyno

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: issues with x3dna_ensemble
« Reply #6 on: June 03, 2013, 09:57:51 pm »
  • The mapping between bases and corresponding reference frame files is controlled by $X3DNA/config/baselist.dat, with modified bases in lower case. By default, the corresponding canonical vs modified base frames are identical, so whether putting
    M C or M c in baselist.dat (preferred) won't make a numerical difference in calculated parameters.

    Note that in 3DNA-analyze output, the top section provide the full detail to uniquely identify a nucleotide. So in later sections, only an one-char short-hand form is used.

    As noted in my post "Curves+ vs 3DNA", Curves+ has unique features unavailable from 3DNA. So users are encouraged to pick their favorite one or try both -- see "Building a bridge between Curves+ and 3DNA".
  • The step parameters in ensemble.out are not taken bp_step.par, but they have strict correspondence; the serial number is used in ensemble.out for simplicity. Here again, you may find Curves+ more convenient.
  • For definition of groove widths used in 3DNA, please see: M. A. El Hassan and C. R. Calladine (1998). "Two Distinct Modes of Protein-induced Bending in DNA.'' J. Mol. Biol., v282, pp331-343.

If you are interested in knowing 3DNA better, please try to reproduce the recipes reported in the 2008 3DNA Nature Protocols paper. Then the output of x3dna_ensemble would make more sense.

HTH,

Xiang-Jun
« Last Edit: June 03, 2013, 10:02:07 pm by xiangjun »

Offline shynomat

  • with-posts
  • *
  • Posts: 16
    • View Profile
Re: issues with x3dna_ensemble
« Reply #7 on: June 04, 2013, 11:00:50 am »
Dear Dr. Xianjun,
Thanks again for your detailed message.
Regarding point #2, when you say "the serial number is used in ensemble.out for simplicity", I am little confused, in the sense I don't find seriel number corresponding to bp steps. For example, if I am interested in looking at the shift parameter, the line that has this info in the ensemble.out is:

<shift>   # with 10 data columns
sel 1.29  -1.11 0.78  -0.44 0.00  0.68  0.40  -0.64 -0.36 0.93
</shift>
Since my system is 12 bp long, it should have 11 inter base pair step values. But from the above information I am not sure which bp step is missing. Is there a way to identify this?
when I look at the file, cf_7methods.par, I see the above shift values are the same as given under the title "Curves base-pair step parameters".
Does this mean these values are calulated by curves? Can I extract the bp steps from this file?

thanks again,
Shyno

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: issues with x3dna_ensemble
« Reply #8 on: June 04, 2013, 11:49:44 am »
Thanks for your feedback. There are quite a few options in x3dna_ensemble analyze to make it flexible. Depending on your specific run, the output file may contain slightly different info. In your example for the shift parameter, it appears you have 10 structures (models/frames) corresponding to the 10 data columns.

Regarding the file cf_7methods.par, it contains 3DNA's implementation of seven analysis methods using the same base reference frame. The part of Curves may not be identical to that of Curves+, due to subtle differences in reference frames. But they should be directly comparable for Watson-Crick base pairs and steps. You can safely ignore cf_7methods.par, going for Curves+, or using 3DNA default.

As always, please provide a reproducible example to make your point unambiguous.

Xiang-Jun

 

Offline shynomat

  • with-posts
  • *
  • Posts: 16
    • View Profile
Re: issues with x3dna_ensemble
« Reply #9 on: June 04, 2013, 01:19:55 pm »
Thanks again for your reply. But I am only considering one frame. Please see the attached pdb file, bpfile.out and ensemble.out.
I would like to know if there is way to extract bp steps along with values for shift using 'x3dna_ensemble extract'.
I apologize if the previous question wasn't clear.
The following are the steps I used to create bpfile.dat, ensemble.out:
1. find_pair sel.pdb bpfile.dat
2. x3dna_ensemble analyze -b bpfile.dat -p sel.pdb -o ensemble.out
thanks,
Shyno

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: issues with x3dna_ensemble
« Reply #10 on: June 04, 2013, 04:49:17 pm »
Hi Shyno,

Thanks for providing a detailed example with corresponding data files. Now things are clear.

If you run:
Code: [Select]
find_pair sel.pdb bpfile.dat
analyze bpfile.dat
x3dna_ensemble analyze -b bpfile.dat -p sel.pdb -o ensemble.out
x3dna_ensemble extract -p shift

The output file 'sel.out' contains the following section:
****************************************************************************
Local base-pair step parameters
    step       Shift     Slide      Rise      Tilt      Roll     Twist
   1 GC/GC      1.29      0.44      2.62      6.32     -3.86     31.01
   2 CG/CG     -1.11     -0.04      3.86     -6.40     14.10     35.74
   3 GA/TC      0.78     -0.30      3.57     -0.97      2.92     36.20
   4 AA/TT     -0.44     -0.77      3.48     -0.74      2.14     35.20
   5 Ac/GT      0.00      0.04      3.09      2.14      0.76     39.12
   6 cG/CG      0.68      0.41      3.21     -2.61     17.38     29.80
   7 GC/GC      0.40     -0.35      3.28     -0.47      0.89     32.94
   8 CG/CG     -0.64      0.39      3.04      3.76      9.39     29.46
   9 GC/GC     -0.36     -0.89      3.40     -6.53      1.69     37.52
  10 CG/CG      0.93     -0.64      3.45      2.40     10.67     32.12
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ave.      0.15     -0.17      3.30     -0.31      5.61     33.91
      s.d.      0.78      0.50      0.34      4.16      6.84      3.33

The shift parameters (extracted by command: x3dna_ensemble extract -p shift) are:
Code: [Select]
sel 1.29 -1.11 0.78 -0.44 0.00 0.68 0.40 -0.64 -0.36 0.93 Here the first column ('sel') is the name of the PDB file (without extension), and the next 10 columns are the shift values for the 10 dinucleotide steps. Since your pattern is very specific, it matches only one PDB file ('sel.pdb').

HTH,

Xiang-Jun
« Last Edit: June 04, 2013, 04:51:41 pm by xiangjun »

Offline shynomat

  • with-posts
  • *
  • Posts: 16
    • View Profile
Re: issues with x3dna_ensemble
« Reply #11 on: June 04, 2013, 10:52:47 pm »
Hello Xianjun,
thank you very much for the detailed message. I did these steps again and got the same results. In a way, I only need to do the following first two steps:
find_pair sel.pdb bpfile.dat
analyze bpfile.dat

as the sel.out contains all the information I need, correct. Just to make sure.

thanks,
Shyno

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: issues with x3dna_ensemble
« Reply #12 on: June 04, 2013, 11:47:45 pm »
Yes, for one frame, you do not need to run x3dna_ensemble at all. The find_pair/analyze command combination is sufficient.

Xiang-Jun

Offline shynomat

  • with-posts
  • *
  • Posts: 16
    • View Profile
Re: issues with x3dna_ensemble
« Reply #13 on: June 06, 2013, 11:58:35 am »
Hello Dr. Xianjun,
thanks again for your reply. I have another question related to this post.
As shown in the sel.out file clearly the first bp step 'CG' is missing, I was wondering if there is way to print out this step in the sel.out, even though it doesn't have any values associated with it. It would be very helpful for me to have all the bp steps listed all the time even if they don't have values associated with it.
For example, the section for inter base pair step parameters will look something like this:

   Local base-pair step parameters
    step            Shift     Slide      Rise      Tilt      Roll     Twist
   1. CG/CG  ---       
   1 GC/GC      1.29      0.44      2.62      6.32     -3.86     31.01
   2 CG/CG     -1.11     -0.04      3.86     -6.40     14.10     35.74
   3 GA/TC      0.78     -0.30      3.57     -0.97      2.92     36.20
   4 AA/TT     -0.44     -0.77      3.48     -0.74      2.14     35.20
   5 AC/GT      0.00      0.04      3.09      2.14      0.76     39.12
   6 CG/CG      0.68      0.41      3.21     -2.61     17.38     29.80
   7 GC/GC      0.40     -0.35      3.28     -0.47      0.89     32.94
   8 CG/CG     -0.64      0.39      3.04      3.76      9.39     29.46
   9 GC/GC     -0.36     -0.89      3.40     -6.53      1.69     37.52
  10 CG/CG      0.93     -0.64      3.45      2.40     10.67     32.12
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ave.      0.15     -0.17      3.30     -0.31      5.61     33.91
      s.d.      0.78      0.50      0.34      4.16      6.84      3.33


Offline shynomat

  • with-posts
  • *
  • Posts: 16
    • View Profile
Re: issues with x3dna_ensemble
« Reply #14 on: June 06, 2013, 12:08:40 pm »
sorry, the previous message got posted accidently. Since my system has 12 bp steps, it should have 11 inter base pair steps.
If it is not clear this is what I want inside sel.out if possible:
Local base-pair step parameters
          step        Shift      Slide      Rise       Tilt         Roll     Twist
   1  CG/CG       ---         ---         ---           ---         ----       ----     
   2  GC/GC      1.29      0.44      2.62       6.32      -3.86    31.01
   3  CG/CG     -1.11    -0.04      3.86      -6.40     14.10     35.74
   4  GA/TC       0.78    -0.30      3.57      -0.97       2.92     36.20
   5  AA/TT       -0.44    -0.77      3.48      -0.74       2.14     35.20
   6  AC/GT       0.00     0.04      3.09       2.14       0.76     39.12
   7  CG/CG      0.68     0.41      3.21      -2.61     17.38     29.80
   8  GC/GC      0.40    -0.35      3.28      -0.47       0.89     32.94
   9  CG/CG     -0.64     0.39      3.04       3.76       9.39     29.46
  10 GC/GC     -0.36    -0.89      3.40      -6.53       1.69     37.52
  11 CG/CG      0.93    -0.64      3.45       2.40     10.67     32.12
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ave.            0.15     -0.17      3.30     -0.31      5.61     33.91
      s.d.             0.78      0.50      0.34      4.16      6.84      3.33

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: issues with x3dna_ensemble
« Reply #15 on: June 06, 2013, 01:10:27 pm »
Hi,

By default, find_pair finds only 11 base pairs in your sel.pdb structure. The nucleotides ....>A:...1_:[..C]C and ....>B:..24_:[..G]G do not form a pair -- as is evident by using a Jmol or PyMOL. If you insist on getting 3DNA output involving this bp, you can manually modify the find_pair generated bp file as below:

sel.pdb
sel-all.out
    2         # duplex
   12         # number of base-pairs
    1    1    # explicit bp numbering/hetero atoms
    1   24
    2   23  0 #    1 | ....>A:...2_:[..G]G-----C[..C]:..23_:B<....  0.51  0.33  9.29  9.08 -3.37
    3   22  0 #    2 | ....>A:...3_:[..C]C-----G[..G]:..22_:B<....  1.01  0.90 38.28  8.76 -0.28
    4   21  0 #    3 | ....>A:...4_:[..G]G-----C[..C]:..21_:B<....  0.62  0.42  8.68  9.05 -3.12
    5   20  0 #    4 | ....>A:...5_:[..A]A-----T[..T]:..20_:B<....  0.34  0.07 24.63  9.03 -1.29
    6   19  0 #    5 | ....>A:...6_:[..A]A-----T[..T]:..19_:B<....  0.44  0.14 27.81  9.19 -2.88
    7   18  0 #    6 | ....>A:...7_:[..M]c-----G[..G]:..18_:B<....  0.60  0.39 16.09  8.90 -2.81
    8   17  0 #    7 | ....>A:...8_:[..G]G-----C[..C]:..17_:B<....  0.19  0.02 12.30  9.07 -4.15
    9   16  0 #    8 | ....>A:...9_:[..C]C-----G[..G]:..16_:B<....  0.50  0.35 15.49  9.12 -3.03
   10   15  0 #    9 | ....>A:..10_:[..G]G-----C[..C]:..15_:B<....  0.13  0.09 12.22  9.02 -4.07
   11   14  0 #   10 | ....>A:..11_:[..C]C-----G[..G]:..14_:B<....  0.49  0.40 12.81  9.03 -3.07
   12   13  0 #   11 | ....>A:..12_:[..G]G-----C[..C]:..13_:B<....  0.44  0.06 25.76  8.98 -3.15

Not the changes in red color.

HTH,

Xiang-Jun

 

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University