Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Messages - xiangjun

Pages: 1 ... 48 49 [50] 51 52 ... 63
1226
MD simulations / Re: Analysis of PDB file:query
« on: April 19, 2012, 12:18:08 pm »
Quote
now got 14 nos. of base pairs out of 15
There must be something special with the missing base pair. If you visualize the structure graphically using Jmol/PyMOL etc, I believe you won't take it as a "base pair" either. Checking the details why it is "missed" by find_pair would be an interesting exercise for those who want to understand 3DNA better.

Xiang-Jun

1227
From the information you provided, 3DNA is clearly not properly setup in your Windows system. I was confused how you could run find_pair etc.

You said you used the compiled version of 3DNA for MinGW/MSYS, so why not you launch a MinGW Shell?
C:\Users\triindia>cd C:\X3DNA\3DNA bin
You have a directory named 3DNA bin? A space between 3DNA and bin?

In my MinGW Shell, this is the output from 3DNA v2.1beta setup script.
$ x3dna-v2.1beta/bin/x3dna_setup
Unknown shell: not-set -- you've to set X3DNA and PATH manually:
             o set up the X3DNA environment variable
             o add $X3DNA/bin to your command search path

Setting up 3DNA should be a straightforward process. If you have
technical problems, ask a local expert for help, or post them at
the 3DNA forum.

I installed 3DNA under directory $HOME/x3dna-v2.1beta. So I have created a text file (named set-me-up), with $X3DNA environment variable and X3DNA/bin path set explicitly as below:
Code: [Select]
export X3DNA=$HOME/x3dna-v2.1beta
export PATH=$X3DNA/bin:$PATH

Then I can run the following (note it is a dot at the beginning):
Code: [Select]
. set-me-up
and you will see the following when run the "blocview -h" command:
~ [237] blocview -h
===========================================================================
SYNOPSIS
    blocview [OPTION]... PDBFILE
DESCRIPTION
    Generates a schematic image which combines base block representation
    with protein ribbon. The image has informative color coding for the
    nucleic acid part and is set in the "best-view" by default. Users need
    to have MolScript, Raster3D and ImageMagick properly installed on their
    system.
        -o   use original coordinates contained in the PDB data file
        -j   output image in JPG format (default to PNG)
        -t[=]RESOLUTION   create PyMOL ray-traced image at RESOLUTION
        -d   display the generated image using "display" of ImageMagick
        -b   ball and stick model with filled base ring
        -c   clean up temporary common files
        -r   only backbone P atoms + base Ring atoms of nucleic acids
        -p   set the best view based on Protein atoms
        -a   set the best view based on All atoms
        -s[=]NUM    set scale factor for the image
        -i[=]IMAGE  set image file name (default to t.png)
        -x|y|z=ANGLE  rotation around x, y, or z-axis by ANGLE degrees
        PDBFILE   a PDB data file name (other than 't.pdb')
EXAMPLES
    blocview -d -i=bdl084.png bdl084.pdb
AUTHOR
    3DNA v2.0 [June 8, 2008] (by Dr. Xiang-Jun Lu; 3dna.lu@gmail.com)
    Check URL: http://x3dna.org/ for the latest
===========================================================================

If the instruction still does not make sense to you, your best bet may be to consult a local expert for setting up 3DNA. Alternatively, you can try w3DNA: the web-interface to commonly used functionality of 3DNA.

Xiang-Jun

1228
MD simulations / Re: Analysis of PDB file:query
« on: April 18, 2012, 09:33:32 am »
Okay, let's check step-by-step how find_pair is working for your two attached structures.
  • The output for 30ns@mod_2.pdb is as you expected.
    find_pair 30ns@mod_2.pdb 30ns@mod_2.bps
    more 30ns@mod_2.bps
    # the output is as below
    30ns@mod_2.pdb
    30ns@mod_2.out
        2         # duplex
       14         # number of base-pairs
        1    1    # explicit bp numbering/hetero atoms
        1   31  0 #    1 | ....>-:...1_:[DA5]a-**--t[TPN]:..31_:-<....  4.71  1.82 45.14  7.30  7.61
        2   30  0 #    2 | ....>-:...2_:[.DA]A-----t[TPN]:..30_:-<....  0.51  0.30 13.22  9.70 -3.24
        3   29  0 #    3 | ....>-:...3_:[.DT]T-----a[APN]:..29_:-<....  0.73  0.73  8.66  9.13 -2.38
        4   28  0 #    4 | ....>-:...4_:[.DT]T-----a[APN]:..28_:-<....  0.19  0.09 13.10  9.21 -3.98
        5   27  0 #    5 | ....>-:...5_:[.DT]T-----a[APN]:..27_:-<....  1.03  1.03 17.68  8.93  0.96
        6   26  0 #    6 | ....>-:...6_:[.DT]T-----a[APN]:..26_:-<....  0.23  0.22 16.23  9.13 -3.52
        7   25  0 #    7 | ....>-:...7_:[.DT]T-----a[APN]:..25_:-<....  0.37  0.36 10.91  8.99 -3.36
        8   24  0 #    8 | ....>-:...8_:[.DT]T-----a[APN]:..24_:-<....  0.30  0.01  9.28  9.21 -4.21
        9   23  0 #    9 | ....>-:...9_:[.DT]T-----a[APN]:..23_:-<....  0.49  0.12 10.87  9.24 -3.73
       10   22  0 #   10 | ....>-:..10_:[.DT]T-----a[APN]:..22_:-<....  0.18  0.05 14.36  8.97 -4.00
       11   21  0 #   11 | ....>-:..11_:[.DA]A-----t[TPN]:..21_:-<....  0.47  0.21  1.26  9.28 -4.05
       12   20  0 #   12 | ....>-:..12_:[.DT]T-----a[APN]:..20_:-<....  0.34  0.30  5.05  9.10 -3.80
       13   19  0 #   13 | ....>-:..13_:[.DT]T-----a[APN]:..19_:-<....  0.46  0.45 22.22  8.98 -2.53
       14   18  0 #   14 | ....>-:..14_:[.DT]T-----a[APN]:..18_:-<....  0.47  0.28 25.44  9.34 -2.71
    ##### Base-pair criteria used:     4.00     0.00    15.00     2.50    65.00     4.50     7.50 [ O N]
    ##### 1 non-Watson-Crick base-pair, and 1 helix (0 isolated bps)
    ##### Helix #1 (14): 1 - 14  ***broken O3' to P[i+1] linkage***

    Using Jmol/PyMOL/RasMol to visualize the structure, one can easily verify that find_pair is behaving properly.
  • Now repeat the procedute for 30ns_nsp_mod.pdb, you'd get the following result which you thought as "improper":
    find_pair 30ns_nsp_mod.pdb 30ns_nsp_mod.bps
    more 30ns_nsp_mod.bps
    # the output is as below
    30ns_nsp_mod.pdb
    30ns_nsp_mod.out
        2         # duplex
       10         # number of base-pairs
        1    1    # explicit bp numbering/hetero atoms
        1   31  0 #    1 | ....>-:...1_:[..A]A-**--T[..T]:..31_:-<....  6.27  0.03 22.42  9.06  6.46
        2   30  0 #    2 | ....>-:...2_:[..A]A-**--T[..T]:..30_:-<....  4.14  0.42 23.12  9.56  5.13
        3   29  0 #    3 | ....>-:...3_:[..T]T-----A[ADE]:..29_:-<....  0.13  0.01 20.45  9.45 -3.83
        4   28  0 #    4 | ....>-:...4_:[..T]T-----A[..A]:..28_:-<....  0.28  0.28 19.08  9.01 -3.21
        5   27  0 #    5 | ....>-:...5_:[..T]T-----A[..A]:..27_:-<....  0.40  0.11 18.15  9.35 -3.47
        6   26  0 #    6 | ....>-:...6_:[..T]T-----A[ADE]:..26_:-<....  0.64  0.51 27.99  9.05 -1.94
        7   25  9 #    7 x ....>-:...7_:[..T]T-----A[..A]:..25_:-<....  0.62  0.56 25.16  9.15 -2.00
       10   22  1 #    8 + ....>-:..10_:[..T]T-----A[..A]:..22_:-<....  0.64  0.53 15.83  9.01 -2.50
       13   19  0 #    9 | ....>-:..13_:[..T]T-----A[ADE]:..19_:-<....  0.48  0.19 18.51  5.17 -3.21
       14   18  0 #   10 | ....>-:..14_:[..T]T-----A[..A]:..18_:-<....  0.27  0.26 30.07  4.91 -2.71
    ##### Base-pair criteria used:     4.00     0.00    15.00     2.50    65.00     4.50     7.50 [ O N]
    ##### 2 non-Watson-Crick base-pairs, and 3 helices (1 isolated bp)
    ##### Helix #1 (7): 1 - 7  ***broken O3' to P[i+1] linkage***
    ##### Helix #2 (1): 8
    ##### Helix #3 (2): 9 - 10  ***broken O3' to P[i+1] linkage***

    As shown in the image below, find_pair is again behaving as it should for this case. Is this structure itself what you'd expect?



In my experience, whenever a user suspects the output from find_pair as "improper", it is more than likely that the structure itself is "weird" -- if in doubt, always check your structure using a molecular graphics visualization program (Jmol/PyMOL/RasMol etc). Of course, I am consistently on the watch to refine find_pair, especially for the edge cases.

Xiang-Jun

1229
Thanks for providing your system info (MinGW/MSYS). The error message means either you do not have 3DNA installed properly, or the script blocview is not executable. Which version of 3DNA are you using?

Do the following and report back verbterm the output:

Code: [Select]
cd $X3DNA/bin
ls -al

Xiang-Jun

1230
MD simulations / Re: Analysis of PDB file:query
« on: April 17, 2012, 11:04:14 am »
Quote
I found the output files were not proper
What do you mean "not proper"? What would you expect the output to be? I tried your attached PDB files, and found 3DNA is doing its job.

Xiang-Jun

1231
MD simulations / Re: How to properly use x3dna_ensemble?
« on: April 16, 2012, 05:08:57 pm »
Hi Nikolay,

Check the updated version dated 2012apr16 on the download page, and let me know if you have any problem.

Xiang-Jun

1232
Thanks for using 3DNA and for posting your question on the forum.

It is a bit complicated to setup the environment to run blocview (especially on Windows), for two reasons:
You may be able to run blocview using Cygwin on Windows, but I've never tried it myself. Any comments from other users? In my experience, it is straightforward to use blocview on Linux or Mac OS X where such third-party components are readily available. In future releases of 3DNA (presumably starting from v2.2), blocview would alternatively take advantage of PyMOL.

Now back to your question, are you using MinGW/MSYS or Cygwin? Please give details about the error message, using copy-and-paste.

Xiang-Jun

1233
MD simulations / Re: Analysis of PDB file
« on: April 07, 2012, 09:21:16 am »
Hi,

The simple Perl script manalyze was introduced around v1.5 for the analysis of "multiple" structures. Over the years, I've not been aware of its usage: your question is the first one. As of v2.1beta, I am migrating from Perl to Ruby as the scripting language for 3DNA. Now manalyze and most other not widely used Perl scripts are moved out the $X3DNA/bin/ directory into $X3DNA/perl_scripts/ -- they are obsolete, but kept there for the record.

As of 3DNA v2.1, the Ruby script "x3dna_ensemble" should be used for the analysis of NMR ensembles or MD simulation trajectories. Type -h for detailed info, and run the examples to get familiar with its usage/functionality.

Your attached PDB file contains only one model, so you can use the find_pair/analyze combination to calculate 3DNA parameters. Note, however, your structure has poor geometry, as shown in the image below. As a rule, one should always perform "sanity" check to ensure sensible results.

HTH,

Xiang-Jun

1234
No, I'm not saying or intending to imply that ssDNA and ssRNA are energetically equivalent. However, from a practical prospective, first it'd be better to have something than nothing. Second, a prediction is just a prediction, it is no guarantee that the "predicted" structure is "real" or even "meaningful": just think about how many predictions you can have from the same program with different settings, not to mention the different software tools available. Third, the recent article "RNA-Puzzles: A CASP-like evaluation of RNA three-dimensional structure prediction" [April 2012 [18 (4)] issue of the RNA journal] serves as a proof of my point. In theory (and in practice), a software can easily "predict" many more protein/DNA/RNA structures than the PDB has accumulated over the past 40 years.

Aside from all the augments above, 3DNA does not deal with "energetics" at all (in its current version, at least) -- it is purely geometrical.

Xiang-Jun

1235
MD simulations / Re: How to properly use x3dna_ensemble?
« on: April 05, 2012, 01:29:28 pm »
Quote
BUT, when I do "x3dna_ensamble analyze" for multiple snapshots, I do not see helical axis vectors in the output file.
The information you need, "Position (Px, Py, Pz) and local helical axis vector (Hx, Hy, Hz) for each dinucleotide step",  is not parsed in the current version of x3dna_ensemble. I will get it added soon and keep you updated.

Xiang-Jun
 

1236
See the thread "changing secondary structure to tertiary structure of ssDNA".

Quote
For RNA we have some software such as Rosetta to predict 3D structure
Does it make sense to start from the same secondary structure you have, but simply change Ts to Us, and predict a 3D RNA structure using Rosetta etc. Then you can delete O2' atoms, and mutate Us to Ts with mutate_bases that will preserve backbone conformation and base-pairing geometry.

Xiang-Jun

1237
See my previous reply. In short, 3DNA is not directly up to your purpose yet, even though some of its components may be useful in certain part of your workflow. The recent "RNA-Puzzles" paper in the RNA journal is the state-of-the-art on predicting DNA/RNA tertiary structures.

Good luck with your project.

Xiang-Jun

1238
Hi Vandana,

I am a bit surprised that you cannot run the basic 'pwd' (present working directory) command in you MinGW/MSYS -- Windows 7 system. Anyway, I'm glad to know that you've got 3DNA up and running; now you can start to play around with 3DNA. If you have any questions, do not heistate to post back on the forum!

Xiang-Jun

1239
Hi Vandana,

Sorry to hear your problem in setting up 3DNA on Windows 7 using MinGW/MSYS. To help identify where the problem is, please do the followings:
  • First, change into the 3DNA bin/ directory
  • Then type: pwd, what is the output?
  • What's the output of ruby -v?
  • If the above step runs successfully, what's the output of ruby x3dna_setup?

Xiang-Jun

1240
General discussions (Q&As) / Re: A-DNA definition
« on: April 02, 2012, 03:12:55 pm »
Hi Arnab,

Quote
However, what I really meant is to supply the coordinate directly from xtc to 3DNA directly bypassing the pdb so that the analysis is extremely fast.

I see your point. By adhering to the standard MODEL/ENDMDL delineated PDB format, however, x3dna_ensemble can handle directly an NMR ensemble. Via a purpose-specific format adaptor, the script should be applicable to the analysis of simulation trajectories from any third-party MD package. Intuitively, I feel this approach is simple, flexible, and practical. Of course, only 3DNA users, especially MD practitioners, can judge if x3dna_ensemble is able to meet real-world challenges. Please share your experience.

Xiang-Jun
 

1241
General discussions (Q&As) / Re: A-DNA definition
« on: April 02, 2012, 11:43:41 am »
Hi Arnab,

I'm glad our conversation in this thread helped clear your doubt. It has been at least several years since I looked at the details about the classification of dinucleotide steps in 3DNA. Your questiones refreshed my memory on this topic.

From the output files you attached, I know you are using 3DNA v2.0. Did you know that as of v2.1, 3DNA provides a Ruby script x3dna_ensemble for the analysis of MD simulation trajectories? The help info is as below:

Code: [Select]
x3dna_ensemble -h
------------------------------------------------------------------------
Utilities for the analysis and visualization of an ensemble
    Usage: x3dna_ensemble [-h|-v] sub-command [-h] [options]
    where sub-command must be one of:
        analyze -- analyze MODEL/ENDMDL delineated ensemble (NMR or MD)
        block_image -- generate a base block schematic image
        extract -- extract structural parameters after running 'analyze'
        reorient -- reorient models to a particular frame/orientation
------------------------------------------------------------------------
  --version, -v:   Print version and exit
     --help, -h:   Show this message

Note that the script starts with a MODEL/ENDMDL delineated ensemble or a collection of individual entries in the standard PDB format. For an example, see the directory $X3DNA/examples/ensemble/md, and run the following to see the possibilities:

Code: [Select]
x3dna_ensemble analyze -h
x3dna_ensemble extract -h

Quote from: Arnab
If you don't have time and need additional pair of hands, I can create a patch between GROMACS and 3DNA where a GROMACS trajectory can be analyzed for all 3DNA related information.
To make 3DNA better sever the community, I really need help from responsive and enthusiastic 3DNA users like you! The script x3dna_ensemble currently does not directly read a third-party specific trajectory file, so I have been planning to add a convert sub-command with options for GROMACS, Amber, CHARMM etc. Your contribution to "create a patch between GROMACS and 3DNA where a GROMACS trajectory can be analyzed for all 3DNA related information" is certainly welcome.

To consolidate our efforts, could you please do the following:
  • Download and install 3DNA v2.1beta, and try out the examples mentioned above to see how the new facilities help your workflow.
  • Check to see how much can be provided from GROMACS. In a recent thread titled "How to properly use x3dna_ensemble?", I become aware of the fact that "Gromacs can actually devide an xtc-trajectory file into separate pdb files in MODEL/ENDML format."
  • After checking the above two points, we can focus on what are still missing or inconvenient.
Best regards,

Xiang-Jun

1242
General discussions (Q&As) / Re: A-DNA definition
« on: April 01, 2012, 05:24:57 pm »
Hi Arnab,

Thanks for your follow up. I am impressed by your attention to the little "details" -- oftentimes, the small part counts a lot.

Quote from: Arnab
Therefore, I assume there must be more to the story.
You are absolutely right -- see below for details.

Quote
To add to my previous post,  the sanity check clears out str221.pdb for 6CG/CG step which has Zp 1.93 but unassigned type. However, I don't know the check corresponding to "WC_info && WC_info[i + 1]  /* WC geometry */". Therefore, this may be a part of the issue of not getting the right form.
From the two PDB files you attached, it is easy to verify that all bps are of standard Watson-Crick type. So there is no issue with sanity check on WC_info(i) and WC_info(i + 1).

The real underlying reason for your observed discrepancy between str221 and str226 is as follows: to be on the safe side, 3DNA performs an additional check before assigning a dinucleotide step into A-, B- or TA-DNA form: there must be at least two consecutive dinucleotide steps of the same type to avoid any single isolated (mostly spurious) "transition" step.

Take your str221.pdb as an example,
    step       Xp      Yp      Zp     XpH     YpH     ZpH    Form
   1 GC/GC   -4.05    9.12   -1.51   -1.02    8.24   -4.15
   2 CG/CG   -2.24    8.83    0.29   -1.71    8.83    0.32     B
   3 GC/GC   -4.06    9.10   -0.65   -6.93    8.67    2.77     B
   4 CA/TG   -3.22    9.18    0.30   -7.45    8.28    4.02
   5 AC/GT   -3.17    9.21    1.03   -5.69    9.14    1.49
   6 CG/CG   -3.38    8.33    1.93   -7.49    8.45    1.33
   7 GT/AC   -3.58    8.97    0.80   -7.40    8.81    1.83
   8 TG/CA   -4.26    8.76   -0.67   -9.08    6.51    5.89
   9 GC/GC   -2.30    8.70   -0.08   -0.73    8.70    0.21     B
  10 CG/CG   -3.74    9.22   -0.75   -7.66    8.74    2.97     B
  11 GC/GC   -2.90    9.03    0.30   -5.61    8.55    2.92     B

According to the criteria detailed in my previous reply, step 6 CG/CG is indeed classified as A-DNA, since its Zp (1.93) > 1.5 Å. However, each of its neighbors -- 5 AC/GT and 7 GT/AC -- has a Zp < 1.5 Å, so neither is in A-form. Thus, 6 CG/CG is downgraded as unclassified. Note that without this additional check, step 4 CA/TG would have been taken as TA-DNA [Zp(h) = 4.02 > 4.0 Å].

With the above note, one can see easily why step 6 CG/CG in str226 is classified as A-DNA -- it's simply because its neighbor 5 AC/GT is also A-DNA.
    step       Xp      Yp      Zp     XpH     YpH     ZpH    Form
   1 GC/GC   -4.37    8.77   -1.05   -4.95    8.83   -0.22     B
   2 CG/CG   -2.56    8.63    0.13   -2.38    8.49    1.55     B
   3 GC/GC   -3.74    9.12   -0.62   -4.46    9.06   -1.34     B
   4 CA/TG   -2.64    9.32    0.91   -6.75    7.91    5.00
   5 AC/GT   -3.16    9.32    1.56   -7.39    9.41    0.91     A
   6 CG/CG   -3.07    8.71    2.09   -7.45    8.73    1.93     A

   7 GT/AC   -3.49    9.02    0.43   -6.64    8.97    1.09     B
   8 TG/CA   -3.83    9.27   -0.68   -7.64    9.09    1.94     B
   9 GC/GC   -2.69    8.96    0.34   -2.36    8.95    0.55     B
  10 CG/CG   -4.15    8.96   -0.57   -7.70    8.62    2.56     B
  11 GC/GC   -3.43    8.71    1.58   -6.69    8.70    1.66

I may refine the criteria used for dinucleotide classification in future release of 3DNA, and I welcome your feedback. For your analysis of MD simulation trajectories, I'd suggest that you check directly the raw data (Xp, Yp, Zp, XpH, YpH, ZpH etc).

HTH,

Xiang-Jun
 

1243
General discussions (Q&As) / Re: A-DNA definition
« on: April 01, 2012, 10:31:26 am »
The scheme of classifying a dinucleotide step into A-, B- or TA-DNA form is described in the 2003 NAR paper. More specifically, it is based on Zp and Zp(h); see Figure 5(c) linked below. For example, if Zp > 1.5 Å, then it is taken as A-DNA.



Per your request, listed below is the exact definition for A-, B- and TA-DNA, as excerpted from 3DNA source code. Note the "sanity check" at the beginning; the empirical criteria try to ensure a right-handed duplex consisting of Watson-Crick bps and with reasonable geometry. Also bear in mind that the classification is intended to be indicative rather than conclusive.

Code: [Select]
if (dval_in_range(mtwist, 10.0, 60.0)  /* over-all twist average */
    && WC_info[i] && WC_info[i + 1]  /* WC geometry */
    && dval_in_range(twist_rise[i][1], 10.0, 60.0)  /* right-handed */
    && dval_in_range(twist_rise[i][2], 2.5, 5.5)  /* Rise in range */
    && dval_in_range(aveS[i][1], -5.0, -0.5)  /* Xp */
    && dval_in_range(aveS[i][2], 7.5, 10.0)  /* Yp */
    && dval_in_range(aveS[i][3], -2.0, 3.5)  /* Zp */
    && dval_in_range(aveH[i][1], -11.5, 2.5)  /* XpH */
    && dval_in_range(aveH[i][2], 1.5, 10.0)  /* YpH */
    && dval_in_range(aveH[i][3], -3.0, 9.0)) {  /* ZpH */
    if (aveS[i][3] >= 1.5)  /* A-form */
        strABT[i] = 1;
    else if (aveH[i][3] >= 4.0)  /* TA-form */
        strABT[i] = 3;
    else if (aveS[i][3] <= 0.5 && aveH[i][1] < 0.5)  /* B-form */
        strABT[i] = 2;  /* aveS[i][3] < 0.5 for C-DNA #47 */
}

HTH,

Xiang-Jun

1244
MD simulations / Re: How to properly use x3dna_ensemble?
« on: March 27, 2012, 01:15:31 pm »
Glad to know that you've made some progress. However, the message as shown below still bothers me:
Quote
ruby 5276 child_info_fork::abort: address space needed by 'etc.so' (0x370000) is already occupied
A quick Google search turns out quite a few hits concerning Ruby, and Cygwin on Windows. Will following Cygwin FAQ #4.44 "How do I fix fork() failures?" solve your problem? Specifically, the following sentence seems to address this problem:
Quote
Read the 'rebase' package README in /usr/share/doc/rebase/, and follow the instructions there to run 'rebaseall'

Additionally, what version of Ruby are you using?
Code: [Select]
ruby -v
I am switching from Perl to Ruby as the scripting language for 3DNA, hopefully this won't cause practical issues for Windows users.

Xiang-Jun

1245
MD simulations / Re: How to properly use x3dna_ensemble?
« on: March 27, 2012, 11:02:16 am »
Hi Nikolay,

The first two steps are fine. The third step is weird -- are you using MinGW or Cygwin version? This is a problem not necessarily specific to 3DNA, but obviously I'd like to see a solution to it.

Xiang-Jun

1246
MD simulations / Re: How to properly use x3dna_ensemble?
« on: March 27, 2012, 10:11:29 am »
Hi Nikolay,

Glad to hear that you've made progress. However, from my understanding of what you described, it seems something is still not quite right. The fixed-name file bp_step.par contains only the parameters for a single structure (snapshot), not the whole ensemble. It is a bit more difficult to explain the details in text, so I would suggest you repeat the examples in the directory $X3DNA/examples/ensemble/md/:
Code: [Select]
cd $X3DNA/examples/ensemble/md/
x3dna_ensemble analyze -h
x3dna_ensemble extract -h
Once you understand how the examples work, you should be able to apply the same idea to the analysis of your MD trajectories. Of course, if you have any questions, please do not hesitate to post back at the forum.

Xiang-Jun

PS: command-line help
The help page for x3dna_ensemble analyze
Code: [Select]
------------------------------------------------------------------------
Analyze a MODEL/ENDMDL delineated ensemble of NMR structures or MD
trajectories. All models must correspond to different conformations of
the same molecule. A template base-pair input file, generated with
'find_pair' and corrected manually as necessary, must be provided.

Usage:
        x3dna_ensemble analyze options
Examples:
        x3dna_ensemble analyze -b bpfile.dat -e sample_md0.pdb
             # 21 models (0-20); output (default): 'ensemble_example.out'
             # also generate 'model_list.dat', see example below
        x3dna_ensemble analyze -b bpfile.dat -m model_list.dat -o ensemble_example2.out
             # diff ensemble_example.out ensemble_example2.out

        x3dna_ensemble analyze -b bpfile.dat -p 'pdbdir/model_*.pdb' -o ensemble_example3.out
             # note to quote the -p option; 20 models (1-20)
             # also generate 'pdb_list.dat', see example below
        x3dna_ensemble analyze -b bpfile.dat -l pdb_list.dat -o ensemble_example4.out
             # diff ensemble_example3.out ensemble_example4.out
             # note the order of the models: 1, 10..19, 2, 20, 3..9
Options:
------------------------------------------------------------------------
    --bpfile, -b <s>:   Name of file containing base-pairing info
   --outfile, -o <s>:   Output file (default: ensemble_example.out)
  --ensemble, -e <s>:   Ensemble delineated with MODEL/ENDMDL pairs
    --models, -m <s>:   File containing an explicit list of model numbers
   --pattern, -p <s>:   Pattern of model files to process (e.g., *.pdb)
      --list, -l <s>:   File containing an explicit list of models
          --info, -i:   Show only model info in the ensemble [with -e]
          --help, -h:   Show this message

The help page for x3dna_ensemble extract
Code: [Select]
------------------------------------------------------------------------
Extract 3DNA structural parameters of an ensemble of NMR structures or
MD trajectories, after running 'x3dna_ensemble analyze'. The extracted
parameters are intended to be exported into Excel, Matlab and R etc for
further data analysis/visualization.

Usage:
        x3dna_ensemble extract options
Examples:
        x3dna_ensemble extract -l
             # to see a list of all parameters
        x3dna_ensemble extract -p prop
             # for propeller, no need to specify full: -p pr suffices
             # -p 36 also fine (see above); use 'ensemble_example.out'
        x3dna_ensemble extract -p slide -s , -f ensemble_example3.out
             # comma separated, from file 'ensemble_example3.out'
        x3dna_ensemble extract -p roll -s ' ' -n -o roll.dat
             # space separated, no row-label, to file 'roll.dat'
        x3dna_ensemble extract -e 1 -p chi1
             # extract the chi torsion angle of strand I, but exclude
             # those from the two terminal base pairs. For comparison,
             # run also: x3dna_ensemble extract -p chi1
        x3dna_ensemble extract -a
             # extract all parameters, each in a separate file
Options:
------------------------------------------------------------------------
  --separator, -s <s>:   Separator for fields [\t] (default: )
   --par-name, -p <s>:   Name of parameter to extract
   --fromfile, -f <s>:   Parameters file (default: ensemble_example.out)
    --outfile, -o <s>:   File of selected parameter (default: stdout)
   --end-bps, -e <i+>:   Number of end pairs to ignore (default: 0, 0)
            --all, -a:   Extract all parameters into separate files
          --clean, -c:   Clean up parameter files by the -a option
           --list, -l:   List all parameters
        --no-1col, -n:   Delete the first (label) column
           --help, -h:   Show this message


1247
MD simulations / Re: How to properly use x3dna_ensemble?
« on: March 26, 2012, 10:29:30 am »
Hi Nikolay,

I am glad to hear that Gromacs provides the facility to output an ensemble of multiple models delineated by MODEL/ENDMDL. As of 3DNA v2.1beta (currently distributed), the x3dna_ensemble Ruby script can handle multiple structures in a PDB MODEL/ENDMDL ensemble:

Code: [Select]
x3dna_ensemble -h
------------------------------------------------------------------------                                                       
Utilities for the analysis and visualization of an ensemble
    Usage: x3dna_ensemble [-h|-v] sub-command [-h] [options]
    where sub-command must be one of:
        analyze -- analyze MODEL/ENDMDL delineated ensemble (NMR or MD)
        block_image -- generate a base block schematic image
        extract -- extract structural parameters after running 'analyze'
        reorient -- reorient models to a particular frame/orientation
------------------------------------------------------------------------
  --version, -v:   Print version and exit
     --help, -h:   Show this message

Check the $X3DNA/examples/ensemble/ directory for examples, and report back if you have any problem. Note that I still need to add a detailed documentation of the new x3dna_ensemble utility. However, it should be straightforward to play with, and I am always quicker in responding to user's questions than writing doc ...

Xiang-Jun

1248
Hi Hugh,

Thank you so much for contributing back your Perl script that solves your problem, and providing new sample Gaussian-Babel-generated PDB date files. At it turns out, the three PDB files you attached -- AT.pdb, AU.pdb, and GC.pdb -- are all fine with 3DNA. You can verify this point by running find_pair with the -s option:
Code: [Select]
find_pair -s GC.pdb stdout
# and it will output the following:
GC.pdb
GC.outs
    1      # single helix
    2      # number of bases
    1    1 # explicit bp numbering/hetero atoms
    1      # ....>A:...1_:[..G]G
    2      # ....>B:...1_:[..C]C
However, none of the three PDB files contains a base pair, per the default parameters -- check using a molecular graphics viewer like Jmol or PyMOL.

The atom naming issue related to PDB files from computational chemistry packages (e.g. Gaussian) and Babel has appeared a few times in the 3DNA forum. As far as 3DNA is concerned, your effort has led to the first known solution (I am aware of) to this problem. Your question has prompted me to read the article "Open Babel: An open chemical toolbox" and download the latest Open Babel v2.3.1.

Best regards,

Xiang-Jun

1249
Thanks for using 3DNA and your elaborate post -- your attached PDB file helped in uncovering where the problem is.

Quote
1. Is the attached file properly formatted for use with 3DNA?
No, it is not. Specifically, the atom names do not conform to the PDB convention. Using one of the U residues as an example, see the following two images:
Gaussian-Babel PDBStandard PDB
On the left is the U based on Gaussian-Babel generated PDB file, and on the right is based on the standard PDB file. Notice how the standard PDB have names like " N1 " instead of " N  ", and " O2 " instead of " O  " etc. Proper atom names are important for 3DNA to identify which atom is which.

Quote
2. If find_pair does not find a base pair, will it still output the base pair geometry parameters that were calculated?
The problem is not that find_pair misses a pair due to parameter cutoffs, but the residues are not taken as nucleotides at all. Your best bet is simply to make your PDB file standard compliant, then both problems will be gone.

In your attached test.pdb file, there are two uracils, which follow the same atom ordering and naming convention. Could you provide me example files with A, C, G, and T? It may be worthwhile to have a utility program in 3DNA that can convert Gaussian-Babel generated PDB file to the standard format.

Xiang-Jun


1250
Hi Kumutha,

Thanks for posting the DNA sequence and its beautiful secondary structure predicted with mfold. 3DNA per se, and the 3DNA server based on it, does not have the magic (yet) to directly convert such a secondary structure into tertiary structure. However, I do believe 3DNA has functionality to help solve some components of the puzzle. In your case, the long double-helical stem can be approximated with a fiber B-DNA model; then you need to model the loop part at the top and the two terminal nucleotides at the bottom end. You need to assemble the three parts together, and possibly perform some energy minimization to achieve good stereochemistry.

You may find the article "RNA-Puzzles: A CASP-like evaluation of RNA three-dimensional structure prediction" published in the April 2012 [18 (4)] issue of the RNA journal helpful.

Xiang-Jun

Pages: 1 ... 48 49 [50] 51 52 ... 63

Created and maintained by Dr. Xiang-Jun Lu [律祥俊] (xiangjun@x3dna.org)
The Bussemaker Laboratory at the Department of Biological Sciences, Columbia University.