Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Netiquette · Download · News · Gallery · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · Video Overview · DSSR v2.8.0 (DSSR Manual) · Homepage

Messages - xiangjun

Pages: 1 ... 48 49 [50] 51 52 ... 66

1226

General discussions (Q&As) / Re: no matching entry for atom name [OP1 ] (OP..) in 'atomlist.dat

« on: October 05, 2012, 03:54:19 pm »

Hi Asmita,

Thanks for joining the 3DNA-user community!

Posting your question on the forum is a good first step; attaching a specific example so that your problem can be reproduced is better still.

The problem is due to the non-standard format of your PDB file, as shown below:


# The following is extracted from your attached "struct_1.pdb"
ATOM     30 P    G       2      -3.465  14.386  -4.840  0.00  0.00
ATOM     31 OP1  G       2      -4.272  15.372  -4.078  0.00  0.00
ATOM     32 OP2  G       2      -2.267  13.805  -4.173  0.00  0.00
ATOM     17  P    DG A   2      23.337  31.278  21.156  1.00 13.26           P  
ATOM     18  OP1  DG A   2      24.761  31.571  21.391  1.00 13.17           O  
ATOM     19  OP2  DG A   2      22.651  31.834  19.956  1.00 12.34           O
# The above in red is taken from PDB entry 355d

As you can see clearly, the atom name (and residue name) in your PDB file is shifted to the left by one column. So the atom name for OP1 is taken as

"OP1 " instead of the normal " OP1",

and that explains the message you saw:

Quote

no matching entry for atom name [OP1 ] (OP..) in 'atomlist.dat'

Adding an entry "OP.. O" (note the two dots in place of digit and space) to file 'atomlist.dat' will make the info message go away.

The most effective way to fix such problems is to ensure your PDB file is standard compliant [see Coordinate File Description (PDB Format)]. In your case, ask the developers of your MD package to generate standard compliant PDB file, or you can write a simple script to make the changes. This is the first time I am aware of such problem; given enough interest, I will consider to refine 3DNA to accommodate such non-standard cases.

HTH,

Xiang-Jun

1227

MD simulations / Re: Using find_pair

« on: September 18, 2012, 11:33:25 pm »

Hi Johnny,

Thanks for your follow-up; it certainly helps clarify the situation:

The output of "uname -a" means your Ubuntu is 32-bit -- google "ubuntu 32 bit or 64 bit how to tell" for details. This explains why the 3DNA 64-bits version "was not compatible" on your machine.

So if you insist on using Ubuntu instead of Mac OS X, a Ubuntu 64-bit version seems the way to go.

Xiang-Jun

1228

General discussions (Q&As) / Data files for Table 3 of the standard base-reference frame article

« on: September 18, 2012, 10:32:15 pm »

Table 3 of the Olson et al. (2001) "standard base-reference frame article" lists the mean values and standard deviations of base geometric parameters for high resolution A-DNA and B-DNA crystal structures, as shown below.

The selection criteria of the A- and B-DNA datasets have recently been reported in the thread "Datasets and scripts for reproducing Figure 5 of the 3DNA NAR03 paper". For the sake for easy reference and completeness, here is the note again:

Quote

Selection Criteria:
   NDB ID: ad OR bd
   Classification: DNA
   Structure Description: Double Helix
   Conformation Type: A OR B
   No Drug, No Mismatch
   No Modifiers (Base/Sugar/Phosphate)
   Resolution better than 2.0 A
   =======================
   34 A-DNA and 27 B-DNA

For B-DNA, delete bd0012, bd0013 & bdf068 (following HMB)
   bd0001 bd0006_A
   bd0014: coordinates from PDB 463D
   bd0005 bd0016_A (with repeated atoms!)
   bd0018 bd0019 bdj017 bdj019 bdj025 bdj031 bdj036 bdj037 bdj051
   bdj052 bdj060 bdj061 
   bdj081 (Uses helix #1 with strands A and B. The other two are
           disordered)
   bdl001 bdl005 bdl020 bdl084
   bd0023_A  bd0029
   -------------------------- 27-3=24 structures

For A-DNA
   ad0002 ==> (ad0002_AB + ad0002_CD)
   ad0003 ad0004 adh008 adh010 adh0102 adh0103 adh0104 adh0105
   adh014 adh026 adh027 adh029 adh033 adh034 adh038 adh039 adh047
   adh070 adh078 adj0102 adj0103 adj0112 adj0113 adj022 adj049
   adj050 adj051 adj065 adj066 adj067 adj075 
   adl025 (suspicious! big Buckle, alternating Propeller)
   adl047 (with B-steps, not good either!)
   -------------------------- 34+1-2=33 structures

Outliers:
  A-DNA: ad0002_CD, steps 3-4,   bps 3-4-5
         ad0004,    steps 3-4-5, bps 3-4-5-6
  B-DNA: bdj025,    step 3,      bps 3-4
         bdj031,    step 3,      bps 3-4
         bdj037,    step 3,      bps 3-4

The six data files themselves are attached below; here the A- prefix is for A-DNA, and B- prefix for B-DNA:

'A-base-pair.dat' and 'B-base-pair.dat' contain the base-pair parameters in the order of Shear, Stretch, Stagger, Buckle, Propeller, and Opening.
'A-step-pars.dat' and 'B-step-pars.dat' contain the step parameters in the order of Shift, Slide, Rse, Tilt, Roll and Twist.
'A-heli-pars.dat' and 'B-heli-pars.dat' contain the helical parameters in the order of x-displacement, y-displacement, Helical rise, Inclination, Tip, and Helical twist.

While the Table content is derived from NDB entries with only Watson-Crick base pairs in A- and B-DNA duplexes, it serves as a reference for identifying/quantifying non-canonical (mismatched) pairs by taking advantage the base-pair parameters. This approach is rigorous in its description of the relative base geometry in a pair, and is distinct from and complement with the Leontis-Westhof classification scheme.

1229

MD simulations / Re: Using find_pair

« on: September 18, 2012, 09:52:28 pm »

Thanks for using 3DNA and for posting your question on the forum.

Which version of 3DNA are you using:, v1.5, v2.0 or v2.1beta? What's your Linux variant -- what's the output of running "uname -a"?

What do you mean specifically that you could run the huge 2GB+ pdb file easily on your Mac? Using "find_pair" or other programs?

This is the first time I meet such a question, so I have more questions to you before I could possibly provide a solution.

Xiang-Jun

1230

General discussions (Q&As) / Re: Datasets and scripts for reproducing Figure 5 of the 3DNA NAR03 paper

« on: September 07, 2012, 02:15:17 pm »

Thanks for your patience -- it took me quite some time to dig into my files used for the 2003 3DNA paper! Luckily, I got them, and the time has been well-worth spent

.

Here are the details -- the whole datasets and scripts can be downloaded by following the link: 3DNA-NAR03-Fig5.tar.gz. Figure 5(a)-(c) generated with the scripts and data files are attached.

Content of the README file:

This folder (3DNA-NAR03-Fig5) contains all the data files and scripts
to reproduce Figure 5 of the 2003 3DNA paper in Nucleic Acids Research
(NAR03). The contents are taken from the original materials I used to
create Figure 5 of NAR03, with slight editing. Specifically, I revised
the Matlab scripts to work in GNU Octave v3.2.4 for verification.

If you have any questions or comments, please do post them on the 3DNA
Forum.

2012-09-06 -- Xiang-Jun Lu (http://x3dna.org)

========================================================================

Data selections:
    'note-AB-datasets' -- datasets of selected A- and B-DNA structures
    'note-TA-dataset'  -- dataset of selected TA-DNA structures

Data files:
    'A-heli-pars.dat' -- six helical parameters 
    'A-step-pars.dat' -- six step parameters
    'A-zp-zph.dat'    -- Zp and ZpH parameters
        Selected parameters of the A-DNA dataset. Note that the order
        the parameters is as in .out file from running 'analyze'

    'B-heli-pars.dat',  'B-step-pars.dat',  'B-zp-zph.dat' for B-DNA
    'TA-heli-pars.dat', 'TA-step-pars.dat', 'TA-zp-zph.dat' for TA-DNA

Scripts:
    'incl_xdsp.m' -- script to generate Figure 5(a), Inclination vs Tip
                  'incl_xdsp.png' -- output file from running the script
    'roll_slide.m' -- script to generate Figure 5(b), Roll vs Slide
                  'roll_slide.png' -- output file from running the script
    'zph_zp.m' -- script to generate Figure 5(c), Zp(h) vs Zp
                  'zph_zp.png' -- output file from running the script
    'draw_ellipse.m', 'get_pars.m', 'open_file.m' -- supporting scripts

Content of file note-AB-datasets

Selection Criteria:
   NDB ID: ad OR bd
   Classification: DNA
   Structure Description: Double Helix
   Conformation Type: A OR B
   No Drug, No Mismatch
   No Modifiers (Base/Sugar/Phosphate)
   Resolution better than 2.0 A
   =======================
   34 A-DNA and 27 B-DNA

For B-DNA, delete bd0012, bd0013 & bdf068 (following HMB)
   bd0001 bd0006_A
   bd0014: coordinates from PDB 463D
   bd0005 bd0016_A (with repeated atoms!)
   bd0018 bd0019 bdj017 bdj019 bdj025 bdj031 bdj036 bdj037 bdj051
   bdj052 bdj060 bdj061 
   bdj081 (Uses helix #1 with strands A and B. The other two are
           disordered)
   bdl001 bdl005 bdl020 bdl084
   bd0023_A  bd0029
   -------------------------- 27-3=24 structures

For A-DNA
   ad0002 ==> (ad0002_AB + ad0002_CD)
   ad0003 ad0004 adh008 adh010 adh0102 adh0103 adh0104 adh0105
   adh014 adh026 adh027 adh029 adh033 adh034 adh038 adh039 adh047
   adh070 adh078 adj0102 adj0103 adj0112 adj0113 adj022 adj049
   adj050 adj051 adj065 adj066 adj067 adj075 
   adl025 (suspicious! big Buckle, alternating Propeller)
   adl047 (with B-steps, not good either!)
   -------------------------- 34+1-2=33 structures

Outliers:
  A-DNA: ad0002_CD, steps 3-4,   bps 3-4-5
         ad0004,    steps 3-4-5, bps 3-4-5-6
  B-DNA: bdj025,    step 3,      bps 3-4
         bdj031,    step 3,      bps 3-4
         bdj037,    step 3,      bps 3-4

Content of file note-TA-dataset

pd0070, pd0112, pd0154, pd0155, pd0156 pd0157, pd0158, pd0159, pd0160,
pd0161, pd0162, pd0163, pd0164, pdr031 pdt009, pdt012, pdt024, pdt025,
pdt032, pdt034, pdt036

This directory contains TATA box segments. It is normally 8-bp long, and
has the sequence: T-A-T-A-@-A-@-N. There are two kinks at the terminal
steps.

* means non-WC base-pair which is eliminated from further analysis

NDB ID  ##     Sequence      Res(A)  R-fac(%) chainID and residue range
--------------------------------------------------------------------
pd0070  01  T-T-T-A-A-A-T-A   2.4     20.0   C 1410 1417 D 1432 1439
                           
pd0112  02  T-A-T-A-A-A-A-G   2.65    23.1   K 8 15 L 105 112
        03  T-A-T-A-A-A-A-G                  C 8 15 D 105 112
        04  T-A-T-A-A-A-A-G                  G 8 15 H 105 112
        05  T-A-T-A-A-A-A-G                  O 8 15 P 105 112
        06  T-A-T-A-A-A-A-G                  S 8 15 T 105 112
                                      
pd0154  07  T-A-T-A-A-A-A-T   1.86    21.0   C 203 210 D 219 226
        08  T-A-T-A-A-A-A-T                  E 203 210 F 219 226
                                   
pd0155  09  T-A-T-A-A-G-A-G*  1.93    19.6   C 203 209 D 220 226
        10  T-A-T-A-A-G-A-G*                 E 203 209 F 220 226
   
pd0156  11  T-A-T-A-A-T-A-G*  2.1     19.3   C 203 209 D 220 226
        12  T-A-T-A-A-T-A-G*                 E 203 209 F 220 226
                                   
pd0157  13  T-A-T-A-T-A-A-G*  2.3     19.4   C 203 209 D 220 226
        14  T-A-T-A-T-A-A-G*                 E 203 209 F 220 226
                                   
pd0158  15  T-A-T-T-A-A-A-G*  2.1     19.4   C 203 209 D 220 226
        16  T-A-T-T-A-A-A-G*                 E 203 209 F 220 226
                                   
pd0159  17  T-A-C-A-A-A-A-G*  1.9     20.9   C 203 209 D 220 226
        18  T-A-C-A-A-A-A-G*                 E 203 209 F 220 226
   
pd0160  19  T-T-T-A-A-A-A-G*  1.8     19.3   C 203 209 D 220 226
        20  T-T-T-A-A-A-A-G*                 E 203 209 F 220 226
                                      
pd0161  21  T-A-T-A-A-A-T-G*  2.23    19.1   C 203 209 D 220 226
        22  T-A-T-A-A-A-T-G*                 E 203 209 F 220 226
                                      
pd0162  23  A-A-T-A-A-A-A-G*  2.3     18.2   C 203 209 D 220 226
        24  A-A-T-A-A-A-A-G*                 E 203 209 F 220 226
                                      
pd0163  25  T-A-T-A-A-A-A-G   1.9     19.7   C 203 210 D 219 226
        26  T-A-T-A-A-A-A-G                  E 203 210 F 219 226
                                      
pd0164  27  T-A-T-A-A-A-C*G*  1.95    19.9   C 203 208 D 221 226
        28  T-A-T-A-A-A-C*G*                 E 203 208 F 221 226
                                      
pdr031  29  T-T-T-t-t-A-A-A   2.1     21.2   C 1408 1415 E 1420 1427
                                      
pdt009  30  T-A-T-A-A-A-A-G   2.25    20.2   A 203 210 B 305 312
        31  T-A-T-A-A-A-A-G                  C 403 410 D 505 512
                                      
pdt012  32  T-A-T-A-T-A-A-A   1.8     20.1   C 2 9 C 21 28
        33  T-A-T-A-T-A-A-A                  D 2 9 D 21 28
                                      
pdt024  34  T-A-T-A-T-A-T-A   2.9     21.4   B 103 110 C 115 122
                                      
pdt025  35  T-A-T-A-A-A-A-G   1.9     19.4   C 203 210 D 219 226
        36  T-A-T-A-A-A-A-G                  E 303 310 F 319 326
                                      
pdt032  37  T-A-T-A-A-A-A-G   2.7     21.5   C 4 11 D 106 113
                                      
pdt034  38  T-A-T-A-A-A-A-G   1.9     18.9   B 5 12 C 105 112
                                      
pdt036  39  T-A-T-A-A-A-A-C   2.5     23.5   E 9 16 F 1 8

HTH,

Xiang-Jun

PS. As a matter of fact, the A- and B-DNA datasets are those used in Table 3 of the report on standard base reference.

1231

MD simulations / Re: About output files from x3dna_md.rb

« on: September 07, 2012, 12:31:11 pm »

Thanks for using 3DNA, and posting your questions on the forum.

Regarding your two questions:

1º) The missing atom messages are normal; they are for information only. "This structure has broken O3' to P[i+1] linkages" -- it means that there is an occurrence where nucleotides i and i+1 are not connected, as judged by out of range distance from O3'(i) to P(i+1). I am glad that you pay attention to this little detail. It may not be a concern -- you need to check a sample structure to be sure. If you need help, please attach such a structure with your follow up post.

2º) The two .out files are fine, and only 'RNA.out' is what you need. The example folder has four .out files correspond to the four different options:

Code: [Select]

  --ensemble, -e <s>:   Ensemble delineated with MODEL/ENDMDL pairs
    --models, -m <s>:   File containing an explicit list of model numbers
   --pattern, -p <s>:   Pattern of model files to process (e.g., *.pdb)
      --list, -l <s>:   File containing an explicit list of models

I am glad to see details on how 3DNA ('x3dna_md.rb' script) is being used.

Quote

I work in Gromacs program with Amber force field. I converted my trajectory, in separeted pdbs files such as example folder.

As noted above, the script can be run in four different modes, so you don't need to convert your trajectory into separate PDB files; the snapshots can be put into a single MODEL/ENDMDL delineated ensemble, as NMR structures in the PDB.

Note that in the current v2.1(beta), x3dna_md.rb has been replaced by x3dna_ensemble which has far more enhanced functionality. Please consider to update to the latest version.

HTH,

Xiang-Jun

1232

General discussions (Q&As) / Re: Datasets and scripts for reproducing Figure 5 of the 3DNA NAR03 paper

« on: August 31, 2012, 05:38:14 pm »

Quote

I'd like to ask about the DNA set used for the analysis that is presented in Fig 5. in the NAR 2003 paper. Are those structures previously classified as A, B and TA DNA by other means (?) before doing the Zp and Zp(h) calculations to confirm their differences? Where can I look for the structures which were used? (I guess it is somewhere in reference 81, Patikoglou,G.A. et al (1999))

Thanks for asking about the DNA datasets used in Figure 5 of the 3DNA 2003 NAR paper. Yes, the structures are previously assigned as A-, B- and TA-DNA by other means before we introduced Zp and Zp(h) to classify the three types of dinucleotide steps automatically. A- and B-DNA are based on conventional parameters (Slide/Roll, sugar puckers etc), as in the NDB, and the TA-DNA is mainly inspired by the work of Guzikevich‐Guerstein and Shakked (ref. 80):

Quote from: 3DNA 2003 NAR paper

A detailed structural analysis of two early examples of the TATA‐box DNA bound to the TATA‐box binding protein (TBP) (10,79) led Guzikevich‐Guerstein and Shakked (80) to propose that the 8 bp TATA‐box adopts a novel TA‐DNA conformation, different from either A or B DNA. The structures of many more such complexes have since been determined (81) and, as shown in Table 2 and Figure 5, all TATA‐box regions share similar conformational features.

So the complete list was not taken directly from somewhere in ref. 81, but compiled specifically for the work. The actual structure list used in producing Figure 5 for the TA-DNA steps can be found in the thread "DNA standards/statistics using 3DNA", dated August 2006. For A-DNA and B-DNA structures used in Figure 5 of the 2003 paper, I need to locate my original record from (nearly) a decade ago -- I will write a post about my findings on the 3DNA homepage, possibly by next week.

HTH,

Xiang-Jun

1233

General discussions (Q&As) / How to identify triplets, quadruplets and higher-order base associations

« on: August 16, 2012, 11:48:16 pm »

The find_pair -p option can find all base pairs and higher-order base associations. I implemented this option early on in 3DNA v1.x; yet in the 2003 Nucleic Acids Research paper and the corresponding find_pair -h output for v1.5, I deliberately omitted mentioning this functionality. I was hoping to further refine the algorithm/implementation, and to write up a detailed method paper on find_pair, a core 3DNA component. After leaving Rutgers nearly a decade ago, I've continuously maintained and refined 3DNA. However, for various reasons, up to now I've not been able to finish the long overdue 'technical' manuscript.

Over the years, numerous RNA structural bioinformatics resources have taken advantage of the functionality provided by find_pair; RNAView & BPS, two Rugters-based tools, are based directly on early versions of the program. It was only in the 3DNA 2008 Nature Protocols paper that I first illustrated the functionality of the find_pair -p option, in the protocol "identification of higher-order base associations in ribosomal RNA". This post provides further detailed examples so 3DNA users can take better advantage of this still underused functionality, useful in RNA structure related applications.

Let's create a new directory (folder), named 'find_pair-p-examples', and change to that directory. Now the directory is empty (check with ls).

Code: [Select]

mkdir find_pair-p-examples
cd find_pair-p-examples
ls

As an example, here we use the crystal structure of an RNA tetraplex (UGAGGU)₄ with A-tetrads, G-tetrads, U-tetrads and G-U octads: NDB id: ur0023; PDB id: 1j6s (see figure below). The structure was solved by Sundaralingam et al. at 1.4 Ångstroms resolution [Structure. 2003 Jul; 11(7):815-23]. Its asymmetric unit contains 4 single chains.The NDB/PDB provides 4 biological assemblies, each consisting of 4 identical chains from the asymmetric unit. Download biological assembly 1 from the NDB (or the PDB, if you prefer; but notice the case difference in PDB id):

Code: [Select]

wget ftp://ndbserver.rutgers.edu/NDB/coordinates/na-biol/1j6s.pdb1
Run find_pair -p on '1j6s.pdb1'. Note the -all_model option; by default, 3DNA programs (such as find_pair) handle only the first model (structure) in a given PDB data file.

Code: [Select]

find_pair -p -all_model 1j6s.pdb1 1j6s.mbp
At the end of output file '1j6s.mbp', one can see the following identified multiplets: one octad and three tetrads:

Code: [Select]

    1: #8 [1]...1>A:...1_:[BRU]u + [2]...1>A:...2_:[..G]G + [47]...2>A:...1_:[BRU]u + [48]...2>A:...2_:[..G]G + [93]...3>A:...1_:[BRU]u + [94]...3>A:...2_:[..G]G + [139]...4>A:...1_:[BRU]u + [140]...4>A:...2_:[..G]G
    2: #4 [3]...1>A:...3_:[..A]A + [49]...2>A:...3_:[..A]A + [95]...3>A:...3_:[..A]A + [141]...4>A:...3_:[..A]A
    3: #4 [4]...1>A:...4_:[..G]G + [50]...2>A:...4_:[..G]G + [96]...3>A:...4_:[..G]G + [142]...4>A:...4_:[..G]G
    4: #4 [5]...1>A:...5_:[..G]G + [51]...2>A:...5_:[..G]G + [97]...3>A:...5_:[..G]G + [143]...4>A:...5_:[..G]G

Among other outputs, there is also a file named 'multiplets.pdb' which contains the atomic coordinates of the corresponding multiplets, each oriented in its most-extended view. The base multiplets can be extracted with ex_str and then converted to .r3d format for Raster3D or PyMol rendering (see also post "What can 3DNA do for RNA structures?" for more examples).

Code: [Select]

ex_str -1 multiplets.pdb oct.pdb
r3d_atom -od -r=0.1 -b=0.2 oct.pdb stdout | render -png > oct.png

ex_str -2 multiplets.pdb A-tetrad.pdb
r3d_atom -od -r=0.1 -b=0.2 A-tetrad.pdb stdout | render -png > A-tetrad.png

ex_str -3 multiplets.pdb G-tetrad.pdb
r3d_atom -od -r=0.1 -b=0.2 G-tetrad.pdb stdout | render -png > G-tetrad.png

The three png images ('oct.png', 'A-tetrad.png' and 'G-tetrad.png') as generated directly above are attached below.

1234

Feature requests / Re: chain continuation character in analyze

« on: August 09, 2012, 07:43:01 am »

Hi Pascal,

I've implemented the -chain_markers option to analyze as of 3DNA v2.1beta dated 2012aug09. As an example, run the following commands,

Code: [Select]

find_pair -s 1egk.pdb stdout | analyze -chain_markers='+|x*' stdin
You will see the the portion below in output file '1egk.outs':

   1   (0.013) ....>A:...1_:[..A]A     +    
   2   (0.020) ....>A:...2_:[..G]G     |    
   3   (0.019) ....>A:...3_:[..G]G     |    
   4   (0.014) ....>A:...4_:[..A]A     |    
   5   (0.014) ....>A:...5_:[..G]G     |    
   6   (0.016) ....>A:...6_:[..A]A     |    
   7   (0.020) ....>A:...7_:[..G]G     |    
   8   (0.015) ....>A:...8_:[..A]A     |    
   9   (0.028) ....>A:...9_:[..G]G     |    
  10   (0.015) ....>A:..10_:[..A]A     |    
  11   (0.015) ....>A:..11_:[..U]U     |
  12   (0.022) ....>A:..12_:[..G]G     |
  13   (0.015) ....>A:..13_:[..G]G     |
  14   (0.021) ....>A:..14_:[..G]G     |
  15   (0.025) ....>A:..15_:[..U]U     |
  16   (0.016) ....>A:..16_:[..G]G     |
  17   (0.016) ....>A:..17_:[..C]C     |
  18   (0.016) ....>A:..18_:[..G]G     |
  19   (0.012) ....>A:..19_:[..A]A     |
  20   (0.017) ....>A:..20_:[..G]G     x
  21   (0.010) ....>B:..21_:[..C]C     +
  22   (0.018) ....>B:..22_:[..T]T     |
  23   (0.007) ....>B:..23_:[..C]C     |
  24   (0.016) ....>B:..24_:[..G]G     |
  25   (0.011) ....>B:..25_:[..C]C     |
  26   (0.013) ....>B:..26_:[..A]A     |
  27   (0.011) ....>B:..27_:[..C]C     |
  28   (0.006) ....>B:..28_:[..C]C     |
  29   (0.010) ....>B:..29_:[..C]C     |

As always, check it out and report back if that fits the bill.

Xiang-Jun

1235

Feature requests / Re: chain continuation character in analyze

« on: August 08, 2012, 12:00:46 pm »

I will think about the issue, and try to find a consistent way to handle find_pair in default and with the -s option, and streamline the output style between find_pair and analyze. I will post back in this thread once it is done.

The -pdbv3 is globally set to TRUE, so it also applies to o1p_o2p.

HTH

Xiang-Jun

1236

Feature requests / Re: chain continuation character in analyze

« on: August 08, 2012, 09:38:32 am »

The -chain_markers option works only for duplexes (default for find_pair), not yet with the -s option for single-stranded (ss) structures. 3DNA analyze checks for O3'(i) to P(i+1) distance with a cut-off of 2.5 Å for chain breaks; only two characters are used: 'x' for a break, and '-' for a covalent bond. I am a bit hesitated to make the current settings more complicated; you are the first 3DNA user to notice such little detailed info at all. Do you a solid use case to convince me?

In current 3DNA v2.1beta, no need to set -pdbv3; it's the default.

Xiang-Jun

1237

Feature requests / Re: chain continuation character in analyze

« on: August 06, 2012, 01:04:12 pm »

Please download the updated aug06 release of 3DNA v2.1beta. Now you can specify helix begin/continuation/end and isolated bp characters through option -chain_markers. For example,

Code: [Select]

find_pair 1egk.pdb stdout
    #  default as before
find_pair -chain_markers='o|x+' 1egk.pdb stdout
    #  with helix beginning character assigned to 'o'

The same option can also be applied to 'analyze'.

Also, as noted previously, the same strand P...P and C1'...C1' distances are now output in two decimals.

Have a try and report back how it goes.

Xiang-Jun

1238

Feature requests / Re: BI/BII issue

« on: August 04, 2012, 03:13:30 pm »

As noted in the -tor output file,

Code: [Select]

          e-z: epsilon - zeta
              BI:  e-z = [-160, +20]
              BII: e-z = [+20, +200]

The criteria are based on "Nucleic acid backbone parameters" at the Jena website. Do you have another definition?

Also, the BI/BII classification is purely backbone-based (epsilon and zeta torsion angles), not specific to "B-DNA" whose definition may require characterization of the base pair geometry.

Xiang-Jun

1239

Feature requests / Re: chain continuation character in analyze

« on: August 04, 2012, 03:01:52 pm »

Quote

In the analyze file, you insert the '-' character when a strand is not broken and 'x' when its broken.
Then a new chain starts. May be you could add a third '+' character for these residues ?

Could you provide an example?

Quote

Also, for the same strand P...P and C1'...C1' distances, could you add two decimals instead of one?

Done -- the distribution will be updated after clarification of the above point.

Xiang-Jun

1240

General discussions (Q&As) / Re: how to bend a big DNA soomthly?

« on: July 27, 2012, 10:29:11 am »

3DNA per se does not provide a prescription "to bend a big DNA smoothly". How to choose roll angles to fit a smooth curve is problem specific; 3DNA is mechanical and rigorous in that it constructs a structure corresponding to the parameters you fed into "rebuild", or (reversibly) it can "analyze" a given DNA structure to provide parameters that fully describe its base geometry.

As mentioned previously, and made clear in the 2008 3DNA Nature Protocols paper, the various prescriptions of roll-introduced DNA bending in Figure 3 are based on the classic work of Calladine and Drew. That protocol was intended to illustrate 3DNA's capability of building structures, in schematic representation, based on user-supplied parameters, not to show how to derive roll angles for any desired DNA bending.

That said, 3DNA is handy for constructing and visualizing a DNA structure in three dimensions to help verify if a roll prescription fulfills one's assumptions -- seeing is believing as well as understanding. For example, by noticing a bend structure in zigzag type, you immediately realized that your roll parameters (as sinusoidal) were not chosen correctly.

There are literature publications on how to fit a smooth ribbon to curved DNA. Pubmed or Google Scholar is your friend; it helps if you could share your findings.

HTH,

Xiang-Jun

1241

General discussions (Q&As) / Re: Analysis_assumptions!

« on: July 24, 2012, 02:41:05 pm »

The 3DNA program mutate_bases does exactly as its name suggests, i.e., mutate DNA/RNA bases, and from a pure geometric approach. It is up to the user (with any other desirable tools) to make sense of "structural and binding differences" between mutated and native structures.

Xiang-Jun

1242

General discussions (Q&As) / Re: mutate_bases error

« on: July 24, 2012, 02:35:49 pm »

Quote

mutate_bases 'c=Y, s=14, m=DC' 3DFV.pdb 3DFV_mut.pdb

Change comma to space, as below:

Code: [Select]

mutate_bases 'c=Y s=14 m=DC' 3DFV.pdb 3DFV_mut.pdb
Comma or semicolon is used to separate multiple mutations. I am updating command line help message to make this point clearer.

Xiang-Jun

1243

General discussions (Q&As) / Re: Question on hairpin-loop in PDB entry 3uzn

« on: July 20, 2012, 09:54:50 am »

Well, with the excerpt from find_pair output, I can see what nucleotides (nts) you are referring to. As always, it's the details that count.
Quote
I also thought you know what hairpin loop is, i don't know why, I'm sorry.
I know what a hairpin loop is, but I certainly did not see what you were referring to in your previous post. Thanks for your wikipedia quota and link on "step-loop", however, I still miss your point as to how it is related to the helix (see attached figure) formed by nts 149-155 with 171-177 on chain A of PDB entry 3uzn.
So at this point, I have no comment to your initial question: "Can you say something about it?"
Quote
Oh, and I was using 3DNA ver 1.5.
I am glad to know that 3DNA v1.5, which was compiled (nearly) a decade ago, is still in use. As noted in "Download instructions", that version is obsolete, and "no longer supported". So it's the time to upgrade to v2.1 (beta).

Xiang-Jun

PS. Note that I have once again split the posts on hairpin-loop from the original thread 'Helices and Isolates in output of find_pair' to make each one focused on a specific topic, and not too lengthy.

1244

General discussions (Q&As) / Re: Question on hairpin-loop in PDB entry 3uzn

« on: July 20, 2012, 08:14:11 am »

Could you provide us reproducible details how you got this "hairpin loop of length zero (between 155 and 156)"? Which version of 3DNA are you using? I did not see nucleotides 156-162 (chain A) at all in PDB entry 3uzn downloaded from the current RCSB website.

Xiang-Jun

1245

General discussions (Q&As) / Re: Meaning of base-pair id string e.g. 'U-+-G'**

« on: July 18, 2012, 11:13:04 am »

Thanks for your new question

. I have split this post from the original thread 'Helices and Isolates in output of find_pair' to make each one focused on a specific topic, and not too lengthy.

I am sure I've answered this question before, maybe through priviate emails. I'll write a post on this issue on the 3DNA homepage soon [link added on 2012-07-30]. Here is the short answer to your question:

The general patten of a base-pair id string is M-XYZ-N for bases M and N. Only when XYZ equals --- would M and N be a possible Watson-Crick pair. If Z is '+', the two z-axes of bases are pointing the same direction and thus have a positive dot product. See my post "Hoogsteen and reverse Hoogsteen base pairs".

HTH,

Xiang-Jun

1246

General discussions (Q&As) / Re: Helices and Isolates in output of find_pair

« on: July 16, 2012, 09:47:15 am »

Thanks for providing a PDB list where find_pair cannot properly assign certain helices. It'd be more helpful if you could provide specific problematic helices, for at least some PDB entries, as you did for 1z58.

Best regards,

Xiang-Jun

1247

General discussions (Q&As) / Re: Helices and Isolates in output of find_pair

« on: July 15, 2012, 09:36:07 am »

Thanks for providing detailed info about a mis-assigned helix by find_pair. As shown in the attached image, in PDB entry 1z58, nucleotides 336-338 and 346-348 should indeed be assigned into a helix. While the helix assignment algorithm of find_pair works elegantly for the majority of cases, it is clearly not sophisticated enough to properly handle complicated structures such as 1z58 (the large ribosomal subunit from the eubacterium Deinococcus radiodurans): if you pay close attention to the output from find_pair, you will see warning messages in such cases.

I'm interested in refining 3DNA on those complicated structures, and your reported example is a concrete case to start with. Do you have more examples? The more varied and detailed cases I have, the easier to test find_pair against, and the more 3DNA can work for you in the end.

Xiang-Jun

1248

General discussions (Q&As) / Re: Helices and Isolates in output of find_pair

« on: July 13, 2012, 05:28:10 pm »

For model #5 of PDB entry 1aju, let's store its coordinates in file 1aju-m5.pdb. Run

Code: [Select]

find_pair 1aju-m5.pdb 1aju-m5.bpsyou get 1aju-m5.bps, with the following content:

1aju-m5.pdb
1aju-m5.out
    2         # duplex
   13         # number of base-pairs
    1    1    # explicit bp numbering/hetero atoms
    1   30  0 #    1 | ...5>A:..16_:[..G]G-----C[..C]:..46_:A<...5  1.26  0.86 43.33  8.60  0.14
    2   29  0 #    2 | ...5>A:..17_:[..G]G-----C[..C]:..45_:A<...5  1.05  0.14 21.15  8.79 -2.62
    3   28  0 #    3 | ...5>A:..18_:[..C]C-----G[..G]:..44_:A<...5  0.47  0.27 13.69  9.09 -3.30
    4   27  0 #    4 | ...5>A:..19_:[..C]C-----G[..G]:..43_:A<...5  0.55  0.43 20.29  9.06 -2.56
    5   26  0 #    5 | ...5>A:..20_:[..A]A-----U[..U]:..42_:A<...5  1.04  1.00 16.36  8.68 -1.15
    6   25  0 #    6 | ...5>A:..21_:[..G]G-----C[..C]:..41_:A<...5  1.01  0.44  9.82  8.78 -2.61
    7   24  0 #    7 | ...5>A:..22_:[..A]A-----U[..U]:..40_:A<...5  0.88  0.75 23.30  8.87 -1.45
   10   23  0 #    8 | ...5>A:..26_:[..G]G-----C[..C]:..39_:A<...5  1.16  0.44 20.37  8.79 -1.94
   11   22  0 #    9 | ...5>A:..27_:[..A]A-----U[..U]:..38_:A<...5  1.09  1.03 16.74  8.87 -1.00
   12   21  0 #   10 | ...5>A:..28_:[..G]G-----C[..C]:..37_:A<...5  0.94  0.83 22.27  8.99 -1.29
   13   20  9 #   11 x ...5>A:..29_:[..C]C-----G[..G]:..36_:A<...5  1.14  0.55 24.72  8.76 -1.53
   15   18  1 #   12 + ...5>A:..31_:[..U]U-**--G[..G]:..34_:A<...5  4.82  2.03 40.01  7.97  7.88
   16   17  1 #   13 + ...5>A:..32_:[..G]G-**+-G[..G]:..33_:A<...5  6.66  0.32 62.30  6.72  9.42
##### Base-pair criteria used:     4.00     0.00    15.00     2.50    65.00     4.50     7.50 [ O N]
##### 2 non-Watson-Crick base-pairs, and 3 helices (2 isolated bps)
##### Helix #1 (11): 1 - 11  ***broken O3'(i) to P(i+1) linkage***
##### Helix #2 (1): 12
##### Helix #3 (1): 13

The helix continues up to base pair (bp) "A:..29_:[..C]C-----G[..G]:..36_:A" (#11). The next two bps are isolated, i.e., not part of a continuous helix formed by bps from 1 to 11. Please see the attached blocview-image showing nucleotides 28 to 37, with green for G, cyan for U, red for A, and yellow for C.

Regarding your question,

Quote

When bp is represented with bigger number first (like 20-9) does it mean anything?

As shown above, the numerical values for bps (the left two columns) from find_pair are nucleotide sequential numbers as they appear in the input PDB file. What do you mean "does it mean anything?". As always, please be specific, using an example to illustrate your point.

HTH,

Xiang-Jun

1249

General discussions (Q&As) / Re: Helices and Isolates in output of find_pair

« on: July 13, 2012, 11:09:37 am »

Thanks for using 3DNA. In 'find_pair' output, '+' means isolated base-pair (bp), i.e., a bp not in a helical context. '|' means the bp is part of a helix, and 'x' means helix breaks at the bp.

It would help if you provide an example -- then our discussion would be more specific.

Xiang-Jun

1250

General discussions (Q&As) / Re: How to bend DNA in a protein-DNA complex

« on: July 12, 2012, 07:03:50 pm »

Quote

is there any way to incorporate protein information in the bp_step.par file such that the whole complex undergoes bend.

No. The rebuilding process in 3DNA is purely geometric, and it does not handle protein explicitly. If you have a DNA-protein complex to start with, bending DNA will most likely cause steric clashes.

Xiang-Jun

Pages: 1 ... 48 49 [50] 51 52 ... 66

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University

News:

Show Posts

Messages - xiangjun

General discussions (Q&As) / Re: no matching entry for atom name [OP1 ] (OP..) in 'atomlist.dat

MD simulations / Re: Using find_pair

General discussions (Q&As) / Data files for Table 3 of the standard base-reference frame article

MD simulations / Re: Using find_pair

General discussions (Q&As) / Re: Datasets and scripts for reproducing Figure 5 of the 3DNA NAR03 paper

MD simulations / Re: About output files from x3dna_md.rb

General discussions (Q&As) / Re: Datasets and scripts for reproducing Figure 5 of the 3DNA NAR03 paper

General discussions (Q&As) / How to identify triplets, quadruplets and higher-order base associations

Feature requests / Re: chain continuation character in analyze

Feature requests / Re: chain continuation character in analyze

Feature requests / Re: chain continuation character in analyze

Feature requests / Re: chain continuation character in analyze

Feature requests / Re: BI/BII issue

Feature requests / Re: chain continuation character in analyze

General discussions (Q&As) / Re: how to bend a big DNA soomthly?

General discussions (Q&As) / Re: Analysis_assumptions!

General discussions (Q&As) / Re: mutate_bases error

General discussions (Q&As) / Re: Question on hairpin-loop in PDB entry 3uzn

General discussions (Q&As) / Re: Question on hairpin-loop in PDB entry 3uzn

General discussions (Q&As) / Re: Meaning of base-pair id string e.g. 'U-**+-G'

General discussions (Q&As) / Re: Helices and Isolates in output of find_pair

General discussions (Q&As) / Re: Helices and Isolates in output of find_pair

General discussions (Q&As) / Re: Helices and Isolates in output of find_pair

General discussions (Q&As) / Re: Helices and Isolates in output of find_pair

General discussions (Q&As) / Re: How to bend DNA in a protein-DNA complex

General discussions (Q&As) / Re: Meaning of base-pair id string e.g. 'U-+-G'**