Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Messages - xiangjun

Pages: 1 ... 44 45 [46] 47 48 ... 63
1126
Quote
- The default O3'-P distance is 4.5A. From the ADIT server of the PDB, the expected value is 1.6A. What is in fact the range of the possible values?
The O3'-P covalent bond distance is ~1.6 A, as you noticed on the ADIT server of the PDB. The 3DNA default of 4.5 A is purely an empirical value used to decide if to include the bond in a corresponding CONECT record in the rebuilt PDB file; it has no 'chemical' meaning. Have a look of build.pdb file using a text editor, you will see my point.

Quote
- Could you indicate some tools alllowing to do energy minimization?
AMBER should help. Some other MD packages or Phenix should also do the trick. Yet, I still failed to find a command-line tool that can 'regularize' the backbone to a reasonable geometry while keeping the base atoms fixed.

HTH,

Xiang-Jun


1127
MD simulations / Re: analyzing longer DNA sequences
« on: November 21, 2012, 11:05:56 am »
Quote
I am not sure if this is too much to ask you.
No, it's not; I always welcome user questions such as this one, and I strive to be as helpful as I could.

Now I see the problem you are experiencing. Strictly speaking, and as I mentioned in my previous reply, it's not a 3DNA problem but at the interface between 3DNA and Curves+. Since the purpose of providing the find_pair c+ option is to build a bridge between the two commonly used software programs for analyzing nucleic acid structures, I'd like to dig the issue further to see if anything can be done from 3DNA's perspective.

Your Curves+ input file curves.inp, as generated with find_pair -c+, has the following content:
&inp file=sel.pdb,
     lis=sel,
     fit=.t.,
     lib=./standard,
     isym=1,
&end
    2    1   -1    0    0
    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22   23   24
   47   46   45   44   43   42   41   40   39   38   37   36   35   34   33   32   31   30   29   28   27   26   25


which has 23 base pairs (note that bases 1 and 48 are not paired). Yet when the file is fed to Curves+, only the first 15 bps are recognized.
Quote
Combined strands have   15 levels ...

  Strand  1 has  15 bases (5'-3'): GTGTGAGCGTGGGCG
  Strand  2 has  15 bases (3'-5'): CACACTCGCACCCGC

To help solve the problem, could you try the following and report back (in detail) what you get?
  • Instead of 23 bps, shorten the list to < 15, say 10, as below:
        2    3    4    5    6    7    8    9   10   11
       47   46   45   44   43   42   41   40   39   38
    Run Curves+ on it again, do you get what you expect?
  • Since the nucleotide numbers are continuous, you can use the short-hand form to specify paired bases:
        2:24
       47:25
    Run Curves+, what do you get?

Xiang-Jun

1128
Could you be more specific, better with an example (or reference), to illustrate exactly what want to achieve?

Xiang-Jun

1129
Hi Damien,

The fact that "O3'-P distance too long after reconstruction" is well expected; 3DNA built structures are accurate for the bases, with only an approximate sugar-phosphate backbone. See FAQ "How do I build nucleic acid structures with sugar-phosphate backbone?"

For cases with longer than the default 4.5 A O3'(i)--P(i+1) distance, the 3DNA rebuild program outputs info message as below (using your example, and with a B-DNA backbone conformation):

Code: [Select]
O3' (#317) and P (#331) on chain A have distance 5.3 over 4.5: no linkage assigned
O3' (#768) and P (#782) on chain B have distance 5.2 over 4.5: no linkage assigned

This approximate structure may serve as a starting point for energy minimization.

Xiang-Jun
 

1130
MD simulations / Re: analyzing longer DNA sequences
« on: November 20, 2012, 04:29:58 pm »
Hi Shyno,

Thanks for providing details of the commands you used and attaching three relevant files. However, I fail to see what're wrong here; things are working as expected from my understanding.

With the -c+ option, you get what's desired, as in your attached curves.inp file. Running find_pair in its default settings on your PDB file sel.pdb gives expected results:
find_pair sel.pdb stdout
sel.pdb
sel.out
    2         # duplex
   23         # number of base-pairs
    1    1    # explicit bp numbering/hetero atoms
    2   47  0 #    1 | ....>A:...2_:[..G]G-----C[..C]:..48_:B<....  0.77  0.76 28.01  8.85 -1.31
    3   46  0 #    2 | ....>A:...3_:[..T]T-----A[..A]:..47_:B<....  0.32  0.02 11.50  9.26 -4.07
    4   45  0 #    3 | ....>A:...4_:[..G]G-----C[..C]:..46_:B<....  0.83  0.64  4.42  9.11 -2.68
    5   44  0 #    4 | ....>A:...5_:[..T]T-----A[..A]:..45_:B<....  0.54  0.32 21.12  8.91 -2.76
    6   43  0 #    5 | ....>A:...6_:[..G]G-----C[..C]:..44_:B<....  0.45  0.27 27.61  8.88 -2.64
    7   42  0 #    6 | ....>A:...7_:[..A]A-**+-T[..T]:..43_:B<....  3.71  0.96 28.75  7.04  4.07
    8   41  0 #    7 | ....>A:...8_:[..G]G-----C[..C]:..42_:B<....  0.45  0.32  3.39  8.97 -3.75
    9   40  0 #    8 | ....>A:...9_:[..C]C-----G[..G]:..41_:B<....  1.05  0.47  8.17  8.82 -2.61
   10   39  0 #    9 | ....>A:..10_:[..G]G-----C[..C]:..40_:B<....  0.76  0.50 21.19  8.86 -2.17
   11   38  0 #   10 | ....>A:..11_:[..T]T-----A[..A]:..39_:B<....  0.34  0.27  5.44  9.01 -3.84
   12   37  0 #   11 | ....>A:..12_:[..G]G-----C[..C]:..38_:B<....  0.58  0.48 12.17  9.02 -2.85
   13   36  0 #   12 | ....>A:..13_:[..G]G-----C[..C]:..37_:B<....  0.24  0.03 16.54  9.08 -3.88
   14   35  0 #   13 | ....>A:..14_:[..G]G-----C[..C]:..36_:B<....  0.69  0.69 14.12  9.14 -2.22
   15   34  0 #   14 | ....>A:..15_:[..C]C-----G[..G]:..35_:B<....  0.41  0.09 14.35  9.02 -3.70
   16   33  0 #   15 | ....>A:..16_:[..G]G-----C[..C]:..34_:B<....  0.40  0.39 26.56  9.03 -2.49
   17   32  0 #   16 | ....>A:..17_:[..T]T-----A[..A]:..33_:B<....  0.70  0.62  8.01  9.49 -2.65
   18   31  0 #   17 | ....>A:..18_:[..A]A-----T[..T]:..32_:B<....  0.26  0.21 18.44  8.96 -3.41
   19   30  0 #   18 | ....>A:..19_:[..C]C-----G[..G]:..31_:B<....  0.26  0.09 25.81  9.27 -3.27
   20   29  0 #   19 | ....>A:..20_:[..A]A-----T[..T]:..30_:B<....  0.63  0.21 19.54  8.96 -2.98
   21   28  0 #   20 | ....>A:..21_:[..C]C-----G[..G]:..29_:B<....  0.57  0.33  6.64  9.06 -3.43
   22   27  0 #   21 | ....>A:..22_:[..A]A-----T[..T]:..28_:B<....  1.09  1.01 10.89  8.63 -1.34
   23   26  0 #   22 | ....>A:..23_:[..C]C-----G[..G]:..27_:B<....  0.66  0.37 28.93  9.04 -2.16
   24   25  0 #   23 | ....>A:..24_:[..A]A-----T[..T]:..26_:B<....  2.72  2.29 25.44  7.75  5.57
##### Base-pair criteria used:     4.00     0.00    15.00     2.50    65.00     4.50     7.50 [ O N]
##### 1 non-Watson-Crick base-pair, and 1 helix (0 isolated bps)
##### Helix #1 (23): 1 - 23

Certainly, find_pair is working properly, as designed. Is there anything I am missing here?

Xiang-Jun


1131
MD simulations / Re: analyzing longer DNA sequences
« on: November 20, 2012, 02:22:59 pm »
Hi Shyno,

DNAs with 62 base pairs should not be a problem for 3DNA to analyze; specifically, find_pair and other 3DNA components have no default bp limit other than your computer's memory.

As always, please be specific by providing a reproducible example; that will help solve your problem.

Xiang-Jun

1132
Thanks for posting your problem on installing 3DNA. Am I right to assume that your system is Windows with MinGW/MSYS? Could you be more specific about your Windows systems -- XP, Vista or Windows 7? Is it 32 bit or 64 bit?

Xiang-Jun

1133
General discussions (Q&As) / Re: different geometries
« on: November 06, 2012, 10:18:50 am »
Could you be more specific by providing an example to put your question in context?

Xiang-Jun

1134
General discussions (Q&As) / Re: DNA elastic parameters from NDB database
« on: November 05, 2012, 11:32:27 am »
No, these new values are not included in 3DNA. You're advised to contact the authors of the 2009 Balasubramanian paper for further details.

For those who are interested in the original Olson et al. 1998 dataset, please check the post "Some details on PNAS98 DNA sequence-dependent deformability".

Xiang-Jun

1135
w3DNA -- web interface to 3DNA / Re: server down?
« on: November 03, 2012, 12:28:40 pm »
Hi Ludo,

Thanks for reporting the w3DNA server problem. It is indeed down right now, possibly due to Hurricane Sandy. I've informed those in change at Rutgers, and hopefully w3DNA will be back soon.

Xiang-Jun

1136
General discussions (Q&As) / Re: different geometries
« on: October 29, 2012, 04:12:43 pm »
It's closer, but still not a proper PDB format -- see the thread "how the cartesian coordinates transform to PDB format" for details.

Given the fact that this issue has popped up quite a few times over the years, I may consider adding support in 3DNA to make such conversion more straightforward.

Xiang-Jun

1137
General discussions (Q&As) / Re: different geometries
« on: October 29, 2012, 01:06:52 pm »
Thanks for posting back an example molden file which help clarify the issue. No, that's not the format recognized by 3DNA programs. Currently, 3DNA only accepts PDB format file (coordinate record descriptions) as documented in the RCSB website. That said, it's feasible to convert your molden file via a purpose-oriented script or third-party utility -- Google search may help.

Xiang-Jun

1138
As a follow up, as of 3DNA v2.1 2012oct26, your initial problem has been solved. Now you can have for example only Atomic_A.pdb in your current working directory, and 3DNA will use that file; for the Atomic*.pdb files not found in the current directory, 3DNA will use the default in $X3DNA/config. Overall, this revision allows 3DNA users greater flexibility in choosing the standard base-reference-frame files for analysis and rebuilding.

Xiang-Jun

1139
General discussions (Q&As) / Re: different geometries
« on: October 26, 2012, 01:37:31 pm »
Thanks for using 3DNA and posting your question on the forum!

Could you be more specific by providing a concreate example to show us what a molden file look like?

Xiang-Jun

1140
FAQs / How can I mutate cytosine to 5-methylcytosine
« on: October 26, 2012, 01:23:14 pm »
Methylation of cytosines in DNA is a crucial epigenetic modification that regulate expression of many genes. Chemically, it is the addition of a methyl group to the 5 position of cytosine (C).

The mutate_bases program in 3DNA v2.x performs in silico base mutations given a nucleic acid structure in PDB format. It is not a problem to mutate any C to a 5-methylcytosine (5CM) provided that users set a 5CM in its standard base reference frame. Given the importance of 5CM in epigenetics and the increasing simulation studies to understand its effects, I have included Atomic_5CM.pdb in the 3DNA v2.1 distribution as of 2012oct26.

According to PDB, the three-letter nucleotide name for 5-methylcytosine is 5CM instead of 5MC -- see for example PDB entries 4mht and 2uz4. The methyl carbon is named " C5A" instead of " C5M" or " C7 ". Thus, the content of the Atomic_5CM.pdb file is:

Code: [Select]
REMARK    3DNA by Dr. Xiang-Jun Lu [2012-10-26] (xiangjun@x3dna.org)
ATOM      1  C1' 5CM A   1      -2.477   5.402   0.000  1.00  0.00           C 
ATOM      2  N1  5CM A   1      -1.285   4.542   0.000  1.00  0.00           N 
ATOM      3  C2  5CM A   1      -1.472   3.158   0.000  1.00  0.00           C 
ATOM      4  O2  5CM A   1      -2.628   2.709   0.001  1.00  0.00           O 
ATOM      5  N3  5CM A   1      -0.391   2.344   0.000  1.00  0.00           N 
ATOM      6  C4  5CM A   1       0.837   2.868   0.000  1.00  0.00           C 
ATOM      7  N4  5CM A   1       1.875   2.027   0.001  1.00  0.00           N 
ATOM      8  C5  5CM A   1       1.056   4.275   0.000  1.00  0.00           C 
ATOM      9  C5A 5CM A   1       2.466   4.961   0.001  1.00  0.00           C 
ATOM     10  C6  5CM A   1      -0.023   5.068   0.000  1.00  0.00           C 
END

With this new addition, it is now very straightforward to mutate Cs to 5CMs with mutate_bases, as illustrated by the following two examples:

  • Mutate C1 on chain A and C23 on chain B of the Dickerson B-DNA dodecamer (PDB entry 355d) to 5CMs:
    mutate_bases 'chain=A snum=1 m=5CM; chain=B snum=23 m=5CM' 355d.pdb 355d_AC1BC23_5CM.pdb
  • Mutate C2 on chain A of the yeast phenylalanine tRNA (PDB entry 6tna) to 5CM:
    mutate_bases 'chain=A snum=2 name=C m=5CM' 6tna.pdb 6tna_C2_5CM.pdb

The mutated files 355d_AC1BC23_5CM.pdb and 6tna_C2_5CM.pdb are attached for your verification. For comparison, shown below are the original atomic coordinates of the above tRNA 6tna cytosine and coordinates of its 5CM mutant in red. Note that the coordinates of the backbone atoms are the same, and coordinates of common base atoms are very close.

  • The original atomic coordinates of a cytosine from PDB entry 6tna:
    --------------------------------------------------------------------------------
    ATOM     25  P     C A   2      31.659  20.469  70.978  1.00 10.00           P 
    ATOM     26  OP1   C A   2      32.973  21.044  71.364  1.00 10.00           O 
    ATOM     27  OP2   C A   2      30.973  21.143  69.849  1.00 10.00           O 
    ATOM     28  O5'   C A   2      31.815  18.912  70.652  1.00 10.00           O 
    ATOM     29  C5'   C A   2      30.629  18.184  70.293  1.00 10.00           C 
    ATOM     30  C4'   C A   2      30.507  16.914  71.139  1.00 10.00           C 
    ATOM     31  O4'   C A   2      29.293  17.051  71.947  1.00 10.00           O 
    ATOM     32  C3'   C A   2      30.455  15.607  70.367  1.00 10.00           C 
    ATOM     33  O3'   C A   2      31.724  14.971  70.316  1.00 10.00           O 
    ATOM     34  C2'   C A   2      29.411  14.815  71.146  1.00 10.00           C 
    ATOM     35  O2'   C A   2      29.987  14.227  72.301  1.00 10.00           O 
    ATOM     36  C1'   C A   2      28.473  15.927  71.630  1.00 10.00           C 
    ATOM     37  N1    C A   2      27.474  16.346  70.621  1.00 10.00           N 
    ATOM     38  C2    C A   2      26.658  15.368  70.068  1.00 10.00           C 
    ATOM     39  O2    C A   2      26.802  14.198  70.441  1.00 10.00           O 
    ATOM     40  N3    C A   2      25.726  15.730  69.143  1.00 10.00           N 
    ATOM     41  C4    C A   2      25.601  17.008  68.767  1.00 10.00           C 
    ATOM     42  N4    C A   2      24.682  17.314  67.872  1.00 10.00           N 
    ATOM     43  C5    C A   2      26.436  18.041  69.324  1.00 10.00           C 
    ATOM     44  C6    C A   2      27.351  17.658  70.243  1.00 10.00           C 
    --------------------------------------------------------------------------------
    The coordinates of the mutant 5-methylcytosine generated by 'mutate_bases'
    REMARK    Mutation#1 A:...2@:[..C] to [5CM]
    ATOM     25  P   5CM A   2      31.659  20.469  70.978  1.00 10.00           P 
    ATOM     26  OP1 5CM A   2      32.973  21.044  71.364  1.00 10.00           O 
    ATOM     27  OP2 5CM A   2      30.973  21.143  69.849  1.00 10.00           O 
    ATOM     28  O5' 5CM A   2      31.815  18.912  70.652  1.00 10.00           O 
    ATOM     29  C5' 5CM A   2      30.629  18.184  70.293  1.00 10.00           C 
    ATOM     30  C4' 5CM A   2      30.507  16.914  71.139  1.00 10.00           C 
    ATOM     31  O4' 5CM A   2      29.293  17.051  71.947  1.00 10.00           O 
    ATOM     32  C3' 5CM A   2      30.455  15.607  70.367  1.00 10.00           C 
    ATOM     33  O3' 5CM A   2      31.724  14.971  70.316  1.00 10.00           O 
    ATOM     34  C2' 5CM A   2      29.411  14.815  71.146  1.00 10.00           C 
    ATOM     35  O2' 5CM A   2      29.987  14.227  72.301  1.00 10.00           O 
    ATOM     36  C1' 5CM A   2      28.473  15.927  71.630  1.00 10.00           C 
    ATOM     37  N1  5CM A   2      27.475  16.353  70.620  1.00  1.00           N 
    ATOM     38  C2  5CM A   2      26.651  15.372  70.062  1.00  1.00           C 
    ATOM     39  O2  5CM A   2      26.789  14.195  70.427  1.00  1.00           O 
    ATOM     40  N3  5CM A   2      25.726  15.730  69.141  1.00  1.00           N 
    ATOM     41  C4  5CM A   2      25.610  17.009  68.776  1.00  1.00           C 
    ATOM     42  N4  5CM A   2      24.685  17.316  67.863  1.00  1.00           N 
    ATOM     43  C5  5CM A   2      26.437  18.028  69.328  1.00  1.00           C 
    ATOM     44  C5A 5CM A   2      26.359  19.547  68.947  1.00  1.00           C 
    ATOM     45  C6  5CM A   2      27.347  17.660  70.238  1.00  1.00           C 

1141
Hi Mauricio,

Thanks for your report. I can reproduce the 'problem'; it has nothing to do with including $X3DNA/config in the environment settings.

The error message:
Code: [Select]
open_file <./Atomic.g.pdb> failed: No such file or directorymeans 3DNA (find_pair) is trying to locate the file in your current working directory (CWD), instead of the default system setting $X3DNA/config where the file Atomic.g.pdb is available, among other Atomic*.pdb files.

If file Atomic_A.pdb exists in your CWD, 3DNA assumes all other Atomic*.pdb files there as well. That's normally true if you run, e.g.:
Code: [Select]
x3dna_utils cp_std bdna
For your case, you must have Atomic_A.pdb and possibly several other canonical base Atomic_[CGTU].pdb files in your CWD, but not the modified version (in lower case, prefixed with a dot instead of underscore). So simply run the following command in you CWD will do the trick:

Code: [Select]
cp -f $X3DNA/config/Atomic.*.pdb .
Alternatively, you can delete all the Atomic*pdb files from your CWD, and find_pair will work as expected:

Code: [Select]
rm -f Atomic*.pdb
find_pair 1ehz.pdb stdout

Please have a try and report back how it goes.

I may consider to refine 3DNA to check for each Atomic*.pdb file separately, but that would complicate the code. You are actually the first to notice this 'limitation'. Practically, knowing what's happening behind the scene, you can easily work around it, as suggested above.

HTH,

Xiang-Jun


1142
Thanks for reading the "technical details" -- you are one of the very few users I am aware of who actually work through the examples.

Regarding your specific question, you are right in noticing that the x- and y-axes (the first two columns) of Rm are not the raw average of the corresponding columns of R1' and R2':
Quote
(-0.6616 -0.1982)/2 =-0.4299, which was different to the given value -0.4490
Here Rm, as a proper rotation matrix, is orthonormal; its x-axis and y-axis are normalized, which is precisely where the discrepancy comes from.

HTH,

Xiang-Jun

PS. In going through the doc, I've also noticed a typo in step #2 of section 5.5, where the RollTilt angle should be in degrees instead of Angstroms. I am planning to update the technical details and put it on the web. If you notice anything that can be improved, please post back.

1143
Hi Christos,

To clarify, 3DNA's rebuild program can generate an 'arbitrary' DNA (or RNA) structure based on a set of base-pair (optional) and step (or helical) parameters. It is a mechanic process since the program does not check or care if the structure is realistic or not. It is mathematically rigorous because you can 'analyze' the generated structure to get  exactly the parameters you start with.

Now to your question -- yes, it is possible to "create DNA structures with sequence-dependent characteristics". As a practical example, see recipe #2 "Determination of DNA curvature associated with different roll distributions" in the 2008 3DNA Nature Protocols paper. As for building DNA structures with "minor groove compression in PURINEpPYRIMIDINE steps, compression of major groove in PYRIMIDINEpPURINE steps", it is up to you to derive the "correct" base sequence and base-pair parameters to feed into rebuild.

HTH,

Xiang-Jun


1144
Well, you have not yet downloaded the 2012oct06 updated version I just compiled to solve the naming issue you experiences (see What's new?). The file should be named: x3dna-v2.1-linux-64bit-2012oct06.tar.gz.

Xiang-Jun

1145
Okay, please download the 2012oct06 updated (beta) version of v2.1 and try again. The atom naming issues should have been resolved. Please report back whether reinstalling the revised version does help.

Xiang-Jun

1146
Hi Asmita,

Thanks for joining the 3DNA-user community!

Posting your question on the forum is a good first step; attaching a specific example so that your problem can be reproduced is better still.

The problem is due to the non-standard format of your PDB file, as shown below:

# The following is extracted from your attached "struct_1.pdb"
ATOM     30 P    G       2      -3.465  14.386  -4.840  0.00  0.00
ATOM     31 OP1  G       2      -4.272  15.372  -4.078  0.00  0.00
ATOM     32 OP2  G       2      -2.267  13.805  -4.173  0.00  0.00
ATOM     17  P    DG A   2      23.337  31.278  21.156  1.00 13.26           P 
ATOM     18  OP1  DG A   2      24.761  31.571  21.391  1.00 13.17           O 
ATOM     19  OP2  DG A   2      22.651  31.834  19.956  1.00 12.34           O
# The above in red is taken from PDB entry 355d 

 

As you can see clearly, the atom name (and residue name) in your PDB file is shifted to the left by one column. So the atom name for OP1 is taken as
"OP1 " instead of the normal " OP1", 
and that explains the message you saw:
Quote
no matching entry for atom name [OP1 ] (OP..) in 'atomlist.dat'
Adding an entry "OP..   O" (note the two dots in place of digit and space) to file 'atomlist.dat' will make the info message go away.

The most effective way to fix such problems is to ensure your PDB file is standard compliant [see Coordinate File Description (PDB Format)]. In your case, ask the developers of your MD package to generate standard compliant PDB file, or you can write a simple script to make the changes. This is the first time I am aware of such problem; given enough interest, I will consider to refine 3DNA to accommodate such non-standard cases.

HTH,

Xiang-Jun


1147
MD simulations / Re: Using find_pair
« on: September 18, 2012, 11:33:25 pm »
Hi Johnny,

Thanks for your follow-up; it certainly helps clarify the situation:

The output of "uname -a" means your Ubuntu is 32-bit -- google "ubuntu 32 bit or 64 bit how to tell" for details. This explains why the 3DNA 64-bits version "was not compatible" on your machine.

So if you insist on using Ubuntu instead of Mac OS X, a Ubuntu 64-bit version seems the way to go.

Xiang-Jun

1148
Table 3 of the Olson et al. (2001) "standard base-reference frame article" lists the mean values and standard deviations of base geometric parameters for high resolution A-DNA and B-DNA crystal structures, as shown below.


The selection criteria of the A- and B-DNA datasets have recently been reported in the thread "Datasets and scripts for reproducing Figure 5 of the 3DNA NAR03 paper". For the sake for easy reference and completeness, here is the note again:

Quote
Selection Criteria:
   NDB ID: ad OR bd
   Classification: DNA
   Structure Description: Double Helix
   Conformation Type: A OR B
   No Drug, No Mismatch
   No Modifiers (Base/Sugar/Phosphate)
   Resolution better than 2.0 A
   =======================
   34 A-DNA and 27 B-DNA

For B-DNA, delete bd0012, bd0013 & bdf068 (following HMB)
   bd0001 bd0006_A
   bd0014: coordinates from PDB 463D
   bd0005 bd0016_A (with repeated atoms!)
   bd0018 bd0019 bdj017 bdj019 bdj025 bdj031 bdj036 bdj037 bdj051
   bdj052 bdj060 bdj061
   bdj081 (Uses helix #1 with strands A and B. The other two are
           disordered)
   bdl001 bdl005 bdl020 bdl084
   bd0023_A  bd0029
   -------------------------- 27-3=24 structures

For A-DNA
   ad0002 ==> (ad0002_AB + ad0002_CD)
   ad0003 ad0004 adh008 adh010 adh0102 adh0103 adh0104 adh0105
   adh014 adh026 adh027 adh029 adh033 adh034 adh038 adh039 adh047
   adh070 adh078 adj0102 adj0103 adj0112 adj0113 adj022 adj049
   adj050 adj051 adj065 adj066 adj067 adj075
   adl025 (suspicious! big Buckle, alternating Propeller)
   adl047 (with B-steps, not good either!)
   -------------------------- 34+1-2=33 structures

Outliers:
  A-DNA: ad0002_CD, steps 3-4,   bps 3-4-5
         ad0004,    steps 3-4-5, bps 3-4-5-6
  B-DNA: bdj025,    step 3,      bps 3-4
         bdj031,    step 3,      bps 3-4
         bdj037,    step 3,      bps 3-4

The six data files themselves are attached below; here the A- prefix is for A-DNA, and B- prefix for B-DNA:
  • 'A-base-pair.dat' and 'B-base-pair.dat' contain the base-pair parameters in the order of Shear, Stretch, Stagger, Buckle, Propeller, and Opening.
  • 'A-step-pars.dat' and 'B-step-pars.dat' contain the step parameters in the order of Shift, Slide, Rse, Tilt, Roll and Twist.
  • 'A-heli-pars.dat' and 'B-heli-pars.dat' contain the helical parameters in the order of x-displacement, y-displacement, Helical rise, Inclination, Tip, and Helical twist.

While the Table content is derived from NDB entries with only Watson-Crick base pairs in A- and B-DNA duplexes, it serves as a reference for identifying/quantifying non-canonical (mismatched) pairs by taking advantage the base-pair parameters. This approach is rigorous in its description of the relative base geometry in a pair, and is distinct from and complement with the Leontis-Westhof classification scheme.

1149
MD simulations / Re: Using find_pair
« on: September 18, 2012, 09:52:28 pm »
Thanks for using 3DNA and for posting your question on the forum.

Which version of 3DNA are you using:, v1.5, v2.0 or v2.1beta? What's your Linux variant -- what's the output of running "uname -a"?

What do you mean specifically that you could run the huge 2GB+ pdb file easily on your Mac? Using "find_pair" or other programs?

This is the first time I meet such a question, so I have more questions to you before I could possibly provide a solution.

Xiang-Jun

1150
Thanks for your patience -- it took me quite some time to dig into my files used for the 2003 3DNA paper! Luckily, I got them, and the time has been well-worth spent :D.

Here are the details -- the whole datasets and scripts can be downloaded by following the link: 3DNA-NAR03-Fig5.tar.gz. Figure 5(a)-(c) generated with the scripts and data files are attached.

  • Content of the README file:
    This folder (3DNA-NAR03-Fig5) contains all the data files and scripts
    to reproduce Figure 5 of the 2003 3DNA paper in Nucleic Acids Research
    (NAR03). The contents are taken from the original materials I used to
    create Figure 5 of NAR03, with slight editing. Specifically, I revised
    the Matlab scripts to work in GNU Octave v3.2.4 for verification.

    If you have any questions or comments, please do post them on the 3DNA
    Forum.

    2012-09-06 -- Xiang-Jun Lu (http://x3dna.org)

    ========================================================================

    Data selections:
        'note-AB-datasets' -- datasets of selected A- and B-DNA structures
        'note-TA-dataset'  -- dataset of selected TA-DNA structures

    Data files:
        'A-heli-pars.dat' -- six helical parameters
        'A-step-pars.dat' -- six step parameters
        'A-zp-zph.dat'    -- Zp and ZpH parameters
            Selected parameters of the A-DNA dataset. Note that the order
            the parameters is as in .out file from running 'analyze'

        'B-heli-pars.dat',  'B-step-pars.dat',  'B-zp-zph.dat' for B-DNA
        'TA-heli-pars.dat', 'TA-step-pars.dat', 'TA-zp-zph.dat' for TA-DNA

    Scripts:
        'incl_xdsp.m' -- script to generate Figure 5(a), Inclination vs Tip
                      'incl_xdsp.png' -- output file from running the script
        'roll_slide.m' -- script to generate Figure 5(b), Roll vs Slide
                      'roll_slide.png' -- output file from running the script
        'zph_zp.m' -- script to generate Figure 5(c), Zp(h) vs Zp
                      'zph_zp.png' -- output file from running the script
        'draw_ellipse.m', 'get_pars.m', 'open_file.m' -- supporting scripts
  • Content of file note-AB-datasets
    Selection Criteria:
       NDB ID: ad OR bd
       Classification: DNA
       Structure Description: Double Helix
       Conformation Type: A OR B
       No Drug, No Mismatch
       No Modifiers (Base/Sugar/Phosphate)
       Resolution better than 2.0 A
       =======================
       34 A-DNA and 27 B-DNA

    For B-DNA, delete bd0012, bd0013 & bdf068 (following HMB)
       bd0001 bd0006_A
       bd0014: coordinates from PDB 463D
       bd0005 bd0016_A (with repeated atoms!)
       bd0018 bd0019 bdj017 bdj019 bdj025 bdj031 bdj036 bdj037 bdj051
       bdj052 bdj060 bdj061
       bdj081 (Uses helix #1 with strands A and B. The other two are
               disordered)
       bdl001 bdl005 bdl020 bdl084
       bd0023_A  bd0029
       -------------------------- 27-3=24 structures

    For A-DNA
       ad0002 ==> (ad0002_AB + ad0002_CD)
       ad0003 ad0004 adh008 adh010 adh0102 adh0103 adh0104 adh0105
       adh014 adh026 adh027 adh029 adh033 adh034 adh038 adh039 adh047
       adh070 adh078 adj0102 adj0103 adj0112 adj0113 adj022 adj049
       adj050 adj051 adj065 adj066 adj067 adj075
       adl025 (suspicious! big Buckle, alternating Propeller)
       adl047 (with B-steps, not good either!)
       -------------------------- 34+1-2=33 structures

    Outliers:
      A-DNA: ad0002_CD, steps 3-4,   bps 3-4-5
             ad0004,    steps 3-4-5, bps 3-4-5-6
      B-DNA: bdj025,    step 3,      bps 3-4
             bdj031,    step 3,      bps 3-4
             bdj037,    step 3,      bps 3-4
  • Content of file note-TA-dataset
    pd0070, pd0112, pd0154, pd0155, pd0156 pd0157, pd0158, pd0159, pd0160,
    pd0161, pd0162, pd0163, pd0164, pdr031 pdt009, pdt012, pdt024, pdt025,
    pdt032, pdt034, pdt036

    This directory contains TATA box segments. It is normally 8-bp long, and
    has the sequence: T-A-T-A-@-A-@-N. There are two kinks at the terminal
    steps.

    * means non-WC base-pair which is eliminated from further analysis

    NDB ID  ##     Sequence      Res(A)  R-fac(%) chainID and residue range
    --------------------------------------------------------------------
    pd0070  01  T-T-T-A-A-A-T-A   2.4     20.0   C 1410 1417 D 1432 1439
                               
    pd0112  02  T-A-T-A-A-A-A-G   2.65    23.1   K 8 15 L 105 112
            03  T-A-T-A-A-A-A-G                  C 8 15 D 105 112
            04  T-A-T-A-A-A-A-G                  G 8 15 H 105 112
            05  T-A-T-A-A-A-A-G                  O 8 15 P 105 112
            06  T-A-T-A-A-A-A-G                  S 8 15 T 105 112
                                         
    pd0154  07  T-A-T-A-A-A-A-T   1.86    21.0   C 203 210 D 219 226
            08  T-A-T-A-A-A-A-T                  E 203 210 F 219 226
                                       
    pd0155  09  T-A-T-A-A-G-A-G*  1.93    19.6   C 203 209 D 220 226
            10  T-A-T-A-A-G-A-G*                 E 203 209 F 220 226
       
    pd0156  11  T-A-T-A-A-T-A-G*  2.1     19.3   C 203 209 D 220 226
            12  T-A-T-A-A-T-A-G*                 E 203 209 F 220 226
                                       
    pd0157  13  T-A-T-A-T-A-A-G*  2.3     19.4   C 203 209 D 220 226
            14  T-A-T-A-T-A-A-G*                 E 203 209 F 220 226
                                       
    pd0158  15  T-A-T-T-A-A-A-G*  2.1     19.4   C 203 209 D 220 226
            16  T-A-T-T-A-A-A-G*                 E 203 209 F 220 226
                                       
    pd0159  17  T-A-C-A-A-A-A-G*  1.9     20.9   C 203 209 D 220 226
            18  T-A-C-A-A-A-A-G*                 E 203 209 F 220 226
       
    pd0160  19  T-T-T-A-A-A-A-G*  1.8     19.3   C 203 209 D 220 226
            20  T-T-T-A-A-A-A-G*                 E 203 209 F 220 226
                                         
    pd0161  21  T-A-T-A-A-A-T-G*  2.23    19.1   C 203 209 D 220 226
            22  T-A-T-A-A-A-T-G*                 E 203 209 F 220 226
                                         
    pd0162  23  A-A-T-A-A-A-A-G*  2.3     18.2   C 203 209 D 220 226
            24  A-A-T-A-A-A-A-G*                 E 203 209 F 220 226
                                         
    pd0163  25  T-A-T-A-A-A-A-G   1.9     19.7   C 203 210 D 219 226
            26  T-A-T-A-A-A-A-G                  E 203 210 F 219 226
                                         
    pd0164  27  T-A-T-A-A-A-C*G*  1.95    19.9   C 203 208 D 221 226
            28  T-A-T-A-A-A-C*G*                 E 203 208 F 221 226
                                         
    pdr031  29  T-T-T-t-t-A-A-A   2.1     21.2   C 1408 1415 E 1420 1427
                                         
    pdt009  30  T-A-T-A-A-A-A-G   2.25    20.2   A 203 210 B 305 312
            31  T-A-T-A-A-A-A-G                  C 403 410 D 505 512
                                         
    pdt012  32  T-A-T-A-T-A-A-A   1.8     20.1   C 2 9 C 21 28
            33  T-A-T-A-T-A-A-A                  D 2 9 D 21 28
                                         
    pdt024  34  T-A-T-A-T-A-T-A   2.9     21.4   B 103 110 C 115 122
                                         
    pdt025  35  T-A-T-A-A-A-A-G   1.9     19.4   C 203 210 D 219 226
            36  T-A-T-A-A-A-A-G                  E 303 310 F 319 326
                                         
    pdt032  37  T-A-T-A-A-A-A-G   2.7     21.5   C 4 11 D 106 113
                                         
    pdt034  38  T-A-T-A-A-A-A-G   1.9     18.9   B 5 12 C 105 112
                                         
    pdt036  39  T-A-T-A-A-A-A-C   2.5     23.5   E 9 16 F 1 8

HTH,

Xiang-Jun


PS. As a matter of fact, the A- and B-DNA datasets are those used in Table 3 of the report on standard base reference.

Pages: 1 ... 44 45 [46] 47 48 ... 63

Created and maintained by Dr. Xiang-Jun Lu [律祥俊] (xiangjun@x3dna.org)
The Bussemaker Laboratory at the Department of Biological Sciences, Columbia University.