Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Netiquette · Download · News · Gallery · Homepage · DSSR · Web-DSSR · DSSR Manual · G4 Structures · DSSR-Jmol · DSSR-PyMOL · Web-SNAP

Messages - xiangjun

Pages: [1] 2 3 ... 87
1
General discussions (Q&As) / Re: How does 3DNA calculate rise
« on: May 21, 2019, 09:43:37 am »
Hi Pascal,

Have a look of $X3DNA/doc/tech-details.pdf, and/or read the source code.

HTH,

Xiang-Jun

2
General discussions (Q&As) / Re: design triple helix
« on: May 21, 2019, 09:39:55 am »
Hi,

Sima -- thanks for using 3DNA and for posting your questions on the Forum
Mauricio -- thanks for your insightful response to Sima's question. I already have your 2014 NAR paper on DNA triplex in my file and will have a look at the DNA-triplex review article by Frank-Kamenetskii and Mirkin.

In addition to the Pauling triplex model (no. 56) and the model no. 31, the 3DNA 'fiber' command can also general several other triplex models. The full list is shown below:

30   32.7   3.160  poly d(C) : poly d(I) : poly d(C)
31   30.0   3.260  poly d(T) : poly d(A) : poly d(T)
32   32.7   3.040  poly (U) : poly (A) : poly(U) (11-fold)
33   30.0   3.040  poly (U) : poly (A) : poly(U) (12-fold)
34   30.0   3.290  poly (I) : poly (A) : poly(I)
42   32.7   3.040  poly(U) : poly d(A) : poly(U) [cf. #32]
56  105.0   3.40   Pauling's triplex model (generic sequence: A, C, G, T, U)

Run 'fiber -m' or 'fiber -l' for a full list of regular helical models from 3DNA. See also "Table 4. Selected features of regular DNA and RNA helical models included in 3DNA" of the 2003 3DNA NAR paper.

Best regards,

Xiang-Jun

3
Have a try, and report back any issues you experience. As a general rule, please remember to be specific with your questions.

Best regards,

Xiang-Jun

4
Hi Xingcheng,

Could you please provide a concrete example to illustrate what you mean by connecting two DNA molecules into one?

Thanks,

Xiang-Jun

5
Hi Erik,

Thanks for your confirmation that the proposed method works.

Quote
And to answer your question, my .xyz file was generated from a .cif file by VESTA. I have attached the .cif file below.

The so-called CIF file is nothing but a simple list of atom symbol and xyz coordinates. As shown below, it is just like the .xyz file you attached at the beginning of the thread. If you're interested in generating a starting DNA or RNA structure of generic sequence, you may find 3DNA itself helpful. Try the new "Web 3DNA 2.0 for the analysis, visualization, and modeling of 3D nucleic acid structures" at http://web.x3dna.org, especially the "Fiber" module.

Best regards,

Xiang-Jun


_data_test

audit_creation_method generated by ABACUS

_cell_length_a 31.7506
_cell_length_b 31.7506
_cell_length_c 84.6683
_cell_angle_alpha 90
_cell_angle_beta 90
_cell_angle_gamma 90

_symmetry_space_group_name_H-M
_symmetry_Int_Tables_number

loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z

H 0.58245 0.237665 0.195505
H 0.597501 0.173458 0.172704
H 0.620449 0.178712 0.191924
H 0.675105 0.197142 0.174543
H 0.630339 0.276193 0.186432
H 0.684881 0.319228 0.173178
H 0.714824 0.272464 0.168714
H 0.672394 0.277871 0.145056
H 0.664234 0.435631 0.145526
H 0.567061 0.258673 0.157903
......
P 0.25159 0.656626 0.15821
P 0.395947 0.720755 0.126839
P 0.551859 0.703742 0.0949896

6
Hi Erik,

I dug a bit deeper into your attached PDB file and found it has a strange arrangement of the nucleotides. Instead of putting all atoms (base, sugar, and phosphate) of a nucleotide together, as is the case for a standard PDB entry (e.g., 355d), your PDB file has the atoms arranged by atom types: H, C, N, O, and P. An example is shown below, for A2 on chain A. As you can see clearly, it is separated into 5 segments.

The source of the issue may be due to the .xyz file which has atoms ordered that way. How was your .xyz generated? It is certainly the first time I see such a weird case for DNA structures.
ATOM     13  H2P   A A   2      21.335   7.601  19.955  1.00  0.00           H
ATOM     14 H5*1   A A   2      23.439   9.452  17.299  1.00  0.00           H
ATOM     15 H5*2   A A   2      23.472   9.989  19.020  1.00  0.00           H
ATOM     16  H4*   A A   2      24.470  11.663  17.626  1.00  0.00           H
ATOM     17  H3*   A A   2      21.726  12.489  18.712  1.00  0.00           H
ATOM     18 H2*1   A A   2      22.286  14.772  17.743  1.00  0.00           H
ATOM     19 H2*2   A A   2      23.960  14.206  17.436  1.00  0.00           H
ATOM     20  H1*   A A   2      22.953  13.709  15.302  1.00  0.00           H
ATOM     21  H2    A A   2      20.156  17.569  14.828  1.00  0.00           H
ATOM     22  H8    A A   2      20.392  11.254  16.304  1.00  0.00           H
ATOM     23  H61   A A   2      16.205  15.439  14.199  1.00  0.00           H
ATOM     24  H62   A A   2      16.535  13.725  14.562  1.00  0.00           H
ATOM    313  C5*   A A   2      23.026  10.171  18.027  1.00  0.00           C
ATOM    314  C4*   A A   2      23.361  11.581  17.596  1.00  0.00           C
ATOM    315  C3*   A A   2      22.779  12.705  18.457  1.00  0.00           C
ATOM    316  C2*   A A   2      22.902  13.896  17.502  1.00  0.00           C
ATOM    317  C1*   A A   2      22.479  13.239  16.178  1.00  0.00           C
ATOM    318  C4    A A   2      20.345  14.424  15.560  1.00  0.00           C
ATOM    319  C2    A A   2      19.876  16.522  14.986  1.00  0.00           C
ATOM    320  C6    A A   2      18.108  14.968  14.888  1.00  0.00           C
ATOM    321  C5    A A   2      19.025  13.992  15.353  1.00  0.00           C
ATOM    322  C8    A A   2      20.101  12.255  15.998  1.00  0.00           C

ATOM    548  N9    A A   2      21.031  13.291  15.971  1.00  0.00           N
ATOM    549  N3    A A   2      20.841  15.678  15.407  1.00  0.00           N
ATOM    550  N1    A A   2      18.575  16.241  14.721  1.00  0.00           N
ATOM    551  N7    A A   2      18.882  12.639  15.633  1.00  0.00           N
ATOM    552  N6    A A   2      16.809  14.704  14.628  1.00  0.00           N
ATOM    630  O1P   A A   2      19.542   8.422  18.155  1.00  0.00           O
ATOM    631  O2P   A A   2      21.886   7.642  19.130  1.00  0.00           O
ATOM    632  O5*   A A   2      21.577  10.009  18.124  1.00  0.00           O
ATOM    633  O4*   A A   2      22.891  11.854  16.254  1.00  0.00           O
ATOM    634  O3*   A A   2      23.577  12.788  19.672  1.00  0.00           O

ATOM    767  P     A A   2      21.033   8.490  18.045  1.00  0.00           P

Knowing the issue, one can find a workaround for this special case as below. Note your attached PDB file is named DNA_TypeA_12bp_1.pdb. The resultant file duplex.pdb can be handled by 3DNA.

Code: [Select]
grep ' A A ' DNA_TypeA_12bp_1.pdb | sort -k 6n > chainA.pdb
grep ' T B ' DNA_TypeA_12bp_1.pdb | sort -k 6n > chainB.pdb
cat chainA.pdb chainB.pdb > duplex.pdb

Please have a try and report back how it goes.

Xiang-Jun

7
As a follow-up, I've updated DSSR to v1.9.1-2019apr06 and 3DNA to v2.4.3-2019apr06. Now the DSSR --analyze-cehs option and the 3DNA cehs program should give the same numerical values of the CEHS parameters, even for non-WC pairs and structures.

Best regards,

Xiang-Jun

8
Hi Erik,

Thanks for your follow-up. 3DNA only needs the ATOM/HETATM coordinate records to its analysis and is generally tolerant of some variants of PDB format (e.g., C1* instead of C1'). As is often the case, please just try 3DNA (or other tools) and see what happens.

Glad to hear that OpenBabel can be of more help in your case.  The PDB file you attached, however, is still not in good condition. As an example, the first residue (designated as A) has the following info, with ONLY hydrogen atoms.

Code: [Select]
ATOM      1  H5*   A A   1      18.493   7.546  16.553  1.00  0.00           H
ATOM      2 H5*1   A A   1      18.971   5.507  14.623  1.00  0.00           H
ATOM      3 H5*2   A A   1      19.700   5.674  16.250  1.00  0.00           H
ATOM      4  H4*   A A   1      21.435   6.259  14.778  1.00  0.00           H
ATOM      5  H3*   A A   1      20.014   8.769  15.785  1.00  0.00           H
ATOM      6 H2*1   A A   1      21.745  10.136  14.663  1.00  0.00           H
ATOM      7 H2*2   A A   1      22.696   8.651  14.285  1.00  0.00           H
ATOM      8  H1*   A A   1      21.349   8.823  12.282  1.00  0.00           H
ATOM      9  H2    A A   1      21.090  13.832  12.321  1.00  0.00           H
ATOM     10  H8    A A   1      18.005   8.213  13.369  1.00  0.00           H
ATOM     11  H61   A A   1      16.614  14.046  11.537  1.00  0.00           H
ATOM     12  H62   A A   1      16.008  12.397  11.886  1.00  0.00           H

You need further work on this file before 3DNA can handle it.

Best regards,

Xiang-Jun


9
Hi Erik,

Thanks for using 3DNA and for posting your questions on the Forum.

There is no quick solution to your problem. 3DNA relies on proper naming of atoms in DNA and RNA, as a prerequisite. For example, the sugar atoms are all ended with a prime (e.g., C1'). With only C, O, N, P, H atomic symbols, as in the .xyz file you attached, it is practically impossible to know if an atom belongs to the base or the sugar moiety. The wwPDB is there to ensure the compliance of atom names to standard nomenclatures of biological macromolecular structures.

If VESTA or OpenBabel cannot do the job for you automatically, you've to come up with your own solution. Sorry for not being able to provide you a more encouraging answer.

Best regards,

Xiang-Jun

10
Quote
Just to make sure I understood this properly, CEHS parameters should only be used to analyze conformational variations of WC structures.

Yes, the original CEHS parameters are designed for WC structures, as mentioned clearly in the abstract of the 1995 JMB paper.

"In this paper, we develop a new local Euler-angle-based scheme for assessing the internal kinematics or geometry of a general dinucleotide step in double-helical DNA. The geometry of a dinucleotide step is completely defined by: (1) the base-pair parameters that describe the relative position and orientation of one base with respect to the other in a standard Watson-Crick base-pair, and (2) the step parameters that describe the relative position and orientation of the two base-pairs. The key feature of our scheme is that it makes use of the concept of a mid-step reference frame..."

That said, the CEHS algorithm is generally applicable and it forms one of the cornerstones of 3DNA and DSSR, including my extension of the idea to a set of local helical parameters.

Quote
And since my structure has only parallel-stranded non-WC interactions, I should be using the base pair and step parameters outputted by x3dna-dssr --analyze?

Yes. For your parallel structure, the local base-pair parameters from DSSR and the 3DNA 'analyze' program make perfect sense. As shown below, the C+C and G+G symbols match the parallel nature of your structure. Moreover, the ~180 opening means the glycosidic bonds are in opposite directions. Combined with the --more option, you will also see the DSSR classification of the pairs.

Code: [Select]
Local base-pair parameters
     bp      Shear    Stretch   Stagger    Buckle  Propeller  Opening
   1 C+C     -5.49      0.38     -0.26     -3.43     -0.76    177.10
   2 C+C     -5.30      0.03     -0.39     -6.39    125.84    174.80
   3 C+C     -5.39      0.10      0.29     -1.40   -128.26    178.13
   4 A-A     -4.24      4.91      0.84      8.66     13.21    -84.79
   5 G+G      5.16      3.66      0.04      4.43      0.39    -92.09
   6 G+G     -5.15     -3.66     -0.04     -4.42     -0.44     92.09
   7 A-A      4.24      4.91      0.84     -8.67     13.21    -84.83
   8 C+C      5.39     -0.09     -0.29      1.40    129.30   -178.12
   9 C+C      5.30     -0.03      0.39      6.40   -125.45   -174.78
  10 C+C      5.49     -0.38      0.25      3.41      1.72   -177.11

The local step and local helical parameters, however, do not make 'intuitive' sense. For example, the Rise values are negative or zero for some steps in your structure. Nevertheless, these parameters (together with the local base-pair parameters) can still be used to 'rebuild' your structure which is accurate at the base-pair level (i.e., no backbone).

You may have a look at the 'simple' parameters. For your parallel structure, the output (from your original attachment) is as below, which should look more reasonable. These 'simple' parameters, however, are for qualitative description only. They cannot be used for the rebuilding purpose.

Code: [Select]
Simple step parameters based on consecutive C1'-C1' vectors
     bp      Shift     Slide      Rise      Tilt      Roll     Twist
   1 C+C      0.16      0.12      3.28      0.27      1.05     46.50
   2 C+C      0.09      0.02      3.08     -1.15      2.21    -21.39
   3 C+C      2.56      0.77      6.67     -1.49     -1.18     13.68
   4 A-A     -1.49      0.62      3.10      0.04      7.07     77.70
   5 G+G      0.00      2.38      3.22     -0.01     -7.85     15.13
   6 G+G      1.49      0.62      3.10     -0.03      7.07     77.69
   7 A-A     -2.56      0.77      6.67      1.48     -1.18     13.69
   8 C+C     -0.09      0.02      3.08      1.16      2.21    -21.40
   9 C+C     -0.16      0.12      3.28     -0.28      1.06     46.51
  10 C+C      ----      ----      ----      ----      ----      ----

Best regards,

Xiang-Jun

11
Hi Betty,

Thanks for attaching three files which make your point clear (to me). It is indeed a confusing topic. Let me explain in more details below.

  • x3dna-dssr --analyze -- this is an option recently introduced to DSSR v1.9.0-2019mar26. It is intended to replace the 3DNA 'analyze' program. So the output from this option should be identical to those from "find_pair some.pdb | analyze". It contain local base-pair, local step parametets, and local helical parameters, and a set of 'simple' parameters intended for a qualitative description of non-WC pairs. The local parameters, but not the 'simple' parameters can be used as input to the 3DNA "rebuild" program. The local parameters are based on the standard base reference frames.
  • x3dna-dssr --analyze-cehs -- this option is intended to reproduce 'authentic' CEHS paramers as described in the 1995 El Hassan and Calladine JMB paper, and in SCHNAaP. The output does not contain helical parameters because these are notavailable in the original CEHS paper.
  • find_pair|cehs -- this option is intended to reproduce 'authentic' CEHS paramers (1995 JMB), plus a set of "SCHNAaP global helical parameters" I introduced in the 1997 JMB SCHNAaP paper.

Now the last two options, x3dna-dssr --analyze-cehs and find_pair|cehs, should give the same 'authentic' CEHS parameters. In the current case, they do not: the reason is that your structure contains only non-WC pairs which affect the directionality of reference frames chosen, and thus the ending results. Try a DNA structure, e.g. 355D, you'd see that they indeed give the same results. The original CEHS scheme was intended to study the conformational variations of WC structures.

I'll look into the issue so that in future releases, the two options will give exactly the same numerical values, even for non-WC pairs. Technically, it is not a big deal to fix. In reality, this can be very confusions, as illustrated clearly in your case.

Also, note that the x3dna-dssr --analyze option is a new addition. I mentioned the --analyze-cehs variant simply because we touched the CEHS topic. I'd otherwise not explicitly document this 'feature' in the DSSR manual.


At the very beginning, I thought you were interested in implementing the 1995 JMB-described CEHS scheme yourself. For that purpose, short of contacting the original authors themselves, I pointed you to the SCHNAaP program and the "ches" programs in 3DNA v2.4 distribution. That's still my advice.

If you're interested in analyzing non-WC structures using DSSR, please start a new thread.

Best regards,

Xiang-Jun

12
Please provide a specific example to illustrate your points unambiguously.

Thanks,

Xiang-Jun

13
Hi Betty,

You've run into one of the subtleties on base-pair parameters between CEHS, 3DNA, and DSSR.

Simply put, the CHES scheme ("The assessment of the geometry of dinucleotide steps in double-helical DNA; a new local calculation scheme.") was implemented in SCHNAaP/SCHNArP ["Structure and conformation of helical nucleic acids: analysis program (SCHNAaP)", and "Structure and conformation of helical nucleic acids: rebuilding program (SCHNArP)"], which laid the foundation for the 3DNA analyze/rebuild programs. The differences lie in the adopted reference frames, which can lead to significant discrepancies among the different analysis programs, even for WC pairs (see  "Resolving the discrepancies among nucleic acid conformational analyses."). In particular, CEHS would give a Stretch of ~5.5 Å for a WC pair, roughly the distance between the centroids of the six-membered rings of A/G and T/U/C.

The 3DNA local parameters are based on the CEHS scheme but are calculated using the standard base reference frame. You can still get the authentic CEHS parameters from 3DNA 2.4 by using the cehs program as you noted. The numerical values would be the same as those from SCHNAaP. As of DSSR v1.9.0-2019mar26, you can also get the authentic CEHS parameters via the newly added --analyze-cehs option.

The 'simple' parameters were introduced for a more intuitive, qualitative description of non-WC pairs (e.g. Hoogsteen pair). See my blogpost "Details on the simple base-pair parameters" and linked therein.

Quote
I have read through El Hassan and Callandine (1995), J. Mol. Biol, 251,648-664, but am still unclear about the mathematical basis for the parameters.
This is understandable. It is worth noting the sentence "We thank Dr C. A. Hunter for his many useful suggestions and comments, and Xian Jung [sic] Lu for checking data. " in the Acknowledgements. There was actually a story behind this and why I developed the SCHHNaP/SCHNArP programs. Anyway, there is a long way between a text description of an algorithm and its robust implementation where many details must be properly taken care of. If you want to get to the bottom of CEHS, you could start with the cehs program distributed with 3DNA v2.4., or SCHNAaP.

HTH,

Xiang-Jun

14
General discussions (Q&As) / Re: Circular DNA and Groove Information
« on: March 24, 2019, 01:01:18 pm »
I've 'officially' updated DSSR to v1.9.0-2019mar26 which contains the features you requested. Using your attached circular DNA as an example, an excerpt for the --analyze or --analyze-circular option is as below:

Code: [Select]
# --analyze, no circular specified
 148 A-T      0.00      0.00      3.40      1.38      1.97     36.00
 149 A-T      0.00      0.00      3.40      2.27      0.77     35.99
 150 A-T      ----      ----      ----      ----      ----      ----

Code: [Select]
# --analyze-circular
 148 A-T      0.00      0.00      3.40      1.38      1.97     36.00
 149 A-T      0.00      0.00      3.40      2.27      0.77     35.99
 150 A-T      0.00     -0.00      3.40      2.29     -0.71     36.00

So the step parameters listed with the 150th pair correspond to base pairs from #150 to #1.

Hope this solves the problems.

Best regards,

Xiang-Jun

15
General discussions (Q&As) / Re: Circular DNA and Groove Information
« on: March 22, 2019, 02:28:35 pm »
Quote
Well, when I get my circular reference frame files, my last frame is numbered 1 instead of N+1. My last row in my circular parameter file is the first base-pair.
So, if anything, I would suggest the following:

N-1 C-G     14.41     14.37     19.81     19.75
   N G-C     14.32     14.27     19.86     19.79
    1 A-T     14.32     14.27     19.86     19.79
Fair point. I'll see how I can accommodate it in DSSR in the near future.

In your previous post, you said the following:
Quote
It's been my experience that there doesn't need to be a new way of arranging the base pair nor the base pair step parameters except that the first base pair will need to be repeated at the end (N base pairs so N base pair steps) in order to collect the last base pair step data.

Xiang-Jun

Pages: [1] 2 3 ... 87

Created and maintained by Dr. Xiang-Jun Lu [律祥俊], Principal Investigator of the NIH grant R01GM096889
Dr. Lu is currently affiliated with the Bussemaker Laboratory at the Department of Biological Sciences, Columbia University.