Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Messages - xiangjun

Pages: 1 ... 41 42 [43] 44 45 ... 63
1051
RNA structures (DSSR) / Further note on DSSR
« on: April 25, 2013, 01:13:56 pm »
Mainly prompted by questions from Pascal (who has contributed the most posts among 3DNA users), here is a further note on DSSR.

Quote
It [DSSR] looks like a combined version of find_pair and analyze. Is that correct ?
Of course it seems not possible to (re)construct NA structures with DSSR.
Yes, to certain extent, you can think DSSR as a combination of find_pair and analyze. The post "DSSR, what's it and why bother?" provides more background information. You are right, DSSR does not construct nucleic acid structures.

DSSR represents my (opinionated) view of what a program for the structural analysis of nucleic acids (RNA in particular) should/could be, based on my extensive experience in supporting 3DNA, an increased knowledge in RNA structures and refined skills in C programming.

Quote
So first, why calling it DSSR and not DSSNA since it works also for DNA ?
I think that one should avoid the RNA domination, it is possible to learn from both structures.
thus, does DSSR really work for DNA ?
Again, read carefully the post "DSSR, what's it and why bother?" for my rationale. You may also notice that I put the word secondary in parenthesis in the title of the software, "DSSR: Software for Defining the (Secondary) Structures of RNA". DSSR surely works for DNA, or DNA-protein complexes in the same way as it does for RNA. As mentioned in the release note, I tested DSSR against every nucleic-acid-containing structure in the PDB. Overall, the acronym DSSR captures the essential message I'd like to get across, it is short, and it parallels the well-respected DSSP program for proteins (among other things).

Quote
Then, as for formats,
I think that as I mentioned it somewhere earlier, and since I am processing the output files
for a large number of structures, I appreciate when there are spacesbetween fields (see).
Code: [Select]
      base_id            alpha    beta   gamma   delta  epsilon   zeta     e-z        chi            phase-angle   sugar-type     Zp      Dp
 1     A.C2649            ---    167.1    47.6    84.1  -146.6   -77.1    -69(BI)   -160.5(anti)    12.9(C3'-endo)  ~C3'-endo    4.41    4.66
 2     A.U2650           -64.2   164.2    60.3    79.8  -154.5   -73.1    -81(BI)   -167.2(anti)    21.3(C3'-endo)  ~C3'-endo    4.40    4.55
I see your point, but the purpose of the output file is mainly for visual examination by a non-expert user. The message appears to be succinct. Your parser should be flexible enough to handle the case. Also see my reply to your initial thread.

Quote
and is there a need for writing twice the sugar pucker in this file ?
From my experience, the phase angle and pucker classification are the most useful information for the sugar moiety. I repeated  the sugar pucker together with commonly used backbone parameters for convenience; one can now easily see the backbone conformation at a glance.

Quote
you name this file torsion although there are sugar puckers in it.
Thus it might be called torsion_puckers.dat or something else.
I see your point, but the file also contains Zp and Dp, and pseudo torsion angles. I'd keep the name as is; it is just a convention to get used to.

Quote
For the non-pairing interactions that is just a great feature,

you had before two values for base overlap
one calculated by just using ring atoms the other by using all base atoms.

you could add this.
DSSR checks base-stacking interaction using all base atoms, and so is the output value of base-overlap-area. I will consider to add overlap areas based on just ring atoms.

Quote
Why adding the name of the chemical groups (hydroxyl, amino, imino, ...)
again this complicates reading since some groups are named and others not like OP2 and so on.

I would appreciate another presentation here.
I added the names of chemical groups (hydroxyl/amino/imino) for the convenience of those who are not that familiar with the chemistry of H-bond. I've first-hand experience with such people (mostly physics/mathematics/computer science turned bioinformaticians). I can add an option to turn the chemical group off; but honestly, I really think you should revised your parser to handle it properly.

Take the following case as an example:
Code: [Select]
H-bonds[2]: "N3(imino)-N1[2.81]; O4(carbonyl)-N6(amino)[3.13]"if your parser can extract the distance and the PDB atom names, it won't be that far to check for () and get rid of the name of the chemical groups.

Quote
I haven't really checked, but are your base pair numbering scheme coherent with the one
you use in find_pair ? It would be really nice to be the case.
What do you mean by "base pair numbering scheme"? The serial numbers should not matter; the base pair is specified by the two constituent nucleotides (chain id, residue name and number, etc).

Quote
Also, I wanted to ask you that but know it seems to be done. You add various names
to each base pair. Thats great. Just a hint to the various nomenclatures (Leontis-Westhof, Saenger...)
would be helpful in the *.out files.
Advice taken  :) -- I will add a note in DSSR-beta-r10 (coming soon).

Quote
is there a configuration file that would allow to precise hydrogen bond and other parameters like in 3DNA.
I would really appreciate that.
To make DSSR self-contained, I've eliminated the configuration file. Overall, DSSR has refined algorithms for finding H-bonds, base pairs, helices etc, and the defaults should work for the vast majority of cases. So regular users could take DSSR as a black box, and they can check the results based on their domain knowledge and application needs.

DSSR also accepts command-line options to alter the default behavior. For example, you can use --hbond_d2=3.6 to set up the upper limit of H-bond length to 3.6 instead of the default 4.0 Å. I am working on a manuscript that describes details of the software.

HTH,

Xiang-Jun

1052
Hi Pascal,

I see your points, and agree with most of them. In the next DSSR-beta release (r10), there will be a section explaining specification of nucleotide id string, base pair classifications, helix vs stem definition etc. The goal is not necessarily maximum documentation in the output files, but a minimum that helps avoid confusion for DSSR's common usages. DSSR aims to be self-contained, self-explanatory, and simple to use for a typical non-expert user.

I will consider to remove ~ in sugar classification into C2'-endo like or C3'-endo like. Nevertheless, I still want to keep the one space offset of the two types, since it makes C2'-endo like sugar conformation in a mostly C3'-endo like RNA structure stand out for visual inspection. This works for me, and that's why I decided an extra space there. I believe this trick is helpful to other structural biologists/chemists as well. As noted above, I will add some explanation in r10 to make my intention explicit.

Regarding parsing the DSSR output files, I would suggest you to make your program/script flexible. At this stage at least, I want to reserve as much freedom as possible in making DSSR most logical/sensible to me. However, I do have a clear mind of allowing for easy web/database interface to DSSR. In the future (not necessarily that near), I will add a SQLite-based SQL output of the DSSR parameters that would make parsing straightforward. I am convinced that is the way to go.

As always, constructive interactions with users like you moves 3DNA (DSSR) forward.

Xiang-Jun

1054
Hi Pascal,

Quote
sugar-type: sugar classification into C3'-like or C2'-like
I should have updated the phrase as "C3'-endo like or C2'-endo like"

Quote
Yet, I noted a misalignment at some places with the ~C2'-endo value.
The misalignment is on purpose, to make the two broad classes more visually separated. I chose '~' for similar to (close to), as in math.

Does that make sense?

Xiang-Jun

1055
General discussions (Q&As) / Re: possible rotate_mol labeling issue
« on: April 23, 2013, 12:34:05 pm »
Quote
Yes, but what would you do for terminal residues that lack phosphate groups. Why not simply rely on the pdb author labeling ?
We are considering just the default setting of the pdbv3 option. It's tricky to make it work for every situation, and that's why it is a command-line option to begin with. Users can always with set pdbv3=false to keep nucleotide name as is. I believe that's what you need to do.

If by "Why not simply rely on the pdb author labeling ?" you mean to set pdbv3=false as the default, then we are on ground zero. Again, see how this switch of default setting in 3DNA got started by following the thread "O1P_O2P still needed ?".

When the PDB format is switched from v2.x to v3.x, I do not know how many programs break. I remember at one time MolProbity had two versions, one for each PDB format. 3DNA tries to accommodate many variants of the so-called PDB format. Obviously, for certain cases, manual setting is needed from the user.

Quote
As for manuals, I understand your point. One way to solve this would be to include a manual for all options
available through a -help option. This manual would be updated along with the addition/modification of program options.
And all users could rely on this.
Good suggestion. The only issue is that the 'global' settings need to be repeated in each program. Maybe I can put them on the Forum, or find a way to avoid duplication. One principle of software design is DRY: "Don't repeat yourself".

Thanks,

Xiang-Jun

1056
General discussions (Q&As) / Re: possible rotate_mol labeling issue
« on: April 23, 2013, 11:10:34 am »
Hi Pascal,

Thanks for your thoughtful comments.

Quote
First, it would be nice to have manuals for them but I know you are working on them.
See the thread "O1P_O2P still needed ?" you started. It was at that time, I said: "Moreover, the -pdbv3 option is now on by default, so you do not have to bother with adding it in every 3DNA program. The previous behavior is still available by setting explicitly -pdbv3=no." (yes/no, true/false, on/off, 1/0 pairs are all okay for the setting).

Maybe I should have written a detailed manual about the various options, but experience has taught me it is not easy to keep things (commandline vs online vs hardcopy PDF) synchronized. Wrong documentation is worse than no documentation, and people do not like to read long manuals at all.

With DSSR, I have tried to make default options sensible for the most general use cases so that a regular user can get started right away [think of the 'deceitfully' simple user interface of Google]. For expert users like you, I won't expect a 'static' manual could fill your need :). So I am always quick in fixing bugs and in classifying user confusions.

Quote
Then, you should may be have a forum topic for clarification issues (the bug report was my only possible choice - sorry).
I have already moved the thread to "General discussions (Q&As)". The current categories may not be comprehensive, but already contains some overlaps.

Quote
For now, I tried the -pdbv3 option and it solves my issue, thanks.
I am glad the trick works.

Quote
On the other side, I think that you may be should reconsider the option of checking if an O2' is present since it lacks universality.
I was living with this program behavior for a while without realizing that I didn't get the expected results.
Just pay attention to your definition of backbone atoms.
Some files might contain C1' atoms and no other backbone atoms, so this seems a tricky issue.
I agree "this seems a tricky issue". In addition to set -pdbv3=false (or no/off/0) explicitly, How about checking for both the existence of phosphorus (P) and absence of O2' for DNA? Then your case won't qualify as a DNA fragment, and so Cs are not converted to DCs by default.

HTH,

Xiang-Jun


PS: Did you check the new feature "x3dna-dssr --non-pair" for non-pairing interactions (H-bonds or base-stacking) that you requested?

1057
General discussions (Q&As) / Re: possible rotate_mol labeling issue
« on: April 22, 2013, 03:09:53 pm »
Hi Pascal,

I won't take the issue you experienced as a 'bug' since rotate_mol works as expected. Here is how: as of 3DNA v2.1, the default output follows PDB format v3.x where DNA bases are named DA/DC/DG/DT etc. The program judges if a fragment is RNA/DNA based on the existence/absence of O2' atom. Since your example does not contain O2' atoms, it is taken as DNA. Thus the nucleotide Cs are converted to DCs.

You could use the -pdbv3=false option to get the desired behavior. Please have a try and post back if that works.

Since you bring up the issue, I may consider to refine the code so that when backbone atoms do not exist, 3DNA leaves a nucleotide as is.

Xiang-Jun

1058
Hi Pascal,

I've just released DSSR beta-r09-on-20130421 which contains a new option -non-pair (or --non-pair) to detect/output non-pairing interactions, including H-bonds and base stacking. As an example, see below for PDB entry 1msy which contains a GNRA tetra-loop.

x3dna-dssr --non-pair -i=1msy -o=1msy.out

The output file '1msy.out' contains the following:

****************************************************************************
List of 12 non-pairing interaction(s)
   1 A.G2648          A.G2673         base-overlap-area=2.0   
       H-bonds[0]: ""
   2 A.U2650          A.G2671         base-overlap-area=0.1   
       H-bonds[0]: ""
   3 A.C2652          A.G2669         base-overlap-area=0.2   
       H-bonds[0]: ""
   4 A.A2654          A.U2656         base-overlap-area=3.7   
       H-bonds[1]: "O4'*O4'[3.05]"
   5 A.G2655          A.G2664         base-overlap-area=4.4   
       H-bonds[1]: "O2'(hydroxyl)-O6(carbonyl)[3.09]"
   6 A.G2655          A.A2665         base-overlap-area=0.0   
       H-bonds[3]: "N1(imino)-OP2[2.77]; N2(amino)-OP2[3.34]; N2(amino)-O5'[2.89]"
   7 A.U2656          A.G2664         base-overlap-area=0.0   
       H-bonds[2]: "OP2-N1(imino)[3.04]; OP2-N2(amino)[2.94]"
   8 A.A2657          A.A2665         base-overlap-area=3.7   
       H-bonds[0]: ""
   9 A.G2659          A.A2661         base-overlap-area=0.0   
       H-bonds[1]: "O2'(hydroxyl)-N7[2.60]"
  10 A.G2659          A.G2663         base-overlap-area=3.9   
       H-bonds[0]: ""
  11 A.U2660          A.A2661         base-overlap-area=7.5   
       H-bonds[0]: ""
  12 A.A2661          A.A2662         base-overlap-area=6.3   
       H-bonds[0]: ""
****************************************************************************


Please let me know how you'd like to revise the content/format. As always, concrete examples work the best.

Xiang-Jun

1059
Glad to hear that you've fixed the installation problems. Could you share what went wrong, and how you solved the issues? Some instructions are likely to be unclear in the post "How to install 3DNA on Linux and Windows?" that need to be improved.

Thanks for using 3DNA and posting on the forum.

Xiang-Jun

1060
General discussions (Q&As) / Re: BI and BII conformations
« on: April 12, 2013, 12:34:45 pm »
I am glad to see the happy :D. This thread help to illustrate clearly that a concrete example is the most effective and unambiguous way to get a (technical) point across. So in the future, whenever you have a 3DNA-related question, please be specific, and do not hesitate to post on the forum.

Xiang-Jun

1061
General discussions (Q&As) / Re: BI and BII conformations
« on: April 12, 2013, 11:44:05 am »
Quote
if epsilon = 0 to +360 and zeta= 0 to +360 e-z = -360     0        +360, but how can get negative values in this way?

Well, I am a bit confused by your reply too. If epsilon=60, zeta=160, then e-z=60-160=-100, which is NEGATIVE, right?

That's why I asked you to provide some concrete examples to walk through. Okay, let's use PDB  id 355d as an example, if you run the command:
Code: [Select]
analyze -tor=355d.tor 355d.pdb, you will have the following in file 355d.tor:

Code: [Select]
              base      chi A/S     alpha    beta   gamma   delta  epsilon   zeta     e-z BI/BII
------------------------------------------------------------------------------------------------
  10 A:..10_:[.DG]G   -83.6 anti    -60.3   163.2    39.5   143.2  -100.0   146.3   113.6  BII
  11 A:..11_:[.DC]C  -112.8 anti    -73.1   144.3    50.8   143.5  -164.4  -126.1   -38.3  BI

For nt C11, epsilon=-164.4, corresponding to -164.4+360=195.6; whilst zeta=-126.1, corresponding to -126.1+360=233.9; 195.6-233.9=-38.3 which is the value reported in 3DNA (see above). Since -38.3 < 0, 360 is added to it: -38.3+360=321.7, which is out of [20, 200], so it is assigned BI. Please work out the numbers for G10, and report back here.

Does this clarify your confusion?

Xiang-Jun

1062
General discussions (Q&As) / Re: BI and BII conformations
« on: April 11, 2013, 08:54:36 am »
Thanks for your followup. After checking the source code carefully and a sample 3DNA output, I noticed that my previous reply is incomplete (inaccurate)  :(. Thus the discrepancy: you found that e-z can be negative in analyze -tor output, while I said that it's in range [0, 360] which should always be positive.

The missing piece is that the 3DNA reported difference (d=e-z) is the raw value, which can be positive or negative. Only for classifying a BI/BII conformation, negative d value is added 360 to make it positive; this simplify the code since the continuous range 20, 200 is checked for BII conformation.

Does this solve your puzzle? If still not, please post some concrete examples so we can walk them through to get to the bottom of the issue.

Xiang-Jun

1063
General discussions (Q&As) / Re: BI and BII conformations
« on: April 10, 2013, 05:16:41 pm »
It is a good question; the short answer is that the range of e-z is in [0, 360].

In 3DNA, the backbone BI/BII classification is based on e-z and works as below:
  • It follows the definition as given by IMB Jena, "Nucleic acid backbone parameters". Specifically,
    BI:  (epsilon-zeta) = -160° ... +20°
    BII: (epsilon-zeta) = +20° ... +200°
  • In 3DNA, all torsion angles are given in the range of [-180, +180]. To calculate e-z, epsilon and zeta are first converted to [0, 360]. If the difference (d=e-z) is < 0, then d=d+360. So d should be in [0, 360]. If d is in [20, 200], it is classified as BII, otherwise, BI.
Does that make sense? If you know of any more elaborate definition of the BI/BII conformations, please let me know.

Xiang-Jun

1064
Hi Pascal,

Thanks for your request for adding non-pairing nucleotide-nucleotide contacts (nt-nt) in 3DNA output. Currently, there is no such an option in 3DNA, but I do have this topic in mind for quite a while. It won't be hard to implement this functionality in DSSR/3DNA, so I will probably be able to get something done next week.

While we are at it, in addition to non-pairing nt-nt interactions via H-bond, how about those with only base-stacking? What output format would be prefer?

Xiang-Jun

1065
MD simulations / Re: Base stacking from x3dna_ensemble
« on: April 01, 2013, 12:06:03 pm »
Quote
theta  = arccos (dot(n1,n2)) where n1 and n2 are normal vectors from the b1n file

I wonder whether the angle I am getting is in radians?

You are getting the angle in radians instead of degrees. To verify, let n1 = [1, 0, 0]; n2 = [0, 0, 1], the angle should be 90 degrees.

Xiang-Jun

1066
General discussions (Q&As) / Re: zero or negative helical rise?
« on: March 28, 2013, 03:36:17 pm »
DSSR does not calculate helical/step parameters (at least not yet), as shown clearly in the enclosed file for 1xvk in my previous reply. See also my reply to "large deviations in output values".

In 3DNA distribution, there is also a program called 'cehs' (as in SCHNAaP) which uses RC8--YC6 as the bp long axis, and mean bp normal as the z-axis. In dinucleotide steps with non-canonical bps, 'cehs' normally provides a more 'sensible' rise/twist values. Please have a try on 1xvk and report back how it goes.

Xiang-Jun

1067
General discussions (Q&As) / Re: large deviations in output values
« on: March 27, 2013, 12:54:49 pm »
Quote
There are large deviations from average values in some of the parameters. Are the values real? What is the best way to view them in structures if so ?
The short answer is yes, they are 'real', as would be expected from 3DNA. The large deviations of 3DNA parameters for 3sj2 (and many other PDB entries) are due to non-canonical base pairs (bp). In the case of 3sj2, they are the three G+G pairs.

3DNA adopts the standard base reference frame, which is based on what would be a perfectly planar Watson-Crick (WC) bp geometry. Thus, by definition, WC pairs have the six bp parameters (Shear, Stretch, Stagger, Buckle, Propeller, Opening) close to zeros, with certain variations, mostly in Stagger, Buckle,  and Propeller. Non-canonical bps can have numerous ways to deviate from a WC bp; however, whatever the case, they can be rigorously quantified by the six rigid-body bp parameters. See the 3DNA NAR03 paper for details.

A non-cannonical bp not only has a set of characteristic bp parameters (see below for G+G bps in 3sj2), it could also greatly effect the (middle) bp reference frame used to derive the bp step and helical parameters. That's why you see large deviations of those parameters from normal WC steps (see below).

     bp        Shear    Stretch   Stagger    Buckle  Propeller  Opening
    3 G-U      -2.33     -0.47      0.13      2.30    -12.27      1.61
    4 C-G       0.22     -0.06     -0.00      3.70    -18.31      2.91
    5 C-G       0.22     -0.10      0.10      0.15     -4.59     -0.12
    6 G+G      -1.48     -3.61     -0.15     11.94      4.31     87.72

    step       Shift     Slide      Rise      Tilt      Roll     Twist
   4 CC/GG     -0.61     -1.97      3.29     -1.87      9.30     30.50
   5 CG/GG      0.04     -3.25     -1.34   -170.79     31.71    160.42
   6 GG/CG     -0.47     -3.69     -3.19    128.85   -110.61     97.02

3DNA derives a complete set of bp parameters to rigorously characterize the relative base geometry. For example, run the following 3DNA commands to see how 'analyze' and 'rebuild' complement each other to illustrate the point:
Code: [Select]
find_pair 3sj2.pdb stdout | analyze stdin
rebuild -atomic bp_step.par 3sj2-3dna.pdb
# superimpose '3sj2-3dna.pdb' onto '3sj2.pdb' using only base atoms, the rmsd would be ~0
find_pair 3sj2-3dna.pdb stdout | analyze stdin
# compare '3sj2-3dna.out' and '3sj2.out', the bp parameters are virtually identical

I see it as an advantage for 3DNA to report those 'weird' parameters, as it would (should) draw a user's attention. These bps and steps should be treated separately from the normal variations in structures (fragments) consisting of only WC bps. There could be ad hoc ways to made 'weird' values look normal, but they lack rigor and consistency. Again, see the 3DNA NAR03 paper for more info.

Note also that for a relatively straight duplex structure like 3sj2, 3DNA also output a set of "Global parameters based on C1'-C1' vectors:" as shown below, which you may find useful:
Code: [Select]
disp.: displacement of the middle C1'-C1' point from the helix
angle: inclination between C1'-C1' vector and helix (subtracted from 90)
twist: helical twist angle between consecutive C1'-C1' vectors
rise:  helical rise by projection of the vector connecting consecutive
       C1'-C1' middle points onto the helical axis

     bp       disp.    angle     twist      rise
   1 G-C      8.82     12.28     28.16      3.19
   2 G-C      8.09     11.46     32.47      2.62
   3 G-U      6.75     11.40     28.94      2.40
   4 C-G      7.06      8.02     31.16      2.87
   5 C-G      7.45      8.60     33.16      2.88
   6 G+G      7.07      7.95     29.95      2.72
   7 G-C      7.20      8.14     30.17      3.24
   8 C-G      6.86      9.94     34.81      2.83
   9 G+G      6.13     10.17     28.18      2.62
  10 G-C      6.56      6.75     33.11      2.91
  11 C-G      6.71      6.30     30.30      2.70
  12 G+G      6.05      9.99     33.37      2.80
  13 G-C      6.48     10.27     29.88      3.13
  14 G-C      6.82      7.48     27.46      2.97
  15 U-G      7.18      5.34     38.30      2.79
  16 C-G      7.16     10.13     26.27      2.90
  17 C-G      7.17     12.43      ---       ---

HTH,

Xiang-Jun

1068
RNA structures (DSSR) / Re: DSSR output - Base pair characteristics
« on: March 26, 2013, 01:33:06 pm »
Hi Jose,

Thanks for trying out DSSR and for your kind comment about the program. User feedback like yours is a great incentive for me to make DSSR a better tool to serve the RNA structure community.

To start with, I have tried hard to made DSSR easy to set up and play with. Based on my reading of literature in structural biology, I've made the DSSR output more intuitive (compared to previous 3DNA programs). For example, A.U2647 means U2647 on chain A. So far, I have not heard of any installation problem yet, and I am glad that you can make sense of most items in the DSSR output.

Your question regarding the meaning of the last column is well expected. It represents my own notation to specify a base pair, as elaborated below:
  • Each base has three edges: W for the Watson-Crick edge, M for the major groove edge, and m for the minor groove edge. M corresponds to the Hoogsteen (or C-H) edge of the Leontis-Westhof nomenclature, and for the majority of cases (where the glycosidic bond is anti) m agrees with the 'sugar' edge. Note that in DSSR, the edges are defined purely on the geometry of the base plane as would be in a Watson-Crick base pair, and it is not related to sugar. See my post "The chi (χ) torsion angle characterizes base/sugar relative orientation". The DSSR definition applies to RNA as well as DNA, with either syn or anti glycosidic bond.
  • In some boundary cases, the two bases in a pair may not be directly interacting edge-to-edge, where it is not straightforward to clearly designate which edge is involved. This is where the '.' comes in.
  • The DSSR notation contains 4 characters of the pattern: [ct][WMm.][+-][WMm.]. The third position is either '+' or '-', and it designates the relative orientation of the two bases (flipped or normal) as has been consistently used in 3DNA. For example, see the difference in A+U Hoogsteen pair vs. A-U Watson-Crick pair.
  • The first position is either 'c' for cis and 't' for trans of the two glycosidic bonds. It is defined by the 'virtual' torsion angle tor(N1-C1'-C1'-N9) reported in the DSSR output.

HTH,

Xiang-Jun

PS. Does "PDBID 3OXO" correspond to an RNA structure?

1069
General discussions (Q&As) / Re: zero or negative helical rise?
« on: March 25, 2013, 04:20:39 pm »
The issue you noticed with negative rise in 1xvk etc is due to the extensive non-canonical base pairs. Under such circumstance, as in more commonly seen in RNA structures, the meaning of base-pair step and helical parameters may not make (much) intuitive sense. Yet, these parameters are required to rigorously characterize the structure. Note that the base-pair parameters (Shear, Stretch, Stagger, Buckle, Propeller and Opening) still have their normal interpretation.

I'd recommend you use DSSR, which shows clearly the two Hoogsteen pairs, among other things, as show below. Check also files dssr-torsions.dat, dssr-pairs.pdb etc.

HTH,

Xiang-Jun

****************************************************************************
    DSSR: Software for Defining the (Secondary) Structures of RNA
      by Xiang-Jun Lu (xiangjun@x3dna.org), beta-r08-on-20130323

   The program is currently under active development. As always, we
   greatly appreciate your feedback! Please report all DSSR-related
   issues on the 3DNA Forum (http://forum.x3dna.org/), and I strive
   to promptly respond to any questions posted there.
****************************************************************************
Date and time: Mon Mar 25 16:15:39 2013
File name: 1xvk.pdb1
    no. of DNA/RNA chains: 1 [A=16]
    no. of nucleotides:    16
    no. of waters:         112
    no. of metals:         2 [Mg=2]
****************************************************************************
List of 8 base pair(s)
   1 1:A.DG1          2:A.DC8          [G+C]              00-n/a    cHW cM+W
       74.1(syn) C2'-endo lambda=47.5; -100.4(anti) C2'-endo lambda=61.6
       d(C1'-C1')=8.31 d(N1-N9)=6.64 d(C6-C8)=6.16 tor(N1-C1'-C1'-N9)=-0.1
       H-bonds[2]: "N7*N3[2.69]; O6(carbonyl)-N4(amino)[2.85]"
       bp_pars: [0.46    -3.41   -0.35   3.71    -5.92   67.37]
   2 1:A.DC2          2:A.DG7          [C-G] WC           19-XIX    cWW cW-W
       -103.9(anti) C4'-exo  lambda=55.5; -105.3(anti) C1'-exo  lambda=52.5
       d(C1'-C1')=10.53 d(N1-N9)=8.80 d(C6-C8)=9.65 tor(N1-C1'-C1'-N9)=0.0
       H-bonds[3]: "O2(carbonyl)-N2(amino)[2.76]; N3-N1(imino)[2.89]; N4(amino)-O6(carbonyl)[2.74]"
       bp_pars: [0.27    -0.18   0.35    -22.28  3.73    -2.75]
   3 1:A.DG3          2:A.DC6          [G-C] WC           19-XIX    cWW cW-W
       -108.4(anti) C1'-exo  lambda=54.1; -107.8(anti) C4'-exo  lambda=54.2
       d(C1'-C1')=10.47 d(N1-N9)=8.77 d(C6-C8)=9.65 tor(N1-C1'-C1'-N9)=-2.6
       H-bonds[3]: "O6(carbonyl)-N4(amino)[2.87]; N1(imino)-N3[2.88]; N2(amino)-O2(carbonyl)[2.81]"
       bp_pars: [-0.38   -0.18   0.41    22.84   2.54    -2.67]
   4 1:A.DT4          2:A.DA5          [T+A] Hoogsteen    23-XXIII  cWH cW+M
       -95.4(anti) C2'-endo lambda=59.7; 68.3(syn) C1'-exo  lambda=53.5
       d(C1'-C1')=8.37 d(N1-N9)=6.77 d(C6-C8)=6.17 tor(N1-C1'-C1'-N9)=0.5
       H-bonds[2]: "N3(imino)-N7[2.86]; O4(carbonyl)-N6(amino)[2.80]"
       bp_pars: [-0.69   3.57    0.31    -3.48   7.41    -70.55]
   5 1:A.DA5          2:A.DT4          [A+T] Hoogsteen    23-XXIII  cHW cM+W
       68.3(syn) C1'-exo  lambda=53.5; -95.4(anti) C2'-endo lambda=59.7
       d(C1'-C1')=8.37 d(N1-N9)=6.77 d(C6-C8)=6.17 tor(N1-C1'-C1'-N9)=0.5
       H-bonds[2]: "N7-N3(imino)[2.86]; N6(amino)-O4(carbonyl)[2.80]"
       bp_pars: [0.69    -3.57   -0.31   3.48    -7.41   70.54]
   6 1:A.DC6          2:A.DG3          [C-G] WC           19-XIX    cWW cW-W
       -107.8(anti) C4'-exo  lambda=54.2; -108.3(anti) C1'-exo  lambda=54.1
       d(C1'-C1')=10.47 d(N1-N9)=8.77 d(C6-C8)=9.65 tor(N1-C1'-C1'-N9)=-2.6
       H-bonds[3]: "O2(carbonyl)-N2(amino)[2.81]; N3-N1(imino)[2.88]; N4(amino)-O6(carbonyl)[2.87]"
       bp_pars: [0.38    -0.18   0.41    -22.84  2.54    -2.67]
   7 1:A.DG7          2:A.DC2          [G-C] WC           19-XIX    cWW cW-W
       -105.3(anti) C1'-exo  lambda=52.5; -103.9(anti) C4'-exo  lambda=55.5
       d(C1'-C1')=10.53 d(N1-N9)=8.80 d(C6-C8)=9.65 tor(N1-C1'-C1'-N9)=0.0
       H-bonds[3]: "O6(carbonyl)-N4(amino)[2.74]; N1(imino)-N3[2.89]; N2(amino)-O2(carbonyl)[2.76]"
       bp_pars: [-0.27   -0.18   0.35    22.28   3.73    -2.75]
   8 1:A.DC8          2:A.DG1          [C+G]              00-n/a    cWH cW+M
       -100.4(anti) C2'-endo lambda=61.5; 74.1(syn) C2'-endo lambda=47.6
       d(C1'-C1')=8.31 d(N1-N9)=6.64 d(C6-C8)=6.16 tor(N1-C1'-C1'-N9)=-0.1
       H-bonds[2]: "N3*N7[2.69]; N4(amino)-O6(carbonyl)[2.86]"
       bp_pars: [-0.46   3.41    0.35    -3.71   5.93    -67.37]
****************************************************************************
List of 1 helix
  helix=1[2] bps=8
   1 1:A.DG1          2:A.DC8          [G+C]              00-n/a    cHW cM+W
   2 1:A.DC2          2:A.DG7          [C-G] WC           19-XIX    cWW cW-W
   3 1:A.DG3          2:A.DC6          [G-C] WC           19-XIX    cWW cW-W
   4 1:A.DT4          2:A.DA5          [T+A] Hoogsteen    23-XXIII  cWH cW+M
   5 1:A.DA5          2:A.DT4          [A+T] Hoogsteen    23-XXIII  cHW cM+W
   6 1:A.DC6          2:A.DG3          [C-G] WC           19-XIX    cWW cW-W
   7 1:A.DG7          2:A.DC2          [G-C] WC           19-XIX    cWW cW-W
   8 1:A.DC8          2:A.DG1          [C+G]              00-n/a    cWH cW+M
****************************************************************************
List of 2 stems
  stem=1[#1] bps=2
   1 1:A.DC2          2:A.DG7          [C-G] WC           19-XIX    cWW cW-W
   2 1:A.DG3          2:A.DC6          [G-C] WC           19-XIX    cWW cW-W

  stem=2[#1] bps=2
   1 1:A.DC6          2:A.DG3          [C-G] WC           19-XIX    cWW cW-W
   2 1:A.DG7          2:A.DC2          [G-C] WC           19-XIX    cWW cW-W
****************************************************************************
List of 1 coaxial stack(s)
   1 Helix#1 contains 2 stems: [#1, #2]
****************************************************************************
List of 1 internal loop(s)
   1 symmetric internal loop: 8 nts; [2x2]; linked by [#1, #2]
       1:A.DG3+1:A.DT4+1:A.DA5+1:A.DC6+2:A.DG3+2:A.DT4+2:A.DA5+2:A.DC6 [GTACGTAC]
****************************************************************************
>chain-A #1 DNA* with 16 nts
GCGTACGCGCGTACGC
.((..((..))..)).

1070
Thanks for pointing out the PDB id (2kd4) of the structure you are interested in, and attaching the corresponding 3DNA output files. As always, such information is useful by making our discussions concrete.

Browsing through the output file (2kd4.out) and looking at the structure with Jmol, it appears to me 3DNA has no problem to analyze this structure.

First, the twist angles associate with intercalated steps are smaller, while those for the flanking steps are larger, than normal A-DNA twist values.

Local base-pair step parameters
    step       Shift     Slide      Rise      Tilt      Roll     Twist
   1 GC/GC      1.21      0.19      3.77      5.32    -16.06      9.14
   2 CC/GG      0.42     -1.82      4.09    -13.22      0.06     37.79
  3 CG/CG     -0.08     -1.17      6.68     -7.87    -11.95     20.28
   4 GC/GC      0.07     -0.19      3.40      0.22     -0.57     49.80
  5 CG/CG     -0.14     -1.18      6.62      3.50     -6.88     21.72
   6 GG/CC      0.16     -1.82      3.69     14.10     12.69     39.30
   7 GC/GC     -1.33      0.15      3.44     -6.43    -13.80      6.98

Second, the output for the sugar torsion angles are also as expected (see below for the output for strand I). Because of the 2'--5' backbone linkage, angles alpha [O3'(i-1)-P-O5'-C5'], epsilon [C4'-C3'-O3'-P(i+1)] and zeta [C3'-O3'-P(i+1)-O5'(i+1)] are not defined in the conventional sense.

Strand I
  base    alpha    beta   gamma   delta  epsilon   zeta    chi
   1 G     ---     ---   -135.8   105.5    ---     ---     18.3
   2 C     ---    145.3    62.2    86.8    ---     ---   -126.7
   3 C     ---    176.8   -35.2   134.4    ---     ---   -103.9
   4 G     ---    115.9    64.8    61.7    ---     ---   -137.3
   5 C     ---    163.8    48.4   113.5    ---     ---   -114.8
   6 G     ---    178.6    50.6    83.4    ---     ---   -120.9
   7 G     ---   -149.2    33.4   139.0    ---     ---    -83.2
   8 C     ---    154.9    20.2   120.1    ---     ---   -132.2


Note the following output from find_pair:
Code: [Select]
^^vv opposite bp direction: 1(8) 1(1)-2(2)
^^vv opposite bp direction: 1(8) 7(7)-8(8)

I have attached the stacking diagram of the first step, where one can see clearly the two base pairs have opposite orientation. The same is true for the last step. Such base flapping occurs around B-Z junctions, as in 2acj. I do not know how reliable this part of the structure is, or its relevance.

You may also want to analyze this structure (or related ones) using Curves+. I noticed that Horowitz et al. used Curves to analyze 2kd4 in their JACS publication.

HTH,

Xiang-Jun

1071
Could you try 3DNA on the structure you linked to ("Solution Structure and Thermodynamics of 2′,5′ RNA Intercalation"), and report back any problem you have?

Xiang-Jun

PS: In posting a question, it'd be very helpful to attach a structure file, or provide a PDB id.

1072
RNA structures (DSSR) / Re: Bug report of DSSR beta
« on: March 19, 2013, 12:36:23 am »
Hi Marc,

I've just released DSSR beta-r06-on-20130319 which should have fixed the "segmentation fault" bug for PDB entry 2a64. I've also taken this opportunity to update the -h (--help) message.

Have a try and report back how it goes!

Xiang-Jun

1073
RNA structures (DSSR) / Re: Bug report of DSSR beta
« on: March 18, 2013, 01:33:30 pm »
Hi Marc,

Thanks for trying out DSSR and reporting back a bug with the 64-bits unix version of beta-r05-on-20130316 on PDB entry 2a64. I've verified the bug and will get it fixed ASAP.

I've also tried the same 2a64 entry with the Mac OS X version without the reported problem. Since the "Segmentation fault (core dumped)" occurs after the output of "total number of junctions: 4", it is likely due to the ribose zipper detection function recently added into DSSR.

In any event, keep testing and report back any further issues you encounter. Also stay tuned for the next release! :)

Xiang-Jun

1074
RNA structures (DSSR) / Re: DSSR - List of bases involved in hairpins?
« on: March 14, 2013, 03:29:27 pm »
I've just updated DSSR to beta-r04-on-20130314. Among other things, now all nucleotides in hairpin loops are explicitly listed in addition to information reported before. So for 1msy, the new output looks like below:

Code: [Select]
List of 1 hairpin loop(s)
   1 nts=4 GUAA closed by pair {A.C2658+A.G2663 [CG], #-1}
       A.C2658+A.G2659+A.U2660+A.A2661+A.A2662+A.G2663 [CGUAAG]

I have some reasons to report hairpin loops differently from other types of loops (bulges, internal loops, junctions), one being to follow the convention. For example, at first glance, one would immediately see that 1msy contains a GUAA tetra-loop (of the most common GNRA type).

I am not aware of a consistent way to name other loops in the literature of RNA structures, so I've come up with my own convention. I'd like to hear what the community has to say when DSSR gains more popularity in the RNA structural world, and make adjustments accordingly.

HTH,

Xiang-Jun

1075
RNA structures (DSSR) / Re: DSSR - List of bases involved in hairpins?
« on: March 12, 2013, 06:40:11 pm »
Thanks for your feedback. I will get DSSR updated in a couple of days, where the requested info will be added to hairpin loops. In the meantime, please test DSSR more thoroughly and report back your thoughts that would make it better tool from a user's perpective.

Best regards,

Xiang-Jun

Pages: 1 ... 41 42 [43] 44 45 ... 63

Created and maintained by Dr. Xiang-Jun Lu [律祥俊] (xiangjun@x3dna.org)
The Bussemaker Laboratory at the Department of Biological Sciences, Columbia University.