Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Topics - xiangjun

Pages: 1 2 3 [4] 5

DSSR-NAR paper / Supplementary Figure 1 -- four base triplets in yeast phenylalanine tRNA (1ehz)

« on: July 05, 2015, 08:55:54 am »

"four base triplets in tRNA (1ehz)" title="four base triplets in tRNA (1ehz)"

Quote

Figure S1: The four base triplets and associated hydrogen bonds (dashed lines) detected by DSSR in yeast tRNA^Phe (PDB id: 1ehz). (A) UAA (U8,A14,A21), with a reverse Hoogsteen U8–A14 pair. (B) AUA (A9,U12,A23). (C) gCG (2MG10,C25,G45). (D) CGg (C13,G22,7MG46). Here, the bases in each triplet are listed in sequential order, with the one-letter shorthand form followed by more detailed identifiers in parentheses. In (C) and (D), the lower case ‘g’ represents the shortened name for modified guanine nucleotides 2MG10 and 7MG46, respectively. Note that the triplets and hydrogen bonds match those originally reported by Quigley and Rich for yeast tRNA^Phe.

Starting from "1ehz.pdb" downloaded from RCSB PDB, here is the complete script to get each of the four base-triplet images in png format.

Code: Bash

x3dna-dssr -i=1ehz.pdb -o=1ehz.out --prefix=1ehz
 
ex_str -1 1ehz-multiplets.pdb 1ehz-m1.pdb
x3dna-dssr -i=1ehz-m1.pdb -o=1ehz-m1.pml --hbfile-pymol
pymol -qkc 1ehz-m1.pml
convert -trim +repage -border 10 -bordercolor white 1ehz-m1-pymol.png 1ehz-m1.png
 
ex_str -2 1ehz-multiplets.pdb 1ehz-m2.pdb
x3dna-dssr -i=1ehz-m2.pdb -o=1ehz-m2.pml --hbfile-pymol
pymol -qkc 1ehz-m2.pml
convert -trim +repage -border 10 -bordercolor white 1ehz-m2-pymol.png 1ehz-m2.png
 
ex_str -3 1ehz-multiplets.pdb 1ehz-m3.pdb
x3dna-dssr -i=1ehz-m3.pdb -o=1ehz-m3.pml --hbfile-pymol
pymol -qkc 1ehz-m3.pml
convert -trim +repage -border 10 -bordercolor white 1ehz-m3-pymol.png 1ehz-m3.png
 
ex_str -4 1ehz-multiplets.pdb 1ehz-m4.pdb
x3dna-dssr -i=1ehz-m4.pdb -o=1ehz-m4.pml --hbfile-pymol
pymol -qkc 1ehz-m4.pml
convert -trim +repage -border 10 -bordercolor white 1ehz-m4-pymol.png 1ehz-m4.png

Note:

The --prefix option makes the auxiliary files having a specified prefix instead of the default "dssr". For example, "dssr-multiplets.pdb" becomes "1ehz-multiplets.pdb".
The ex_str utility program is from the 3DNA distribution. It is used to extract a specific model from a MODEL/ENDMDL ensemble.
The DSSR --hbfile-pymol option is used to generate a .pml file with all required settings for rendering in PyMOL.
The convert program is from ImageMagick that is used here to trim extra white boundaries.
The multiplet-png images (here four triplets) were combined using InkScape, and annotated, to get the final illustration.
For completeness, here is the tarball file containing all the data files and the script ("tasks"): supp-fig1-1ehz-multiplets.tar.gz

Here is a sample image generated with the above script:

DSSR-NAR paper / Summary table

« on: July 04, 2015, 07:53:36 am »

The only table in the paper presents a summary of common structural features identified by DSSR (in default settings) for ten representative RNA molecules. The screen-captured image of the table shown below is from a DSSR poster, where the citations to the PDB entries have been removed for simplicity.

"DSSR summary table" title="Summary of structural features identified by DSSR (in default settings) for ten representative RNA molecules"

This table aims to illustrate the applicability of DSSR on a broad range of typical RNA molecules, from the classic yeast phenylalanine (1ehz) and the new env22 twister ribozyme (4rge) to the whole E. coli 70S (5afi ) and S. cerevisiae 80S (4u4o) ribosomes. In each case, the listed results are produced by running DSSR automatically, as shown below:

Code: [Select]

x3dna-dssr -i=1ehz.pdb -o=1ehz.out
x3dna-dssr -i=4u4o.cif -o=4u4o.out

Note that either the old .pdb or the new .cif (PDBx/mmCIF) format, as downloaded directly from the PDB, can be fed into DSSR. The time it requires to finish each analysis would obviously depend on hardwire configurations. For example, one of the reviewers tested DSSR on a Windows machine, and reported "08:20 min on the unprocessed 4U4O.cif file". That's why in the 'time' column, only a rough value is presented. The point is that DSSR runs almost instantaneously on a contemporary laptop computer, except for the analyses of very large ribosomal RNA structures.

To ensure accuracy and for easy verification, the table itself was generated semi-automatically with a Ruby script, which produced a text file (named 'summary.txt') with content as follows:

Code: [Select]

|name|id|nt|pair|multiplet|helix|stem|hairpin|iloop|junction|pseudoknot|time|
|tRNA|1ehz|76(14)|34(21)|4|2|4|3|0(0)|1|1|00:00|
|tRNA mimicry|4p5j|84(1)|37(27)|4|2|5|5|0(0)|1|1|00:00|
|Twister ribozyme|4rge (chain A)|56(0)|31(20)|4|2|6|2|0(0)|1|2|00:00|
|SAM-I riboswitch|2gis|95(1)|47(30)|8|3|7|3|3(1)|2|1|00:00|
|Cas9-RNA-DNA|4oo8|117(0)|49(43)|0|5|6|4|1(1)|0|0|00:00|
|Group I intron|1gid (chain A)|158(0)|82(48)|14|4|10|3|4(4)|1|0|00:00|
|Group II intron|3bwp|349(0)|159(104)|12|10|23|6|9(1)|2|4|00:01|
|Large ribosomal subuit|1s72|2876(5)|1459(811)|242|86|179|68|67(36)|36|3|00:31|
|E. coli ribosome-EF-TU complex|5afi|4801(53)|2383(1332)|325|134|297|116|126(66)|54|2|02:45|
|Yeast 80S ribosome|4u4o|10398(0)|4927(2705)|572|317|636|231|348(139)|120|4|06:07|

Within the org-mode of emacs, the above text file can be automatically expanded into a neatly formated form:

Code: [Select]

| name                           | id             |       nt |       pair | multiplet | helix | stem | hairpin |    iloop | junction | pseudoknot |  time |
| tRNA                           | 1ehz           |   76(14) |     34(21) |         4 |     2 |    4 |       3 |     0(0) |        1 |          1 | 00:00 |
| tRNA mimicry                   | 4p5j           |    84(1) |     37(27) |         4 |     2 |    5 |       5 |     0(0) |        1 |          1 | 00:00 |
| Twister ribozyme               | 4rge (chain A) |    56(0) |     31(20) |         4 |     2 |    6 |       2 |     0(0) |        1 |          2 | 00:00 |
| SAM-I riboswitch               | 2gis           |    95(1) |     47(30) |         8 |     3 |    7 |       3 |     3(1) |        2 |          1 | 00:00 |
| Cas9-RNA-DNA                   | 4oo8           |   117(0) |     49(43) |         0 |     5 |    6 |       4 |     1(1) |        0 |          0 | 00:00 |
| Group I intron                 | 1gid (chain A) |   158(0) |     82(48) |        14 |     4 |   10 |       3 |     4(4) |        1 |          0 | 00:00 |
| Group II intron                | 3bwp           |   349(0) |   159(104) |        12 |    10 |   23 |       6 |     9(1) |        2 |          4 | 00:01 |
| Large ribosomal subuit         | 1s72           |  2876(5) |  1459(811) |       242 |    86 |  179 |      68 |   67(36) |       36 |          3 | 00:31 |
| E. coli ribosome-EF-TU complex | 5afi           | 4801(53) | 2383(1332) |       325 |   134 |  297 |     116 |  126(66) |       54 |          2 | 02:45 |
| Yeast 80S ribosome             | 4u4o           | 10398(0) | 4927(2705) |       572 |   317 |  636 |     231 | 348(139) |      120 |          4 | 06:07 |

Note again, the time column -- here it is based on an iMac, which is much quicker than the 2011 MacBook Air used to produce the resultant table in the paper.

DSSR-NAR paper / Reproducing results published in the DSSR-NAR paper

« on: July 03, 2015, 11:37:23 am »

I am pleased to announce that a paper on DSSR, an integrated software tool for dissecting the spatial structure of RNA, has recently been published in Nucleic Acids Research (NAR). Co-authored by Harmen Bussemaker, Wilma Olson and me (a team with a unique combination of complementary expertise), this DSSR paper represents another solid piece of work that I can be proud of. Moreover, in contrast to our previous GpU dinucleotide platform paper focusing on results, and the two major 3DNA papers concentrating on methods, the current NAR article describes significant scientific findings that are enabled by the novel analysis algorithms implemented in the DSSR software program. The abstract of the paper is quoted below:

Quote

Insight into the three-dimensional architecture of RNA is essential for understanding its cellular functions. However, even the classic transfer RNA structure contains features that are overlooked by existing bioinformatics tools. Here we present DSSR (Dissecting the Spatial Structure of RNA), an integrated and automated tool for analyzing and annotating RNA tertiary structures. The software identifies canonical and noncanonical base pairs, including those with modified nucleotides, in any tautomeric or protonation state. DSSR detects higher-order coplanar base associations, termed multiplets. It finds arrays of stacked pairs, classifies them by base-pair identity and backbone connectivity, and distinguishes a stem of covalently connected canonical pairs from a helix of stacked pairs of arbitrary type/linkage. DSSR identifies coaxial stacking of multiple stems within a single helix and lists isolated canonical pairs that lie outside of a stem. The program characterizes ‘closed’ loops of various types (hairpin, bulge, internal, and junction loops) and pseudoknots of arbitrary complexity. Notably, DSSR employs isolated pairs and the ends of stems, whether pseudoknotted or not, to define junction loops. This new, inclusive definition provides a novel perspective on the spatial organization of RNA. Tests on all nucleic acid structures in the Protein Data Bank confirm the efficiency and robustness of the software, and applications to representative RNA molecules illustrate its unique features. DSSR and related materials are freely available at http://x3dna.org/.

This section on the 3DNA Forum is dedicated to topics on reproducing the results reported in the DSSR article. In the following series of posts, I will provide the scripts and related data files where necessary so that any interested parties can rigorously reproduce our results, as presented (mostly) in the table and figures (including supplementary ones). I welcome any questions and comments you may have. Please post them here instead of (or in addition to) sending me emails.

Table 1: summary of structural features identified by DSSR (in default settings) for ten representative RNA molecules.

Six main figures

Figure 1 -- summary of methods to identify nucleic acid structural components
Figure 2 -- analysis of the yeast phenylalanine tRNA (1ehz)
Figure 3 -- analysis of the tRNA mimic (4p5j)
Figure 4 -- analysis of the env22 twister ribozyme (4rge)
Figure 5 -- analysis of the SAM-I riboswitch (2gis)
Figure 6 -- analysis of the CRISPR Cas9-sgRNA-DNA ternary complex (4oo8)

Supplementary data: nine figures and the main output of a sample DSSR run, combined into one PDF file (Lu-DSSR-supp.pdf)

Supplementary Figure 1 -- four base triplets in yeast phenylalanine tRNA (1ehz)
Supplementary Figure 2 -- three similar base pairs in tRNA and its mimic
Supplementary Figure 3 -- four base triplets in the tRNA mimic (4p5j)
Supplementary Figure 4 -- four multiplets in the env22 twister ribozyme (4rge)
Supplementary Figure 5 -- characterizing junction loops in twister ribozyme
Supplementary Figure 6 -- DSSR-identified k-turn in the SAM-I riboswitch
Supplementary Figure 7 -- eight base triplets in the SAM-I riboswitch (2gis)
Supplementary Figure 8 -- secondary structure diagram of CRISPR RNA-DNA hybrid (4oo8)
Supplementary Figure 9 -- comparison of diloops

Output of a sample DSSR run on yeast phenylalanine tRNA (1ehz)

Cartoon-block representations: four more sample schematic images created with PyMOL and DSSR (cartoon-block.tar.gz). Thomas Holder, the Principal Developer of PyMOL, has written a PyMOL plugin that implements the dssr_block command. Now users can create "block" shaped cartoons for nucleic acid bases and base pairs interactively in PyMOL.

Best regards,

Xiang-Jun

MD simulations / MOVED: modified RNA

« on: June 10, 2015, 12:33:54 pm »

This topic has been moved to RNA structures (DSSR).

http://forum.x3dna.org/index.php?topic=532.0

General discussions (Q&As) / MOVED: cif file compatibility?

« on: May 10, 2015, 05:40:23 pm »

This topic has been moved to RNA structures (DSSR).

http://forum.x3dna.org/index.php?topic=517.0

RNA structures (DSSR) / DSSR: Dissecting the Spatial Structure of RNA

« on: March 28, 2015, 12:42:38 pm »

As the number of experimentally solved RNA-containing structures grows, it is becoming increasingly important to characterize the geometric features of the molecules consistently and efficiently. Existing RNA bioinformatics tools are fragmented, and suffer in either scope or usability. DSSR, an integrated software tool for Dissecting the Spatial Structure of RNA, has been designed from ground up to streamline the analyses of three-dimensional RNA structures. This new program consolidates, refines, and significantly extends the functionality of 3DNA for RNA structural analysis.

Starting from an RNA structure in PDB or PDBx/mmCIF format, DSSR employs a set of simple geometric criteria to identify all existent base pairs (bp): either canonical Watson-Crick and wobble pairs or non-canonical pairs with at least one hydrogen bond. The latter pairs may include normal or modified bases, regardless of tautomeric or protonation state. DSSR uses the six standard rigid-body bp parameters (shear, stretch, stagger, propeller, buckle, and opening) to rigorously quantify the spatial disposition of any two interacting bases. Where applicable, the program also denotes a bp by common names, the Saenger classification scheme of 28 H-bonding types, and the Leontis-Westhof nomenclature of 12 basic geometric classes.

DSSR detects multiplets (triplets or higher-order base associations) by searching horizontally in the plane of the associated bp for further H-bonding interactions. The program determines double-helical regions by exploring vertically in the neighborhood of selected bps for base-stacking interactions, regardless of backbone connection (e.g., coaxial stacking of helices). DSSR then identifies hairpin loops, bulges, internal loops, and multi-branch loops (junctions), and recognizes the existence of pseudo-knots. The program outputs RNA secondary structure in dot-bracket notation (dbn), connectivity table (.ct) and CRW bpseq formats that can be fed directly into visualization tools (such as VARNA).

DSSR classifies dinucleotide steps into the most common A-, B-, or Z-form double helices, calculates commonly used backbone torsion angles, and assigns the consensus RNA backbone suite names. The program also identifies A-minor interactions, ribose zippers, G quartets, kissing loops, U-turns, and kink-turns. Furthermore, it reports non-pairing interactions (H-bonding or base-stacking) between two nucleotides, and contacts involving phosphate groups.

A simple web interface and a comprehensive user manual are available. Supported by Dr. Robert Hanson, DSSR has recently been integrated into Jmol, a popular molecular graphics program. DSSR-related news and information can be found on the 3DNA homepage. Questions and suggestions are always welcome on the 3DNA forum.

Give DSSR a try, compare it with similar tools in terms of usability, functionality and support, and see the differences!

As of version 2.0, DSSR has been licensed by Columbia University.

List of users who has helped improve DSSR by reporting bugs, making comments/suggestions etc:

jyvdf3asdg2; kailsen; MarcParisien; jctoledo; Auffinger; febos; acolasanti; hansonr; cllawson; Sylverlin; cigdem; lvelve0901; meier74; jms89; chemikeris; Bernhard10; rcsb_pdb; Marcel Heinz; lijun; tctcab; brinda.vallat

-- Xiang-Jun

Note: please start a new topic with a more specific title; do not post directly below this announcement.

Here are some sample runs (see x3dna-dssr -h for more info),

Code: [Select]

x3dna-dssr -i=1msy.pdb -o=1msy.out  # 27 nts
x3dna-dssr --input=1msy.pdb --output=1msy.out  # as as above
x3dna-dssr -i=1msy.pdb --json -o=1msy-dssr.json  # parameters exported in JSON format
x3dna-dssr -i=1ehz.pdb -o=1ehz.out  # tRNA, 76 nts
x3dna-dssr -i=1jj2.pdb -o=1jj2.out  # rRNA, 2876 nts

Example #1: GUAA tetraloop mutant of Sarcin/Ricin domain from E. Coli 23 S rRNA (1msy)

Code: [Select]

Run: x3dna-dssr -i=1msy.pdb --time-stamp=off -o=1msy.out --non-pair --u-turn
****************************************************************************
              DSSR: an Integrated Software Tool for
             Dissecting the Spatial Structure of RNA
             v1.9.10-2020apr23, by xiangjun@x3dna.org

DSSR has been made possible by the NIH grant R01GM096889 (to X.J.Lu).
It is being actively maintained and developed. As always, I greatly
appreciate your feedback. Please report all DSSR-related issues on
the 3DNA Forum (forum.x3dna.org). I strive to respond promptly to any
questions posted there. DSSR is free of charge for NON-COMMERCIAL
purposes, and it comes with ABSOLUTELY NO WARRANTY.

****************************************************************************
Note: By default, each nucleotide is identified by chainId.name#. So a
      common case would be B.A1689, meaning adenosine #1689 on chain B.
      One-letter base names for modified nucleotides are put in lower
      case (e.g., 'c' for 5MC). For further information about the output
      notation, please refer to the DSSR User Manual.
    Questions and suggestions are *always* welcome on the 3DNA Forum.

Command: x3dna-dssr -i=1msy.pdb --u-turn --non-pair -o=1msy.out
File name: 1msy.pdb
    no. of DNA/RNA chains: 1 [A=27]
    no. of nucleotides:    27
    no. of atoms:          685
    no. of waters:         109
    no. of metals:         0

****************************************************************************
List of 13 base pairs
     nt1            nt2            bp  name        Saenger   LW   DSSR
   1 A.U2647        A.G2673        U-G --          n/a       cWW  cW-W
   2 A.G2648        A.U2672        G-U Wobble      28-XXVIII cWW  cW-W
   3 A.C2649        A.G2671        C-G WC          19-XIX    cWW  cW-W
   4 A.U2650        A.A2670        U-A WC          20-XX     cWW  cW-W
   5 A.C2651        A.G2669        C-G WC          19-XIX    cWW  cW-W
   6 A.C2652        A.G2668        C-G WC          19-XIX    cWW  cW-W
   7 A.U2653        A.C2667        U-C --          n/a       tW.  tW-.
   8 A.A2654        A.C2666        A+C --          n/a       tHH  tM+M
   9 A.G2655        A.U2656        G+U Platform    n/a       cSH  cm+M
  10 A.U2656        A.A2665        U-A rHoogsteen  24-XXIV   tWH  tW-M
  11 A.A2657        A.G2664        A-G Sheared     11-XI     tHS  tM-m
  12 A.C2658        A.G2663        C-G WC          19-XIX    cWW  cW-W
  13 A.G2659        A.A2662        G-A Sheared     11-XI     tSH  tm-M

****************************************************************************
List of 1 multiplet
   1 nts=3 GUA A.G2655,A.U2656,A.A2665

****************************************************************************
List of 1 helix
  Note: a helix is defined by base-stacking interactions, regardless of bp
        type and backbone connectivity, and may contain more than one stem.
      helix#number[stems-contained] bps=number-of-base-pairs in the helix
      bp-type: '|' for a canonical WC/wobble pair, '.' otherwise
      helix-form: classification of a dinucleotide step comprising the bp
        above the given designation and the bp that follows it. Types
        include 'A', 'B' or 'Z' for the common A-, B- and Z-form helices,
        '.' for an unclassified step, and 'x' for a step without a
        continuous backbone.
      --------------------------------------------------------------------
  helix#1[1] bps=12
      strand-1 5'-UGCUCCUAUACG-3'
       bp-type    .|||||....|.
      strand-2 3'-GUGAGGCCAGGA-5'
      helix-form  ..AAA..x...
   1 A.U2647        A.G2673        U-G --           n/a       cWW  cW-W
   2 A.G2648        A.U2672        G-U Wobble       28-XXVIII cWW  cW-W
   3 A.C2649        A.G2671        C-G WC           19-XIX    cWW  cW-W
   4 A.U2650        A.A2670        U-A WC           20-XX     cWW  cW-W
   5 A.C2651        A.G2669        C-G WC           19-XIX    cWW  cW-W
   6 A.C2652        A.G2668        C-G WC           19-XIX    cWW  cW-W
   7 A.U2653        A.C2667        U-C --           n/a       tW.  tW-.
   8 A.A2654        A.C2666        A+C --           n/a       tHH  tM+M
   9 A.U2656        A.A2665        U-A rHoogsteen   24-XXIV   tWH  tW-M
  10 A.A2657        A.G2664        A-G Sheared      11-XI     tHS  tM-m
  11 A.C2658        A.G2663        C-G WC           19-XIX    cWW  cW-W
  12 A.G2659        A.A2662        G-A Sheared      11-XI     tSH  tm-M

****************************************************************************
List of 1 stem
  Note: a stem is defined as a helix consisting of only canonical WC/wobble
        pairs, with a continuous backbone.
      stem#number[#helix-number containing this stem]
      Other terms are defined as in the above Helix section.
      --------------------------------------------------------------------
  stem#1[#1] bps=5
      strand-1 5'-GCUCC-3'
       bp-type    |||||
      strand-2 3'-UGAGG-5'
      helix-form  .AAA
   1 A.G2648        A.U2672        G-U Wobble       28-XXVIII cWW  cW-W
   2 A.C2649        A.G2671        C-G WC           19-XIX    cWW  cW-W
   3 A.U2650        A.A2670        U-A WC           20-XX     cWW  cW-W
   4 A.C2651        A.G2669        C-G WC           19-XIX    cWW  cW-W
   5 A.C2652        A.G2668        C-G WC           19-XIX    cWW  cW-W

****************************************************************************
List of 1 isolated WC/wobble pair
  Note: isolated WC/wobble pairs are assigned negative indices to
        differentiate them from the stem numbers, which are positive.
        --------------------------------------------------------------------
[#1]     -1 A.C2658        A.G2663        C-G WC           19-XIX    cWW  cW-W

****************************************************************************
List of 30 non-pairing interactions
   1 A.U2647   A.G2648   stacking: 1.0(0.5)--pm(>>,forward) interBase-angle=6 connected min_baseDist=3.26
   2 A.G2648   A.C2649   stacking: 7.3(4.6)--pm(>>,forward) interBase-angle=5 connected min_baseDist=3.30
   3 A.G2648   A.G2673   stacking: 2.0(0.2)--mm(<>,outward) interBase-angle=2 min_baseDist=3.28
   4 A.C2649   A.U2650   stacking: 2.8(1.1)--pm(>>,forward) interBase-angle=9 connected min_baseDist=3.09
   5 A.U2650   A.C2651   stacking: 0.6(0.0)--pm(>>,forward) interBase-angle=7 connected min_baseDist=3.30
   6 A.C2651   A.C2652   stacking: 0.5(0.1)--pm(>>,forward) interBase-angle=12 connected min_baseDist=3.30
   7 A.C2652   A.U2653   stacking: 5.2(2.6)--pm(>>,forward) interBase-angle=13 connected min_baseDist=3.43
   8 A.C2652   A.G2669   stacking: 0.2(0.0)--mm(<>,outward) interBase-angle=7 min_baseDist=3.22
   9 A.U2653   A.A2654   stacking: 3.3(2.0)--pp(><,inward) interBase-angle=13 H-bonds[1]: "OP2-O2'(hydroxyl)[2.62]" connected min_baseDist=3.23
  10 A.A2654   A.U2656   stacking: 3.7(1.1)--mm(<>,outward) interBase-angle=1 H-bonds[1]: "O4'*O4'[3.05]" min_baseDist=3.45
  11 A.G2655   A.G2664   stacking: 4.4(2.2)--pp(><,inward) interBase-angle=10 H-bonds[2]: "O2'(hydroxyl)-O6(carbonyl)[3.09],O2'(hydroxyl)-N1(imino)[3.34]" min_baseDist=3.37
  12 A.G2655   A.A2665   interBase-angle=21 H-bonds[2]: "N1(imino)-OP2[2.77],N2(amino)-O5'[2.89]" min_baseDist=4.79
  13 A.U2656   A.G2664   interBase-angle=7 H-bonds[2]: "OP2-N1(imino)[3.04],OP2-N2(amino)[2.94]" min_baseDist=3.36
  14 A.A2657   A.C2658   stacking: 6.7(2.6)--pm(>>,forward) interBase-angle=4 connected min_baseDist=3.46
  15 A.A2657   A.A2665   stacking: 3.7(3.3)--mm(<>,outward) interBase-angle=11 min_baseDist=3.29
  16 A.C2658   A.G2659   stacking: 0.4(0.1)--pm(>>,forward) interBase-angle=10 connected min_baseDist=3.34
  17 A.G2659   A.A2661   interBase-angle=31 H-bonds[2]: "O2'(hydroxyl)-N7[2.60],O2'(hydroxyl)-N6(amino)[3.26]" min_baseDist=3.97
  18 A.G2659   A.G2663   stacking: 3.9(1.2)--mm(<>,outward) interBase-angle=4 min_baseDist=3.35
  19 A.U2660   A.A2661   stacking: 7.5(4.2)--pm(>>,forward) interBase-angle=17 connected min_baseDist=3.26
  20 A.A2661   A.A2662   stacking: 6.3(4.4)--pm(>>,forward) interBase-angle=19 connected min_baseDist=3.38
  21 A.G2663   A.G2664   stacking: 2.7(0.6)--pm(>>,forward) interBase-angle=8 connected min_baseDist=3.38
  22 A.G2664   A.A2665   interBase-angle=14 H-bonds[1]: "O2'(hydroxyl)-O4'[2.75]" connected min_baseDist=5.83
  23 A.A2665   A.C2666   stacking: 1.6(1.1)--pm(>>,forward) interBase-angle=10 connected min_baseDist=3.18
  24 A.C2666   A.C2667   stacking: 4.3(2.1)--pm(>>,forward) interBase-angle=8 connected min_baseDist=3.35
  25 A.C2667   A.G2668   stacking: 3.1(1.0)--pm(>>,forward) interBase-angle=7 connected min_baseDist=3.38
  26 A.G2668   A.G2669   stacking: 4.3(3.0)--pm(>>,forward) interBase-angle=4 connected min_baseDist=3.28
  27 A.G2669   A.A2670   stacking: 4.3(2.9)--pm(>>,forward) interBase-angle=4 connected min_baseDist=3.29
  28 A.A2670   A.G2671   stacking: 1.5(1.5)--pm(>>,forward) interBase-angle=6 connected min_baseDist=3.24
  29 A.G2671   A.U2672   stacking: 7.4(4.0)--pm(>>,forward) interBase-angle=10 connected min_baseDist=3.22
  30 A.U2672   A.G2673   interBase-angle=11 H-bonds[1]: "O2'(hydroxyl)-O4'[3.37]" connected min_baseDist=3.61

****************************************************************************
List of 5 stacks
  Note: a stack is an ordered list of nucleotides assembled together via
        base-stacking interactions, regardless of backbone connectivity.
        Stacking interactions within a stem are *not* included.
   1 nts=2 GG A.G2648,A.G2673
   2 nts=3 UAA A.U2660,A.A2661,A.A2662
   3 nts=4 CUAU A.C2652,A.U2653,A.A2654,A.U2656
   4 nts=4 GGGG A.G2655,A.G2664,A.G2663,A.G2659
   5 nts=6 CAACCG A.C2658,A.A2657,A.A2665,A.C2666,A.C2667,A.G2668

****************************************************************************
List of 2 atom-base capping interactions
    dv: vertical distance of the atom above the nucleotide base
    -----------------------------------------------------------
     type       atom                 nt             dv
   1 phosphate  OP2@A.A2661          A.G2659        3.04
   2 sugar      O4'@A.G2664          A.G2663        3.48

****************************************************************************
Note: for the various types of loops listed below, numbers within the first
      set of brackets are the number of loop nts, and numbers in the second
      set of brackets are the identities of the stems (positive number) or
      isolated WC/wobble pairs (negative numbers) to which they are linked.

****************************************************************************
List of 1 hairpin loop
   1 hairpin loop: nts=6; [4]; linked by [#-1]
     summary: [1] 4 [A.2658 A.2663] 1
     nts=6 CGUAAG A.C2658,A.G2659,A.U2660,A.A2661,A.A2662,A.G2663
       nts=4 GUAA A.G2659,A.U2660,A.A2661,A.A2662

****************************************************************************
List of 1 internal loop
   1 asymmetric internal loop: nts=13; [5,4]; linked by [#1,#-1]
     summary: [2] 5 4 [A.2652 A.2668 A.2658 A.2663] 5 1
     nts=13 CUAGUACGGACCG A.C2652,A.U2653,A.A2654,A.G2655,A.U2656,A.A2657,A.C2658,A.G2663,A.G2664,A.A2665,A.C2666,A.C2667,A.G2668
       nts=5 UAGUA A.U2653,A.A2654,A.G2655,A.U2656,A.A2657
       nts=4 GACC A.G2664,A.A2665,A.C2666,A.C2667

****************************************************************************
List of 2 non-loop single-stranded segments
   1 nts=1 U A.U2647
   2 nts=1 G A.G2673

****************************************************************************
List of 1 U-turn
   1  A.G2659-A.A2662 H-bonds[2]: "N2(amino)-OP2[2.97],N2(amino)-N7[2.86]" nts=6 CGUAAG A.C2658,A.G2659,A.U2660,A.A2661,A.A2662,A.G2663

****************************************************************************
List of 1 splayed-apart dinucleotide
   1 A.G2659   A.U2660   angle=95     distance=13.2     ratio=0.74
----------------------------------------------------------------
Summary of 1 splayed-apart unit
   1 nts=2 GU A.G2659,A.U2660

****************************************************************************
Secondary structures in dot-bracket notation (dbn) as a whole and per chain
>1msy nts=27 [whole]
UGCUCCUAGUACGUAAGGACCGGAGUG
.(((((.....(....)....))))).
>1msy-A #1 nts=27 0.30(2.47) [chain] RNA
UGCUCCUAGUACGUAAGGACCGGAGUG
.(((((.....(....)....))))).

****************************************************************************
Summary of structural features of 27 nucleotides
  Note: the first five columns are: (1) serial number, (2) one-letter
    shorthand name, (3) dbn, (4) id string, (5) rmsd (~zero) of base
    ring atoms fitted against those in a standard base reference
    frame. The sixth (last) column contains a comma-separated list of
    features whose meanings are mostly self-explanatory, except for:
      turn: angle C1'(i-1)--C1'(i)--C1'(i+1) < 90 degrees
      break: no backbone linkage between O3'(i-1) and P(i)
   1  U . A.U2647   0.011  anti,~C3'-endo,non-canonical,non-pair-contact,helix-end,ss-non-loop
   2  G ( A.G2648   0.012  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end
   3  C ( A.C2649   0.019  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem
   4  U ( A.U2650   0.019  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem
   5  C ( A.C2651   0.024  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem
   6  C ( A.C2652   0.032  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,internal-loop
   7  U . A.U2653   0.019  anti,~C3'-endo,non-canonical,non-pair-contact,helix,internal-loop,phosphate
   8  A . A.A2654   0.019  anti,~C2'-endo,BII,non-canonical,non-pair-contact,helix,internal-loop
   9  G . A.G2655   0.022  turn,anti,~C2'-endo,non-canonical,non-pair-contact,multiplet,internal-loop
  10  U . A.U2656   0.020  anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,multiplet,internal-loop,phosphate
  11  A . A.A2657   0.023  anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,internal-loop
  12  C ( A.C2658   0.013  anti,~C3'-endo,BI,isolated-canonical,non-pair-contact,helix,hairpin-loop,internal-loop
  13  G . A.G2659   0.033  u-turn,anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix-end,hairpin-loop,cap-acceptor,splayed-apart
  14  U . A.U2660   0.020  turn,u-turn,anti,~C3'-endo,non-pair-contact,hairpin-loop,splayed-apart
  15  A . A.A2661   0.015  u-turn,anti,~C3'-endo,BI,non-pair-contact,hairpin-loop,cap-donor,phosphate
  16  A . A.A2662   0.010  u-turn,anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix-end,hairpin-loop,phosphate
  17  G ) A.G2663   0.019  anti,~C3'-endo,BI,isolated-canonical,non-pair-contact,helix,hairpin-loop,internal-loop,cap-acceptor
  18  G . A.G2664   0.014  anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,internal-loop,cap-donor
  19  A . A.A2665   0.014  anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,multiplet,internal-loop,phosphate
  20  C . A.C2666   0.016  anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,internal-loop,phosphate
  21  C . A.C2667   0.029  anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,internal-loop
  22  G ) A.G2668   0.012  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,internal-loop
  23  G ) A.G2669   0.020  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem
  24  A ) A.A2670   0.019  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem
  25  G ) A.G2671   0.023  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem
  26  U ) A.U2672   0.024  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end
  27  G . A.G2673   0.010  anti,~C3'-endo,non-canonical,non-pair-contact,helix-end,ss-non-loop

****************************************************************************
List of 14 additional files
   1 dssr-pairs.pdb -- an ensemble of base pairs
   2 dssr-multiplets.pdb -- an ensemble of multiplets
   3 dssr-stems.pdb -- an ensemble of stems
   4 dssr-helices.pdb -- an ensemble of helices (coaxial stacking)
   5 dssr-hairpins.pdb -- an ensemble of hairpin loops
   6 dssr-iloops.pdb -- an ensemble of internal loops
   7 dssr-2ndstrs.bpseq -- secondary structure in bpseq format
   8 dssr-2ndstrs.ct -- secondary structure in connectivity table format
   9 dssr-2ndstrs.dbn -- secondary structure in dot-bracket notation
  10 dssr-torsions.txt -- backbone torsion angles and suite names
  11 dssr-splays.pdb -- an ensemble of splayed-apart units
  12 dssr-Uturns.pdb -- an ensemble of U-turn motifs
  13 dssr-stacks.pdb -- an ensemble of stacks
  14 dssr-atom2bases.pdb -- an ensemble of atom-base stacking interactions

Example #2: The crystal structure of yeast phenylalanine tRNA at 1.93 Å resolution (1ehz)

Code: [Select]

Run: x3dna-dssr -i=1ehz.pdb --time-stamp=off -o=1ehz.out --po4 --u-turn
****************************************************************************
              DSSR: an Integrated Software Tool for
             Dissecting the Spatial Structure of RNA
             v1.9.10-2020apr23, by xiangjun@x3dna.org

DSSR has been made possible by the NIH grant R01GM096889 (to X.J.Lu).
It is being actively maintained and developed. As always, I greatly
appreciate your feedback. Please report all DSSR-related issues on
the 3DNA Forum (forum.x3dna.org). I strive to respond promptly to any
questions posted there. DSSR is free of charge for NON-COMMERCIAL
purposes, and it comes with ABSOLUTELY NO WARRANTY.

****************************************************************************
Note: By default, each nucleotide is identified by chainId.name#. So a
      common case would be B.A1689, meaning adenosine #1689 on chain B.
      One-letter base names for modified nucleotides are put in lower
      case (e.g., 'c' for 5MC). For further information about the output
      notation, please refer to the DSSR User Manual.
    Questions and suggestions are *always* welcome on the 3DNA Forum.

Command: x3dna-dssr -i=1ehz.pdb --u-turn --po4 -o=1ehz.out
File name: 1ehz.pdb
    no. of DNA/RNA chains: 1 [A=76]
    no. of nucleotides:    76
    no. of atoms:          1821
    no. of waters:         160
    no. of metals:         9 [Mg=6,Mn=3]

****************************************************************************
List of 11 types of 14 modified nucleotides
      nt    count  list
   1 1MA-a    1    A.1MA58
   2 2MG-g    1    A.2MG10
   3 5MC-c    2    A.5MC40,A.5MC49
   4 5MU-t    1    A.5MU54
   5 7MG-g    1    A.7MG46
   6 H2U-u    2    A.H2U16,A.H2U17
   7 M2G-g    1    A.M2G26
   8 OMC-c    1    A.OMC32
   9 OMG-g    1    A.OMG34
  10 PSU-P    2    A.PSU39,A.PSU55
  11 YYG-g    1    A.YYG37

****************************************************************************
List of 34 base pairs
     nt1            nt2            bp  name        Saenger   LW   DSSR
   1 A.G1           A.C72          G-C WC          19-XIX    cWW  cW-W
   2 A.C2           A.G71          C-G WC          19-XIX    cWW  cW-W
   3 A.G3           A.C70          G-C WC          19-XIX    cWW  cW-W
   4 A.G4           A.U69          G-U Wobble      28-XXVIII cWW  cW-W
   5 A.A5           A.U68          A-U WC          20-XX     cWW  cW-W
   6 A.U6           A.A67          U-A WC          20-XX     cWW  cW-W
   7 A.U7           A.A66          U-A WC          20-XX     cWW  cW-W
   8 A.U8           A.A14          U-A rHoogsteen  24-XXIV   tWH  tW-M
   9 A.U8           A.A21          U+A --          n/a       tSW  tm+W
  10 A.A9           A.A23          A+A --          02-II     tHH  tM+M
  11 A.2MG10        A.C25          g-C WC          19-XIX    cWW  cW-W
  12 A.2MG10        A.G45          g+G --          n/a       cHS  cM+m
  13 A.C11          A.G24          C-G WC          19-XIX    cWW  cW-W
  14 A.U12          A.A23          U-A WC          20-XX     cWW  cW-W
  15 A.C13          A.G22          C-G WC          19-XIX    cWW  cW-W
  16 A.G15          A.C48          G+C rWC         22-XXII   tWW  tW+W
  17 A.H2U16        A.U59          u+U --          n/a       tSW  tm+W
  18 A.G18          A.PSU55        G+P --          n/a       tWS  tW+m
  19 A.G19          A.C56          G-C WC          19-XIX    cWW  cW-W
  20 A.G22          A.7MG46        G-g --          07-VII    tHW  tM-W
  21 A.M2G26        A.A44          g-A Imino       08-VIII   cWW  cW-W
  22 A.C27          A.G43          C-G WC          19-XIX    cWW  cW-W
  23 A.C28          A.G42          C-G WC          19-XIX    cWW  cW-W
  24 A.A29          A.U41          A-U WC          20-XX     cWW  cW-W
  25 A.G30          A.5MC40        G-c WC          19-XIX    cWW  cW-W
  26 A.A31          A.PSU39        A-P --          n/a       cWW  cW-W
  27 A.OMC32        A.A38          c-A --          n/a       c.W  c.-W
  28 A.U33          A.A36          U-A --          n/a       tSH  tm-M
  29 A.5MC49        A.G65          c-G WC          19-XIX    cWW  cW-W
  30 A.U50          A.A64          U-A WC          20-XX     cWW  cW-W
  31 A.G51          A.C63          G-C WC          19-XIX    cWW  cW-W
  32 A.U52          A.A62          U-A WC          20-XX     cWW  cW-W
  33 A.G53          A.C61          G-C WC          19-XIX    cWW  cW-W
  34 A.5MU54        A.1MA58        t-a rHoogsteen  24-XXIV   tWH  tW-M

****************************************************************************
List of 4 multiplets
   1 nts=3 UAA A.U8,A.A14,A.A21
   2 nts=3 AUA A.A9,A.U12,A.A23
   3 nts=3 gCG A.2MG10,A.C25,A.G45
   4 nts=3 CGg A.C13,A.G22,A.7MG46

****************************************************************************
List of 2 helices
  Note: a helix is defined by base-stacking interactions, regardless of bp
        type and backbone connectivity, and may contain more than one stem.
      helix#number[stems-contained] bps=number-of-base-pairs in the helix
      bp-type: '|' for a canonical WC/wobble pair, '.' otherwise
      helix-form: classification of a dinucleotide step comprising the bp
        above the given designation and the bp that follows it. Types
        include 'A', 'B' or 'Z' for the common A-, B- and Z-form helices,
        '.' for an unclassified step, and 'x' for a step without a
        continuous backbone.
      --------------------------------------------------------------------
  helix#1[2] bps=15
      strand-1 5'-GCGGAUUcUGUGtPC-3'
       bp-type    ||||||||||||..|
      strand-2 3'-CGCUUAAGACACaGG-5'
      helix-form  AA....xAAAAxx.
   1 A.G1           A.C72          G-C WC           19-XIX    cWW  cW-W
   2 A.C2           A.G71          C-G WC           19-XIX    cWW  cW-W
   3 A.G3           A.C70          G-C WC           19-XIX    cWW  cW-W
   4 A.G4           A.U69          G-U Wobble       28-XXVIII cWW  cW-W
   5 A.A5           A.U68          A-U WC           20-XX     cWW  cW-W
   6 A.U6           A.A67          U-A WC           20-XX     cWW  cW-W
   7 A.U7           A.A66          U-A WC           20-XX     cWW  cW-W
   8 A.5MC49        A.G65          c-G WC           19-XIX    cWW  cW-W
   9 A.U50          A.A64          U-A WC           20-XX     cWW  cW-W
  10 A.G51          A.C63          G-C WC           19-XIX    cWW  cW-W
  11 A.U52          A.A62          U-A WC           20-XX     cWW  cW-W
  12 A.G53          A.C61          G-C WC           19-XIX    cWW  cW-W
  13 A.5MU54        A.1MA58        t-a rHoogsteen   24-XXIV   tWH  tW-M
  14 A.PSU55        A.G18          P+G --           n/a       tSW  tm+W
  15 A.C56          A.G19          C-G WC           19-XIX    cWW  cW-W
  --------------------------------------------------------------------------
  helix#2[2] bps=15
      strand-1 5'-AAPcUGGAgCUCAGu-3'
       bp-type    ...||||.||||...
      strand-2 3'-UcAGACCgCGAGUCU-5'
      helix-form  x..AAAAxAA.xxx
   1 A.A36          A.U33          A-U --           n/a       tHS  tM-m
   2 A.A38          A.OMC32        A-c --           n/a       cW.  cW-.
   3 A.PSU39        A.A31          P-A --           n/a       cWW  cW-W
   4 A.5MC40        A.G30          c-G WC           19-XIX    cWW  cW-W
   5 A.U41          A.A29          U-A WC           20-XX     cWW  cW-W
   6 A.G42          A.C28          G-C WC           19-XIX    cWW  cW-W
   7 A.G43          A.C27          G-C WC           19-XIX    cWW  cW-W
   8 A.A44          A.M2G26        A-g Imino        08-VIII   cWW  cW-W
   9 A.2MG10        A.C25          g-C WC           19-XIX    cWW  cW-W
  10 A.C11          A.G24          C-G WC           19-XIX    cWW  cW-W
  11 A.U12          A.A23          U-A WC           20-XX     cWW  cW-W
  12 A.C13          A.G22          C-G WC           19-XIX    cWW  cW-W
  13 A.A14          A.U8           A-U rHoogsteen   24-XXIV   tHW  tM-W
  14 A.G15          A.C48          G+C rWC          22-XXII   tWW  tW+W
  15 A.H2U16        A.U59          u+U --           n/a       tSW  tm+W

****************************************************************************
List of 4 stems
  Note: a stem is defined as a helix consisting of only canonical WC/wobble
        pairs, with a continuous backbone.
      stem#number[#helix-number containing this stem]
      Other terms are defined as in the above Helix section.
      --------------------------------------------------------------------
  stem#1[#1] bps=7
      strand-1 5'-GCGGAUU-3'
       bp-type    |||||||
      strand-2 3'-CGCUUAA-5'
      helix-form  AA....
   1 A.G1           A.C72          G-C WC           19-XIX    cWW  cW-W
   2 A.C2           A.G71          C-G WC           19-XIX    cWW  cW-W
   3 A.G3           A.C70          G-C WC           19-XIX    cWW  cW-W
   4 A.G4           A.U69          G-U Wobble       28-XXVIII cWW  cW-W
   5 A.A5           A.U68          A-U WC           20-XX     cWW  cW-W
   6 A.U6           A.A67          U-A WC           20-XX     cWW  cW-W
   7 A.U7           A.A66          U-A WC           20-XX     cWW  cW-W
  --------------------------------------------------------------------------
  stem#2[#2] bps=4
      strand-1 5'-gCUC-3'
       bp-type    ||||
      strand-2 3'-CGAG-5'
      helix-form  AA.
   1 A.2MG10        A.C25          g-C WC           19-XIX    cWW  cW-W
   2 A.C11          A.G24          C-G WC           19-XIX    cWW  cW-W
   3 A.U12          A.A23          U-A WC           20-XX     cWW  cW-W
   4 A.C13          A.G22          C-G WC           19-XIX    cWW  cW-W
  --------------------------------------------------------------------------
  stem#3[#2] bps=4
      strand-1 5'-CCAG-3'
       bp-type    ||||
      strand-2 3'-GGUc-5'
      helix-form  AAA
   1 A.C27          A.G43          C-G WC           19-XIX    cWW  cW-W
   2 A.C28          A.G42          C-G WC           19-XIX    cWW  cW-W
   3 A.A29          A.U41          A-U WC           20-XX     cWW  cW-W
   4 A.G30          A.5MC40        G-c WC           19-XIX    cWW  cW-W
  --------------------------------------------------------------------------
  stem#4[#1] bps=5
      strand-1 5'-cUGUG-3'
       bp-type    |||||
      strand-2 3'-GACAC-5'
      helix-form  AAAA
   1 A.5MC49        A.G65          c-G WC           19-XIX    cWW  cW-W
   2 A.U50          A.A64          U-A WC           20-XX     cWW  cW-W
   3 A.G51          A.C63          G-C WC           19-XIX    cWW  cW-W
   4 A.U52          A.A62          U-A WC           20-XX     cWW  cW-W
   5 A.G53          A.C61          G-C WC           19-XIX    cWW  cW-W

****************************************************************************
List of 1 isolated WC/wobble pair
  Note: isolated WC/wobble pairs are assigned negative indices to
        differentiate them from the stem numbers, which are positive.
        --------------------------------------------------------------------
[#1]     -1 A.G19          A.C56          G-C WC           19-XIX    cWW  cW-W

****************************************************************************
List of 2 coaxial stacks
   1 Helix#1 contains 2 stems: [#1,#4]
   2 Helix#2 contains 2 stems: [#3,#2]

****************************************************************************
List of 11 stacks
  Note: a stack is an ordered list of nucleotides assembled together via
        base-stacking interactions, regardless of backbone connectivity.
        Stacking interactions within a stem are *not* included.
   1 nts=2 Uc A.U7,A.5MC49
   2 nts=2 UC A.U8,A.C13
   3 nts=2 GA A.G65,A.A66
   4 nts=3 CgC A.C25,A.M2G26,A.C27
   5 nts=3 gAC A.7MG46,A.A21,A.C48
   6 nts=3 GtP A.G53,A.5MU54,A.PSU55
   7 nts=4 GACC A.G1,A.A73,A.C74,A.C75
   8 nts=4 GAcU A.G30,A.A31,A.OMC32,A.U33
   9 nts=5 GGGaC A.G19,A.G57,A.G18,A.1MA58,A.C61
  10 nts=7 gAAgAPc A.OMG34,A.A35,A.A36,A.YYG37,A.A38,A.PSU39,A.5MC40
  11 nts=9 GAGAGAGUC A.G43,A.A44,A.G45,A.A9,A.G22,A.A14,A.G15,A.U59,A.C60

****************************************************************************
Nucleotides not involved in stacking interactions
     nts=4 uGUA A.H2U17,A.G20,A.U47,A.A76

****************************************************************************
List of 4 atom-base capping interactions
    dv: vertical distance of the atom above the nucleotide base
    -----------------------------------------------------------
     type       atom                 nt             dv
   1 phosphate  OP2@A.H2U16          A.H2U16        2.59
   2 phosphate  OP2@A.A35            A.U33          2.85
   3 sugar      O4'@A.U59            A.C48          3.10
   4 phosphate  OP2@A.G57            A.PSU55        2.90

****************************************************************************
Note: for the various types of loops listed below, numbers within the first
      set of brackets are the number of loop nts, and numbers in the second
      set of brackets are the identities of the stems (positive number) or
      isolated WC/wobble pairs (negative numbers) to which they are linked.

****************************************************************************
List of 3 hairpin loops
   1 hairpin loop: nts=10; [8]; linked by [#2]
     summary: [1] 8 [A.13 A.22] 4
     nts=10 CAGuuGGGAG A.C13,A.A14,A.G15,A.H2U16,A.H2U17,A.G18,A.G19,A.G20,A.A21,A.G22
       nts=8 AGuuGGGA A.A14,A.G15,A.H2U16,A.H2U17,A.G18,A.G19,A.G20,A.A21
   2 hairpin loop: nts=11; [9]; linked by [#3]
     summary: [1] 9 [A.30 A.40] 4
     nts=11 GAcUgAAgAPc A.G30,A.A31,A.OMC32,A.U33,A.OMG34,A.A35,A.A36,A.YYG37,A.A38,A.PSU39,A.5MC40
       nts=9 AcUgAAgAP A.A31,A.OMC32,A.U33,A.OMG34,A.A35,A.A36,A.YYG37,A.A38,A.PSU39
   3 hairpin loop: nts=9; [7]; linked by [#4]
     summary: [1] 7 [A.53 A.61] 5
     nts=9 GtPCGaUCC A.G53,A.5MU54,A.PSU55,A.C56,A.G57,A.1MA58,A.U59,A.C60,A.C61
       nts=7 tPCGaUC A.5MU54,A.PSU55,A.C56,A.G57,A.1MA58,A.U59,A.C60

****************************************************************************
List of 1 junction
   1 4-way junction: nts=16; [2,1,5,0]; linked by [#1,#2,#3,#4]
     summary: [4] 2 1 5 0 [A.7 A.66 A.10 A.25 A.27 A.43 A.49 A.65] 7 4 4 5
     nts=16 UUAgCgCGAGgUCcGA A.U7,A.U8,A.A9,A.2MG10,A.C25,A.M2G26,A.C27,A.G43,A.A44,A.G45,A.7MG46,A.U47,A.C48,A.5MC49,A.G65,A.A66
       nts=2 UA A.U8,A.A9
       nts=1 g A.M2G26
       nts=5 AGgUC A.A44,A.G45,A.7MG46,A.U47,A.C48
       nts=0

****************************************************************************
List of 1 non-loop single-stranded segment
   1 nts=4 ACCA A.A73,A.C74,A.C75,A.A76

****************************************************************************
List of 1 kissing loop interaction
   1 isolated-pair #-1 between hairpin loops #1 and #3

****************************************************************************
List of 2 U-turns
   1  A.U33-A.A36 H-bonds[1]: "N3(imino)-OP2[2.80]" nts=6 cUgAAg A.OMC32,A.U33,A.OMG34,A.A35,A.A36,A.YYG37
   2  A.PSU55-A.1MA58 H-bonds[1]: "N3(imino)-OP2[2.77]" nts=6 tPCGaU A.5MU54,A.PSU55,A.C56,A.G57,A.1MA58,A.U59

****************************************************************************
List of 18 phosphate interactions
   1 A.U7      OP1-hbonds[1]: "MG@A.MG580[2.60]"
   2 A.A9      OP2-hbonds[1]: "N4@A.C13[3.01]"
   3 A.A14     OP2-hbonds[1]: "MG@A.MG580[1.93]"
   4 A.H2U16   OP2-cap: "A.H2U16"
   5 A.G18     OP1-hbonds[1]: "O2'@A.H2U17[2.97]"
   6 A.G19     OP1-hbonds[2]: "N4@A.C60[3.27],MN@A.MN530[2.19]"
   7 A.G20     OP1-hbonds[1]: "MG@A.MG540[2.07]"
   8 A.A21     OP2-hbonds[1]: "MG@A.MG540[2.11]"
   9 A.A23     OP2-hbonds[1]: "N6@A.A9[3.12]"
  10 A.A35     OP2-cap: "A.U33"
  11 A.A36     OP2-hbonds[1]: "N3@A.U33[2.80]"
  12 A.YYG37   OP2-hbonds[1]: "MG@A.MG590[2.53]"
  13 A.C48     OP2-hbonds[1]: "O2'@A.7MG46[3.55]"
  14 A.5MC49   OP1-hbonds[1]: "O2'@A.C48[3.13]" OP2-hbonds[1]: "O2'@A.U7[2.68]"
  15 A.U50     OP1-hbonds[1]: "O2'@A.U47[2.71]"
  16 A.G57     OP2-cap: "A.PSU55"
  17 A.1MA58   OP2-hbonds[1]: "N3@A.PSU55[2.77]"
  18 A.C60     OP1-hbonds[1]: "N4@A.C61[3.12]" OP2-hbonds[1]: "O2'@A.1MA58[2.42]"

****************************************************************************
List of 9 splayed-apart dinucleotides
   1 A.U7     A.U8     angle=127    distance=17.2     ratio=0.90
   2 A.H2U16  A.H2U17  angle=146    distance=19.5     ratio=0.96
   3 A.H2U17  A.G18    angle=106    distance=15.8     ratio=0.80
   4 A.G19    A.G20    angle=130    distance=16.8     ratio=0.91
   5 A.7MG46  A.U47    angle=139    distance=19.0     ratio=0.94
   6 A.U47    A.C48    angle=157    distance=19.7     ratio=0.98
   7 A.C48    A.5MC49  angle=148    distance=17.5     ratio=0.96
   8 A.C60    A.C61    angle=91     distance=13.8     ratio=0.71
   9 A.C75    A.A76    angle=160    distance=18.7     ratio=0.98
----------------------------------------------------------------
Summary of 6 splayed-apart units
   1 nts=2 UU A.U7,A.U8
   2 nts=3 uuG A.H2U16,A.H2U17,A.G18
   3 nts=2 GG A.G19,A.G20
   4 nts=4 gUCc A.7MG46,A.U47,A.C48,A.5MC49
   5 nts=2 CC A.C60,A.C61
   6 nts=2 CA A.C75,A.A76

****************************************************************************
This structure contains 1-order pseudoknot
   o You may want to run DSSR again with the '--nested' option which removes
     pseudoknots to get a fully nested secondary structure representation.

****************************************************************************
Secondary structures in dot-bracket notation (dbn) as a whole and per chain
>1ehz nts=76 [whole]
GCGGAUUUAgCUCAGuuGGGAGAGCgCCAGAcUgAAgAPcUGGAGgUCcUGUGtPCGaUCCACAGAAUUCGCACCA
(((((((..((((.....[..)))).((((.........)))).....(((((..]....))))))))))))....
>1ehz-A #1 nts=76 0.09(2.86) [chain] RNA
GCGGAUUUAgCUCAGuuGGGAGAGCgCCAGAcUgAAgAPcUGGAGgUCcUGUGtPCGaUCCACAGAAUUCGCACCA
(((((((..((((.....[..)))).((((.........)))).....(((((..]....))))))))))))....

****************************************************************************
Summary of structural features of 76 nucleotides
  Note: the first five columns are: (1) serial number, (2) one-letter
    shorthand name, (3) dbn, (4) id string, (5) rmsd (~zero) of base
    ring atoms fitted against those in a standard base reference
    frame. The sixth (last) column contains a comma-separated list of
    features whose meanings are mostly self-explanatory, except for:
      turn: angle C1'(i-1)--C1'(i)--C1'(i+1) < 90 degrees
      break: no backbone linkage between O3'(i-1) and P(i)
   1  G ( A.G1     0.008  anti,~C3'-endo,BI,canonical,non-pair-contact,helix-end,stem-end,coaxial-stack
   2  C ( A.C2     0.006  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
   3  G ( A.G3     0.021  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
   4  G ( A.G4     0.018  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
   5  A ( A.A5     0.018  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
   6  U ( A.U6     0.011  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
   7  U ( A.U7     0.008  anti,~C2'-endo,canonical,non-pair-contact,helix,stem-end,coaxial-stack,junction-loop,phosphate,splayed-apart
   8  U . A.U8     0.016  anti,~C3'-endo,non-canonical,non-pair-contact,helix,multiplet,junction-loop,splayed-apart
   9  A . A.A9     0.032  anti,~C2'-endo,non-canonical,non-pair-contact,multiplet,junction-loop,phosphate
  10  g ( A.2MG10  0.018  modified,turn,anti,~C3'-endo,canonical,non-canonical,non-pair-contact,helix,stem-end,coaxial-stack,multiplet,junction-loop
  11  C ( A.C11    0.005  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  12  U ( A.U12    0.012  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack,multiplet
  13  C ( A.C13    0.013  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,multiplet,hairpin-loop,kissing-loop
  14  A . A.A14    0.025  anti,~C3'-endo,non-canonical,non-pair-contact,helix,multiplet,hairpin-loop,kissing-loop,phosphate
  15  G . A.G15    0.021  anti,~C3'-endo,non-canonical,non-pair-contact,helix,hairpin-loop,kissing-loop
  16  u . A.H2U16  0.188  modified,anti,~C3'-endo,non-canonical,non-pair-contact,helix-end,hairpin-loop,kissing-loop,cap-donor,cap-acceptor,phosphate,splayed-apart
  17  u . A.H2U17  0.201  modified,turn,anti,~C3'-endo,non-stack,non-pair-contact,hairpin-loop,kissing-loop,splayed-apart
  18  G . A.G18    0.028  anti,~C2'-endo,BII,non-canonical,non-pair-contact,helix,hairpin-loop,kissing-loop,phosphate,splayed-apart
  19  G [ A.G19    0.026  pseudoknotted,anti,~C2'-endo,isolated-canonical,non-pair-contact,helix-end,hairpin-loop,kissing-loop,phosphate,splayed-apart
  20  G . A.G20    0.018  anti,~C3'-endo,non-stack,non-pair-contact,hairpin-loop,kissing-loop,phosphate,splayed-apart
  21  A . A.A21    0.018  anti,~C3'-endo,BI,non-canonical,non-pair-contact,multiplet,hairpin-loop,kissing-loop,phosphate
  22  G ) A.G22    0.017  anti,~C3'-endo,BI,canonical,non-canonical,non-pair-contact,helix,stem-end,coaxial-stack,multiplet,hairpin-loop,kissing-loop
  23  A ) A.A23    0.030  anti,~C3'-endo,BI,canonical,non-canonical,non-pair-contact,helix,stem,coaxial-stack,multiplet,phosphate
  24  G ) A.G24    0.013  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  25  C ) A.C25    0.011  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,multiplet,junction-loop
  26  g . A.M2G26  0.013  modified,anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,junction-loop
  27  C ( A.C27    0.011  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,junction-loop
  28  C ( A.C28    0.007  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  29  A ( A.A29    0.019  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  30  G ( A.G30    0.013  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,hairpin-loop
  31  A . A.A31    0.007  anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,hairpin-loop
  32  c . A.OMC32  0.008  modified,anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,hairpin-loop
  33  U . A.U33    0.014  u-turn,anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix-end,hairpin-loop,cap-acceptor
  34  g . A.OMG34  0.021  modified,turn,u-turn,anti,~C3'-endo,BI,non-pair-contact,hairpin-loop
  35  A . A.A35    0.006  u-turn,anti,~C3'-endo,BI,non-pair-contact,hairpin-loop,cap-donor,phosphate
  36  A . A.A36    0.015  u-turn,anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix-end,hairpin-loop,phosphate
  37  g . A.YYG37  0.015  modified,anti,~C3'-endo,BI,non-pair-contact,hairpin-loop,phosphate
  38  A . A.A38    0.008  anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,hairpin-loop
  39  P . A.PSU39  0.004  modified,anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,hairpin-loop
  40  c ) A.5MC40  0.005  modified,anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,hairpin-loop
  41  U ) A.U41    0.013  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  42  G ) A.G42    0.010  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  43  G ) A.G43    0.025  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,junction-loop
  44  A . A.A44    0.015  anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,junction-loop
  45  G . A.G45    0.026  anti,~C3'-endo,BI,non-canonical,non-pair-contact,multiplet,junction-loop
  46  g . A.7MG46  0.022  modified,anti,~C2'-endo,non-canonical,non-pair-contact,multiplet,junction-loop,splayed-apart
  47  U . A.U47    0.007  turn,anti,~C2'-endo,non-stack,non-pair-contact,junction-loop,splayed-apart
  48  C . A.C48    0.015  turn,anti,~C2'-endo,non-canonical,non-pair-contact,helix,junction-loop,cap-acceptor,phosphate,splayed-apart
  49  c ( A.5MC49  0.020  modified,anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,junction-loop,phosphate,splayed-apart
  50  U ( A.U50    0.024  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack,phosphate
  51  G ( A.G51    0.030  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  52  U ( A.U52    0.020  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  53  G ( A.G53    0.027  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,hairpin-loop,kissing-loop
  54  t . A.5MU54  0.010  modified,anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,hairpin-loop,kissing-loop
  55  P . A.PSU55  0.024  modified,u-turn,anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix,hairpin-loop,kissing-loop,cap-acceptor
  56  C ] A.C56    0.014  pseudoknotted,turn,u-turn,anti,~C3'-endo,BI,isolated-canonical,non-pair-contact,helix-end,hairpin-loop,kissing-loop
  57  G . A.G57    0.017  u-turn,anti,~C3'-endo,BI,non-pair-contact,hairpin-loop,kissing-loop,cap-donor,phosphate
  58  a . A.1MA58  0.012  modified,u-turn,anti,~C2'-endo,BII,non-canonical,non-pair-contact,helix,hairpin-loop,kissing-loop,phosphate
  59  U . A.U59    0.029  turn,anti,~C3'-endo,BI,non-canonical,non-pair-contact,helix-end,hairpin-loop,kissing-loop,cap-donor
  60  C . A.C60    0.016  anti,~C2'-endo,non-pair-contact,hairpin-loop,kissing-loop,phosphate,splayed-apart
  61  C ) A.C61    0.012  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,hairpin-loop,kissing-loop,splayed-apart
  62  A ) A.A62    0.016  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  63  C ) A.C63    0.016  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  64  A ) A.A64    0.014  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  65  G ) A.G65    0.017  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,junction-loop
  66  A ) A.A66    0.016  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem-end,coaxial-stack,junction-loop
  67  A ) A.A67    0.020  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  68  U ) A.U68    0.011  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  69  U ) A.U69    0.015  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  70  C ) A.C70    0.018  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  71  G ) A.G71    0.013  anti,~C3'-endo,BI,canonical,non-pair-contact,helix,stem,coaxial-stack
  72  C ) A.C72    0.006  anti,~C3'-endo,BI,canonical,non-pair-contact,helix-end,stem-end,coaxial-stack
  73  A . A.A73    0.010  anti,~C3'-endo,BI,non-pair-contact,ss-non-loop
  74  C . A.C74    0.007  anti,~C3'-endo,BI,non-pair-contact,ss-non-loop
  75  C . A.C75    0.007  anti,~C3'-endo,BII,non-pair-contact,ss-non-loop,splayed-apart
  76  A . A.A76    0.010  anti,~C2'-endo,non-stack,non-pair-contact,ss-non-loop,splayed-apart

****************************************************************************
List of 14 additional files
   1 dssr-pairs.pdb -- an ensemble of base pairs
   2 dssr-multiplets.pdb -- an ensemble of multiplets
   3 dssr-stems.pdb -- an ensemble of stems
   4 dssr-helices.pdb -- an ensemble of helices (coaxial stacking)
   5 dssr-hairpins.pdb -- an ensemble of hairpin loops
   6 dssr-junctions.pdb -- an ensemble of junctions (multi-branch)
   7 dssr-2ndstrs.bpseq -- secondary structure in bpseq format
   8 dssr-2ndstrs.ct -- secondary structure in connectivity table format
   9 dssr-2ndstrs.dbn -- secondary structure in dot-bracket notation
  10 dssr-torsions.txt -- backbone torsion angles and suite names
  11 dssr-splays.pdb -- an ensemble of splayed-apart units
  12 dssr-Uturns.pdb -- an ensemble of U-turn motifs
  13 dssr-stacks.pdb -- an ensemble of stacks
  14 dssr-atom2bases.pdb -- an ensemble of atom-base stacking interactions

Note: shown above, the 3-dimensional schematic images (with rectangular base blocks) were created with the 3DNA blocview program to generate .r3d-formatted files that were ray-traced using PyMOL. The 2-dimensional diagrams were produced with VARNA: Visualization Applet for RNA using DSSR-derived base sequence and dot-bracket notation of secondary structure:

Code: [Select]

>1msy-A #1 RNA with 27 nts
UGCUCCUAGUACGUAAGGACCGGAGUG
.(((((.....(....)....))))).

>1ehz-A #1 RNA with 76 nts
GCGGAUUUAgCUCAGuuGGGAGAGCgCCAGAcUgAAgAPcUGGAGgUCcUGUGuPCGaUCCACAGAAUUCGCACCA
(((((((..((((.....[..)))).((((.........)))).....(((((..]....))))))))))))....

Site announcements / The Biophysical Society (BPS) 59th annual meeting at Baltimore

« on: February 04, 2015, 11:31:37 pm »

I’m going to attend the Biophysical Society (BPS) 59th Annual Meeting to be held during February 7-11 at Baltimore, Maryland. In last year’s BPS annual meeting (San Francisco, California), I was delighted to come across a few 3DNA users at poster sessions. I thought this post may help to connect me with some DSSR/3DNA users in the coming meeting.

Want to have a meetup at Baltimore? Please drop me a message!

Xiang-Jun

RNA structures (DSSR) / Proposed changes in DSSR v1.2

« on: December 03, 2014, 02:20:58 pm »

The coming release of DSSR v1.2.0 includes proposed format changes in existing sections and a newly added section, as detailed below. I'd like to keep the community informed and hopefully to hear your comments before the 'official' release. To have a look of new features, try the web-interface to DSSR.

Non-pairing interactions (option --non-pair)

Currently, the format is like this, using 1msy as an example:

Code: [Select]

List of 30 non-pairing interactions
      nt1              nt2              base-stacking    H-bonding
   1 A.U2647          A.G2648             1.0(0.5)
   2 A.G2648          A.C2649             7.3(4.6)
   8 A.C2652          A.G2669             0.2(0.0)
   9 A.U2653          A.A2654             3.3(2.0)    H-bonds[1]: "OP2-O2'(hydroxyl)[2.62]"
  12 A.G2655          A.A2665             0.0(0.0)    H-bonds[3]: "N1(imino)-OP2[2.77],N2(amino)-OP2[3.34],N2(amino)-O5'[2.89]"

The corresponding section in v1.2.0 is:

Code: [Select]

List of 30 non-pairing interactions
   1 A.U2647        A.G2648        stacking: 1.0(0.5)--pm(>>,forward)
   2 A.G2648        A.C2649        stacking: 7.3(4.6)--pm(>>,forward)
   8 A.C2652        A.G2669        stacking: 0.2(0.0)--mm(<>,outward)
   9 A.U2653        A.A2654        stacking: 3.3(2.0)--pp(><,inward) H-bonds[1]: "OP2-O2'(hydroxyl)[2.62]"
  12 A.G2655        A.A2665        H-bonds[3]: "N1(imino)-OP2[2.77],N2(amino)-OP2[3.34],N2(amino)-O5'[2.89]"

The header line is removed. For each pair of nucleotides involved in non-pairing interactions, the stacking or H-bonding part is listed only if it is present. For example, #1 in the list, A.U2647 vs. A.G2648, has only stacking interaction, no H-bonding; #12 (A.G2655 vs. A.A2665) has only H-bonding interactions, no base-stacking; #9 (A.U2653 vs. A.A2654) has both base-stacking and H-bonding interactions.

I've also introduced a classification of base-stacking interactions, into 4 categories, depending on the relative orientations of base faces, defined by the standard base reference frame: pm, mp, pp, and mm. Here 'p' is for plus, meaning the other base is on the +z-axis side of the referenced base; 'm' for minus, meaning the other base is on the –z-axis side of the referenced base. The corresponding symbols used by Major et al are put in parentheses; pm(>>, forward), mp(<<, backward), pp(><, inward), and mm(<>, outward). Note that I prefer to call pm(>>) forward instead of upward, and mp(<<) backward instead of downward.

List of isolated canonical pairs

An extra column is added at the beginning to specify the helix that the isolated WC/wobble pair is part of. If it does not belong to any helix, the symbol n/a is used. Using 1msy as an example, the new output for this section would be as below.

List of 1 isolated WC/wobble pair Note: isolated WC/wobble pairs are assigned negative indices to differentiate them from the stem numbers, which are positive. -------------------------------------------------------------------- [#1] -1 A.C2658 A.G2663 C-G WC 19-XIX cWW cW-W

Phosphate interactions (option --po4)

In addition to H-bonding and capping interactions, coordinations with metal ions will be reported. To keep things simple, the OP1-hbonds or OP2-hbonds listing also includes corresponding metals that the non-bridging oxygen coordinates with. Using the yeast phenylalanine 1ehz as an example, the new output looks like below:

Code: [Select]

List of 18 phosphate interactions
   1 A.U7            OP1-hbonds[1]: "MG@A.MG580[2.60]"
   ...........
   6 A.G19           OP1-hbonds[2]: "N4@A.C60[3.27],MN@A.MN530[2.19]"
   ...........

The new option --idstr to replace --long-idstr

The specific --long-idstr option has been replaced by --idstr, which can take value long (i.e., --idstr=long) to achieve previous functionality. Moreover, it can take short (i.e., --idstr=short) as an argument to list only name and number of nucleotides (e.g., A21).

Xiang-Jun

DNA/RNA-protein interactions (SNAP) / SNAP: software for characterizing DNA-protein interactions

« on: May 05, 2014, 01:22:59 am »

DNA/RNA-protein interactions underpin fundamental biological processes such as transcription, splicing, and translation. The increasing number of experimentally determined 3D structures of nucleic acid-protein complexes provides unprecedented opportunities to decipher the underlying principles governing the process of DNA/RNA-protein recognition. Existing bioinformatics tools are fragmented, with limited scope or usability. We have developed SNAP, a new 3DNA program for the characterization of 3D Structures of Nucleic Acid-Protein complexes. SNAP consolidates, refines, and significantly extends 3DNA's functionality for DNA-protein structural analysis.

Starting from a structure of a DNA-protein complex in PDB or PDBx/mmCIF format, SNAP automatically detects double-helical regions consisting of either canonical or non-canonical base-pairs using DSSR, and categorizes protein into secondary structural units (alpha-helices, beta-sheets, turns, and loops) using DSSP. The program aims to characterize DNA/RNA-protein interactions by checking all combinations between the two components: major groove, minor groove, and backbone for DNA/RNA, versus each alpha-helix, beta-sheet, turn, and loop for protein. SNAP recognizes and outputs base-amino-acid H-bonding and stacking interactions. To quantify the relative spatial relationship between interacting amino acids and bases, SNAP defines a local amino-acid reference frame in the side chain, and takes advantage of the standard base reference frame (see figures below). SNAP calculates all six rigid-body parameters to allow for the analysis of large sets of DNA/RNA-protein complexes consistently and rigorously.

Implemented in ANSI C as a standalone command-line program, SNAP follows the same minimalist design as DSSR. It is tiny (executables are about 1MB) and self-contained, with zero runtime dependencies on third party libraries or configurations. The program is currently under active development, and your feedback will make a difference!

SNAP has been integrated into DSSR, and is available from the Columbia Technology Ventures (CTV) website.

Release history (in reverse chronological order)

List of users who has helped improve SNAP by reporting bugs, making comments/suggestions etc:

Auffinger; jdbrown444; ldfinger; miaozhichao; Phosphoserine; jms89

-- Xiang-Jun

Note: please start a new topic with a more specific title; do not post directly below this announcement.

Abbreviations used: AA: amino acid; BP: base-pair

Here is a sample run on 1oct (see x3dna-snap -h for more info), the crystal structure of the Oct-1 POU domain bound to an octamer site solved by Pabo et al.:

Code: [Select]

Run: x3dna-snap --time-stamp=off --interface -i=1oct.pdb -o=1oct.out
****************************************************************************
       SNAP: a software tool for the characterization of 3D
           Structures of Nucleic Acid-Protein complexes
              v1.0.7-2020sep09, by xiangjun@x3dna.org

SNAP has been made possible by the NIH grant R01GM096889 (to X.J.Lu).
It is being actively maintained and developed. As always, I greatly
appreciate your feedback. Please report all SNAP-related issues on
the 3DNA Forum (forum.x3dna.org). I strive to respond promptly to any
questions posted there. SNAP is free of charge for NON-COMMERCIAL
purposes, and it comes with ABSOLUTELY NO WARRANTY.

****************************************************************************
Note: By default, each nucleotide/amino-acid is identified by chainId.name#.
      So a common case would be B.DA1689, meaning adenosine #1689 on chain B.
      Use the --idstr=long option to get strictly delineated id strings.

Command: x3dna-snap -i=1oct.pdb -o=1oct.out --interface --type=either
File name: 1oct.pdb
    no. of peptide chains: 1 [C=131]
    no. of DNA/RNA chains: 2 [A=15,B=15]
    no. of amino acids:    131
    no. of nucleotides:    30
    no. of atoms:          1670
    no. of waters:         0
    no. of metals:         0

****************************************************************************
List of 1 helix
  Note: a helix is defined by base-stacking interactions, regardless of bp
        type and backbone connectivity, and may contain more than one stem.
      helix#number[stems-contained] bps=number-of-base-pairs in the helix
      bp-type: '|' for a canonical WC/wobble pair, '.' otherwise
      helix-form: classification of a dinucleotide step comprising the bp
        above the given designation and the bp that follows it. Types
        include 'A', 'B' or 'Z' for the common A-, B- and Z-form helices,
        '.' for an unclassified step, and 'x' for a step without a
        continuous backbone.
      --------------------------------------------------------------------
  helix#1[0] bps=14
      strand-1 5'-GTATGCAAATAAGG-3'
       bp-type    ||||||||||||||
      strand-2 3'-CATACGTTTATTCC-5'
      helix-form  BBBBBBBBBBBBB
   1 A.DG202        B.DC230        G-C WC           19-XIX    cWW  cW-W
   2 A.DT203        B.DA229        T-A WC           20-XX     cWW  cW-W
   3 A.DA204        B.DT228        A-T WC           20-XX     cWW  cW-W
   4 A.DT205        B.DA227        T-A WC           20-XX     cWW  cW-W
   5 A.DG206        B.DC226        G-C WC           19-XIX    cWW  cW-W
   6 A.DC207        B.DG225        C-G WC           19-XIX    cWW  cW-W
   7 A.DA208        B.DT224        A-T WC           20-XX     cWW  cW-W
   8 A.DA209        B.DT223        A-T WC           20-XX     cWW  cW-W
   9 A.DA210        B.DT222        A-T WC           20-XX     cWW  cW-W
  10 A.DT211        B.DA221        T-A WC           20-XX     cWW  cW-W
  11 A.DA212        B.DT220        A-T WC           20-XX     cWW  cW-W
  12 A.DA213        B.DT219        A-T WC           20-XX     cWW  cW-W
  13 A.DG214        B.DC218        G-C WC           19-XIX    cWW  cW-W
  14 A.DG215        B.DC217        G-C WC           19-XIX    cWW  cW-W

****************************************************************************
List of 77 nucleotide/amino-acid interactions
       id   nt-aa   nt           aa              Tdst    Rdst     Tx      Ty      Tz      Rx      Ry      Rz
   1  1oct  G-thr  A.DG202      C.THR26         12.73  -38.52   -6.47   10.19    4.03  -24.59   26.49  -13.54
   2  1oct  T-arg  A.DT203      C.ARG20         19.41 -111.23  -15.89    8.93    6.68  -43.44  -11.44 -104.66
   3  1oct  T-ile  A.DT203      C.ILE21        -17.63 -139.08  -13.56    5.93   -9.58  -15.64   62.29 -131.25
   4  1oct  T-thr  A.DT203      C.THR26        -13.42  -56.68   -2.74   13.12    0.78  -17.93   32.10  -43.91
   5  1oct  T-gln  A.DT203      C.GLN27         12.54 -132.49   -5.55    8.99    6.76   21.85  -81.09 -114.36
   6  1oct  T-gln  A.DT203      C.GLN44         10.67 -115.42   -4.72    9.57   -0.05   65.02   27.06  -98.34
   7  1oct  T-ser  A.DT203      C.SER48         11.31  -84.75   -8.12    6.44    4.54   29.42    7.18  -80.13
   8  1oct  A-lys  A.DA204      C.LYS17        -18.07  175.87   14.34   -7.08   -8.42    7.85  -46.15  175.49
   9  1oct  A-gln  A.DA204      C.GLN27        -11.98  163.47    5.68   -8.28   -6.55   20.04  -87.74  156.55
  10  1oct  A-gln  A.DA204      C.GLN44          8.90 -147.71   -1.25    8.28   -3.00   62.13   -4.03 -142.09
  11  1oct  A-ser  A.DA204      C.SER48         10.01 -126.61   -5.86    8.07    0.89   21.11   -7.37 -125.49
  12  1oct  A-glu  A.DA204      C.GLU51         14.87 -145.96  -11.08    8.90   -4.38   87.37   -5.57 -132.18
  13  1oct  T-thr  A.DT205      C.THR45          6.48 -148.19   -1.00    5.50    3.28  -35.77    2.63 -146.52
  14  1oct  T-ser  A.DT205      C.SER48        -10.33 -151.96   -5.14    8.73   -2.03   28.42  -17.90 -150.69
  15  1oct  G-thr  A.DG206      C.THR45         -7.46  178.53   -1.92   -7.19    0.56   34.46  -11.97  178.45
  16  1oct  G-arg  A.DG206      C.ARG49        -11.02  175.51    3.67   -9.94    3.04   54.67  -28.85  174.77
  17  1oct  C-arg  A.DC207      C.ARG49        -10.55  130.65   -1.73  -10.40    0.39   42.89  -25.07  125.22
  18  1oct  C-arg  A.DC207      C.ARG105        10.64   61.56   -9.91   -3.22    2.16    3.33   59.24   17.22
  19  1oct  A-arg  A.DA208      C.ARG105        10.98  -45.39  -10.37   -1.40   -3.30   18.22   38.36  -16.42
  20  1oct  A-arg  A.DA208      C.ARG113        17.53  -87.05  -14.54    6.53   -7.31   61.91   23.81  -59.97
  21  1oct  A-arg  A.DA209      C.ARG102        12.96  -64.64  -12.42   -3.69    0.50   19.39   38.44  -49.42
  22  1oct  A-arg  A.DA209      C.ARG105       -10.54  -77.58   -9.01    0.77   -5.43   32.27   40.40  -59.98
  23  1oct  A-thr  A.DA209      C.THR106       -12.45   76.08   -4.46   10.70    4.55  -24.22  -70.81   14.79
  24  1oct  A-ile  A.DA209      C.ILE108       -14.17  -36.15   -5.74   12.87   -1.43   -1.65  -27.04  -24.16
  25  1oct  A-arg  A.DA209      C.ARG113       -16.89 -120.22  -11.02    8.88   -9.21   63.56   10.27 -107.85
  26  1oct  A-trp  A.DA209      C.TRP148        11.47  103.87    6.57    5.65    7.52    2.91   44.45   96.44
  27  1oct  A-asn  A.DA209      C.ASN151         9.24 -165.14   -6.41    6.54   -1.24   44.17   60.06 -161.30
  28  1oct  A-lys  A.DA209      C.LYS155        12.02  116.35    8.46   -4.66    7.15    1.16   62.24  103.95
  29  1oct  A-arg  A.DA210      C.ARG102        13.00  -90.23  -12.62   -0.84   -3.03   31.67   30.70  -80.82
  30  1oct  A-lys  A.DA210      C.LYS103       -13.17  -49.41   -7.21    9.69    5.25  -32.94  -36.47   -5.27
  31  1oct  A-lys  A.DA210      C.LYS104       -14.22  106.94    5.65   12.26   -4.48   29.91   64.97   85.63
  32  1oct  A-arg  A.DA210      C.ARG105       -12.13 -102.76   -8.14    2.20   -8.71   44.67   29.25  -91.36
  33  1oct  A-thr  A.DA210      C.THR106       -13.69  -77.72   -3.34   13.06    2.39  -41.68  -63.09  -19.46
  34  1oct  A-val  A.DA210      C.VAL144        12.95   98.31    5.37    9.78    6.57    9.45   78.15   64.48
  35  1oct  A-val  A.DA210      C.VAL147         9.50  177.54    8.35   -4.40    1.13  -31.38  -26.50  177.38
  36  1oct  A-asn  A.DA210      C.ASN151         7.95  169.39    3.58   -6.02   -3.76  -61.24  -46.91  166.41
  37  1oct  T-arg  A.DT211      C.ARG102       -14.28 -123.19  -12.66    2.15   -6.23   35.07   24.54 -118.55
  38  1oct  T-lys  A.DT211      C.LYS103       -14.60  -59.01   -4.92   13.51    2.53  -42.55  -18.37  -37.59
  39  1oct  T-val  A.DT211      C.VAL147        -9.12  143.26    7.02   -5.17   -2.68  -31.13  -15.46  141.43
  40  1oct  C-ser  B.DC217      C.SER128       -15.00  -66.88   -9.78    8.86   -7.13   36.24   36.65  -44.16
  41  1oct  C-lys  B.DC217      C.LYS142        18.35  -89.91   -7.88   15.79   -5.07   36.13   69.78  -47.78
  42  1oct  C-arg  B.DC217      C.ARG146        14.28  -55.97   -8.39   11.34    2.24   15.64   47.13  -26.68
  43  1oct  C-lys  B.DC218      C.LYS125        15.78 -137.92  -15.08    2.61   -3.83   11.82   49.06 -133.24
  44  1oct  C-arg  B.DC218      C.ARG146        13.18  -86.99   -2.76   12.86   -0.85   32.48   47.52  -68.28
  45  1oct  C-cys  B.DC218      C.CYS150        11.54  -63.48   -5.57    8.38    5.65    2.43   35.14  -53.68
  46  1oct  C-arg  B.DC218      C.ARG153        14.95 -110.67  -12.78    7.73    0.61   48.20   62.91  -84.80
  47  1oct  T-lys  B.DT219      C.LYS125       -16.35 -155.82  -12.71    6.20   -8.22   28.65   34.22 -153.83
  48  1oct  T-cys  B.DT219      C.CYS150        10.66  -81.40   -2.86   10.10    1.89    8.36   20.66  -78.81
  49  1oct  T-arg  B.DT219      C.ARG153        13.68 -118.82   -9.41    9.25   -3.59   61.30   38.84 -101.70
  50  1oct  T-lys  B.DT220      C.LYS157        15.34 -128.23  -11.99    9.55    0.74   26.77    7.97 -126.53
  51  1oct  T-ser  B.DT223      C.SER56         13.43 -172.88  -11.81    5.85    2.61   78.96  -41.85 -169.98
  52  1oct  T-lys  B.DT223      C.LYS58        -15.63 -104.26  -10.08    8.56   -8.32   47.88   32.02  -91.06
  53  1oct  T-asn  B.DT223      C.ASN59         13.38  -79.77   -9.24    8.66    4.33  -40.87    3.90  -69.96
  54  1oct  T-lys  B.DT223      C.LYS62         16.93  -82.77  -14.55    8.20   -2.75   16.71   20.50  -79.16
  55  1oct  T-arg  B.DT223      C.ARG102        12.45   46.89  -11.36    4.59    2.22   27.19  -16.42   34.94
  56  1oct  T-phe  B.DT224      C.PHE42         13.51   37.27   -1.91   12.27    5.32   -1.35    5.84   36.80
  57  1oct  T-thr  B.DT224      C.THR46         10.40  140.67    5.66    1.06    8.65  -28.39   38.51  136.80
  58  1oct  T-arg  B.DT224      C.ARG49         12.06 -116.46   -5.79   10.20    2.79   24.96   60.01 -102.74
  59  1oct  T-leu  B.DT224      C.LEU55         11.64  144.35    9.48   -3.62   -5.72  -19.44  -72.43  134.61
  60  1oct  T-asn  B.DT224      C.ASN59        -13.74 -111.77   -6.97   11.83    0.55  -36.46   15.99 -106.76
  61  1oct  T-lys  B.DT224      C.LYS62        -17.62 -117.67  -12.18   11.04   -6.35   23.45   13.68 -115.67
  62  1oct  T-leu  B.DT224      C.LEU63         16.98 -113.68  -11.93   10.63    5.74  -60.86   16.39 -100.17
  63  1oct  T-arg  B.DT224      C.ARG102       -14.08  -32.87  -11.42    8.12   -1.33   20.82  -25.38   -1.63
  64  1oct  T-arg  B.DT224      C.ARG105        10.20   47.44   -9.49    1.43    3.47   32.81  -33.87    5.35
  65  1oct  G-asp  B.DG225      C.ASP41        -15.79 -162.72  -12.41    2.27    9.49  -41.38  -74.71 -156.40
  66  1oct  G-phe  B.DG225      C.PHE42         12.92   10.20    2.40   12.63    1.36   -9.66    3.27    0.08
  67  1oct  G-ser  B.DG225      C.SER43         10.76 -127.61   -8.48    6.55   -1.00  -33.95   77.50 -106.70
  68  1oct  G-thr  B.DG225      C.THR45          8.31 -141.05   -5.19    5.96    2.57   27.42   16.56 -139.41
  69  1oct  G-thr  B.DG225      C.THR46          9.41  109.20    7.44   -0.06    5.76  -23.33   51.56   97.72
  70  1oct  G-arg  B.DG225      C.ARG49         10.84 -140.82   -1.39   10.75   -0.00   40.74   39.22 -135.24
  71  1oct  G-arg  B.DG225      C.ARG102       -14.97  -49.97   -8.82   10.75   -5.55    3.36  -36.30  -34.77
  72  1oct  G-arg  B.DG225      C.ARG105       -10.31  -55.06   -9.54    3.91   -0.13   12.03  -47.33  -26.24
  73  1oct  C-thr  B.DC226      C.THR45         -7.82 -173.77   -2.54    7.16   -1.83   17.56   14.19 -173.64
  74  1oct  C-lys  B.DC226      C.LYS104       -14.16  152.22   13.16    2.51   -4.59   66.99   37.28  144.36
  75  1oct  C-arg  B.DC226      C.ARG105       -11.37  -65.61   -8.90    6.45   -2.94   -3.39  -34.34  -56.68
  76  1oct  A-thr  B.DA227      C.THR45         -6.83  150.54   -0.53   -5.80   -3.56  -38.91   -1.49  148.71
  77  1oct  A-ser  B.DA227      C.SER107       -13.93  120.59    3.63    1.86  -13.32   63.52  -38.15  103.24

****************************************************************************
List of 71 base-pair/amino-acid interactions
       id   bp-aa     nt1          nt2          aa              Tdst    Rdst    Tx      Ty      Tz      Rx      Ry      Rz
   1  1oct  AT-arg  B.DA229      A.DT203      C.ARG20        -19.52  117.19  -15.97   -9.33   -6.22  -41.53   15.52  111.52
   2  1oct  AT-arg  A.DA208      B.DT224      C.ARG49        -12.01  110.94   -6.48   -9.84   -2.30   21.51  -51.98  100.00
   3  1oct  AT-arg  A.DA208      B.DT224      C.ARG102        14.29  -31.63  -11.90   -7.65    1.98   13.43   28.49   -2.95
   4  1oct  AT-arg  A.DA208      B.DT224      C.ARG105       -10.59  -45.43   -9.93   -1.42   -3.39   25.50   36.06  -10.92
   5  1oct  AT-arg  A.DA208      B.DT224      C.ARG113        17.31  -90.85  -14.39    6.97   -6.64   70.13   25.78  -55.99
   6  1oct  AT-arg  A.DA209      B.DT223      C.ARG102        12.71  -54.74  -11.96   -4.19   -0.89   23.07   27.36  -42.12
   7  1oct  AT-arg  A.DA209      B.DT223      C.ARG105       -10.29  -68.58   -8.18    0.74   -6.21   36.66   30.50  -50.81
   8  1oct  AT-arg  A.DA209      B.DT223      C.ARG113       -16.87 -116.63  -10.50    9.30   -9.38   73.35    5.41  -98.06
   9  1oct  AT-arg  A.DA210      B.DT222      C.ARG102       -13.20  -84.60  -12.59   -1.35   -3.71   32.25   24.46  -75.94
  10  1oct  AT-arg  A.DA210      B.DT222      C.ARG105       -12.11  -97.17   -7.84    1.95   -9.02   45.82   23.66  -85.44
  11  1oct  AT-arg  B.DA221      A.DT211      C.ARG102        14.18  115.79  -11.93   -1.95    7.41   37.89  -13.80  111.03
  12  1oct  AT-arg  A.DA213      B.DT219      C.ARG153       -13.63  118.33   -9.60   -9.00    3.57   62.96  -35.71  101.15
  13  1oct  AT-asn  A.DA208      B.DT224      C.ASN59         13.52  112.55   -7.23  -11.42    0.14  -38.45   -7.03  107.81
  14  1oct  AT-asn  A.DA209      B.DT223      C.ASN59        -13.39   77.54   -9.96   -8.07   -3.88  -34.03    7.88   70.36
  15  1oct  AT-asn  A.DA209      B.DT223      C.ASN151         9.34 -156.07   -6.69    6.48   -0.66   53.59   59.89 -148.51
  16  1oct  AT-asn  A.DA210      B.DT222      C.ASN151         7.73  175.46    3.73   -5.97   -3.21  -66.31  -46.20  174.04
  17  1oct  AT-cys  A.DA213      B.DT219      C.CYS150       -10.45   82.10   -3.17   -9.77   -1.89    9.43  -17.73   80.04
  18  1oct  AT-gln  B.DA229      A.DT203      C.GLN27        -12.65  160.63  -11.13   -5.09    3.19  -44.88  -77.69  152.54
  19  1oct  AT-gln  B.DA229      A.DT203      C.GLN44        -10.82  115.12   -4.79   -9.70   -0.08   67.58  -19.25   98.01
  20  1oct  AT-gln  A.DA204      B.DT228      C.GLN27        -11.93  166.48    5.68   -8.49   -6.16   16.93  -84.96  161.38
  21  1oct  AT-gln  A.DA204      B.DT228      C.GLN44          8.85 -145.35   -1.20    8.29   -2.86   64.97   -7.53 -138.55
  22  1oct  AT-glu  A.DA204      B.DT228      C.GLU51         14.79  -90.29   -0.94   12.66    7.59  -40.17   80.28  -10.94
  23  1oct  AT-ile  B.DA229      A.DT203      C.ILE21         17.69  140.07  -13.19   -6.51    9.84   -9.96  -57.42  133.98
  24  1oct  AT-ile  A.DA209      B.DT223      C.ILE108       -14.05  -45.00   -5.64   12.74   -1.80    1.85  -39.75  -21.44
  25  1oct  AT-leu  A.DA208      B.DT224      C.LEU55        -11.39 -149.57    9.27    3.36    5.69  -23.35   66.87 -142.43
  26  1oct  AT-leu  A.DA208      B.DT224      C.LEU63         16.88  116.53  -12.54  -10.40   -4.43  -63.20   -7.65  103.50
  27  1oct  AT-lys  A.DA204      B.DT228      C.LYS17        -18.00  178.12   14.14   -7.20   -8.51    4.21  -42.94  177.98
  28  1oct  AT-lys  A.DA208      B.DT224      C.LYS62         17.46  113.44  -12.10  -10.36    7.14   21.84   -5.49  111.96
  29  1oct  AT-lys  A.DA209      B.DT223      C.LYS58         15.58   97.17   -9.91   -8.49    8.52   54.71  -24.55   80.41
  30  1oct  AT-lys  A.DA209      B.DT223      C.LYS62         16.98   76.57  -14.77   -7.60    3.52   23.46  -10.58   72.75
  31  1oct  AT-lys  A.DA209      B.DT223      C.LYS155        12.22  111.66    8.02   -4.08    8.28   -9.80   54.07  101.45
  32  1oct  AT-lys  A.DA210      B.DT222      C.LYS103       -13.29  -56.30   -7.88    9.58    4.78  -35.38  -43.65   -3.71
  33  1oct  AT-lys  A.DA210      B.DT222      C.LYS104       -14.28  103.49    5.38   12.13   -5.25   22.30   62.89   84.32
  34  1oct  AT-lys  B.DA221      A.DT211      C.LYS103        14.53   67.18   -5.13  -13.50   -1.59  -46.94   30.72   38.56
  35  1oct  AT-lys  A.DA212      B.DT220      C.LYS157       -15.24  126.42  -12.14   -9.18   -0.66   26.14   -7.61  124.74
  36  1oct  AT-lys  A.DA213      B.DT219      C.LYS125        16.19  156.20  -12.64   -6.01    8.15   31.37  -32.17  154.22
  37  1oct  AT-phe  A.DA208      B.DT224      C.PHE42        -13.70  -40.01   -2.53  -12.67   -4.56  -10.22   -4.94  -38.43
  38  1oct  AT-ser  B.DA229      A.DT203      C.SER48        -11.58   87.13   -8.41   -6.64   -4.39   30.60    0.15   82.61
  39  1oct  AT-ser  A.DA204      B.DT228      C.SER48          9.93 -125.38   -5.92    7.94    0.78   23.33  -11.93 -123.79
  40  1oct  AT-ser  B.DA227      A.DT205      C.SER48         10.37  152.19   -4.85   -8.92    2.09   26.76   18.96  150.98
  41  1oct  AT-ser  B.DA227      A.DT205      C.SER107       -13.72  120.33    3.45    2.22  -13.09   64.99  -38.86  101.87
  42  1oct  AT-ser  A.DA209      B.DT223      C.SER56        -13.35   90.10   -4.50  -12.57    0.15   -6.23  -78.04   48.75
  43  1oct  AT-thr  B.DA229      A.DT203      C.THR26         13.45   58.18   -2.51  -13.19   -0.70  -17.63  -26.42   49.39
  44  1oct  AT-thr  B.DA227      A.DT205      C.THR45         -6.65  149.36   -0.76   -5.65   -3.42  -37.34   -2.05  147.61
  45  1oct  AT-thr  A.DA208      B.DT224      C.THR46        -10.47 -143.48    5.83   -1.89   -8.49  -35.34  -44.18 -138.31
  46  1oct  AT-thr  A.DA209      B.DT223      C.THR106       -12.24   88.85   -4.96   10.54    3.77  -23.98  -84.29   16.44
  47  1oct  AT-thr  A.DA210      B.DT222      C.THR106       -13.61  -84.35   -3.84   12.88    2.19  -42.84  -70.72  -18.40
  48  1oct  AT-trp  A.DA209      B.DT223      C.TRP148        11.45  101.29    6.02    6.18    7.54   -7.55   35.76   96.20
  49  1oct  AT-val  A.DA210      B.DT222      C.VAL144        12.90   94.14    4.85   10.35    5.99    2.43   74.57   62.19
  50  1oct  AT-val  A.DA210      B.DT222      C.VAL147         9.44 -177.78   -8.53    3.83    1.33   37.06   24.02 -177.61
  51  1oct  AT-val  B.DA221      A.DT211      C.VAL147        -9.19 -149.81    7.12    5.31    2.37  -42.28   11.70 -147.39
  52  1oct  GC-arg  A.DG206      B.DC226      C.ARG49         11.14  174.67    3.34   -9.86    3.98   45.92  -30.78  173.98
  53  1oct  GC-arg  A.DG206      B.DC226      C.ARG105        11.24   73.06   -8.49   -6.69    3.11   -0.59   41.26   61.67
  54  1oct  GC-arg  B.DG225      A.DC207      C.ARG49         10.69 -135.66   -1.58   10.57   -0.18   41.82   32.16 -130.18
  55  1oct  GC-arg  B.DG225      A.DC207      C.ARG102       -15.17  -52.80   -8.81   10.28   -6.84   -0.85  -42.84  -31.61
  56  1oct  GC-arg  B.DG225      A.DC207      C.ARG105       -10.47  -57.77   -9.76    3.61   -1.15    7.66  -53.29  -21.80
  57  1oct  GC-arg  A.DG214      B.DC218      C.ARG146       -13.22   81.82   -3.10  -12.84    0.55   33.12  -42.77   63.89
  58  1oct  GC-arg  A.DG214      B.DC218      C.ARG153       -15.04  105.31  -13.12   -7.36   -0.34   48.96  -59.02   78.67
  59  1oct  GC-arg  A.DG215      B.DC217      C.ARG146       -14.12   54.56   -8.22  -11.26   -2.24   17.71  -44.22   27.42
  60  1oct  GC-asp  B.DG225      A.DC207      C.ASP41        -15.83  145.24    7.69    5.05  -12.89   14.51  -88.51  130.16
  61  1oct  GC-cys  A.DG214      B.DC218      C.CYS150       -11.64   58.75   -6.02   -8.19   -5.66    2.88  -29.51   51.30
  62  1oct  GC-lys  A.DG206      B.DC226      C.LYS104        14.00 -147.17   13.14   -1.87    4.47   60.50  -32.46 -139.97
  63  1oct  GC-lys  A.DG214      B.DC218      C.LYS125       -15.77  133.59  -15.05   -2.40    4.03   15.10  -45.64  128.88
  64  1oct  GC-lys  A.DG215      B.DC217      C.LYS142       -18.20   88.73   -7.54  -15.85    4.81   38.75  -67.02   47.26
  65  1oct  GC-phe  B.DG225      A.DC207      C.PHE42        -12.94   16.35    2.22   12.74    0.40  -16.24   -1.72    0.81
  66  1oct  GC-ser  B.DG225      A.DC207      C.SER43         10.70 -125.48   -8.19    6.60   -1.95  -34.20   69.17 -108.25
  67  1oct  GC-ser  A.DG215      B.DC217      C.SER128        14.83   66.84   -9.41   -8.96    7.15   38.53  -33.77   44.47
  68  1oct  GC-thr  A.DG202      B.DC230      C.THR26         12.81  -35.78   -6.69   10.25    3.78  -25.47   20.05  -15.35
  69  1oct  GC-thr  A.DG206      B.DC226      C.THR45         -7.64  176.19   -2.23   -7.21    1.19   26.03  -13.35  176.06
  70  1oct  GC-thr  B.DG225      A.DC207      C.THR45          8.13 -137.49   -5.46    5.69    1.99   29.38    9.20 -135.83
  71  1oct  GC-thr  B.DG225      A.DC207      C.THR46          9.34  109.47    7.63    0.45    5.38  -31.44   53.14   95.45

****************************************************************************
List of 25 phosphate/amino-acid H-bonds
       id    nt-atom         aa-atom            dist     type
   1  1oct  OP1@A.DT203      NH2@C.ARG20        2.74 po4:sidechain:salt-bridge
   2  1oct  OP2@A.DT203      NH2@C.ARG20        3.02 po4:sidechain:salt-bridge
   3  1oct  OP2@A.DT203      N@C.GLN27          3.19 po4:backbone
   4  1oct  OP2@A.DA204      NE2@C.GLN27        2.79 po4:sidechain
   5  1oct  OP2@A.DA204      OG@C.SER48         2.60 po4:sidechain
   6  1oct  OP1@A.DA208      NH2@C.ARG113       2.69 po4:sidechain:salt-bridge
   7  1oct  OP1@A.DA210      N@C.THR106         2.96 po4:backbone
   8  1oct  OP1@A.DA210      OG1@C.THR106       2.37 po4:sidechain
   9  1oct  OP1@A.DT211      N@C.LYS103         3.10 po4:backbone
  10  1oct  OP1@B.DC217      NZ@C.LYS142        3.64 po4:sidechain:salt-bridge
  11  1oct  O5'@B.DC217      OG@C.SER128        3.41 po4:sidechain
  12  1oct  OP2@B.DC218      NE@C.ARG146        2.90 po4:sidechain:salt-bridge
  13  1oct  OP1@B.DT219      NZ@C.LYS125        3.21 po4:sidechain:salt-bridge
  14  1oct  OP2@B.DT219      NE@C.ARG153        3.04 po4:sidechain:salt-bridge
  15  1oct  OP2@B.DT219      NH2@C.ARG153       2.58 po4:sidechain:salt-bridge
  16  1oct  O3'@B.DC218      NZ@C.LYS125        3.44 po4:sidechain
  17  1oct  O3'@B.DC218      NH2@C.ARG153       3.14 po4:sidechain
  18  1oct  OP2@B.DT223      OG@C.SER56         2.61 po4:sidechain
  19  1oct  OP1@B.DT224      NZ@C.LYS62         2.42 po4:sidechain:salt-bridge
  20  1oct  OP2@B.DT224      ND2@C.ASN59        2.83 po4:sidechain
  21  1oct  OP1@B.DG225      N@C.SER43          3.26 po4:backbone
  22  1oct  OP2@B.DG225      N@C.SER43          3.12 po4:backbone
  23  1oct  OP2@B.DG225      OG@C.SER43         2.74 po4:sidechain
  24  1oct  OP2@B.DG225      OG1@C.THR46        2.45 po4:sidechain
  25  1oct  OP1@B.DC226      N@C.ARG105         3.82 po4:backbone

****************************************************************************
List of 1 sugar/amino-acid H-bonds
       id    nt-atom         aa-atom            dist     type
   1  1oct  O4'@A.DA208      NH2@C.ARG105       3.40 sugar:sidechain

****************************************************************************
List of 12 base/amino-acid H-bonds
       id    nt-atom         aa-atom            dist     type
   1  1oct  N7@A.DA204       NE2@C.GLN44        3.16 base:sidechain
   2  1oct  N6@A.DA204       OE1@C.GLN44        3.37 base:sidechain
   3  1oct  O4@A.DT205       OG1@C.THR45        2.93 base:sidechain
   4  1oct  N7@A.DG206       NH2@C.ARG49        3.44 base:sidechain
   5  1oct  N3@A.DA208       NH1@C.ARG105       3.23 base:sidechain
   6  1oct  N7@A.DA210       ND2@C.ASN151       3.47 base:sidechain
   7  1oct  N6@A.DA210       OD1@C.ASN151       3.54 base:sidechain
   8  1oct  N3@A.DA210       NH2@C.ARG102       3.85 base:sidechain
   9  1oct  O2@B.DT223       NH2@C.ARG102       3.69 base:sidechain
  10  1oct  O6@B.DG225       NH1@C.ARG49        2.84 base:sidechain
  11  1oct  O6@B.DG225       NH2@C.ARG49        3.21 base:sidechain
  12  1oct  N4@B.DC226       OG1@C.THR45        3.22 base:sidechain

****************************************************************************
List of 7 base/amino-acid pairs
       id   nt-aa   nt           aa      vertical-distance   plane-angle
   1  1oct  A-gln  A.DA204      C.GLN44         0.05              9
   2  1oct  G-arg  A.DG206      C.ARG49         1.68             26
   3  1oct  A-arg  A.DA208      C.ARG105        0.33             44
   4  1oct  A-arg  A.DA210      C.ARG102        2.12             37
   5  1oct  A-asn  A.DA210      C.ASN151        0.70             27
   6  1oct  T-arg  B.DT223      C.ARG102        1.17             49
   7  1oct  G-arg  B.DG225      C.ARG49         0.44             32

****************************************************************************
List of 2 base/amino-acid stacks
       id   nt-aa   nt           aa      vertical-distance   plane-angle
   1  1oct  T-gln  A.DT203      C.GLN44         3.79              1
   2  1oct  G-arg  B.DG225      C.ARG105        3.31             37

****************************************************************************
List of 19 nucleotides interacting with amino acids
   1 A.DG202  d=3.25 C5'@A.DG202   CG2@C.THR26   aas=1 T C.THR26 ...
   2 A.DT203  d=2.74 OP1@A.DT203   NH2@C.ARG20   aas=6 RITQQS C.ARG20,C.ILE21,C.THR26,C.GLN27,C.GLN44,C.SER48 h..,...,...,h..,..s,...
   3 A.DA204  d=2.60 OP2@A.DA204   OG@C.SER48    aas=5 KQQSE C.LYS17,C.GLN27,C.GLN44,C.SER48,C.GLU51 ...,h..,hp.,h..,...
   4 A.DT205  d=2.93 O4@A.DT205    OG1@C.THR45   aas=2 TS C.THR45,C.SER48 h..,...
   5 A.DG206  d=3.44 N7@A.DG206    NH2@C.ARG49   aas=2 TR C.THR45,C.ARG49 ...,hp.
   6 A.DC207  d=3.53 N4@A.DC207    NH2@C.ARG49   aas=2 RR C.ARG49,C.ARG105 ...,...
   7 A.DA208  d=2.69 OP1@A.DA208   NH2@C.ARG113  aas=2 RR C.ARG105,C.ARG113 hp.,h..
   8 A.DA209  d=3.35 O4'@A.DA209   NH1@C.ARG105  aas=8 RRTIRWNK C.ARG102,C.ARG105,C.THR106,C.ILE108,C.ARG113,C.TRP148,C.ASN151,C.LYS155 ...,...,...,...,...,...,...,...
   9 A.DA210  d=2.37 OP1@A.DA210   OG1@C.THR106  aas=8 RKKRTVVN C.ARG102,C.LYS103,C.LYS104,C.ARG105,C.THR106,C.VAL144,C.VAL147,C.ASN151 hp.,...,...,...,h..,...,...,hp.
  10 A.DT211  d=3.10 OP1@A.DT211   N@C.LYS103    aas=3 RKV C.ARG102,C.LYS103,C.VAL147 ...,h..,...
  11 B.DC217  d=3.29 C5'@B.DC217   OG@C.SER128   aas=3 SKR C.SER128,C.LYS142,C.ARG146 h..,h..,...
  12 B.DC218  d=2.90 OP2@B.DC218   NE@C.ARG146   aas=4 KRCR C.LYS125,C.ARG146,C.CYS150,C.ARG153 h..,h..,...,h..
  13 B.DT219  d=2.58 OP2@B.DT219   NH2@C.ARG153  aas=3 KCR C.LYS125,C.CYS150,C.ARG153 h..,...,h..
  14 B.DT220  d=3.96 OP2@B.DT220   CE@C.LYS157   aas=1 K C.LYS157 ...
  15 B.DT223  d=2.61 OP2@B.DT223   OG@C.SER56    aas=5 SKNKR C.SER56,C.LYS58,C.ASN59,C.LYS62,C.ARG102 h..,...,...,...,hp.
  16 B.DT224  d=2.42 OP1@B.DT224   NZ@C.LYS62    aas=9 FTRLNKLRR C.PHE42,C.THR46,C.ARG49,C.LEU55,C.ASN59,C.LYS62,C.LEU63,C.ARG102,C.ARG105 ...,...,...,...,h..,h..,...,...,...
  17 B.DG225  d=2.45 OP2@B.DG225   OG1@C.THR46   aas=8 DFSTTRRR C.ASP41,C.PHE42,C.SER43,C.THR45,C.THR46,C.ARG49,C.ARG102,C.ARG105 ...,...,h..,...,h..,hp.,...,..s
  18 B.DC226  d=3.22 N4@B.DC226    OG1@C.THR45   aas=3 TKR C.THR45,C.LYS104,C.ARG105 h..,...,h..
  19 B.DA227  d=3.71 OP1@B.DA227   OG@C.SER107   aas=2 TS C.THR45,C.SER107 ...,...

FAQs / How do I cite DSSR?

« on: January 22, 2014, 11:14:55 pm »

The following DSSR article, recently appeared in Nucleic Acids Res., should be used as the primary citation:

X.J. Lu, H.J. Bussemaker & W.K. Olson (2015). "DSSR: an integrated software tool for dissecting the spatial structure of RNA." Nucleic Acids Res. 43(21), e142 (doi:10.1093/nar/gkv716).

Optionally, you may also want to mention one of the two 3DNA papers:

X.J. Lu & W.K. Olson (2008). "3DNA: a versatile, integrated software system for the analysis, rebuilding and visualization of three-dimensional nucleic-acid structures." Nat. Protoc. 3(7), 1213-1227.
X.J. Lu & W.K. Olson (2003). "3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures." Nucleic Acids Res. 31(17), 5108-5121.

RNA structures (DSSR) / Request for comments on DSSR output before v1.0 release

« on: January 10, 2014, 03:40:26 pm »

Greetings, DSSR users!

It is over ten months since DSSR beta-r01-on-20130303 was first made public. Over the time, thanks to the feedback of many enthusiastic users, DSSR has been continuously developed and improved. I believe the software is now solid enough for v1.0, to be released in the near future.

If you have any comments on DSSR in general, its output style in particular, please do let me know. I will try to accommodate as much of your feedback as practical in the DSSR v1.0 release.

Best regards,

Xiang-Jun

RNA structures (DSSR) / A bug with missing right-side type bulges

« on: December 25, 2013, 07:29:36 pm »

This thread is to replace the original one titled "an issue with bulges" stared by febos. Somehow, the URL with the original thread does not reflect the topic, and trying to fix the issue leads to a broken link

when clicking directly on the topic subject. [Note added on 2014-01-10: the broken URL bug has been fixed.]

The original thread with three posts is still kept for record, and this new thread is a direct copy of the original posts.

Xiang-Jun

RNA structures (DSSR) / DSSR release history

« on: October 31, 2013, 11:24:21 am »

As the list is becoming quite long, for easy reference, I have split up the DSSR release notes from the main post "DSSR: Dissecting the Spatial Structure of RNA".

V2.x releases

DSSR may be licensed from the Columbia Technology Ventures (CTV) for academic or commercial use. Licensing revenue helps ensure the long-term sustainability of the DSSR project.
As of v2.2, DSSR has completely superseded 3DNA. Classic 3DNA features served via two dozens of core and utility programs have been integrated into one DSSR program, all under an easy-to-use and consistent interface.

V1.x releases

v1.9.10-2020apr23 -- added the nt_type field (with values "DNA", "RNA", or "unknown") to the nts array of JSON output, plus many code refactoring and feature revisions on DSSR-PyMOL schematics.
v1.9.9-2020feb06 -- Added the --nt-mapping option so users can specify how modified nucleotides are mapped to their canonical counterparts. Please refer to the thread "modified nucleotides incorrect" initiated by tctcab on the 3DNA Forum. This update also contains many refinements at the DSSR-PyMOL interface for producing the characteristic block schematics. See http://skmatic.x3dna.org.
v1.9.8-2019oct16 -- Added the --g4-onz option for ONZ classifications of G-tetrads in intramolecular G-quadruplexes, plus minor code refinements.
v1.9.7-2019oct01 -- Fixed a bug in the identification of junction loops in special cases (as in PDB entry: 4wsm) -- thanks to lijun for reporting.
v1.9.6-2019sep16 -- Revised the --get-hbond option, plus miscellaneous code/manual refinements.
v1.9.5-2019aug01 -- Checked the compatibility of the --symm option with an NMR-ensemble-like input file where only the first model is handled; enhanced features on the analysis and output of G-quadruplexes; revised identification of duplex-G4 junctions (e.g., in 6r9k/6r9l).
v1.9.4-2019jul08 -- Revised criteria for Watston-Crick pairs and improved identification of loops in rare cases. Updated the DSSR User Manual, which is now at 108 pages. Miscellaneous code refinements.
v1.9.3-2019may25 -- Refined the code for command-line options; Added --block-opt (or --block-option) as an alternative for --block-file.
v1.9.2-2019may06 -- Revised the algorithm for the alignment of G-tetrads along a G-helix. Now the left-handed G-quadruplex 6fq2 has a Twist angle of -25.8 degrees; Plus other minor code refinements.
v1.9.1-2019apr06 -- Revised the --analyze-cehs option so that for non-WC structures it gives consistent results with the cehs program in 3DNA v2.4; added the output of a set of six "Simple helical parameters based on consecutive C1'-C1' vectors" with the --analyze option.
v1.9.0-2019mar26 -- Added the --analyze option to output a list of key structural parameters as those from the 3DNA v2.x ‘analyze’ program. This option can also be specified as --3dna-v2, and it contains variations to fit other potential needs.

v1.8.9-2019mar09 -- Revised algorithms for the identificaton of modified nucleotides, multiplets, and loops, in edge cases; expanded the definition of ~rHoogsteen pairs (e.g., G2-G12 in PDB entry 6ac7).
v1.8.8-2019feb18 -- Relaxed criteria for reverse Hoogsteen (rHoogsteen) pairs; revised the algorithm for base identification in special cases.
v1.8.7-2019feb11 -- Refined code and fixed a couple of bugs in rare occasions.
v1.8.6-2019feb03 -- Refined the algorithm for H-bonds, plus documentation improvements and code refactoring.
v1.8.5-2018nov29 -- Fixed the bug of not escaping backslash and double quote in DSSR JSON output.
v1.8.4-2018nov12 -- (1) Further refinements of G-tetrad reference frame, leading to slightly revised twist angle. (2) Updated examples in command-line help text. (2) A few other minor revisions.
v1.8.3-2018oct29 -- (1) Replaced raw C1' atomic coordinates with least-squares-fitted ones in the definition of G-tetrad reference frames. This switch ameliorates experimental uncertainty of C1' atomic coordinates, and is in line with the usage of the origins of G reference frames. As a result, helical parameters (Twist, Rise etc.) are slightly different from previously reported values. (2) Add a descriptive note for G4 stems that incorporates common names (including chair, basket, hybrid) and strand directionality, as in note=basket(2+2) for PDB id 2kf8. (3) Introduced a squared G-tetrad block of size 11.6 Å for visualization. (4) Added the --pair-only option to output just base-pairing info, which is 10 times faster than a default DSSR run. (5) Tightened the criteria of G-U Wobble pair. Now the U2586-G2592 pair in PDB id 1s72 is named ~Wobble. The pair has a Shear in the opposite direction of a normal Wobble pair, with two completely different H-bonds: N3(imino)*N2(amino)[3.05],O4(carbonyl)-N1(imino)[2.77]
v1.8.2-2018oct20 -- Added detection of V-loops in G-quadruplexes, plus miscellaneous improvements.
v1.8.1-2018oct09 -- Further refinements and expansions on the characterization of G-quadruplexes. (1) Added support of a consistent topological nomenclature for canonical G4 structures; (2) revised the algorithm for identifying G4 loops to allow for 0-nt propeller loop as in PDB entry 2m53; (3) recovered the missing G6+G10 pair in a distorted G-tetrad in PDB entry 148d; (4) implemented full JSON output of G4 structural features. DSSR is now full-fledged for the analysis and annotation of G-quadruplexes.
v1.8.0-2018sep18 -- Significantly improved the characterization of G-quadruplexes, (1) revised the algorithm for the calculation of G-tetrad step parameters (twist/rise); (2) new features for the assignment of groove widths (medium, narrow, or wide), classification of stacking interactions based on the two faces of G-tetrads, and categorization of higher-order associations (coaxial stacking). Other refinements related to the identification of base-pairs and multiplets.

v1.7.9-2018sep06 -- sped up further the analysis of NMR ensembles or MD trajectories; revised algorithms for identifying base pairs and multiplets in special cases; improved the mmCIF parser; plus minor code/manual refinements.
v1.7.8-2018sep01 -- classified G-tetrads by different types of non-planarity; sped up analyses of large ensembles (--nmr) as the trajectories from MD simulations; introduced the "Linker" G+A base-pair name/type; revised the algorithm for H-bonding identification, plus numerous other minor code refactoring and refinements.
v1.7.7-2018apr20 -- revised detection of multiplets in edge cases; miscellaneous minor refinements.
v1.7.6-2018mar22 -- significantly refactored code for running multiple instances of DSSR in parallel; introduced the --auxfile=no option to bypass the generation of the auxiliary files; added three FAQ entries and a section on DSSR integrations to the manual.
v1.7.5-2018mar19 -- revised code associated with the --blocview option for generating cartoon-block schematic images using PyMOL, in the most extended view; minor but fixes and code improvements.
v1.7.4-2018jan30 -- revised the algorithm for H-bond detection, plus other minor code/manual improvements.
v1.7.3-2017dec26 -- revised the JSON output of model/chain keys for consistency; incorporated abasic sites into analysis by default; revised code to avoid warning messages with GCC v7.
v1.7.2-2017nov20 -- fixed a bug with abasic sites (as in PDB entry 4ifd, with --abasic), and a bug in the listing of modified nucleotides (as in PDB entry 2c4z in its biological assembly, with --symm); a variety of minor enhancements  of source code and the documentation.
v1.7.1-2017nov01 -- fixed a bug in the analysis of G4-quadruplexes in rare cases; revised the characterization of i-motifs; checked this release on all nucleic-acid-containing structures in the PDB.
v1.7.0-2017oct19 -- added a module to automatically identify and fully characterize G-quadruplexes, plus numerous other internal code refinements.

v1.6.9-2017aug09 -- refined algorithms for identifying H-bonds and base pairs for boundary cases, plus miscellaneous code factoring and revisions.
v1.6.8-2017mar28 -- fixed a bug in loop identification in edge cases (as in 4fe5); expanded standard bases to include A5 and A3 etc; fixed the 1-unit shift bug in the concatenated backbone suite string; fixed a bug in chain-specific DBN (as in 5pky); revised base-pair criteria to include C.U4--D.U4 pair in 1zh5; extended criteria for type=X A-minor-like motifs (U49 to U20--A76 in 4fe5); checked for output of non-stacked bases independently of stacked ones (as in 4rts); miscellaneous code/manual refinements.
v1.6.7-2017mar14 -- fixed a bug in derived dot-bracket notation (DBN) with pseudoknots in special cases; revised --json to work with --get-hbonds for a full list of all H-bonds (DSSR and SNAP); changed type=O to type=X for eXtended A-minor motifs to avoid confusion with the previously documented type 0; lowered the default angle for splayed-apart dinucleotides from 100 degrees to 85; miscellaneous code/documentation refinements. 
v1.6.6-2017feb20 -- extended A-minor motifs to include a miscellaneous type other than I and II; added groove widths to helices/stems and sequential number to nts of the JSON output; updated the user manual.
v1.6.5-2017jan22 -- revised detection of pairs and helices/stems in rare cases; miscellaneous minor refinements.
v1.6.4-2016nov19 -- refined detection of multiplets; added the characterization of terminal bases of helices/stems in the 'Summary' section; revised Jmol-DSSR web interface.
v1.6.3-2016oct19 -- added a new section of splayed-apart dinucleotides and larger units; plus miscellaneous code refinements and minor bug fixes.
v1.6.2-2016sep19 -- refined the algorithm for identifying kink-turns (K-turns), among many other internal improvements.
v1.6.1-2016aug22 -- added the identification and characterization of i-motifs (e.g., 1a83 and 2n89); refined algorithms for the identification of H-bonds and helices; miscellaneous code refactoring.
v1.6.0-2016aug06 -- added the --pair-list option to allow for user customizations of base pairs to be analyzed; added an analysis of the global curvature for each nucleic acid chain; plus various code refinements.

v1.5.9-2016jul22 -- further refinements of the algorithm for H-bonding detection.
v1.5.8-2016jul09 -- added a 'summary' line for each loop, plus miscellaneous code refinements.
v1.5.7-2016jun16 -- refined the algorithm for H-bonding detection in corner cases (such as G-tetrads with poor geometry).
v1.5.6-2016jun09 -- revised the summary line of DNA/RNA chains when multiple models are involved; consolidated/extended cartoob-block related functionality; internal code refinements and minor bug fixes.
v1.5.5-2016may25 -- added the --view option (and related variants) to reset a structure via the principle moment of inertia, as in rotate_mol of 3DNA. The output orientation is in the most extended form, vertically; --blocview (or --block-view, --cartoon-block-view) option is also accepted to mimic the 3DNA blocview script; simplified the command-line help/examble message, and revised the user manual accordingly.
v1.5.4-2016may16 -- significantly refined and extended the --frame option to reorient a structure based on a selected reference frame, including the middle frame of two base-pair frames, as in frame_mol of 3DNA; added output of the suite string in dssr-torsions.txt; minor bug fixed and refinements; updated user manual.
v1.5.3-2016apr11 -- derived a set of virtual torsion angles using the phosphorus atoms and base origins (see output file dssr-torsions.txt). This set of P-base virtual torsions was first implemented in analyze -torsion of 3DNA v2.1, released in early 2012. See my blogpost titled "Pseudo-torsions to simplify the representation of DNA/RNA backbone conformation" (dated 2012-04-22) for details. Moreover, functions related to the --block-color option have been refined.
v1.5.2-2016apr02 -- added the --block-color option to facilitate flexible color customizations of blocks/edges (e.g. minor groove); expanded the definition of junction loops to include the special case of a kissing loop motif mediated by an isolated canonical pair (e.g., 1ehz); various minor internal refinements.
v1.5.1-2016mar11 -- miscellaneous code refinements and function enhancements.
v1.5.0-2016feb12 -- removed the obsolete --jmol option since the DSSR-Jmol integration is now better served via JSON; added more styles in the cartoon-block representation.

v1.4.9-2016jan25 -- fixed inconsistency in the dot-bracket-notation (dbn) output section regarding chain names with more than 1-char (as for 1vy6 in .cif format) -- thanks to Eugene for reporting the bug; refined .r3d output of base blocks for PyMOL rendering, following feedback from Thomas Holder.
v1.4.8-2016jan16 -- refined the definition of extended base-pair names ("~Wobble", "~Hoogsteen", "~rHoogsteen", and "~Shear"); fixed a bug in the identification of G quartets in rare cases.
v1.4.7-2016jan06 -- extended definition of base-pair names, with "~Wobble", "~Hoogsteen", "~rHoogsteen", and "~Shear" for corresponding pairs with similar geometry but sequences other than G–U, A+U, A–U, G–A, respectively.
v1.4.6-2015dec16 -- refined detection of H-bonds, base-pairs, multiplets, and helices/stems for boundary cases.
v1.4.5-2015nov23 -- added the --nar-paper option for rigorously reproducing the results reported in the 2015 DSSR Nucleic Acids Research paper; miscellaneous code refinements.
v1.4.4-2015nov18 -- refined detection of base-pairs and multiplets in boundary cases; made the output of base-capping interactions by default (i.e., the --more option is no longer needed to be specified for its output).
v1.4.3-2015oct23 -- added detection of metallo-base pairs, such as T-Hg-T (4l24) and C-Ag-C (5ay2) from the work of Kondo et al.
v1.4.2-2015oct19 -- revised code for circular DNA or RNA molecules and unconventional glycosidic linkages (such as C1'–C1 for DY).
v1.4.1-2015oct12 -- checked for potential erroneous usage of option --symmetry with an NMR ensemble (which leads to DSSR effectively taking the models all together); revised output width of id strings.
v1.4.0-2015oct10 -- introduced the --nmr option to facilitate processing of NMR ensembles or trajectories of molecular dynamics simulations; added a new section summarizing structural features per nucleotide. Up to this point, DSSR contains all the fundamental features I have had in mind!

v1.3.9-2015oct08 -- simplified diagnostic message, and refactored code.
v1.3.8-2015oct02 -- added option --cartoon-block, plus a few minor refinements.
v1.3.7-2015oct01 -- added option --symmetry (short form: --symm) to take symmetry-related MODEL/ENDMDL ensemble as a whole. This option is useful for x-ray crystal structures where the asymmetric unit is 'half' of the biological unit (e.g., PDB id: 467d -- x3dna-dssr -i=467d.pdb1 --symm).
v1.3.6-2015sep18 -- revised JSON output for better DSSR-Jmol integration (thanks to Dr. Robert Hanson). Specifically, a "metadata" property is introduced to collect miscellaneous information, thus simplifying the top-level name space. Moreover, "ntParams" is renamed "nts" for consistency.
v1.3.5-2015sep09 -- bug fixes for edge cases in JSON output, based on tests against all nucleic acid structures in PDB; minor code refactoring. The DSSR JSON output has now reached a stable, useable state.
v1.3.4-2015sep06 -- fixed a bug in parsing .cif files as in PDB entry 5aj0 (thanks to Eugene for reporting the issue).
v1.3.3-2015sep03 -- added output of reference frames of bases and base-pairs in JSON, exposing more of DSSR's functionality to other third-party tools (e.g., for visualization).
v1.3.2-2015sep02 -- introduced a new set of "simple" base-pair (bp) parameters that are more intuitive for non-canonical bps. The simple parameters, including Shear, Stretch, Buckle and Propeller, are for structural description only, not suitable for model rebuilding. The non-planarity bp parameters, Buckle and Propeller in particular, have recently received attention in the RNA structure community. This simple set of bp parameter is provided to make DSSR more readily accessible to X-ray crystallographers or cryo-EM practitioners. The new parameters complement, but not replace, the original six rigid-body bp parameters for rigorous description and exact rebuilding of nucleic acid structures.

When the --more option is specified, the new parameters are available in the main output file, taking an extra line for each bp. For --non-pair, the inter-base angle and minimum distance between base atoms are also listed. The new additions break backward compatibility of the main output file; use the --nar-paper option to stay with DSSR v1.2, as reported in the NAR article. Better yet, users are strongly encouraged to switch to the JSON output format for better connection with DSSR.
v1.3.1-2015aug29 -- revised tag names for the --json output based on feedback from Dr. Wilma Olson; along the line, changed the file name dssr-a2bases.pdb to dssr-atom2bases.pdb to make its meaning more explicit for atom over base capping interactions.
v1.3.0-2015aug27 -- added option --json to collect all DSSR-derived structural features in the standard JSON data exchange format. This single JSON file makes DSSR results easily parsable, allowing for its seamless integration with other RNA bioinformatics tools. Plus various other minor refinements.

v1.2.9-2015jul25 -- added a new section on Reproducing results published in the DSSR-NAR paper to the 3DNA Forum, and documented corresponding auxiliary options in the User Manual.
v1.2.8-2015jun15 -- added a new section titled "Additional options" to the User Manual; refined the algorithm for hydration identification.
v1.2.7-2015jun09 -- added documentation of two related options: --prefix to customize the prefix of DSSR auxiliary files, and --cleanup to remove those files; other minor changes.
v1.2.6-2015mar28 -- revised the interpretation of DSSR to "Dissecting the Spatial Structure of RNA"; added a new option --loop-isolated-pair to exclude isolated canonical pairs in delineating loops; updated the user manual (now 60 pages).
v1.2.5-2015mar19 -- revised the helix/stem detection algorithm for circular DNA/RNA structures.
v1.2.4-2015mar03 -- refined the algorithm for assigning dot-bracket notation (dbn) in rare cases, plus miscellaneous minor improvements.
v1.2.3-2015feb18 -- improved the identification of multiplets.
v1.2.2-2015feb06 -- refined the algorithm for assigning helices, plus several code refactoring and enhancements.
v1.2.1-2015feb01 -- added the functionality for the "characterization of H-type pseudoknots"; refined the underlying algorithms for identifying H-bonds and base pairs.
v1.2.0-2015jan01 -- numerous code refinements and refactoring; added classification of base-stacking interactions (option --non-pair); introduced the helix index that an isolated canonical pair is part of; included metal coordination bonds that phosphate OP1/OP2 atoms are involved in (option --po4); replaced option --long-idstr with --idstr; made option --nested explicit; added a new section on base stacks.

v1.1.10-2014nov04 -- refined the algorithm for identifying multiplets; expanded option --non-pair to include all non-pairing interactions; various minor improvements.
v1.1.9-2014oct22 -- refined H-bond identification and significantly fine-tuned base pair classification; updated the user manual.
v1.1.8-2014oct09 -- fixed a bug in assigning G+A pairs to Saenger type X (10); further refined algorithms for finding H-bonds and base pairs.
v1.1.7-2014sep24 -- added auxiliary files (where available) dssr-bulges.pdb, dssr-iloops.pdb and dssr-junctions.pdb to parallel dssr-hairpins.pdb; further refined the algorithm for H-bond detection; significantly improved the User Manual -- by switching to LaTeX, all the hyperlinks and cross references are active, and the excerpted output listings in the manual are auto-synced with the latest DSSR release via a Ruby script.
v1.1.6-2014sep09 -- refined algorithms for detecting H-bonding and base-stacking interactions; removed the 4-line header in .bpseq output; checked for chain breaks in pseudoknot report; various code polishing.
v1.1.5-2014aug28 -- added the section of atom-base stacking interactions (as in the case of the phosphate OP2 atom of G57 capping over the uracil ring of PSU55 in tRNA "1ehz"); significantly sped up the --non-pair and --phosphate options; miscellaneous code refinements.
v1.1.4-2014aug09 -- added the option --nest to remove pseudo-knots (if any), leaving only nested pairs; added a new section to list modified nucleotides; numerous minor refinements.
v1.1.3-2014jun18 -- refined the algorithm for deriving dot-bracket notation (.dbn) in RNA structures with higher-order pseudo knots (thanks to Jan Hajic); added secondary structure output in .bpseq format to parallel .ct and .dbn; miscellaneous code improvements.
v1.1.2-2014apr19 -- added the option --torsion360 to output (virtual) torsional angle in the range of [0, 360] instead of the default range [-180, +180], following Cathy Lawson's suggestion; renamed "00-n/a" to "n/a" for unclassified Saenger pairs, plus a few other refinements of the User Manual based on feedback from Pascal Auffinger; revised A-minor separator character from '/' to '|' (i.e., from 'A/G-C' to 'A|G-C') based on communications with Bob Hanson.
v1.1.1-2014apr11 -- added the option --get-hbonds to find and output all H-bonds in a structure; renamed file ‘dssr-torsions.dat’ to ‘dssr-torsions.txt’; updated the User Manual (50 pages)
v1.1.0-2014apr09 -- denoted unnamed bps as '--' for easy parsing (thanks to feedback from Dr. Robert Hanson); added helical radius info for helices/stems, and made the helical rise parameter explicit (thanks to Dr. Wilma Olson); changed '_pars' to '-pars' in the output file for consistency; upgraded DSSR to v1.1.0-2014apr09 due to format changes.

v1.0.6-2014apr04 -- revised the algorithm for detecting kissing loops, plus other minor refinements; updated the User Manual accordingly.
v1.0.5-2014mar24 -- removed the --note option which has become redundant with notes in the main output file and the DSSR User Manual; shortened output from the --help option by deleting the 'Summary' section; minor code refinements. Added the overlooked subsection "Orientation of helices/stems" and fixed a few typos and inconsistencies in the User Manual (48 pages).
v1.0.4-2014mar19 -- minor updates on notes in the main output file to synchronize with an significantly improved DSSR User Manual (46 pages) based on feedback from Dr. Wilma Olson.
v1.0.3-2014mar09 -- various improvements for consistency, and finally and most importantly, the DSSR User Manual (45 pages) is out!
v1.0.2-2014feb16 -- numerous minor refinements, mostly as a result of writing up the DSSR User Manual (coming soon!).
v1.0.1-2014jan31 -- Refined the algorithm for detecting multiplets [thanks to Eugene]; fixed a bug for handling circular DNA/RNA structures [thanks to Pascal]; plus consistency improvements.
v1.0-2014jan25 -- The program is robust and mature enough to warrant a v1.0 release. While DSSR will be continuously refined, top priority will be on bug fixes. Wherever practical, future DSSR v1.x releases will remain backward compatible.

Beta-testing releases

beta-r30-on-20140118 -- considerably improved annotation and consistency of the main DSSR output file with help from Dr. Wilma Olson, refined algorithms for detecting internal and junction loops, added output of secondary structures in the connect (.ct) format and the extended DBN notation to allow for multiple molecules or fragments, removed the --break-symbol option. This will be the last beta release, and shortly we will move to DSSR v1.0! Please give it a try and let me know anything you'd like to change!
beta-r29-on-20140106 -- significantly improved the algorithms for detecting various loops (hairpin, bulge, internal or junction loops), covering many corner cases. List of nucleotides in loops and single-stranded fragments are now presented consistently. Plus many code refinements.
beta-r28-on-20131225 -- fixed a bug for missing 0-by-N type (right-side) bulge (thanks to Eugene).
beta-r27-on-20131203 -- added the missing bracket ([) to delineate base-pair parameters in detailed output for helices/stems (thanks to Eugene).
beta-r26-on-20131128 -- code refinements and refactoring, plus minor bug fixes. Single-stranded fragments now refer to nucleotides not involved in loops or canonical base-pairs (Watson-Crick and G-U wobble); simplified idstr for the non-standard compliant yet commonly encountered PDB files (mostly from MD simulations) with no chain id specified, from e.g. _.G1 to G1.
beta-r25-on-20131119 -- added option --break-symbol to delineate chain breaks in dot-bracket notation; listed terminal single-stranded segments; plus code refinements/refactoring.
beta-r24-on-20131030 -- refined code for nucleotide characterization; annotated the Levitt pair.
beta-r23-on-20130918 -- bug fixes on U-turn identification and missing base pairs.
beta-r22-on-20130910 -- minor bug fixes and code refinements; released the DSSR web-interface.
beta-r21-on-20130903 -- fixed a rare bug reported by Pascal, and refined the mmCIF parsing code.
beta-r20-on-20130830 -- added option --u-turn to detect UNR- or GNRA-type U-turns, plus numerous code refinements.
beta-r19-on-20130819 -- added option --po4 (--phosphate) to list H-bonds involving phosphate groups; removed the segid info from nucleotide id-string by default;  refined code internally and fixed minor bugs.
beta-r18-on-20130801 -- added support for the mmCIF format; refined code for parsing the PDB format.
beta-r17-on-20130723 -- assigned backbone suite names (in file "dssr-torsion.dat") following Richardson et al. "RNA backbone: consensus all-angle conformers and modular string nomenclature"; numerous code refinements and note revisions.
beta-r16-on-20130709 -- classified each dinucleotide step into A-, B- or Z-form conformation; simplified output by default. Users are advised to upgrade to this release.
beta-r15-on-20130703 -- added output of base morphology parameters for each identified helix/stem.
beta-r14-on-20130626 -- auto-detection of 'canonical' kink-turns (k-turns) and reverse k-turns (see my post "DSSR identifies kink-turns!"); numerous code refinements.
beta-r13-on-20130618 -- added the PDB segment identifier (segid) into nucleotide id string; refine the algorithm for finding A-minor motifs.
beta-r12-on-20130610 -- delineated the components of bulges, internal loops, and junctions, per user request.
beta-r11-on-20130603 -- refined the descriptive note with the help of Dr. Wilma Olson; added the --long-idstr option to explicitly delineate fields of a residue id string for easy machine parsing; added the --pucker option to output the sugar pucker following either Altona & Sundaralingam (1972) or Westhof & Sundaralingam (1983) -- see the post "Two slightly different definitions of sugar pucker".
beta-r10-on-20130430 -- added a brief descriptive note and a list of generated files to the main DSSR output; revised the command-line --help with more detailed usage info; improved output format, and refined code. Now DSSR is not only self-contained, but also (at least should be) self-explanatory.
beta-r09-on-20130421 -- added a least-squares fitted helical axis for each identified helix/stem; classified the backbone into BI/BII conformations and the sugar into C2'/C3'-endo like (see file 'dssr-torsions.dat'); checked for non-pairing interactions (H-bonds or base stacking) with option '-non-pair'; refined code and revised output format
beta-r08-on-20130323 -- refined algorithm for multiplet detection, revised the header section to output the numbers of DNA/RNA chains, nucleotides, waters, and metals
beta-r07-on-20130322 -- code refinements, minor bug fixes, and more extensive tests
beta-r06-on-20130319 -- fixed the "segmentation fault" bug reported by MarcParisien for PDB entry 2a64; revised -h message
beta-r05-on-20130316 -- detection of ribose zippers; revision of help message; code refactoring
beta-r04-on-20130314 -- detection of kissing loops; output format revisions, including an explicit listing of all nucleotides involved in hairpin loops; internal bug fixes and refinements
beta-r03-on-20130309 -- extensive tests against all RNA/DNA-containing structures in the PDB as of March 2013, bug fixes and refinements
beta-r02-on-20130306 -- bug fixes, and internal improvements
beta-r01-on-20130303 -- initial release

Bug reports / MOVED: dssr issue (for modified pdb files)

« on: September 03, 2013, 11:36:11 am »

This topic has been moved to RNA structures (DSSR).

http://forum.x3dna.org/index.php?topic=390.0

RNA structures (DSSR) / Further note on DSSR

« on: April 25, 2013, 01:13:56 pm »

Mainly prompted by questions from Pascal (who has contributed the most posts among 3DNA users), here is a further note on DSSR.

Quote

It [DSSR] looks like a combined version of find_pair and analyze. Is that correct ?
Of course it seems not possible to (re)construct NA structures with DSSR.

Yes, to certain extent, you can think DSSR as a combination of find_pair and analyze. The post "DSSR, what's it and why bother?" provides more background information. You are right, DSSR does not construct nucleic acid structures.

DSSR represents my (opinionated) view of what a program for the structural analysis of nucleic acids (RNA in particular) should/could be, based on my extensive experience in supporting 3DNA, an increased knowledge in RNA structures and refined skills in C programming.

Quote

So first, why calling it DSSR and not DSSNA since it works also for DNA ?
I think that one should avoid the RNA domination, it is possible to learn from both structures.
thus, does DSSR really work for DNA ?

Again, read carefully the post "DSSR, what's it and why bother?" for my rationale. You may also notice that I put the word secondary in parenthesis in the title of the software, "DSSR: Software for Defining the (Secondary) Structures of RNA". DSSR surely works for DNA, or DNA-protein complexes in the same way as it does for RNA. As mentioned in the release note, I tested DSSR against every nucleic-acid-containing structure in the PDB. Overall, the acronym DSSR captures the essential message I'd like to get across, it is short, and it parallels the well-respected DSSP program for proteins (among other things).

Quote

Then, as for formats,
I think that as I mentioned it somewhere earlier, and since I am processing the output files
for a large number of structures, I appreciate when there are spacesbetween fields (see).

Code: [Select]

      base_id            alpha    beta   gamma   delta  epsilon   zeta     e-z        chi            phase-angle   sugar-type     Zp      Dp
 1     A.C2649            ---    167.1    47.6    84.1  -146.6   -77.1    -69(BI)   -160.5(anti)    12.9(C3'-endo)  ~C3'-endo    4.41    4.66
 2     A.U2650           -64.2   164.2    60.3    79.8  -154.5   -73.1    -81(BI)   -167.2(anti)    21.3(C3'-endo)  ~C3'-endo    4.40    4.55

I see your point, but the purpose of the output file is mainly for visual examination by a non-expert user. The message appears to be succinct. Your parser should be flexible enough to handle the case. Also see my reply to your initial thread.

Quote

and is there a need for writing twice the sugar pucker in this file ?

From my experience, the phase angle and pucker classification are the most useful information for the sugar moiety. I repeated the sugar pucker together with commonly used backbone parameters for convenience; one can now easily see the backbone conformation at a glance.

Quote

you name this file torsion although there are sugar puckers in it.
Thus it might be called torsion_puckers.dat or something else.

I see your point, but the file also contains Zp and Dp, and pseudo torsion angles. I'd keep the name as is; it is just a convention to get used to.

Quote

For the non-pairing interactions that is just a great feature,

you had before two values for base overlap
one calculated by just using ring atoms the other by using all base atoms.

you could add this.

DSSR checks base-stacking interaction using all base atoms, and so is the output value of base-overlap-area. I will consider to add overlap areas based on just ring atoms.

Quote

Why adding the name of the chemical groups (hydroxyl, amino, imino, ...)
again this complicates reading since some groups are named and others not like OP2 and so on.

I would appreciate another presentation here.

I added the names of chemical groups (hydroxyl/amino/imino) for the convenience of those who are not that familiar with the chemistry of H-bond. I've first-hand experience with such people (mostly physics/mathematics/computer science turned bioinformaticians). I can add an option to turn the chemical group off; but honestly, I really think you should revised your parser to handle it properly.

Take the following case as an example:

Code: [Select]

H-bonds[2]: "N3(imino)-N1[2.81]; O4(carbonyl)-N6(amino)[3.13]"if your parser can extract the distance and the PDB atom names, it won't be that far to check for () and get rid of the name of the chemical groups.

Quote

I haven't really checked, but are your base pair numbering scheme coherent with the one
you use in find_pair ? It would be really nice to be the case.

What do you mean by "base pair numbering scheme"? The serial numbers should not matter; the base pair is specified by the two constituent nucleotides (chain id, residue name and number, etc).

Quote

Also, I wanted to ask you that but know it seems to be done. You add various names
to each base pair. Thats great. Just a hint to the various nomenclatures (Leontis-Westhof, Saenger...)
would be helpful in the *.out files.

Advice taken

-- I will add a note in DSSR-beta-r10 (coming soon).

Quote

is there a configuration file that would allow to precise hydrogen bond and other parameters like in 3DNA.
I would really appreciate that.

To make DSSR self-contained, I've eliminated the configuration file. Overall, DSSR has refined algorithms for finding H-bonds, base pairs, helices etc, and the defaults should work for the vast majority of cases. So regular users could take DSSR as a black box, and they can check the results based on their domain knowledge and application needs.

DSSR also accepts command-line options to alter the default behavior. For example, you can use --hbond_d2=3.6 to set up the upper limit of H-bond length to 3.6 instead of the default 4.0 Å. I am working on a manuscript that describes details of the software.

HTH,

Xiang-Jun

Feature requests / MOVED: list nucleotide/nucleotide contacts involving a phosphate group.

« on: April 24, 2013, 01:05:52 pm »

This topic has been moved to RNA structures (DSSR).

http://forum.x3dna.org/index.php?topic=363.0

RNA structures (DSSR) / DSSR: Software for Defining the (Secondary) Structures of RNA

« on: March 03, 2013, 10:59:17 pm »

Note added on March 28, 2015: Please visit "DSSR: Dissecting the Spatial Structure of RNA"

As the number of experimentally solved RNA-containing structures grows, it is becoming increasingly important to characterize the geometric features of the molecules consistently and efficiently. Existing RNA bioinformatics tools are fragmented, and suffer in either scope or usability. DSSR, a new 3DNA program for Defining the Secondary Structures of RNA from three-dimensional (3D) coordinates, is designed to streamline the analyses of 3D RNA structures. It consolidates, refines, and significantly extends the functionality of 3DNA for RNA structural analysis.

Starting from an RNA structure in PDB or PDBx/mmCIF format, DSSR employs a set of simple geometric criteria to identify all existent base pairs (bp): either canonical Watson-Crick and wobble pairs or non-canonical pairs with at least one hydrogen bond. The latter pairs may include normal or modified bases, regardless of tautomeric or protonation state. DSSR uses the six standard rigid-body bp parameters (shear, stretch, stagger, propeller, buckle, and opening) to rigorously quantify the spatial disposition of any two interacting bases. Where applicable, the program also denotes a bp by common names, the Saenger classification scheme of 28 H-bonding types, and the Leontis-Westhof nomenclature of 12 basic geometric classes.

DSSR detects multiplets (triplets or higher-order base associations) by searching horizontally in the plane of the associated bp for further H-bonding interactions. The program determines double-helical regions by exploring vertically in the neighborhood of selected bps for base-stacking interactions, regardless of backbone connection (e.g., coaxial stacking of helices). DSSR then identifies hairpin loops, bulges, internal loops, and multi-branch loops (junctions), and recognizes the existence of pseudo-knots. The program outputs RNA secondary structure in dot-bracket notation (dbn) and connect table (.ct) format that can be fed directly into visualization tools (such as VARNA).

DSSR classifies dinucleotide steps into the most common A-, B-, or Z-form double helices, calculates commonly used backbone torsion angles, and assigns the consensus RNA backbone suite names. The program also identifies A-minor interactions, ribose zippers, G quartets, kissing loops, U-turns, and kink-turns. Furthermore, it reports non-pairing interactions (H-bonding or base-stacking) between two nucleotides, and contacts involving phosphate groups.

Currently at version 1.2, DSSR is in a stable and mature state. A simple web interface and a comprehensive user manual are available. Supported by Dr. Robert Hanson, DSSR has recently been integrated into Jmol, a popular molecular graphics program. DSSR-related news and information can be found on the 3DNA homepage. Questions and suggestions are always welcome on the 3DNA forum.

Give DSSR a try, compare it with similar tools in terms of usability, functionality and support, and see the differences!

Current version: DSSR v1.2.5-2015mar19. Release history (in reverse chronological order)

List of users who has helped improve DSSR by reporting bugs, making comments/suggestions etc:

jyvdf3asdg2; kailsen; MarcParisien; jctoledo; Auffinger; febos; acolasanti; hansonr; cllawson; cllawson; Sylverlin

-- Xiang-Jun

Note: please start a new topic with a more specific title; do not post directly below this announcement.

Here are some sample runs (see x3dna-dssr -h for more info),

Code: [Select]

x3dna-dssr -i=1msy.pdb -o=1msy.out  # 27 nts
x3dna-dssr --input=1msy.pdb --output=1msy.out # as as above
x3dna-dssr -i=1ehz.pdb -o=1ehz.out  # tRNA, 76 nts
x3dna-dssr -i=1jj2.pdb -o=1jj2.out  # rRNA, 2876 nts

Example #1: GUAA tetraloop mutant of Sarcin/Ricin domain from E. Coli 23 S rRNA (1msy)

Code: [Select]

Run: x3dna-dssr -i=1msy.pdb -o=1msy.out --non-pair --u-turn
****************************************************************************
         DSSR: a software program for Defining the Secondary
         Structures of RNA from three-dimensional coordinates
         v1.2.5-2015mar19, Xiang-Jun Lu (xiangjun@x3dna.org)

   This program is being actively maintained and developed. As always,
   I greatly appreciate your feedback! Please report all DSSR-related
   issues on the 3DNA Forum (forum.x3dna.org). I strive to respond
   *promptly* to *any questions* posted there.

****************************************************************************
Note: Each nucleotide is identified by model:chainId.name#, where the
      'model:' portion is omitted if no model number is available (as
      is often the case for x-ray crystal structures in the PDB). So a
      common example would be B.A1689, meaning adenosine #1689 on
      chain B. One-letter base names for modified nucleotides are put
      in lower case (e.g., 'c' for 5MC). For further information about
      the output notation, please refer to the DSSR User Manual.
      Questions and suggestions are always welcome on the 3DNA Forum.

Command: x3dna-dssr -i=1msy.pdb --u-turn --non-pair -o=1msy.out
Date and time: Thu Mar 19 16:17:25 2015
File name: 1msy.pdb
    no. of DNA/RNA chains: 1 [A=27]
    no. of nucleotides:    27
    no. of atoms:          685
    no. of waters:         109
    no. of metals:         0

****************************************************************************
List of 13 base pairs
      nt1            nt2           bp  name        Saenger    LW  DSSR
   1 A.U2647        A.G2673        U-G Wobble      28-XXVIII cWW  cW-W
   2 A.G2648        A.U2672        G-U Wobble      28-XXVIII cWW  cW-W
   3 A.C2649        A.G2671        C-G WC          19-XIX    cWW  cW-W
   4 A.U2650        A.A2670        U-A WC          20-XX     cWW  cW-W
   5 A.C2651        A.G2669        C-G WC          19-XIX    cWW  cW-W
   6 A.C2652        A.G2668        C-G WC          19-XIX    cWW  cW-W
   7 A.U2653        A.C2667        U-C --          n/a       tW.  tW-.
   8 A.A2654        A.C2666        A+C --          n/a       tHH  tM+M
   9 A.G2655        A.U2656        G+U Platform    n/a       cSH  cm+M
  10 A.U2656        A.A2665        U-A rHoogsteen  24-XXIV   tWH  tW-M
  11 A.A2657        A.G2664        A-G Sheared     11-XI     tHS  tM-m
  12 A.C2658        A.G2663        C-G WC          19-XIX    cWW  cW-W
  13 A.G2659        A.A2662        G-A Sheared     11-XI     tSH  tm-M

****************************************************************************
List of 1 multiplet
   1 nts=3 GUA A.G2655,A.U2656,A.A2665

****************************************************************************
List of 1 helix
  Note: a helix is defined by base-stacking interactions, regardless of bp
        type and backbone connectivity, and may contain more than one stem.
      helix#number[stems-contained] bps=number-of-base-pairs in the helix
      bp-type: '|' for a canonical WC/wobble pair, '.' otherwise
      helix-form: classification of a dinucleotide step comprising the bp
        above the given designation and the bp that follows it. Types
        include 'A', 'B' or 'Z' for the common A-, B- and Z-form helices,
        '.' for an unclassified step, and 'x' for a step without a
        continuous backbone.
      --------------------------------------------------------------------
  helix#1[1] bps=12
      strand-1 5'-UGCUCCUAUACG-3'
       bp-type    ||||||....|.
      strand-2 3'-GUGAGGCCAGGA-5'
      helix-form  ..AAA..x...
   1 A.U2647        A.G2673        U-G Wobble       28-XXVIII cWW  cW-W
   2 A.G2648        A.U2672        G-U Wobble       28-XXVIII cWW  cW-W
   3 A.C2649        A.G2671        C-G WC           19-XIX    cWW  cW-W
   4 A.U2650        A.A2670        U-A WC           20-XX     cWW  cW-W
   5 A.C2651        A.G2669        C-G WC           19-XIX    cWW  cW-W
   6 A.C2652        A.G2668        C-G WC           19-XIX    cWW  cW-W
   7 A.U2653        A.C2667        U-C --           n/a       tW.  tW-.
   8 A.A2654        A.C2666        A+C --           n/a       tHH  tM+M
   9 A.U2656        A.A2665        U-A rHoogsteen   24-XXIV   tWH  tW-M
  10 A.A2657        A.G2664        A-G Sheared      11-XI     tHS  tM-m
  11 A.C2658        A.G2663        C-G WC           19-XIX    cWW  cW-W
  12 A.G2659        A.A2662        G-A Sheared      11-XI     tSH  tm-M

****************************************************************************
List of 1 stem
  Note: a stem is defined as a helix consisting of only canonical WC/wobble
        pairs, with a continuous backbone.
      stem#number[#helix-number containing this stem]
      Other terms are defined as in the above Helix section.
      --------------------------------------------------------------------
  stem#1[#1] bps=6
      strand-1 5'-UGCUCC-3'
       bp-type    ||||||
      strand-2 3'-GUGAGG-5'
      helix-form  ..AAA
   1 A.U2647        A.G2673        U-G Wobble       28-XXVIII cWW  cW-W
   2 A.G2648        A.U2672        G-U Wobble       28-XXVIII cWW  cW-W
   3 A.C2649        A.G2671        C-G WC           19-XIX    cWW  cW-W
   4 A.U2650        A.A2670        U-A WC           20-XX     cWW  cW-W
   5 A.C2651        A.G2669        C-G WC           19-XIX    cWW  cW-W
   6 A.C2652        A.G2668        C-G WC           19-XIX    cWW  cW-W

****************************************************************************
List of 1 isolated WC/wobble pair
  Note: isolated WC/wobble pairs are assigned negative indices to
        differentiate them from the stem numbers, which are positive.
        --------------------------------------------------------------------
[#1]     -1 A.C2658        A.G2663        C-G WC           19-XIX    cWW  cW-W

****************************************************************************
List of 30 non-pairing interactions
   1 A.U2647        A.G2648        stacking: 1.0(0.5)--pm(>>,forward)
   2 A.G2648        A.C2649        stacking: 7.3(4.6)--pm(>>,forward)
   3 A.G2648        A.G2673        stacking: 2.0(0.2)--mm(<>,outward)
   4 A.C2649        A.U2650        stacking: 2.8(1.1)--pm(>>,forward)
   5 A.U2650        A.C2651        stacking: 0.6(0.0)--pm(>>,forward)
   6 A.C2651        A.C2652        stacking: 0.5(0.1)--pm(>>,forward)
   7 A.C2652        A.U2653        stacking: 5.2(2.6)--pm(>>,forward)
   8 A.C2652        A.G2669        stacking: 0.2(0.0)--mm(<>,outward)
   9 A.U2653        A.A2654        stacking: 3.3(2.0)--pp(><,inward) H-bonds[1]: "OP2-O2'(hydroxyl)[2.62]"
  10 A.A2654        A.U2656        stacking: 3.7(1.1)--mm(<>,outward) H-bonds[1]: "O4'*O4'[3.05]"
  11 A.G2655        A.G2664        stacking: 4.4(2.2)--pp(><,inward) H-bonds[1]: "O2'(hydroxyl)-O6(carbonyl)[3.09]"
  12 A.G2655        A.A2665        H-bonds[3]: "N1(imino)-OP2[2.77],N2(amino)-OP2[3.34],N2(amino)-O5'[2.89]"
  13 A.U2656        A.G2664        H-bonds[2]: "OP2-N1(imino)[3.04],OP2-N2(amino)[2.94]"
  14 A.A2657        A.C2658        stacking: 6.7(2.6)--pm(>>,forward)
  15 A.A2657        A.A2665        stacking: 3.7(3.3)--mm(<>,outward)
  16 A.C2658        A.G2659        stacking: 0.4(0.1)--pm(>>,forward)
  17 A.G2659        A.A2661        H-bonds[1]: "O2'(hydroxyl)-N7[2.60]"
  18 A.G2659        A.G2663        stacking: 3.9(1.2)--mm(<>,outward)
  19 A.U2660        A.A2661        stacking: 7.5(4.2)--pm(>>,forward)
  20 A.A2661        A.A2662        stacking: 6.3(4.4)--pm(>>,forward)
  21 A.G2663        A.G2664        stacking: 2.7(0.6)--pm(>>,forward)
  22 A.G2664        A.A2665        H-bonds[1]: "O2'(hydroxyl)-O4'[2.75]"
  23 A.A2665        A.C2666        stacking: 1.6(1.1)--pm(>>,forward)
  24 A.C2666        A.C2667        stacking: 4.3(2.1)--pm(>>,forward)
  25 A.C2667        A.G2668        stacking: 3.1(1.0)--pm(>>,forward)
  26 A.G2668        A.G2669        stacking: 4.3(3.0)--pm(>>,forward)
  27 A.G2669        A.A2670        stacking: 4.3(2.9)--pm(>>,forward)
  28 A.A2670        A.G2671        stacking: 1.5(1.5)--pm(>>,forward)
  29 A.G2671        A.U2672        stacking: 7.4(4.0)--pm(>>,forward)
  30 A.U2672        A.G2673        H-bonds[1]: "O2'(hydroxyl)-O4'[3.37]"

****************************************************************************
List of 4 stacks
  Note: a stack is an ordered list of nucleotides assembled together via
        base-stacking interactions, regardless of backbone connectivity.
        Stacking interactions within a stem are *not* included.
        --------------------------------------------------------------------
   1 nts=3 UAA A.U2660,A.A2661,A.A2662
   2 nts=4 CUAU A.C2652,A.U2653,A.A2654,A.U2656
   3 nts=4 GGGG A.G2655,A.G2664,A.G2663,A.G2659
   4 nts=6 CAACCG A.C2658,A.A2657,A.A2665,A.C2666,A.C2667,A.G2668

****************************************************************************
Note: for the various types of loops listed below, numbers within the first
      set of brackets are the number of loop nts, and numbers in the second
      set of brackets are the identities of the stems (positive number) or
      isolated WC/wobble pairs (negative numbers) to which they are linked.

****************************************************************************
List of 1 hairpin loop
   1 hairpin loop: nts=6; [4]; linked by [#-1]
     nts=6 CGUAAG A.C2658,A.G2659,A.U2660,A.A2661,A.A2662,A.G2663
       nts=4 GUAA A.G2659,A.U2660,A.A2661,A.A2662

****************************************************************************
List of 1 internal loop
   1 asymmetric internal loop: nts=13; [5,4]; linked by [#1,#-1]
     nts=13 CUAGUACGGACCG A.C2652,A.U2653,A.A2654,A.G2655,A.U2656,A.A2657,A.C2658,A.G2663,A.G2664,A.A2665,A.C2666,A.C2667,A.G2668
       nts=5 UAGUA A.U2653,A.A2654,A.G2655,A.U2656,A.A2657
       nts=4 GACC A.G2664,A.A2665,A.C2666,A.C2667

****************************************************************************
List of 1 U-turn
   1  A.G2659-A.A2662 H-bonds[2]: "N2(amino)-OP2[2.97],N2(amino)-N7[2.86]" nts=6 CGUAAG A.C2658,A.G2659,A.U2660,A.A2661,A.A2662,A.G2663

****************************************************************************
Secondary structures in dot-bracket notation (dbn) as a whole and per chain
>1msy nts=27 [whole]
UGCUCCUAGUACGUAAGGACCGGAGUG
((((((.....(....)....))))))
>1msy-A #1 nts=27 [chain] RNA
UGCUCCUAGUACGUAAGGACCGGAGUG
((((((.....(....)....))))))

****************************************************************************
List of 12 additional files
   1 dssr-stems.pdb -- an ensemble of stems
   2 dssr-helices.pdb -- an ensemble of helices (coaxial stacking)
   3 dssr-pairs.pdb -- an ensemble of base pairs
   4 dssr-multiplets.pdb -- an ensemble of multiplets
   5 dssr-hairpins.pdb -- an ensemble of hairpin loops
   6 dssr-iloops.pdb -- an ensemble of internal loops
   7 dssr-2ndstrs.bpseq -- secondary structure in bpseq format
   8 dssr-2ndstrs.ct -- secondary structure in connect table format
   9 dssr-2ndstrs.dbn -- secondary structure in dot-bracket notation
  10 dssr-torsions.txt -- backbone torsion angles and suite names
  11 dssr-Uturns.pdb -- an ensemble of U-turn motifs
  12 dssr-stacks.pdb -- an ensemble of stacks

Example #2: The crystal structure of yeast phenylalanine tRNA at 1.93 Å resolution (1ehz)

Code: [Select]

Run: x3dna-dssr -i=1ehz.pdb -o=1ehz.out --po4 --u-turn
****************************************************************************
         DSSR: a software program for Defining the Secondary
         Structures of RNA from three-dimensional coordinates
         v1.2.5-2015mar19, Xiang-Jun Lu (xiangjun@x3dna.org)

   This program is being actively maintained and developed. As always,
   I greatly appreciate your feedback! Please report all DSSR-related
   issues on the 3DNA Forum (forum.x3dna.org). I strive to respond
   *promptly* to *any questions* posted there.

****************************************************************************
Note: Each nucleotide is identified by model:chainId.name#, where the
      'model:' portion is omitted if no model number is available (as
      is often the case for x-ray crystal structures in the PDB). So a
      common example would be B.A1689, meaning adenosine #1689 on
      chain B. One-letter base names for modified nucleotides are put
      in lower case (e.g., 'c' for 5MC). For further information about
      the output notation, please refer to the DSSR User Manual.
      Questions and suggestions are always welcome on the 3DNA Forum.

Command: x3dna-dssr -i=1ehz.pdb --u-turn --po4 -o=1ehz.out
Date and time: Thu Mar 19 16:17:25 2015
File name: 1ehz.pdb
    no. of DNA/RNA chains: 1 [A=76]
    no. of nucleotides:    76
    no. of atoms:          1821
    no. of waters:         160
    no. of metals:         9 [Mg=6,Mn=3]

****************************************************************************
List of 11 types of 14 modified nucleotides
      nt    count  list
   1 1MA-a    1    A.1MA58
   2 2MG-g    1    A.2MG10
   3 5MC-c    2    A.5MC40,A.5MC49
   4 5MU-t    1    A.5MU54
   5 7MG-g    1    A.7MG46
   6 H2U-u    2    A.H2U16,A.H2U17
   7 M2G-g    1    A.M2G26
   8 OMC-c    1    A.OMC32
   9 OMG-g    1    A.OMG34
  10 PSU-P    2    A.PSU39,A.PSU55
  11 YYG-g    1    A.YYG37

****************************************************************************
List of 34 base pairs
      nt1            nt2           bp  name        Saenger    LW  DSSR
   1 A.G1           A.C72          G-C WC          19-XIX    cWW  cW-W
   2 A.C2           A.G71          C-G WC          19-XIX    cWW  cW-W
   3 A.G3           A.C70          G-C WC          19-XIX    cWW  cW-W
   4 A.G4           A.U69          G-U Wobble      28-XXVIII cWW  cW-W
   5 A.A5           A.U68          A-U WC          20-XX     cWW  cW-W
   6 A.U6           A.A67          U-A WC          20-XX     cWW  cW-W
   7 A.U7           A.A66          U-A WC          20-XX     cWW  cW-W
   8 A.U8           A.A14          U-A rHoogsteen  24-XXIV   tWH  tW-M
   9 A.U8           A.A21          U+A --          n/a       tSW  tm+W
  10 A.A9           A.A23          A+A --          02-II     tHH  tM+M
  11 A.2MG10        A.C25          g-C WC          19-XIX    cWW  cW-W
  12 A.2MG10        A.G45          g+G --          n/a       cHS  cM+m
  13 A.C11          A.G24          C-G WC          19-XIX    cWW  cW-W
  14 A.U12          A.A23          U-A WC          20-XX     cWW  cW-W
  15 A.C13          A.G22          C-G WC          19-XIX    cWW  cW-W
  16 A.G15          A.C48          G+C rWC         22-XXII   tWW  tW+W
  17 A.H2U16        A.U59          u+U --          n/a       tSW  tm+W
  18 A.G18          A.PSU55        G+P --          n/a       tWS  tW+m
  19 A.G19          A.C56          G-C WC          19-XIX    cWW  cW-W
  20 A.G22          A.7MG46        G-g --          07-VII    tHW  tM-W
  21 A.M2G26        A.A44          g-A Imino       08-VIII   cWW  cW-W
  22 A.C27          A.G43          C-G WC          19-XIX    cWW  cW-W
  23 A.C28          A.G42          C-G WC          19-XIX    cWW  cW-W
  24 A.A29          A.U41          A-U WC          20-XX     cWW  cW-W
  25 A.G30          A.5MC40        G-c WC          19-XIX    cWW  cW-W
  26 A.A31          A.PSU39        A-P --          n/a       cWW  cW-W
  27 A.OMC32        A.A38          c-A --          n/a       c.W  c.-W
  28 A.U33          A.A36          U-A --          n/a       tSH  tm-M
  29 A.5MC49        A.G65          c-G WC          19-XIX    cWW  cW-W
  30 A.U50          A.A64          U-A WC          20-XX     cWW  cW-W
  31 A.G51          A.C63          G-C WC          19-XIX    cWW  cW-W
  32 A.U52          A.A62          U-A WC          20-XX     cWW  cW-W
  33 A.G53          A.C61          G-C WC          19-XIX    cWW  cW-W
  34 A.5MU54        A.1MA58        t-a rHoogsteen  24-XXIV   tWH  tW-M

****************************************************************************
List of 4 multiplets
   1 nts=3 UAA A.U8,A.A14,A.A21
   2 nts=3 AUA A.A9,A.U12,A.A23
   3 nts=3 gCG A.2MG10,A.C25,A.G45
   4 nts=3 CGg A.C13,A.G22,A.7MG46

****************************************************************************
List of 2 helices
  Note: a helix is defined by base-stacking interactions, regardless of bp
        type and backbone connectivity, and may contain more than one stem.
      helix#number[stems-contained] bps=number-of-base-pairs in the helix
      bp-type: '|' for a canonical WC/wobble pair, '.' otherwise
      helix-form: classification of a dinucleotide step comprising the bp
        above the given designation and the bp that follows it. Types
        include 'A', 'B' or 'Z' for the common A-, B- and Z-form helices,
        '.' for an unclassified step, and 'x' for a step without a
        continuous backbone.
      --------------------------------------------------------------------
  helix#1[2] bps=15
      strand-1 5'-GCGGAUUcUGUGtPC-3'
       bp-type    ||||||||||||..|
      strand-2 3'-CGCUUAAGACACaGG-5'
      helix-form  AA....xAAAAxx.
   1 A.G1           A.C72          G-C WC           19-XIX    cWW  cW-W
   2 A.C2           A.G71          C-G WC           19-XIX    cWW  cW-W
   3 A.G3           A.C70          G-C WC           19-XIX    cWW  cW-W
   4 A.G4           A.U69          G-U Wobble       28-XXVIII cWW  cW-W
   5 A.A5           A.U68          A-U WC           20-XX     cWW  cW-W
   6 A.U6           A.A67          U-A WC           20-XX     cWW  cW-W
   7 A.U7           A.A66          U-A WC           20-XX     cWW  cW-W
   8 A.5MC49        A.G65          c-G WC           19-XIX    cWW  cW-W
   9 A.U50          A.A64          U-A WC           20-XX     cWW  cW-W
  10 A.G51          A.C63          G-C WC           19-XIX    cWW  cW-W
  11 A.U52          A.A62          U-A WC           20-XX     cWW  cW-W
  12 A.G53          A.C61          G-C WC           19-XIX    cWW  cW-W
  13 A.5MU54        A.1MA58        t-a rHoogsteen   24-XXIV   tWH  tW-M
  14 A.PSU55        A.G18          P+G --           n/a       tSW  tm+W
  15 A.C56          A.G19          C-G WC           19-XIX    cWW  cW-W
  --------------------------------------------------------------------------
  helix#2[2] bps=15
      strand-1 5'-AAPcUGGAgCUCAGu-3'
       bp-type    ...||||.||||...
      strand-2 3'-UcAGACCgCGAGUCU-5'
      helix-form  x..AAAAxAA.xxx
   1 A.A36          A.U33          A-U --           n/a       tHS  tM-m
   2 A.A38          A.OMC32        A-c --           n/a       cW.  cW-.
   3 A.PSU39        A.A31          P-A --           n/a       cWW  cW-W
   4 A.5MC40        A.G30          c-G WC           19-XIX    cWW  cW-W
   5 A.U41          A.A29          U-A WC           20-XX     cWW  cW-W
   6 A.G42          A.C28          G-C WC           19-XIX    cWW  cW-W
   7 A.G43          A.C27          G-C WC           19-XIX    cWW  cW-W
   8 A.A44          A.M2G26        A-g Imino        08-VIII   cWW  cW-W
   9 A.2MG10        A.C25          g-C WC           19-XIX    cWW  cW-W
  10 A.C11          A.G24          C-G WC           19-XIX    cWW  cW-W
  11 A.U12          A.A23          U-A WC           20-XX     cWW  cW-W
  12 A.C13          A.G22          C-G WC           19-XIX    cWW  cW-W
  13 A.A14          A.U8           A-U rHoogsteen   24-XXIV   tHW  tM-W
  14 A.G15          A.C48          G+C rWC          22-XXII   tWW  tW+W
  15 A.H2U16        A.U59          u+U --           n/a       tSW  tm+W

****************************************************************************
List of 4 stems
  Note: a stem is defined as a helix consisting of only canonical WC/wobble
        pairs, with a continuous backbone.
      stem#number[#helix-number containing this stem]
      Other terms are defined as in the above Helix section.
      --------------------------------------------------------------------
  stem#1[#1] bps=7
      strand-1 5'-GCGGAUU-3'
       bp-type    |||||||
      strand-2 3'-CGCUUAA-5'
      helix-form  AA....
   1 A.G1           A.C72          G-C WC           19-XIX    cWW  cW-W
   2 A.C2           A.G71          C-G WC           19-XIX    cWW  cW-W
   3 A.G3           A.C70          G-C WC           19-XIX    cWW  cW-W
   4 A.G4           A.U69          G-U Wobble       28-XXVIII cWW  cW-W
   5 A.A5           A.U68          A-U WC           20-XX     cWW  cW-W
   6 A.U6           A.A67          U-A WC           20-XX     cWW  cW-W
   7 A.U7           A.A66          U-A WC           20-XX     cWW  cW-W
  --------------------------------------------------------------------------
  stem#2[#2] bps=4
      strand-1 5'-gCUC-3'
       bp-type    ||||
      strand-2 3'-CGAG-5'
      helix-form  AA.
   1 A.2MG10        A.C25          g-C WC           19-XIX    cWW  cW-W
   2 A.C11          A.G24          C-G WC           19-XIX    cWW  cW-W
   3 A.U12          A.A23          U-A WC           20-XX     cWW  cW-W
   4 A.C13          A.G22          C-G WC           19-XIX    cWW  cW-W
  --------------------------------------------------------------------------
  stem#3[#2] bps=4
      strand-1 5'-CCAG-3'
       bp-type    ||||
      strand-2 3'-GGUc-5'
      helix-form  AAA
   1 A.C27          A.G43          C-G WC           19-XIX    cWW  cW-W
   2 A.C28          A.G42          C-G WC           19-XIX    cWW  cW-W
   3 A.A29          A.U41          A-U WC           20-XX     cWW  cW-W
   4 A.G30          A.5MC40        G-c WC           19-XIX    cWW  cW-W
  --------------------------------------------------------------------------
  stem#4[#1] bps=5
      strand-1 5'-cUGUG-3'
       bp-type    |||||
      strand-2 3'-GACAC-5'
      helix-form  AAAA
   1 A.5MC49        A.G65          c-G WC           19-XIX    cWW  cW-W
   2 A.U50          A.A64          U-A WC           20-XX     cWW  cW-W
   3 A.G51          A.C63          G-C WC           19-XIX    cWW  cW-W
   4 A.U52          A.A62          U-A WC           20-XX     cWW  cW-W
   5 A.G53          A.C61          G-C WC           19-XIX    cWW  cW-W

****************************************************************************
List of 1 isolated WC/wobble pair
  Note: isolated WC/wobble pairs are assigned negative indices to
        differentiate them from the stem numbers, which are positive.
        --------------------------------------------------------------------
[#1]     -1 A.G19          A.C56          G-C WC           19-XIX    cWW  cW-W

****************************************************************************
List of 2 coaxial stacks
   1 Helix#1 contains 2 stems: [#1,#4]
   2 Helix#2 contains 2 stems: [#3,#2]

****************************************************************************
List of 11 stacks
  Note: a stack is an ordered list of nucleotides assembled together via
        base-stacking interactions, regardless of backbone connectivity.
        Stacking interactions within a stem are *not* included.
        --------------------------------------------------------------------
   1 nts=2 Uc A.U7,A.5MC49
   2 nts=2 UC A.U8,A.C13
   3 nts=2 GA A.G65,A.A66
   4 nts=3 CgC A.C25,A.M2G26,A.C27
   5 nts=3 gAC A.7MG46,A.A21,A.C48
   6 nts=3 GtP A.G53,A.5MU54,A.PSU55
   7 nts=4 GACC A.G1,A.A73,A.C74,A.C75
   8 nts=4 GAcU A.G30,A.A31,A.OMC32,A.U33
   9 nts=5 GGGaC A.G19,A.G57,A.G18,A.1MA58,A.C61
  10 nts=7 gAAgAPc A.OMG34,A.A35,A.A36,A.YYG37,A.A38,A.PSU39,A.5MC40
  11 nts=9 GAGAGAGUC A.G43,A.A44,A.G45,A.A9,A.G22,A.A14,A.G15,A.U59,A.C60
     -----------------------------------------------------------------------
  Nucleotides not involved in stacking interactions
     nts=4 uGUA A.H2U17,A.G20,A.U47,A.A76

****************************************************************************
Note: for the various types of loops listed below, numbers within the first
      set of brackets are the number of loop nts, and numbers in the second
      set of brackets are the identities of the stems (positive number) or
      isolated WC/wobble pairs (negative numbers) to which they are linked.

****************************************************************************
List of 3 hairpin loops
   1 hairpin loop: nts=10; [8]; linked by [#2]
     nts=10 CAGuuGGGAG A.C13,A.A14,A.G15,A.H2U16,A.H2U17,A.G18,A.G19,A.G20,A.A21,A.G22
       nts=8 AGuuGGGA A.A14,A.G15,A.H2U16,A.H2U17,A.G18,A.G19,A.G20,A.A21
   2 hairpin loop: nts=11; [9]; linked by [#3]
     nts=11 GAcUgAAgAPc A.G30,A.A31,A.OMC32,A.U33,A.OMG34,A.A35,A.A36,A.YYG37,A.A38,A.PSU39,A.5MC40
       nts=9 AcUgAAgAP A.A31,A.OMC32,A.U33,A.OMG34,A.A35,A.A36,A.YYG37,A.A38,A.PSU39
   3 hairpin loop: nts=9; [7]; linked by [#4]
     nts=9 GtPCGaUCC A.G53,A.5MU54,A.PSU55,A.C56,A.G57,A.1MA58,A.U59,A.C60,A.C61
       nts=7 tPCGaUC A.5MU54,A.PSU55,A.C56,A.G57,A.1MA58,A.U59,A.C60

****************************************************************************
List of 1 junction
   1 4-way junction: nts=16; [2,1,5,0]; linked by [#1,#2,#3,#4]
     nts=16 UUAgCgCGAGgUCcGA A.U7,A.U8,A.A9,A.2MG10,A.C25,A.M2G26,A.C27,A.G43,A.A44,A.G45,A.7MG46,A.U47,A.C48,A.5MC49,A.G65,A.A66
       nts=2 UA A.U8,A.A9
       nts=1 g A.M2G26
       nts=5 AGgUC A.A44,A.G45,A.7MG46,A.U47,A.C48
       nts=0

****************************************************************************
List of 1 non-loop single-stranded segment
   1 nts=4 ACCA A.A73,A.C74,A.C75,A.A76

****************************************************************************
List of 1 kissing loop interaction
   1 isolated-pair #-1 between hairpin loops #1 and #3

****************************************************************************
List of 2 U-turns
   1  A.U33-A.A36 H-bonds[1]: "N3(imino)-OP2[2.80]" nts=6 cUgAAg A.OMC32,A.U33,A.OMG34,A.A35,A.A36,A.YYG37
   2  A.PSU55-A.1MA58 H-bonds[1]: "N3-OP2[2.77]" nts=6 tPCGaU A.5MU54,A.PSU55,A.C56,A.G57,A.1MA58,A.U59

****************************************************************************
List of 18 phosphate interactions
   1 A.U7            OP1-hbonds[1]: "MG@A.MG580[2.60]"
   2 A.A9            OP2-hbonds[1]: "N4@A.C13[3.01]"
   3 A.A14           OP2-hbonds[1]: "MG@A.MG580[1.93]"
   4 A.H2U16         OP2-cap: "A.H2U16"
   5 A.G18           OP1-hbonds[1]: "O2'@A.H2U17[2.97]"
   6 A.G19           OP1-hbonds[2]: "N4@A.C60[3.27],MN@A.MN530[2.19]"
   7 A.G20           OP1-hbonds[1]: "MG@A.MG540[2.07]"
   8 A.A21           OP2-hbonds[1]: "MG@A.MG540[2.11]"
   9 A.A23           OP2-hbonds[1]: "N6@A.A9[3.12]"
  10 A.A35           OP2-cap: "A.U33"
  11 A.A36           OP2-hbonds[1]: "N3@A.U33[2.80]"
  12 A.YYG37         OP2-hbonds[1]: "MG@A.MG590[2.53]"
  13 A.C48           OP2-hbonds[1]: "O2'@A.7MG46[3.55]"
  14 A.5MC49         OP1-hbonds[1]: "O2'@A.C48[3.13]" OP2-hbonds[1]: "O2'@A.U7[2.68]"
  15 A.U50           OP1-hbonds[1]: "O2'@A.U47[2.71]"
  16 A.G57           OP2-cap: "A.PSU55"
  17 A.1MA58         OP2-hbonds[1]: "N3@A.PSU55[2.77]"
  18 A.C60           OP1-hbonds[1]: "N4@A.C61[3.12]" OP2-hbonds[1]: "O2'@A.1MA58[2.42]"

****************************************************************************
This structure contains 1-order pseudoknot
   o You may want to run DSSR again with the '--nested' option which removes
     pseudoknots to get a fully nested secondary structure representation.

****************************************************************************
Secondary structures in dot-bracket notation (dbn) as a whole and per chain
>1ehz nts=76 [whole]
GCGGAUUUAgCUCAGuuGGGAGAGCgCCAGAcUgAAgAPcUGGAGgUCcUGUGtPCGaUCCACAGAAUUCGCACCA
(((((((..((((.....[..)))).((((.........)))).....(((((..]....))))))))))))....
>1ehz-A #1 nts=76 [chain] RNA
GCGGAUUUAgCUCAGuuGGGAGAGCgCCAGAcUgAAgAPcUGGAGgUCcUGUGtPCGaUCCACAGAAUUCGCACCA
(((((((..((((.....[..)))).((((.........)))).....(((((..]....))))))))))))....

****************************************************************************
List of 12 additional files
   1 dssr-stems.pdb -- an ensemble of stems
   2 dssr-helices.pdb -- an ensemble of helices (coaxial stacking)
   3 dssr-pairs.pdb -- an ensemble of base pairs
   4 dssr-multiplets.pdb -- an ensemble of multiplets
   5 dssr-hairpins.pdb -- an ensemble of hairpin loops
   6 dssr-junctions.pdb -- an ensemble of junctions (multi-branch)
   7 dssr-2ndstrs.bpseq -- secondary structure in bpseq format
   8 dssr-2ndstrs.ct -- secondary structure in connect table format
   9 dssr-2ndstrs.dbn -- secondary structure in dot-bracket notation
  10 dssr-torsions.txt -- backbone torsion angles and suite names
  11 dssr-Uturns.pdb -- an ensemble of U-turn motifs
  12 dssr-stacks.pdb -- an ensemble of stacks

Code: [Select]

>1msy-A #1 RNA with 27 nts
UGCUCCUAGUACGUAAGGACCGGAGUG
.(((((.....(....)....))))).

>1ehz-A #1 RNA with 76 nts
GCGGAUUUAgCUCAGuuGGGAGAGCgCCAGAcUgAAgAPcUGGAGgUCcUGUGuPCGaUCCACAGAAUUCGCACCA
(((((((..((((.....[..)))).((((.........)))).....(((((..]....))))))))))))....

FAQs / How can I mutate cytosine to 5-methylcytosine

« on: October 26, 2012, 01:23:14 pm »

Methylation of cytosines in DNA is a crucial epigenetic modification that regulate expression of many genes. Chemically, it is the addition of a methyl group to the 5 position of cytosine (C).

The mutate_bases program in 3DNA v2.x performs in silico base mutations given a nucleic acid structure in PDB format. It is not a problem to mutate any C to a 5-methylcytosine (5CM) provided that users set a 5CM in its standard base reference frame. Given the importance of 5CM in epigenetics and the increasing simulation studies to understand its effects, I have included Atomic_5CM.pdb in the 3DNA v2.1 distribution as of 2012oct26.

According to PDB, the three-letter nucleotide name for 5-methylcytosine is 5CM instead of 5MC -- see for example PDB entries 4mht and 2uz4. The methyl carbon is named " C5A" instead of " C5M" or " C7 ". Thus, the content of the Atomic_5CM.pdb file is:

Code: [Select]

REMARK    3DNA by Dr. Xiang-Jun Lu [2012-10-26] (xiangjun@x3dna.org)
ATOM      1  C1' 5CM A   1      -2.477   5.402   0.000  1.00  0.00           C  
ATOM      2  N1  5CM A   1      -1.285   4.542   0.000  1.00  0.00           N  
ATOM      3  C2  5CM A   1      -1.472   3.158   0.000  1.00  0.00           C  
ATOM      4  O2  5CM A   1      -2.628   2.709   0.001  1.00  0.00           O  
ATOM      5  N3  5CM A   1      -0.391   2.344   0.000  1.00  0.00           N  
ATOM      6  C4  5CM A   1       0.837   2.868   0.000  1.00  0.00           C  
ATOM      7  N4  5CM A   1       1.875   2.027   0.001  1.00  0.00           N  
ATOM      8  C5  5CM A   1       1.056   4.275   0.000  1.00  0.00           C  
ATOM      9  C5A 5CM A   1       2.466   4.961   0.001  1.00  0.00           C  
ATOM     10  C6  5CM A   1      -0.023   5.068   0.000  1.00  0.00           C  
END

With this new addition, it is now very straightforward to mutate Cs to 5CMs with mutate_bases, as illustrated by the following two examples:

Mutate C1 on chain A and C23 on chain B of the Dickerson B-DNA dodecamer (PDB entry 355d) to 5CMs:
```
mutate_bases 'chain=A snum=1 m=5CM; chain=B snum=23 m=5CM' 355d.pdb 355d_AC1BC23_5CM.pdb
```
Mutate C2 on chain A of the yeast phenylalanine tRNA (PDB entry 6tna) to 5CM:
```
mutate_bases 'chain=A snum=2 name=C m=5CM' 6tna.pdb 6tna_C2_5CM.pdb
```

The mutated files 355d_AC1BC23_5CM.pdb and 6tna_C2_5CM.pdb are attached for your verification. For comparison, shown below are the original atomic coordinates of the above tRNA 6tna cytosine and coordinates of its 5CM mutant in red. Note that the coordinates of the backbone atoms are the same, and coordinates of common base atoms are very close.

The original atomic coordinates of a cytosine from PDB entry 6tna:
--------------------------------------------------------------------------------
ATOM     25  P     C A   2      31.659  20.469  70.978  1.00 10.00           P  
ATOM     26  OP1   C A   2      32.973  21.044  71.364  1.00 10.00           O  
ATOM     27  OP2   C A   2      30.973  21.143  69.849  1.00 10.00           O  
ATOM     28  O5'   C A   2      31.815  18.912  70.652  1.00 10.00           O  
ATOM     29  C5'   C A   2      30.629  18.184  70.293  1.00 10.00           C  
ATOM     30  C4'   C A   2      30.507  16.914  71.139  1.00 10.00           C  
ATOM     31  O4'   C A   2      29.293  17.051  71.947  1.00 10.00           O  
ATOM     32  C3'   C A   2      30.455  15.607  70.367  1.00 10.00           C  
ATOM     33  O3'   C A   2      31.724  14.971  70.316  1.00 10.00           O  
ATOM     34  C2'   C A   2      29.411  14.815  71.146  1.00 10.00           C  
ATOM     35  O2'   C A   2      29.987  14.227  72.301  1.00 10.00           O  
ATOM     36  C1'   C A   2      28.473  15.927  71.630  1.00 10.00           C  
ATOM     37  N1    C A   2      27.474  16.346  70.621  1.00 10.00           N  
ATOM     38  C2    C A   2      26.658  15.368  70.068  1.00 10.00           C  
ATOM     39  O2    C A   2      26.802  14.198  70.441  1.00 10.00           O  
ATOM     40  N3    C A   2      25.726  15.730  69.143  1.00 10.00           N  
ATOM     41  C4    C A   2      25.601  17.008  68.767  1.00 10.00           C  
ATOM     42  N4    C A   2      24.682  17.314  67.872  1.00 10.00           N  
ATOM     43  C5    C A   2      26.436  18.041  69.324  1.00 10.00           C  
ATOM     44  C6    C A   2      27.351  17.658  70.243  1.00 10.00           C  
--------------------------------------------------------------------------------
The coordinates of the mutant 5-methylcytosine generated by 'mutate_bases'
REMARK    Mutation#1 A:...2@:[..C] to [5CM]
ATOM     25  P   5CM A   2      31.659  20.469  70.978  1.00 10.00           P  
ATOM     26  OP1 5CM A   2      32.973  21.044  71.364  1.00 10.00           O  
ATOM     27  OP2 5CM A   2      30.973  21.143  69.849  1.00 10.00           O  
ATOM     28  O5' 5CM A   2      31.815  18.912  70.652  1.00 10.00           O  
ATOM     29  C5' 5CM A   2      30.629  18.184  70.293  1.00 10.00           C  
ATOM     30  C4' 5CM A   2      30.507  16.914  71.139  1.00 10.00           C  
ATOM     31  O4' 5CM A   2      29.293  17.051  71.947  1.00 10.00           O  
ATOM     32  C3' 5CM A   2      30.455  15.607  70.367  1.00 10.00           C  
ATOM     33  O3' 5CM A   2      31.724  14.971  70.316  1.00 10.00           O  
ATOM     34  C2' 5CM A   2      29.411  14.815  71.146  1.00 10.00           C  
ATOM     35  O2' 5CM A   2      29.987  14.227  72.301  1.00 10.00           O  
ATOM     36  C1' 5CM A   2      28.473  15.927  71.630  1.00 10.00           C  
ATOM     37  N1  5CM A   2      27.475  16.353  70.620  1.00  1.00           N  
ATOM     38  C2  5CM A   2      26.651  15.372  70.062  1.00  1.00           C  
ATOM     39  O2  5CM A   2      26.789  14.195  70.427  1.00  1.00           O  
ATOM     40  N3  5CM A   2      25.726  15.730  69.141  1.00  1.00           N  
ATOM     41  C4  5CM A   2      25.610  17.009  68.776  1.00  1.00           C  
ATOM     42  N4  5CM A   2      24.685  17.316  67.863  1.00  1.00           N  
ATOM     43  C5  5CM A   2      26.437  18.028  69.328  1.00  1.00           C  
ATOM     44  C5A 5CM A   2      26.359  19.547  68.947  1.00  1.00           C  
ATOM     45  C6  5CM A   2      27.347  17.660  70.238  1.00  1.00           C

General discussions (Q&As) / Data files for Table 3 of the standard base-reference frame article

« on: September 18, 2012, 10:32:15 pm »

Table 3 of the Olson et al. (2001) "standard base-reference frame article" lists the mean values and standard deviations of base geometric parameters for high resolution A-DNA and B-DNA crystal structures, as shown below.

The selection criteria of the A- and B-DNA datasets have recently been reported in the thread "Datasets and scripts for reproducing Figure 5 of the 3DNA NAR03 paper". For the sake for easy reference and completeness, here is the note again:

Quote

Selection Criteria:
   NDB ID: ad OR bd
   Classification: DNA
   Structure Description: Double Helix
   Conformation Type: A OR B
   No Drug, No Mismatch
   No Modifiers (Base/Sugar/Phosphate)
   Resolution better than 2.0 A
   =======================
   34 A-DNA and 27 B-DNA

For B-DNA, delete bd0012, bd0013 & bdf068 (following HMB)
   bd0001 bd0006_A
   bd0014: coordinates from PDB 463D
   bd0005 bd0016_A (with repeated atoms!)
   bd0018 bd0019 bdj017 bdj019 bdj025 bdj031 bdj036 bdj037 bdj051
   bdj052 bdj060 bdj061 
   bdj081 (Uses helix #1 with strands A and B. The other two are
           disordered)
   bdl001 bdl005 bdl020 bdl084
   bd0023_A  bd0029
   -------------------------- 27-3=24 structures

For A-DNA
   ad0002 ==> (ad0002_AB + ad0002_CD)
   ad0003 ad0004 adh008 adh010 adh0102 adh0103 adh0104 adh0105
   adh014 adh026 adh027 adh029 adh033 adh034 adh038 adh039 adh047
   adh070 adh078 adj0102 adj0103 adj0112 adj0113 adj022 adj049
   adj050 adj051 adj065 adj066 adj067 adj075 
   adl025 (suspicious! big Buckle, alternating Propeller)
   adl047 (with B-steps, not good either!)
   -------------------------- 34+1-2=33 structures

Outliers:
  A-DNA: ad0002_CD, steps 3-4,   bps 3-4-5
         ad0004,    steps 3-4-5, bps 3-4-5-6
  B-DNA: bdj025,    step 3,      bps 3-4
         bdj031,    step 3,      bps 3-4
         bdj037,    step 3,      bps 3-4

The six data files themselves are attached below; here the A- prefix is for A-DNA, and B- prefix for B-DNA:

'A-base-pair.dat' and 'B-base-pair.dat' contain the base-pair parameters in the order of Shear, Stretch, Stagger, Buckle, Propeller, and Opening.
'A-step-pars.dat' and 'B-step-pars.dat' contain the step parameters in the order of Shift, Slide, Rse, Tilt, Roll and Twist.
'A-heli-pars.dat' and 'B-heli-pars.dat' contain the helical parameters in the order of x-displacement, y-displacement, Helical rise, Inclination, Tip, and Helical twist.

While the Table content is derived from NDB entries with only Watson-Crick base pairs in A- and B-DNA duplexes, it serves as a reference for identifying/quantifying non-canonical (mismatched) pairs by taking advantage the base-pair parameters. This approach is rigorous in its description of the relative base geometry in a pair, and is distinct from and complement with the Leontis-Westhof classification scheme.

General discussions (Q&As) / How to identify triplets, quadruplets and higher-order base associations

« on: August 16, 2012, 11:48:16 pm »

The find_pair -p option can find all base pairs and higher-order base associations. I implemented this option early on in 3DNA v1.x; yet in the 2003 Nucleic Acids Research paper and the corresponding find_pair -h output for v1.5, I deliberately omitted mentioning this functionality. I was hoping to further refine the algorithm/implementation, and to write up a detailed method paper on find_pair, a core 3DNA component. After leaving Rutgers nearly a decade ago, I've continuously maintained and refined 3DNA. However, for various reasons, up to now I've not been able to finish the long overdue 'technical' manuscript.

Over the years, numerous RNA structural bioinformatics resources have taken advantage of the functionality provided by find_pair; RNAView & BPS, two Rugters-based tools, are based directly on early versions of the program. It was only in the 3DNA 2008 Nature Protocols paper that I first illustrated the functionality of the find_pair -p option, in the protocol "identification of higher-order base associations in ribosomal RNA". This post provides further detailed examples so 3DNA users can take better advantage of this still underused functionality, useful in RNA structure related applications.

Let's create a new directory (folder), named 'find_pair-p-examples', and change to that directory. Now the directory is empty (check with ls).

Code: [Select]

mkdir find_pair-p-examples
cd find_pair-p-examples
ls

As an example, here we use the crystal structure of an RNA tetraplex (UGAGGU)₄ with A-tetrads, G-tetrads, U-tetrads and G-U octads: NDB id: ur0023; PDB id: 1j6s (see figure below). The structure was solved by Sundaralingam et al. at 1.4 Ångstroms resolution [Structure. 2003 Jul; 11(7):815-23]. Its asymmetric unit contains 4 single chains.The NDB/PDB provides 4 biological assemblies, each consisting of 4 identical chains from the asymmetric unit. Download biological assembly 1 from the NDB (or the PDB, if you prefer; but notice the case difference in PDB id):

Code: [Select]

wget ftp://ndbserver.rutgers.edu/NDB/coordinates/na-biol/1j6s.pdb1
Run find_pair -p on '1j6s.pdb1'. Note the -all_model option; by default, 3DNA programs (such as find_pair) handle only the first model (structure) in a given PDB data file.

Code: [Select]

find_pair -p -all_model 1j6s.pdb1 1j6s.mbp
At the end of output file '1j6s.mbp', one can see the following identified multiplets: one octad and three tetrads:

Code: [Select]

    1: #8 [1]...1>A:...1_:[BRU]u + [2]...1>A:...2_:[..G]G + [47]...2>A:...1_:[BRU]u + [48]...2>A:...2_:[..G]G + [93]...3>A:...1_:[BRU]u + [94]...3>A:...2_:[..G]G + [139]...4>A:...1_:[BRU]u + [140]...4>A:...2_:[..G]G
    2: #4 [3]...1>A:...3_:[..A]A + [49]...2>A:...3_:[..A]A + [95]...3>A:...3_:[..A]A + [141]...4>A:...3_:[..A]A
    3: #4 [4]...1>A:...4_:[..G]G + [50]...2>A:...4_:[..G]G + [96]...3>A:...4_:[..G]G + [142]...4>A:...4_:[..G]G
    4: #4 [5]...1>A:...5_:[..G]G + [51]...2>A:...5_:[..G]G + [97]...3>A:...5_:[..G]G + [143]...4>A:...5_:[..G]G

Among other outputs, there is also a file named 'multiplets.pdb' which contains the atomic coordinates of the corresponding multiplets, each oriented in its most-extended view. The base multiplets can be extracted with ex_str and then converted to .r3d format for Raster3D or PyMol rendering (see also post "What can 3DNA do for RNA structures?" for more examples).

Code: [Select]

ex_str -1 multiplets.pdb oct.pdb
r3d_atom -od -r=0.1 -b=0.2 oct.pdb stdout | render -png > oct.png

ex_str -2 multiplets.pdb A-tetrad.pdb
r3d_atom -od -r=0.1 -b=0.2 A-tetrad.pdb stdout | render -png > A-tetrad.png

ex_str -3 multiplets.pdb G-tetrad.pdb
r3d_atom -od -r=0.1 -b=0.2 G-tetrad.pdb stdout | render -png > G-tetrad.png

The three png images ('oct.png', 'A-tetrad.png' and 'G-tetrad.png') as generated directly above are attached below.

Site announcements / The forum is shaping up nicely

« on: April 24, 2012, 03:09:31 pm »

Ever since new 3DNA forum was made public in early March, it is shaping up quite nicely. As demonstrated by the statistics, the number of registrations and posts have increased significantly thereafter. The following table is a snapshot of the section "Forum History (using forum time offset)" (at the bottom of the statistics page) as of today while I am writing this post:

Yearly Summary     New Topics  New Posts   New Members   Most Online
 2012                  39          163          114            11
    April 2012         7           60           44             9
    March 2012         13          55           61             11
    February 2012      13          38           7              6
    January 2012       6           10           2              6
 2011                  3           6            8              6

Note that most of the posts in February were composed by myself in preparing the forum for public release.

As made clear in my initial welcome message, the forum was created to make 3DNA-related discussions archived, searchable, and viewable to the public (without registration). With support from the community at large, enthusiastic users in particular, 3DNA forum is functioning well as expected -- thank you! As 3DNA enjoys wider recognition it deserves, the forum is more than likely to become more active, and potentially turns into "an online community for DNA/RNA structural bioinformatics."

I'd like to emphasize again that any 3DNA-related questions are welcome and should be directed to this 3DNA forum. As always, I strive to provide a prompt and concrete response to each and every question posted here. No email or private forum message, please -- by asking your questions on the public 3DNA Forum, you are benefiting not only yourself but also the whole user community.

Xiang-Jun

FAQs / How to calculate DNA bending angle?

« on: March 21, 2012, 02:10:48 pm »

DNA bending angle is a frequently used parameter in the literature, often associated with DNA-protein complexes. Nevertheless, 3DNA does not provide a direct measure of the "bending angle" in its output file of structural parameters. The topic is more subtle and complicated than it appears.

On its face, an angle is defined by two vectors; let's call them a and b, and if each is normalized, then the angle (in degrees) between them is: acos(dot(a, b)) * 180/pi. Geometrically, after moving the tails of the two vectors into the same position (e.g., origin), the heads would normally define a plane, unless a and b are strictly parallel (0°) or anti-parallel (180°).

DNA structures are three-dimensional, normally far more complicated than a single number can quantify. The concept of DNA bending angle, as I understand it, is only applicable to DNA structures with two relatively straight fragments (as in CAP-DNA complexes). Under such situations, one can fit a least-squares (LS) linear helical axis to each of the two fragments, and calculate the angle between them. Towards this end, 3DNA outputs the following section when it judges that the input structure is not strongly curved. Using 355d/bdl084, which is distributed with 3DNA, as an example:

Code: [Select]

Global linear helical axis defined by equivalent C1' and RN9/YN1 atom pairs
Deviation from regular linear helix: 3.30(0.52)
Helix:    -0.127  -0.275  -0.953
HETATM 9998  XS    X X 999      17.536  25.713  25.665
HETATM 9999  XE    X X 999      12.911  15.677  -9.080
Average and standard deviation of helix radius:
P: 9.42(0.82), O4': 6.37(0.85),  C1': 5.85(0.86)

Where the Helix: line gives the normalized vector along the "best-fit" helical axis. The two HETATM records provides the two end points of the helix, and they are directly related to the Helix: line by a simple equation. Following the above example, we have (Octave/Matlab code):

Code: [Select]

XE = [12.911  15.677  -9.080];
XS = [17.536  25.713  25.665];

dd = XE - XS
%   -4.6250  -10.0360  -34.7450

Helix = dd / norm(dd)
%  -0.12685  -0.27526  -0.95296  ==> [-0.127  -0.275  -0.953]

With the two HETATM records, one can easily add them into the original PDB file to display the helical axis using a molecular graphics programs (e.g., RasMol, Jmol or PyMOL). Moreover, the two helix vectors can be used to reorient the original PDB structure into a view so that one helical fragment lies along the x-axis, and the other in the xy-plane. As documented in detail in recipes #4 on "Automatic identification of double-helical regions in a DNA–RNA junction" of the 2008 3DNA Nature Protocols paper, "The chosen view allows for easy visualization and protractor measurement of the overall bending angle between the two relatively straight helices."

The following points are well worth noting:

The LS fitting procedure used in 3DNA follows SCHNAaP, which was based on the algorithm in the well-known NewHelix program, maintained by Dr. Richard Dickerson upto the 1990s. While fitting a global linear helical axis to strongly curved DNA structures makes no sense with derived parameters (NewHelix itself has been replaced by FreeHelix, also from Dickerson), I do believe it is meaningful to fit a linear helix to a relatively straight DNA fragment. That's why I have kept this functionality in SCHNAaP and 3DNA; 3DNA bending angle calculation serves as an example illustrating the point – it provides an "intuitive" way for biologists to understand how the bending angle is calculated; it can actually be measured directly.
Instead of directly LS-fitting a linear helical axis with 3DNA, one can alternatively superimpose a regular fiber model into the DNA fragment, and then derive the straight helical axis from the fitted coordinates. The two approaches normally gives slightly different numerical values, as would be expected.
Overall, bending angle is (at most) an approximate measure of DNA curvature. In my opinion, the concept is only applicable for comparing a set of structures, each with two relatively straight helical fragments. Even in such cases, the relative spatial relationship between two segments is more complicated than a simple (bending) angle could quantify. Be watchful – do not exaggerate the significance of small variations in bending angle.

FAQs / How to handle modified (uncommon) bases?

« on: March 20, 2012, 09:40:19 pm »

In 3DNA, modified bases are mapped to their standard counterparts, e.g. 5‐iodouracil (5IU) to uracil (U) and 1‐methyladenine (1MA) to adenine (A), and are designated with lower case letters (as u and a respectively for the examples cited above). Technically, the mapping is stored in file $X3DNA/config/baselist.dat, and looks like this:

Code: [Select]

  A     A
 DA     A
ADE     A
....
5IU     u      # I connected to C5
....
1MA     a      # C connected to N1

Each mapped one-letter base (X = A/C/G/T/U for the standard nucleotides and x = a/c/g/t/u for the modified ones) has a corresponding Atomic_X.pdb (or Atomic.x.pdb) file oriented in the standard base reference frame. By default, the two sets (X and x) are identical, i.e., Atomic_A.pdb has the same content as Atomic.a.pdb. The mapping information is used in a ls-fitting procedure to define the base reference frame for each nucleotide in a PDB file, and allows for easy analysis of unusual DNA and RNA structures.

As of v2.1, when encountering a new modified base, 3DNA will automatically perform the mapping, and outputs the following message (using a contrived example):

Code: [Select]

Match '2MG' to 'g' for residue 2MG   10  on chain A [#1]
    check it & consider to add line '2MG     g' to file <baselist.dat>

Simply adding a line containing 2MG g to file baselist.dat and the above info message will be gone. This is a contrived example because I deliberately deleted that line from baselist.dat for this illustration.

I implemented this auto-mapping as an experimental feature at least back in v1.5, but did not document it for public use. My experience over the years has shown that the auto-mapping is functioning as designed. Now with this feature set by default, processing of large datasets can be fully automated. Moreover, using find_pair, it is easy to get a complete list of modified bases in a dataset, e.g., in all the NDB entires.

FAQs / How to fix missing (superfluous) base pairs identified by find_pair?

« on: March 20, 2012, 05:38:27 pm »

Structural analysis of nucleic acid used to be a rather tedious process, especially for irregular, complicated RNA structures and nucleic acid-protein complexes (e.g., the large ribosomal subunit 1jj2/rr0033). Without valid base-pairing information as input, the various analysis software will produce meaningless results. The program find_pair was originally created to solve this specific problem, by generating input file to 3DNA analysis routines (analyze/cehs) directly from a PDB file.

In its core, find_pair uses a pure geometric approach to identify all possible pairs (Watson-Cricks or non-canonical pairs actually exist in a structure), their H-bonding patterns and helix context. Specifically, the major criteria used are as follows:

The distance between the origins of the two bases (as defined by their standard reference frames) must be less than certain limit (15.0 Å by default) - otherwise, they would be too far away to be called a pair.
The vertical separation (i.e., stagger) between the two base planes must be less than certain limit (2.5 Å by default) - otherwise, they would be stacking instead of pairing.
The angle between the two base z-axes (i.e., their normal vectors) is less than a cut-off (65.0° by default).
There is at least one pair of nitrogen/oxygen base atoms that are within a H-bonding cut off distance (4.0 Å by default).

If two bases fulfill these geometric requirements, they are defined to be a pair, without taking consideration of their chemical constituents. Thus our method allows for identification of unconventional pairs as easily as the canonical ones. The program then checks for possible H-bonding patterns, whether the normal donor-acceptor (noted by '-' as in O6 - N4 for a G·C pair) or the unusual donor-donor, acceptor-acceptor (noted by '*' as in O2 * N3 for a C·C pair in urx057). The non-canonical pairs, especially those with unusual H-bonding patterns, should be checked more carefully - they could be due to errors in structure determination, or they could have some special meaning/significance unnoticed previously.

The default criteria mentioned above are based on a survey of the NDB structures. Generally speaking, they are pretty generous and work quite well in the most common cases we've encountered. However, we are aware of the possibilities of special cases where some of them might be too restrict or too generous, thus leading to find_pair to miss or produce superfluous base pairs. The default settings are stored in a text file named misc_3dna.par under the directory $X3DNA/config/ where users can modify as they see fit. Changes in that directory will have a global effect - wherever you run find_pair on your system, the modified values will be used. Alternately, users could make a copy of misc_3dna.par to their current working directory and change it over there for local effect. Note that the local setting has precedence over the global one.

As an example, find_pair will miss the 127th base-pair I:..53_:[.DT]T-----A[.DA]:.-53_:J in structure 1kx5/pd0287 in its default settings. This is because the H-bonding distance between T:N3 - A:N1 is 4.20 Å and that for T:O4 - A:N6 is 4.85 Å; both of them are larger than the default 4.0 Å cut off. Increasing the H-bonding criterion in file misc_3dna.par from 4.0 Å to 5.0 Å will solve this problem. Please note that in 3DNA, users can start directly from an uncompressed PDB file, without having to extract the DNA fragment first:

find_pair 1kx5.pdb 1kx5.inp to get input file for analyze
analyze 1kx5.inp to get detailed structural parameters in file 1kx5.out
The above two steps can be combined into one: find_pair 1kx5.pdb stdout | analyze stdin

In addition to (or instead of) manipulating parameters in misc_3dna.par, oftentimes it may be preferable to manually edit find_pair-generated base-piar files before feeding them into analyze/cehs. This allows for maximum flexibility as to which pair to consider in calculating 3DNA structural parameters.

Also worth noting is the -p option of find_pair: without this option, find_pair locates base pairs in double-helical regions; thus the Watson-Crick pairs take precedence over the Wobble and other non-canonical pairs. With the -p, then all pairs and higher order base associations (i.e., triplets and above) are detected.

100

FAQs / How do I build nucleic acid structures with sugar-phosphate backbone?

« on: March 20, 2012, 03:03:06 pm »

The easiest way to build a nucleic acid structure with the sugar-phosphate backbone, other than predefined fiber models, is to use the rebuild program. The backbone building scheme uses exactly the same protocol as the default for base-only model. The user needs to add the -atomic option to rebuild, and to choose the desired rigid sugar-phosphate backbone to be attached to the standard base geometry.

The four types of currently available backbone conformations are listed in the directory $X3DNA/config/atomic. To use any of these backbones, it is necessary to copy the standard nucleotide files associated with each type of backbone to $X3DNA/config or your current working directory, and to name each nucleotide as follows: Atomic_X.pdb (where X = A, C, G, T, U; or Atomic.x.pdb where x = a, c, g, t, u for modified bases). The default Atomic_X.pdb files contains only the C1' backbone atom, and the base geometry is independent of the backbone conformation.

To build a DNA structure with B-DNA backbone conformation, for example, one uses the BDNA_X.pdb set to replace Atomic_X.pdb. There is a sub-command cp_std of the Ruby utility program x3dna_utils to help with this: x3dna_utils cp_std BDNA. This will copy BDNA_X.pdb to the current working directory and rename it Atomic_X.pdb. Please note that rebuild searches for Atomic_X.pdb files first in the current working directory, and then in $X3DNA/config.

To make the above description clear, here is an example. Go to the directory $X3DNA/examples/analyze_rebuild, and try to reproduce the following:

use the command, x3dna_utils cp_std BDNA, so that you will have Atomic_X.pdb files
use find_pair bdl084.pdb | analyze, to analyze the structure bdl084 (355d) and to generate a file named bp_step.par
use rebuild -atomic bp_step.par bdl084_3dna.pdb, to generate the PDB file bdl084_3dna.pdb with a standard B-backbone

The RMSD between all atoms of the original bdl084.pdb file and the generated bdl084_3dna.pdb file is only 0.73 Å. Please note that in the rebuilt bdl084_3dna.pdb file, some O3'(i-1) to P(i) linkages can be quite long (broken). This structure, however, serves well as a starting point for further energy minimization. See post "Restraint optimization of DNA backbone geometry using PHENIX" for how to regularize the overlong bonds.

Pages: 1 2 3 [4] 5

Created and maintained by Dr. Xiang-Jun Lu [律祥俊] (xiangjun@x3dna.org)
The Bussemaker Laboratory at the Department of Biological Sciences, Columbia University.

News:

Show Posts

Topics - xiangjun

DSSR-NAR paper / Supplementary Figure 1 -- four base triplets in yeast phenylalanine tRNA (1ehz)

DSSR-NAR paper / Summary table

DSSR-NAR paper / Reproducing results published in the DSSR-NAR paper

MD simulations / MOVED: modified RNA

General discussions (Q&As) / MOVED: cif file compatibility?

RNA structures (DSSR) / DSSR: Dissecting the Spatial Structure of RNA

Site announcements / The Biophysical Society (BPS) 59th annual meeting at Baltimore

RNA structures (DSSR) / Proposed changes in DSSR v1.2

DNA/RNA-protein interactions (SNAP) / SNAP: software for characterizing DNA-protein interactions

FAQs / How do I cite DSSR?

RNA structures (DSSR) / Request for comments on DSSR output before v1.0 release

RNA structures (DSSR) / A bug with missing right-side type bulges

RNA structures (DSSR) / DSSR release history

Bug reports / MOVED: dssr issue (for modified pdb files)

RNA structures (DSSR) / Further note on DSSR

Feature requests / MOVED: list nucleotide/nucleotide contacts involving a phosphate group.

RNA structures (DSSR) / DSSR: Software for Defining the (Secondary) Structures of RNA

FAQs / How can I mutate cytosine to 5-methylcytosine

General discussions (Q&As) / Data files for Table 3 of the standard base-reference frame article

General discussions (Q&As) / How to identify triplets, quadruplets and higher-order base associations

Site announcements / The forum is shaping up nicely

FAQs / How to calculate DNA bending angle?

FAQs / How to handle modified (uncommon) bases?

FAQs / How to fix missing (superfluous) base pairs identified by find_pair?

FAQs / How do I build nucleic acid structures with sugar-phosphate backbone?