51
This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
Netiquette · Download · News · Gallery · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · Video Overview · DSSR v2.5.0 (DSSR Manual) · Homepage
2012 -- 601
2013 -- 667
2014 -- 696
2015 -- 598
![]() | ![]() |
![]() | ![]() |
Insight into the three-dimensional architecture of RNA is essential for understanding its cellular functions. However, even the classic transfer RNA structure contains features that are overlooked by existing bioinformatics tools. Here we present DSSR (Dissecting the Spatial Structure of RNA), an integrated and automated tool for analyzing and annotating RNA tertiary structures. The software identifies canonical and noncanonical base pairs, including those with modified nucleotides, in any tautomeric or protonation state. DSSR detects higher-order coplanar base associations, termed multiplets. It finds arrays of stacked pairs, classifies them by base-pair identity and backbone connectivity, and distinguishes a stem of covalently connected canonical pairs from a helix of stacked pairs of arbitrary type/linkage. DSSR identifies coaxial stacking of multiple stems within a single helix and lists isolated canonical pairs that lie outside of a stem. The program characterizes ‘closed’ loops of various types (hairpin, bulge, internal, and junction loops) and pseudoknots of arbitrary complexity. Notably, DSSR employs isolated pairs and the ends of stems, whether pseudoknotted or not, to define junction loops. This new, inclusive definition provides a novel perspective on the spatial organization of RNA. Tests on all nucleic acid structures in the Protein Data Bank confirm the efficiency and robustness of the software, and applications to representative RNA molecules illustrate its unique features. DSSR and related materials are freely available at http://x3dna.org/.
Figure 6: DSSR applies to RNA-DNA hybrid structures, such as the CRISPR Cas9-sgRNA-DNA ternary complex (chains B and C, PDB id: 4oo8 (47)). (A) The software identifies five helices (depicted by gray lines) and six stems (annotated) in the structure. The longest helix includes the RNA-DNA hybrid duplex (S1, depicted by intertwined gold-red backbone tubes) and the repeat:anti-repeat RNA stem (S2). (B) The secondary structure diagram, derived using DSSR, shows that the hybrid structure does not form a ‘closed’ junction loop. DSSR classifies the CUAG hairpin loop as a diloop (instead of a tetraloop) because the C and G form a Watson-Crick pair that closes the loop, leaving only a two-nucleotide (UA) loop segment. (C) Comparison of the CUAG diloop (center) with the UUGA diloop from a yeast Vts1p-RNA hairpin complex (referred to as part of a pentaloop(59), left) shows the remarkable similarity between the two loops despite the large difference in their base sequences. The CUAG diloop also shares common features with the NMR solution structure of the classic CUUG diloop(60) (often called a tetraloop, right), including the flipped out second position U and the stacking of the closing C–G pair over a neighboring G–C pair. The diloops differ, however, in terms of the inter-pair twist angle at the GpC dinucleotide step. These three images are oriented in the frames of the purines stacked above the terminal nucleotides (A9-left; G58-middle; G8-right) with the minor-groove edges facing the viewer.
Figure 5: DSSR pinpoints a linchpin-like U64–A85 pair that is shared by a four-way and a five-way junction loop in the S-adenosyl methionine I riboswitch (PDB id: 2gis (45)). (A) DSSR identifies two junction loops (right): a [4,0,3,0] four-way junction loop (red) and a [1,0,2,0,0] five-way junction loop (blue), which share a common side, i.e., the isolated U64–A85 pair (left). (B) The linear secondary structure diagram, annotated with DSSR-derived dot-bracket notation, depicts the pathways of the two junction loops. The four-way loop runs from C8 (*), follows the red arrows to the right, and returns along the outer G86→C8 arc. The five-way loop starts at G23 (*), moves to the right following the blue arrows along two arcs (C25→G68 and C69→G82), and returns to the start along three arcs (A85→U64, C65→G28, C29→G23). Note that the shared U64–A85 arc is traversed twice, from left to right along the four-way junction loop, and right to left along the five-way junction loop. (C) The U64–A85 pair is stabilized by base-stacking interactions in a way strikingly similar to the G2–C74 linchpin pair in the viral tRNA mimic (see Figure 3), and may also be regarded as a ‘linchpin’. These two images take advantage of unique visualization features within 3DNA/DSSR, including the capability to orient different molecules in a common frame (here, the frames of the linchpin pairs with the minor-groove edges facing the viewer) and to represent bases as color-coded rectangular blocks.
Figure 4: DSSR discloses complexity in the folding of the env22 twister ribozyme not apparent in the two-armed tertiary structure (chain A, PDB id: 4rge (43)). (A) The software automatically detects the long helical arm with five coaxially stacked stems and the short single-stemmed arm of the molecule. Failing to account for the pseudoknots within the structure leads to a characterization of the molecule very different from its real organization. When pseudoknots are omitted, the RNA appears to form a simplified [2,1,3] three-way junction as shown in both planar (B) and linear (C) secondary structure diagrams. In reality, the DSSR-derived dot-bracket notation points to a double-pseudoknotted structure (D) with two types of brackets distinguishing the pseudoknotted pairs (matched [] and {}), and uncovers a novel [4,2,2,0,1,3,0,0,1,1] ten-way junction loop (D,E). The junction, which can be traced by following the arrows along the red arcs and bases (starting from U3, marked with *) in D, contains both ends of four of the six stems and follows a supercoiled pathway in 3D (Supplementary Figure S5). In contrast, without consideration of pseudoknots (F), the junction forms a simple relaxed circle (Supplementary Figure S5). DSSR also detects three previously ignored base pairs that help to anchor the consecutive A-minor motifs reported in the literature (43) (G). U41 pairs with A42 and A43 through bifurcated hydrogen bonding, as well as with A26 (Supplementary Figure S4C,D). Moreover, U41 and A42 constitute a UpA dinucleotide platform, and in combination with G25 and A26, create a unique network of eight interacting nucleotides (G). All eight nucleotides are involved in the ten-way junction loop (labeled red in (E)).
Figure 3: DSSR reveals the striking global similarity and distinct local variations between the tRNA mimic from turnip yellow mosaic virus (PDB id: 4p5j (34)) and yeast tRNAPhe. (A) The viral tRNA mimic assumes an overall L-shaped tertiary structure (center) composed of two helices (gray lines). DSSR uncovers a [0,0,3,0,1] five-way junction loop (right) enabled by the hairpin-type pseudoknot at the 3′-end of the molecule and the G2–C74 linchpin pair. This critical linchpin is unique to the tRNA mimic, where it is stabilized by extensive base-stacking interactions (upper-left). The lower-left inset emphasizes the intricate interactions between the D- and T-loops in the mimic, including the three base pairs (within dashed ellipses) and the unique base triplet at the elbow (Supplementary Figure S3A). (B) The linear secondary structure diagram generated with the DSSR-derived dot-bracket notation shows the sequential location of the bases comprising the linchpin pair, the five-way junction loop (red), the G10–C49 pair at the elbow, and the hairpin-type pseudoknot. Note that the dashed arcs connecting the so-called first-order pseudoknotted pairs (indicated by matched []) do not cross each other along the linear sequence. The numbering of residues used here follows that in the PDB file, which is offset by two nucleotides from that given in the original publication (e.g., the G2–C74 linchpin is termed G4–C76 there).
Figure 1: Summary of steps used to identify nucleic acid structural components. (A) Nucleotides are recognized using standard atom names and base planarity. A base is taken as a pyrimidine (six-membered ring) unless it possesses one of three purine atoms (red). (B) Bases are assigned a standard reference frame independent of sequence: purines and pyrimidines (red) are symmetrically placed with respect to the sugar. (C) The standard base frame is derived from an idealized Watson-Crick base pair, where the x1, y1-axes of the sequence base align with the x2-, y2-axes of its complement (red) and define three base edges (Watson-Crick, minor groove, Major groove). (D) Base pairs are identified from the distance and coplanarity of base rings (highlighted by rectangular blocks with embedded reference frames and shaded minor-groove edges) and the occurrence of at least one hydrogen bond (dashed lines). (E) Helices are defined by base-stacking interactions. Whereas the two nearest neighbors of a terminal pair (black) lie on one side of the pair, those of a middle pair (red) lie on opposite sides. (F) Closed loops are delineated by the ends of stems and specified by the lengths of consecutive connecting loop segments. Here, the four-way junction (S1 to S4) is denoted [2,1,1,0] in terms of the loop nucleotides (white circles) running clockwise from S1 to S4. Arrows point from the 5′ to 3′ direction along each strand and dashed lines represent stem pairs.
ATOM 2 N9 G A 1 -1.289 4.551 0.000
ATOM 3 C8 G A 1 0.023 4.962 0.000
ATOM 4 N7 G A 1 0.870 3.969 0.000
ATOM 5 C5 G A 1 0.071 2.833 0.000
ATOM 6 C6 G A 1 0.424 1.460 0.000
ATOM 8 N1 G A 1 -0.700 0.641 0.000
ATOM 9 C2 G A 1 -1.999 1.087 0.000
ATOM 11 N3 G A 1 -2.342 2.364 0.001
ATOM 12 C4 G A 1 -1.265 3.177 0.000
Figure 2: DSSR captures well-known features and provides a new perspective on the classic yeast tRNAPhe structure (PDB id: 1ehz (46)). (A) The software automatically detects the four stems and the two helices that form the L-shaped molecule, depicted here in cartoon-block representation (center). Whereas the helices may include all types of base pairs and backbone breaks, the stems comprise only canonical pairs with continuous backbones. Note the coaxial stacking of the D and anti-codon stems and the noncanonical features of the composite helix (represented by a gray line, left). The red ‘circle’, overlaid on the central image and detailed to the right, reveals the 3D pathway along the [2,1,5,0] four-way junction loop. (B) The dot-bracket notation derived by DSSR serves as input for the depicted linear (arc) representation of secondary structure. The bases comprising the four-way junction loop (red) run in sequential order from U7 (*) following the arrows to the right and returning along the outer A66→U7 arc. The pseudoknotted G19–C56 pair (with matched []) is noted by the dashed arc. (C) Both the four-way junction (red) and the three hairpin loops follow ‘circular’ routes within the traditional cloverleaf representation of tRNA. Here the 14 modified nucleotides are represented by three-letter codes. The 3D images were created using PyMOL (A-red; C-yellow; G-green; T-blue; U-cyan; pseudouridine P-gray), the 2D diagrams using VARNA, and the annotations using Inkscape.
helix#1[2] bps=15
strand-1 5'-GCGGAUUcUGUGtPC-3'
bp-type ||||||||||||..|
strand-2 3'-CGCUUAAGACACaGG-5'
helix-form AA....xAAAAxx.
helical-rise: 3.00(0.90) *
helical-radius: 8.88(1.77) *
helical-axis: 0.617 0.739 -0.269 *
helix#2[2] bps=15
strand-1 5'-AAPcUGGAgCUCAGu-3'
bp-type ...||||.||||...
strand-2 3'-UcAGACCgCGAGUCU-5'
helix-form x..AAAAxAA.xxx
helical-rise: 3.07(1.12) *
helical-radius: 8.89(2.35) *
helical-axis: 0.071 0.444 0.893 *
1 # x-, y-, z-axes row-rise
0.000 0.000 0.000 # translation
0.617 0.739 -0.269 # h1
0.071 0.444 0.893 # h2
0.000 0.000 1.000 # z: can be anything
by rotation y 180The transformed PDB coordinate file 1ehz-ok.pdb is the starting point of all the following illustrations.
by rotation x 180
REMARK-DSSR: helix#1
ATOM 1 P1 G A 1 -50.221 -58.766 28.361 1.00 99.85 H1 P
REMARK-DSSR: helix#1
ATOM 2 P2 C A 56 -92.115 -58.758 28.363 1.00 37.81 H1 P
REMARK-DSSR: helix#2
ATOM 3 P1 A A 36 -70.051 -7.424 32.844 1.00 81.67 H2 P
REMARK-DSSR: helix#2
HETATM 4 P2 H2U A 16 -75.673 -49.918 32.841 1.00 64.01 H2 P
CONECT 1 2
CONECT 2 1
CONECT 3 4
CONECT 4 3
MODEL 1
REMARK model=1 nts=16
REMARK 4-way junction: nts=16; [2,1,5,0]; linked by [#1,#2,#3,#4]
ATOM 1 C1' U A 7 -65.936 -49.847 29.027 1.00 37.23 C
ATOM 2 C1' U A 8 -72.670 -44.818 30.530 1.00 30.28 C
ATOM 3 C1' A A 9 -72.606 -37.344 27.403 1.00 28.79 C
HETATM 4 C1' 2MG A 10 -66.888 -33.680 24.426 1.00 44.62 C
ATOM 5 C1' C A 25 -66.556 -29.785 34.413 1.00 51.93 C
HETATM 6 C1' M2G A 26 -66.983 -28.143 29.356 1.00 46.92 C
ATOM 7 C1' C A 27 -70.138 -25.556 25.591 1.00 48.68 C
ATOM 8 C1' G A 43 -80.779 -25.396 27.582 1.00 46.94 C
ATOM 9 C1' A A 44 -78.474 -28.381 24.234 1.00 54.14 C
ATOM 10 C1' G A 45 -75.498 -32.895 24.403 1.00 45.24 C
HETATM 11 C1' 7MG A 46 -76.230 -40.483 24.555 1.00 39.69 C
ATOM 12 C1' U A 47 -74.362 -46.762 19.557 1.00 50.55 C
ATOM 13 C1' C A 48 -75.266 -47.135 28.377 1.00 27.98 C
HETATM 14 C1' 5MC A 49 -68.564 -51.174 23.872 1.00 33.10 C
ATOM 15 C1' G A 65 -67.234 -61.378 20.695 1.00 42.23 C
ATOM 16 C1' A A 66 -64.217 -56.459 21.032 1.00 40.50 C
CONECT 1 16 2
CONECT 2 1 3
CONECT 3 2 4
CONECT 4 3 5
CONECT 5 4 6
CONECT 6 5 7
CONECT 7 6 8
CONECT 8 7 9
CONECT 9 8 10
CONECT 10 9 11
CONECT 11 10 12
CONECT 12 11 13
CONECT 13 12 14
CONECT 14 13 15
CONECT 15 14 16
CONECT 16 15 1
ENDMDL
load 1ehz-ok-jct.pdb, jctThe PyMOL options -qkc is used to generate file 1ehz-ok-jct-pymol.png from command line. Note the extra white space around the image (see below).
hide everything, jct
set sphere_color, white, jct
set sphere_scale, 0.36, jct
show spheres, jct
set stick_radius, 0.3, jct
set stick_color, red, jct
set stick_transparency, 0.46, jct
show sticks, jct
# -----------------------------------------
bg_color white
util.cbaw
set sphere_quality, 4
set stick_quality, 16
# PyMOL FAQ recommendations
set depth_cue, 0
set ray_trace_fog, 0
set ray_shadow, off
set orthoscopic, 1
set antialias, 1
# cannot be: zoom complete, 1
zoom complete=1
# -----------------------------------------
ray 1800
png 1ehz-ok-jct-pymol.png
x3dna-dssr -i=1ehz.pdb --u-turn --non-pair --po4 -o=1ehz.out
****************************************************************************
DSSR: an Integrated Software Tool for
Dissecting the Spatial Structure of RNA
v1.2.8-2015jun15 by xiangjun@x3dna.org
This program is being actively maintained and developed. As always,
I greatly appreciate your feedback! Please report all DSSR-related
issues on the 3DNA Forum (forum.x3dna.org). I strive to respond
*promptly* to *any questions* posted there.
****************************************************************************
Note: Each nucleotide is identified by model:chainId.name#, where the
'model:' portion is omitted if no model number is available (as
is often the case for x-ray crystal structures in the PDB). So a
common example would be B.A1689, meaning adenosine #1689 on
chain B. One-letter base names for modified nucleotides are put
in lower case (e.g., 'c' for 5MC). For further information about
the output notation, please refer to the DSSR User Manual.
Questions and suggestions are always welcome on the 3DNA Forum.
Command: x3dna-dssr -i=1ehz.pdb --u-turn --non-pair --po4 -o=1ehz.out
Date and time: Mon Jun 15 02:58:49 2015
File name: 1ehz.pdb
no. of DNA/RNA chains: 1 [A=76]
no. of nucleotides: 76
no. of atoms: 1821
no. of waters: 160
no. of metals: 9 [Mg=6,Mn=3]
****************************************************************************
List of 11 types of 14 modified nucleotides
nt count list
1 1MA-a 1 A.1MA58
2 2MG-g 1 A.2MG10
3 5MC-c 2 A.5MC40,A.5MC49
4 5MU-t 1 A.5MU54
5 7MG-g 1 A.7MG46
6 H2U-u 2 A.H2U16,A.H2U17
7 M2G-g 1 A.M2G26
8 OMC-c 1 A.OMC32
9 OMG-g 1 A.OMG34
10 PSU-P 2 A.PSU39,A.PSU55
11 YYG-g 1 A.YYG37
****************************************************************************
List of 34 base pairs
nt1 nt2 bp name Saenger LW DSSR
1 A.G1 A.C72 G-C WC 19-XIX cWW cW-W
2 A.C2 A.G71 C-G WC 19-XIX cWW cW-W
3 A.G3 A.C70 G-C WC 19-XIX cWW cW-W
4 A.G4 A.U69 G-U Wobble 28-XXVIII cWW cW-W
5 A.A5 A.U68 A-U WC 20-XX cWW cW-W
6 A.U6 A.A67 U-A WC 20-XX cWW cW-W
7 A.U7 A.A66 U-A WC 20-XX cWW cW-W
8 A.U8 A.A14 U-A rHoogsteen 24-XXIV tWH tW-M
9 A.U8 A.A21 U+A -- n/a tSW tm+W
10 A.A9 A.A23 A+A -- 02-II tHH tM+M
11 A.2MG10 A.C25 g-C WC 19-XIX cWW cW-W
12 A.2MG10 A.G45 g+G -- n/a cHS cM+m
13 A.C11 A.G24 C-G WC 19-XIX cWW cW-W
14 A.U12 A.A23 U-A WC 20-XX cWW cW-W
15 A.C13 A.G22 C-G WC 19-XIX cWW cW-W
16 A.G15 A.C48 G+C rWC 22-XXII tWW tW+W
17 A.H2U16 A.U59 u+U -- n/a tSW tm+W
18 A.G18 A.PSU55 G+P -- n/a tWS tW+m
19 A.G19 A.C56 G-C WC 19-XIX cWW cW-W
20 A.G22 A.7MG46 G-g -- 07-VII tHW tM-W
21 A.M2G26 A.A44 g-A Imino 08-VIII cWW cW-W
22 A.C27 A.G43 C-G WC 19-XIX cWW cW-W
23 A.C28 A.G42 C-G WC 19-XIX cWW cW-W
24 A.A29 A.U41 A-U WC 20-XX cWW cW-W
25 A.G30 A.5MC40 G-c WC 19-XIX cWW cW-W
26 A.A31 A.PSU39 A-P -- n/a cWW cW-W
27 A.OMC32 A.A38 c-A -- n/a c.W c.-W
28 A.U33 A.A36 U-A -- n/a tSH tm-M
29 A.5MC49 A.G65 c-G WC 19-XIX cWW cW-W
30 A.U50 A.A64 U-A WC 20-XX cWW cW-W
31 A.G51 A.C63 G-C WC 19-XIX cWW cW-W
32 A.U52 A.A62 U-A WC 20-XX cWW cW-W
33 A.G53 A.C61 G-C WC 19-XIX cWW cW-W
34 A.5MU54 A.1MA58 t-a rHoogsteen 24-XXIV tWH tW-M
****************************************************************************
List of 4 multiplets
1 nts=3 UAA A.U8,A.A14,A.A21
2 nts=3 AUA A.A9,A.U12,A.A23
3 nts=3 gCG A.2MG10,A.C25,A.G45
4 nts=3 CGg A.C13,A.G22,A.7MG46
****************************************************************************
List of 2 helices
Note: a helix is defined by base-stacking interactions, regardless of bp
type and backbone connectivity, and may contain more than one stem.
helix#number[stems-contained] bps=number-of-base-pairs in the helix
bp-type: '|' for a canonical WC/wobble pair, '.' otherwise
helix-form: classification of a dinucleotide step comprising the bp
above the given designation and the bp that follows it. Types
include 'A', 'B' or 'Z' for the common A-, B- and Z-form helices,
'.' for an unclassified step, and 'x' for a step without a
continuous backbone.
--------------------------------------------------------------------
helix#1[2] bps=15
strand-1 5'-GCGGAUUcUGUGtPC-3'
bp-type ||||||||||||..|
strand-2 3'-CGCUUAAGACACaGG-5'
helix-form AA....xAAAAxx.
1 A.G1 A.C72 G-C WC 19-XIX cWW cW-W
2 A.C2 A.G71 C-G WC 19-XIX cWW cW-W
3 A.G3 A.C70 G-C WC 19-XIX cWW cW-W
4 A.G4 A.U69 G-U Wobble 28-XXVIII cWW cW-W
5 A.A5 A.U68 A-U WC 20-XX cWW cW-W
6 A.U6 A.A67 U-A WC 20-XX cWW cW-W
7 A.U7 A.A66 U-A WC 20-XX cWW cW-W
8 A.5MC49 A.G65 c-G WC 19-XIX cWW cW-W
9 A.U50 A.A64 U-A WC 20-XX cWW cW-W
10 A.G51 A.C63 G-C WC 19-XIX cWW cW-W
11 A.U52 A.A62 U-A WC 20-XX cWW cW-W
12 A.G53 A.C61 G-C WC 19-XIX cWW cW-W
13 A.5MU54 A.1MA58 t-a rHoogsteen 24-XXIV tWH tW-M
14 A.PSU55 A.G18 P+G -- n/a tSW tm+W
15 A.C56 A.G19 C-G WC 19-XIX cWW cW-W
--------------------------------------------------------------------------
helix#2[2] bps=15
strand-1 5'-AAPcUGGAgCUCAGu-3'
bp-type ...||||.||||...
strand-2 3'-UcAGACCgCGAGUCU-5'
helix-form x..AAAAxAA.xxx
1 A.A36 A.U33 A-U -- n/a tHS tM-m
2 A.A38 A.OMC32 A-c -- n/a cW. cW-.
3 A.PSU39 A.A31 P-A -- n/a cWW cW-W
4 A.5MC40 A.G30 c-G WC 19-XIX cWW cW-W
5 A.U41 A.A29 U-A WC 20-XX cWW cW-W
6 A.G42 A.C28 G-C WC 19-XIX cWW cW-W
7 A.G43 A.C27 G-C WC 19-XIX cWW cW-W
8 A.A44 A.M2G26 A-g Imino 08-VIII cWW cW-W
9 A.2MG10 A.C25 g-C WC 19-XIX cWW cW-W
10 A.C11 A.G24 C-G WC 19-XIX cWW cW-W
11 A.U12 A.A23 U-A WC 20-XX cWW cW-W
12 A.C13 A.G22 C-G WC 19-XIX cWW cW-W
13 A.A14 A.U8 A-U rHoogsteen 24-XXIV tHW tM-W
14 A.G15 A.C48 G+C rWC 22-XXII tWW tW+W
15 A.H2U16 A.U59 u+U -- n/a tSW tm+W
****************************************************************************
List of 4 stems
Note: a stem is defined as a helix consisting of only canonical WC/wobble
pairs, with a continuous backbone.
stem#number[#helix-number containing this stem]
Other terms are defined as in the above Helix section.
--------------------------------------------------------------------
stem#1[#1] bps=7
strand-1 5'-GCGGAUU-3'
bp-type |||||||
strand-2 3'-CGCUUAA-5'
helix-form AA....
1 A.G1 A.C72 G-C WC 19-XIX cWW cW-W
2 A.C2 A.G71 C-G WC 19-XIX cWW cW-W
3 A.G3 A.C70 G-C WC 19-XIX cWW cW-W
4 A.G4 A.U69 G-U Wobble 28-XXVIII cWW cW-W
5 A.A5 A.U68 A-U WC 20-XX cWW cW-W
6 A.U6 A.A67 U-A WC 20-XX cWW cW-W
7 A.U7 A.A66 U-A WC 20-XX cWW cW-W
--------------------------------------------------------------------------
stem#2[#2] bps=4
strand-1 5'-gCUC-3'
bp-type ||||
strand-2 3'-CGAG-5'
helix-form AA.
1 A.2MG10 A.C25 g-C WC 19-XIX cWW cW-W
2 A.C11 A.G24 C-G WC 19-XIX cWW cW-W
3 A.U12 A.A23 U-A WC 20-XX cWW cW-W
4 A.C13 A.G22 C-G WC 19-XIX cWW cW-W
--------------------------------------------------------------------------
stem#3[#2] bps=4
strand-1 5'-CCAG-3'
bp-type ||||
strand-2 3'-GGUc-5'
helix-form AAA
1 A.C27 A.G43 C-G WC 19-XIX cWW cW-W
2 A.C28 A.G42 C-G WC 19-XIX cWW cW-W
3 A.A29 A.U41 A-U WC 20-XX cWW cW-W
4 A.G30 A.5MC40 G-c WC 19-XIX cWW cW-W
--------------------------------------------------------------------------
stem#4[#1] bps=5
strand-1 5'-cUGUG-3'
bp-type |||||
strand-2 3'-GACAC-5'
helix-form AAAA
1 A.5MC49 A.G65 c-G WC 19-XIX cWW cW-W
2 A.U50 A.A64 U-A WC 20-XX cWW cW-W
3 A.G51 A.C63 G-C WC 19-XIX cWW cW-W
4 A.U52 A.A62 U-A WC 20-XX cWW cW-W
5 A.G53 A.C61 G-C WC 19-XIX cWW cW-W
****************************************************************************
List of 1 isolated WC/wobble pair
Note: isolated WC/wobble pairs are assigned negative indices to
differentiate them from the stem numbers, which are positive.
--------------------------------------------------------------------
[#1] -1 A.G19 A.C56 G-C WC 19-XIX cWW cW-W
****************************************************************************
List of 2 coaxial stacks
1 Helix#1 contains 2 stems: [#1,#4]
2 Helix#2 contains 2 stems: [#3,#2]
****************************************************************************
List of 92 non-pairing interactions
1 A.G1 A.C2 stacking: 5.4(2.6)--pm(>>,forward) H-bonds[1]: "OP2*OP2[2.99]"
2 A.G1 A.A73 stacking: 2.4(1.2)--mm(<>,outward)
3 A.C2 A.G3 stacking: 0.5(0.0)--pm(>>,forward)
4 A.G3 A.G4 stacking: 3.2(1.8)--pm(>>,forward)
5 A.G3 A.G71 stacking: 2.6(0.3)--mm(<>,outward)
6 A.G4 A.A5 stacking: 5.6(3.5)--pm(>>,forward)
7 A.A5 A.U6 stacking: 5.9(4.3)--pm(>>,forward)
8 A.U6 A.U7 stacking: 0.6(0.0)--pm(>>,forward)
9 A.U7 A.5MC49 stacking: 1.2(0.0)--pm(>>,forward) H-bonds[1]: "O2'(hydroxyl)-OP2[2.68]"
10 A.U8 A.C13 stacking: 2.0(0.0)--pp(><,inward)
11 A.U8 A.G15 stacking: 0.5(0.0)--mm(<>,outward)
12 A.A9 A.C11 H-bonds[1]: "O2'(hydroxyl)-N4(amino)[2.90]"
13 A.A9 A.C13 H-bonds[1]: "OP2-N4(amino)[3.01]"
14 A.A9 A.G22 stacking: 0.1(0.0)--mp(<<,backward)
15 A.A9 A.G45 stacking: 1.6(0.5)--pp(><,inward)
16 A.A9 A.7MG46 stacking: 1.6(0.7)--mm(<>,outward) H-bonds[1]: "O5'-N2(amino)[3.34]"
17 A.2MG10 A.C11 stacking: 4.2(1.3)--pm(>>,forward)
18 A.2MG10 A.M2G26 stacking: 1.0(0.0)--mm(<>,outward)
19 A.C11 A.U12 stacking: 0.9(0.0)--pm(>>,forward)
20 A.U12 A.C13 stacking: 1.3(0.3)--pm(>>,forward)
21 A.A14 A.G15 stacking: 2.4(0.8)--pm(>>,forward)
22 A.A14 A.G22 stacking: 1.9(0.1)--mm(<>,outward)
23 A.G15 A.H2U16 stacking: 0.4(0.0)--pm(>>,forward)
24 A.G15 A.U59 stacking: 0.4(0.0)--pm(>>,forward)
25 A.H2U16 A.C60 stacking: 1.4(0.0)--pm(>>,forward) H-bonds[1]: "O2'(hydroxyl)-N3[3.46]"
26 A.H2U17 A.G18 H-bonds[1]: "O2'(hydroxyl)-OP1[2.97]"
27 A.G18 A.G57 stacking: 4.3(1.5)--pp(><,inward) H-bonds[3]: "O3'-N2(amino)[3.29],O2'(hydroxyl)-N1(imino)[3.04],O2'(hydroxyl)-N2(amino)[2.71]"
28 A.G18 A.1MA58 stacking: 8.3(3.6)--mm(<>,outward) H-bonds[2]: "N2(amino)-O5'[3.22],N2(amino)-O4'[3.11]"
29 A.G19 A.G57 stacking: 3.3(0.9)--mm(<>,outward) H-bonds[1]: "O4'-N2(amino)[3.17]"
30 A.G19 A.C60 H-bonds[1]: "OP1-N4(amino)[3.27]"
31 A.G20 A.A21 H-bonds[1]: "OP1*OP2[2.74]"
32 A.G20 A.G22 H-bonds[1]: "N2(amino)-O4'[3.24]"
33 A.A21 A.G22 H-bonds[1]: "O2'(hydroxyl)-O4'[3.44]"
34 A.A21 A.7MG46 stacking: 5.0(2.1)--pp(><,inward)
35 A.A21 A.C48 stacking: 5.9(2.9)--mm(<>,outward)
36 A.G22 A.A23 stacking: 1.1(0.1)--pm(>>,forward)
37 A.A23 A.G24 stacking: 4.1(3.3)--pm(>>,forward)
38 A.G24 A.C25 stacking: 7.5(4.2)--pm(>>,forward)
39 A.C25 A.M2G26 stacking: 2.0(1.0)--pm(>>,forward)
40 A.M2G26 A.C27 stacking: 6.8(3.6)--pm(>>,forward)
41 A.C27 A.C28 stacking: 0.9(0.1)--pm(>>,forward)
42 A.C28 A.G43 stacking: 0.2(0.0)--mm(<>,outward)
43 A.A29 A.G30 stacking: 2.4(2.2)--pm(>>,forward)
44 A.A29 A.G42 stacking: 2.8(1.6)--mm(<>,outward)
45 A.G30 A.A31 stacking: 6.3(3.5)--pm(>>,forward)
46 A.G30 A.U41 stacking: 0.8(0.0)--mm(<>,outward)
47 A.A31 A.OMC32 stacking: 6.2(4.1)--pm(>>,forward)
48 A.OMC32 A.U33 stacking: 3.6(1.3)--pm(>>,forward)
49 A.U33 A.A35 H-bonds[1]: "O2'(hydroxyl)-N7[2.37]"
50 A.U33 A.YYG37 H-bonds[1]: "O2'(hydroxyl)-O22[3.41]"
51 A.OMG34 A.A35 stacking: 6.0(4.1)--pm(>>,forward) H-bonds[1]: "O2'(hydroxyl)-O4'[3.33]"
52 A.A35 A.A36 stacking: 4.7(2.1)--pm(>>,forward)
53 A.A36 A.YYG37 stacking: 5.3(3.9)--pm(>>,forward) H-bonds[4]: "O2'(hydroxyl)-O4'[2.49],N6(amino)-O17[3.25],N6(amino)*N20[2.94],N6(amino)-O22[3.25]"
54 A.YYG37 A.A38 stacking: 7.7(3.5)--pm(>>,forward)
55 A.A38 A.PSU39 stacking: 5.9(4.1)--pm(>>,forward)
56 A.PSU39 A.5MC40 stacking: 5.4(1.1)--pm(>>,forward)
57 A.G42 A.G43 stacking: 3.3(1.8)--pm(>>,forward)
58 A.G43 A.A44 stacking: 4.7(2.9)--pm(>>,forward)
59 A.A44 A.G45 stacking: 5.4(2.5)--pm(>>,forward)
60 A.7MG46 A.C48 H-bonds[1]: "O2'(hydroxyl)-OP2[3.55]"
61 A.U47 A.5MC49 H-bonds[1]: "O2'(hydroxyl)-O3'[3.21]"
62 A.U47 A.U50 H-bonds[1]: "O2'(hydroxyl)-OP1[2.71]"
63 A.C48 A.5MC49 H-bonds[1]: "O2'(hydroxyl)-OP1[3.13]"
64 A.C48 A.U59 H-bonds[1]: "O2'(hydroxyl)-O2'(hydroxyl)[3.07]"
65 A.U50 A.G51 stacking: 0.4(0.0)--pm(>>,forward)
66 A.U50 A.G65 stacking: 0.4(0.0)--mm(<>,outward)
67 A.G51 A.U52 stacking: 6.8(4.0)--pm(>>,forward)
68 A.G51 A.A64 stacking: 2.5(1.1)--mm(<>,outward)
69 A.G53 A.5MU54 stacking: 7.9(3.4)--pm(>>,forward)
70 A.G53 A.A62 stacking: 4.2(2.0)--mm(<>,outward)
71 A.5MU54 A.PSU55 stacking: 5.7(2.2)--pm(>>,forward)
72 A.PSU55 A.G57 H-bonds[1]: "O2'(hydroxyl)-N7[2.72]"
73 A.PSU55 A.1MA58 H-bonds[1]: "N3-OP2[2.77]"
74 A.C56 A.G57 stacking: 1.9(1.2)--pm(>>,forward)
75 A.1MA58 A.C60 H-bonds[1]: "O2'(hydroxyl)-OP2[2.42]"
76 A.1MA58 A.C61 stacking: 4.8(1.3)--pm(>>,forward)
77 A.U59 A.C60 stacking: 6.7(4.2)--pm(>>,forward)
78 A.C60 A.C61 H-bonds[1]: "OP1-N4(amino)[3.12]"
79 A.A62 A.C63 stacking: 4.7(3.0)--pm(>>,forward)
80 A.C63 A.A64 stacking: 0.6(0.0)--pm(>>,forward)
81 A.A64 A.G65 stacking: 4.0(2.9)--pm(>>,forward)
82 A.G65 A.A66 stacking: 3.3(1.7)--pm(>>,forward)
83 A.A66 A.A67 stacking: 4.7(3.9)--pm(>>,forward)
84 A.A67 A.U68 stacking: 4.5(3.1)--pm(>>,forward)
85 A.U68 A.U69 stacking: 2.6(1.0)--pm(>>,forward)
86 A.U69 A.C70 stacking: 0.4(0.0)--pm(>>,forward) H-bonds[1]: "O2'(hydroxyl)-O4'[3.16]"
87 A.C70 A.G71 stacking: 1.4(0.2)--pm(>>,forward)
88 A.G71 A.C72 stacking: 7.4(4.2)--pm(>>,forward)
89 A.C72 A.A73 stacking: 0.3(0.1)--pm(>>,forward)
90 A.A73 A.C74 stacking: 6.0(4.0)--pm(>>,forward)
91 A.C74 A.C75 stacking: 4.8(2.5)--pm(>>,forward)
92 A.C75 A.A76 H-bonds[1]: "O5'*OP1[3.27]"
****************************************************************************
List of 11 stacks
Note: a stack is an ordered list of nucleotides assembled together via
base-stacking interactions, regardless of backbone connectivity.
Stacking interactions within a stem are *not* included.
--------------------------------------------------------------------
1 nts=2 Uc A.U7,A.5MC49
2 nts=2 UC A.U8,A.C13
3 nts=2 GA A.G65,A.A66
4 nts=3 CgC A.C25,A.M2G26,A.C27
5 nts=3 gAC A.7MG46,A.A21,A.C48
6 nts=3 GtP A.G53,A.5MU54,A.PSU55
7 nts=4 GACC A.G1,A.A73,A.C74,A.C75
8 nts=4 GAcU A.G30,A.A31,A.OMC32,A.U33
9 nts=5 GGGaC A.G19,A.G57,A.G18,A.1MA58,A.C61
10 nts=7 gAAgAPc A.OMG34,A.A35,A.A36,A.YYG37,A.A38,A.PSU39,A.5MC40
11 nts=9 GAGAGAGUC A.G43,A.A44,A.G45,A.A9,A.G22,A.A14,A.G15,A.U59,A.C60
-----------------------------------------------------------------------
Nucleotides not involved in stacking interactions
nts=4 uGUA A.H2U17,A.G20,A.U47,A.A76
****************************************************************************
Note: for the various types of loops listed below, numbers within the first
set of brackets are the number of loop nts, and numbers in the second
set of brackets are the identities of the stems (positive number) or
isolated WC/wobble pairs (negative numbers) to which they are linked.
****************************************************************************
List of 3 hairpin loops
1 hairpin loop: nts=10; [8]; linked by [#2]
nts=10 CAGuuGGGAG A.C13,A.A14,A.G15,A.H2U16,A.H2U17,A.G18,A.G19,A.G20,A.A21,A.G22
nts=8 AGuuGGGA A.A14,A.G15,A.H2U16,A.H2U17,A.G18,A.G19,A.G20,A.A21
2 hairpin loop: nts=11; [9]; linked by [#3]
nts=11 GAcUgAAgAPc A.G30,A.A31,A.OMC32,A.U33,A.OMG34,A.A35,A.A36,A.YYG37,A.A38,A.PSU39,A.5MC40
nts=9 AcUgAAgAP A.A31,A.OMC32,A.U33,A.OMG34,A.A35,A.A36,A.YYG37,A.A38,A.PSU39
3 hairpin loop: nts=9; [7]; linked by [#4]
nts=9 GtPCGaUCC A.G53,A.5MU54,A.PSU55,A.C56,A.G57,A.1MA58,A.U59,A.C60,A.C61
nts=7 tPCGaUC A.5MU54,A.PSU55,A.C56,A.G57,A.1MA58,A.U59,A.C60
****************************************************************************
List of 1 junction
1 4-way junction: nts=16; [2,1,5,0]; linked by [#1,#2,#3,#4]
nts=16 UUAgCgCGAGgUCcGA A.U7,A.U8,A.A9,A.2MG10,A.C25,A.M2G26,A.C27,A.G43,A.A44,A.G45,A.7MG46,A.U47,A.C48,A.5MC49,A.G65,A.A66
nts=2 UA A.U8,A.A9
nts=1 g A.M2G26
nts=5 AGgUC A.A44,A.G45,A.7MG46,A.U47,A.C48
nts=0
****************************************************************************
List of 1 non-loop single-stranded segment
1 nts=4 ACCA A.A73,A.C74,A.C75,A.A76
****************************************************************************
List of 1 kissing loop interaction
1 isolated-pair #-1 between hairpin loops #1 and #3
****************************************************************************
List of 2 U-turns
1 A.U33-A.A36 H-bonds[1]: "N3(imino)-OP2[2.80]" nts=6 cUgAAg A.OMC32,A.U33,A.OMG34,A.A35,A.A36,A.YYG37
2 A.PSU55-A.1MA58 H-bonds[1]: "N3-OP2[2.77]" nts=6 tPCGaU A.5MU54,A.PSU55,A.C56,A.G57,A.1MA58,A.U59
****************************************************************************
List of 18 phosphate interactions
1 A.U7 OP1-hbonds[1]: "MG@A.MG580[2.60]"
2 A.A9 OP2-hbonds[1]: "N4@A.C13[3.01]"
3 A.A14 OP2-hbonds[1]: "MG@A.MG580[1.93]"
4 A.H2U16 OP2-cap: "A.H2U16"
5 A.G18 OP1-hbonds[1]: "O2'@A.H2U17[2.97]"
6 A.G19 OP1-hbonds[2]: "N4@A.C60[3.27],MN@A.MN530[2.19]"
7 A.G20 OP1-hbonds[1]: "MG@A.MG540[2.07]"
8 A.A21 OP2-hbonds[1]: "MG@A.MG540[2.11]"
9 A.A23 OP2-hbonds[1]: "N6@A.A9[3.12]"
10 A.A35 OP2-cap: "A.U33"
11 A.A36 OP2-hbonds[1]: "N3@A.U33[2.80]"
12 A.YYG37 OP2-hbonds[1]: "MG@A.MG590[2.53]"
13 A.C48 OP2-hbonds[1]: "O2'@A.7MG46[3.55]"
14 A.5MC49 OP1-hbonds[1]: "O2'@A.C48[3.13]" OP2-hbonds[1]: "O2'@A.U7[2.68]"
15 A.U50 OP1-hbonds[1]: "O2'@A.U47[2.71]"
16 A.G57 OP2-cap: "A.PSU55"
17 A.1MA58 OP2-hbonds[1]: "N3@A.PSU55[2.77]"
18 A.C60 OP1-hbonds[1]: "N4@A.C61[3.12]" OP2-hbonds[1]: "O2'@A.1MA58[2.42]"
****************************************************************************
This structure contains 1-order pseudoknot
o You may want to run DSSR again with the '--nested' option which removes
pseudoknots to get a fully nested secondary structure representation.
****************************************************************************
Secondary structures in dot-bracket notation (dbn) as a whole and per chain
>1ehz nts=76 [whole]
GCGGAUUUAgCUCAGuuGGGAGAGCgCCAGAcUgAAgAPcUGGAGgUCcUGUGtPCGaUCCACAGAAUUCGCACCA
(((((((..((((.....[..)))).((((.........)))).....(((((..]....))))))))))))....
>1ehz-A #1 nts=76 [chain] RNA
GCGGAUUUAgCUCAGuuGGGAGAGCgCCAGAcUgAAgAPcUGGAGgUCcUGUGtPCGaUCCACAGAAUUCGCACCA
(((((((..((((.....[..)))).((((.........)))).....(((((..]....))))))))))))....
****************************************************************************
List of 12 additional files
1 dssr-stems.pdb -- an ensemble of stems
2 dssr-helices.pdb -- an ensemble of helices (coaxial stacking)
3 dssr-pairs.pdb -- an ensemble of base pairs
4 dssr-multiplets.pdb -- an ensemble of multiplets
5 dssr-hairpins.pdb -- an ensemble of hairpin loops
6 dssr-junctions.pdb -- an ensemble of junctions (multi-branch)
7 dssr-2ndstrs.bpseq -- secondary structure in bpseq format
8 dssr-2ndstrs.ct -- secondary structure in connect table format
9 dssr-2ndstrs.dbn -- secondary structure in dot-bracket notation
10 dssr-torsions.txt -- backbone torsion angles and suite names
11 dssr-Uturns.pdb -- an ensemble of U-turn motifs
12 dssr-stacks.pdb -- an ensemble of stacks
List of 11 types of 14 modified nucleotides
nt count list
1 1MA-a 1 A.1MA58
2 2MG-g 1 A.2MG10
3 5MC-c 2 A.5MC40,A.5MC49
4 5MU-t 1 A.5MU54
5 7MG-g 1 A.7MG46
6 H2U-u 2 A.H2U16,A.H2U17
7 M2G-g 1 A.M2G26
8 OMC-c 1 A.OMC32
9 OMG-g 1 A.OMG34
10 PSU-P 2 A.PSU39,A.PSU55
11 YYG-g 1 A.YYG37
List of 34 base pairs
nt1 nt2 bp name Saenger LW DSSR
1 A.G1 A.C72 G-C WC 19-XIX cWW cW-W
2 A.C2 A.G71 C-G WC 19-XIX cWW cW-W
3 A.G3 A.C70 G-C WC 19-XIX cWW cW-W
4 A.G4 A.U69 G-U Wobble 28-XXVIII cWW cW-W
5 A.A5 A.U68 A-U WC 20-XX cWW cW-W
6 A.U6 A.A67 U-A WC 20-XX cWW cW-W
7 A.U7 A.A66 U-A WC 20-XX cWW cW-W
8 A.U8 A.A14 U-A rHoogsteen 24-XXIV tWH tW-M
......
With the --more option, it would becomeList of 34 base pairs
nt1 nt2 bp name Saenger LW DSSR
1 A.G1 A.C72 G-C WC 19-XIX cWW cW-W
[-167.8(anti) ~C3'-endo lambda=51.3] [-161.6(anti) ~C3'-endo lambda=56.2]
d(C1'-C1')=10.58 d(N1-N9)=8.85 d(C6-C8)=9.75 tor(C1'-N1-N9-C1')=-0.7
H-bonds[3]: "O6(carbonyl)-N4(amino)[2.83],N1(imino)-N3[2.88],N2(amino)-O2(carbonyl)[2.84]"
bp-pars: [-0.55 -0.28 -0.43 -6.30 -9.83 -0.70]
2 A.C2 A.G71 C-G WC 19-XIX cWW cW-W
[-163.8(anti) ~C3'-endo lambda=53.0] [-162.8(anti) ~C3'-endo lambda=52.7]
d(C1'-C1')=10.83 d(N1-N9)=9.06 d(C6-C8)=9.93 tor(C1'-N1-N9-C1')=-8.3
H-bonds[3]: "O2(carbonyl)-N2(amino)[3.01],N3-N1(imino)[2.97],N4(amino)-O6(carbonyl)[2.86]"
bp-pars: [0.13 -0.08 0.03 -7.96 -10.30 -2.67]
......
Figure S9: Images of 15 diloops (GGUC, CARG, CUUG, CUAG, and UUKA) identified by DSSR in the NR3A-dataset. The diloops can be categorized into five groups by base sequence: GGUC, where the second position G is flipped away from the closing pair; CARG, where the second position A is extruded into the minor-groove side of the closing pair; CUUG, which shows structural variations in the three crystallographic examples and differences from their NMR solution counterpart (PDB id: 1rng, Figure 6C); CUAG, where all four cases occur in Cas9 complexes either without (PDB id: 4oo8) or with (PDB ids: 4un3 and 4un5) a protospacer adjacent motif; and UUKA, where the two cases are quite distinct.
4oo8 2 nts=4 CUAG B.C55,B.U56,B.A57,B.G58 C3',C2',C2',C3' anti,anti,anti,anti
Figure S8: The linear (arc) secondary structure diagram of the RNA-DNA hybrid structure in the CRISPR Cas9-sgRNA-DNA ternary complex (PDB id: 4oo8), annotated with DSSR-derived dot-bracket notation and key structural elements. The target DNA base se- quence is colored red, and the chain switch from sgRNA to DNA is marked by the dotted vertical line. DSSR detects no junction loops in this hybrid structure because of the chain break.
pdb_frag B 1:97 C 1:20 4oo8.pdb 4oo8-BC.pdb
x3dna-dssr -i=4oo8-BC.pdb -o=4oo8-BC.out --prefix=4oo8-BC
117 ENERGY = 0.0 [4oo8-BC] -- secondary structure derived by DSSR
1 G 0 1 117 1
2 G 1 2 116 2
3 A 2 3 115 3
4 A 3 4 114 4
5 A 4 5 113 5
6 U 5 6 112 6
7 U 6 7 111 7
8 A 7 8 110 8
9 G 8 9 109 9
10 G 9 10 108 10
11 U 10 11 107 11
12 G 11 12 106 12
13 C 12 13 105 13
14 G 13 14 104 14
15 C 14 15 103 15
16 U 15 16 102 16
17 U 16 17 101 17
18 G 17 18 100 18
19 G 18 19 99 19
20 C 19 20 98 20
21 G 20 21 50 21
22 U 21 22 49 22
23 U 22 23 48 23
24 U 23 24 47 24
25 U 24 25 46 25
26 A 25 26 45 26
27 G 26 27 0 27
28 A 27 28 0 28
29 G 28 29 40 29
30 C 29 30 39 30
31 U 30 31 38 31
32 A 31 32 37 32
33 G 32 33 0 33
34 A 33 34 0 34
35 A 34 35 0 35
36 A 35 36 0 36
37 U 36 37 32 37
38 A 37 38 31 38
39 G 38 39 30 39
40 C 39 40 29 40
41 A 40 41 0 41
42 A 41 42 0 42
43 G 42 43 0 43
44 U 43 44 0 44
45 U 44 45 26 45
46 A 45 46 25 46
47 A 46 47 24 47
48 A 47 48 23 48
49 A 48 49 22 49
50 U 49 50 21 50
51 A 50 51 0 51
52 A 51 52 0 52
53 G 52 53 61 53
54 G 53 54 60 54
55 C 54 55 58 55
56 U 55 56 0 56
57 A 56 57 0 57
58 G 57 58 55 58
59 U 58 59 0 59
60 C 59 60 54 60
61 C 60 61 53 61
62 G 61 62 0 62
63 U 62 63 0 63
64 U 63 64 0 64
65 A 64 65 0 65
66 U 65 66 0 66
67 C 66 67 0 67
68 A 67 68 0 68
69 A 68 69 80 69
70 C 69 70 79 70
71 U 70 71 78 71
72 U 71 72 77 72
73 G 72 73 0 73
74 A 73 74 0 74
75 A 74 75 0 75
76 A 75 76 0 76
77 A 76 77 72 77
78 A 77 78 71 78
79 G 78 79 70 79
80 U 79 80 69 80
81 G 80 81 0 81
82 G 81 82 96 82
83 C 82 83 95 83
84 A 83 84 94 84
85 C 84 85 93 85
86 C 85 86 92 86
87 G 86 87 91 87
88 A 87 88 0 88
89 G 88 89 0 89
90 U 89 90 0 90
91 C 90 91 87 91
92 G 91 92 86 92
93 G 92 93 85 93
94 U 93 94 84 94
95 G 94 95 83 95
96 C 95 96 82 96
97 U 96 97 0 97
98 G 0 98 20 1
99 C 98 99 19 2
100 C 99 100 18 3
101 A 100 101 17 4
102 A 101 102 16 5
103 G 102 103 15 6
104 C 103 104 14 7
105 G 104 105 13 8
106 C 105 106 12 9
107 A 106 107 11 10
108 C 107 108 10 11
109 C 108 109 9 12
110 T 109 110 8 13
111 A 110 111 7 14
112 A 111 112 6 15
113 T 112 113 5 16
114 T 113 114 4 17
115 T 114 115 3 18
116 C 115 116 2 19
117 C 116 0 1 20
Note:Figure S6: The k-turn identified by DSSR in the SAM-I riboswitch (PDB id: 2gis). Base-stacking interactions are interrupted around the k-turn even though the backbone is continuous along each strand. Thus DSSR assigns two helices (depicted by gray lines), the canonical helix on the left, and the noncanonical one on the right.
Figure S5: Ribbon representations of the junction loop in the env22 twister ribozyme (PDB id: 4rge). The ribbons are defined in terms of the C1′ and P atoms of the nucleotides that make up the junction loop. Inclusion of pseudoknots in the analysis of the structure reveals a [4,2,2,0,1,3,0,0,1,1] ten-way junction and a ribbon that follows a super-coiled pathway, with a linking number of three (blue, top row). Upon pseudoknot removal, only a [2,1,3] three-way junction and a ribbon with a simple relaxed circular configuration remain (green, bottom row). The overlap of the two junction loops in the middle row clearly shows that the over-simplified three-way junction spans only a small portion of the ten-way loop. The ribbons are shown in three projections: down the x-axis (left column), the y-axis (middle column), and the z-axis (right column). The images were kindly generated by Dr. Nicolas Clauvelin using the approach described in ref. 49.
Figure S2: Three similarly positioned base pairs that hold the D- and T-loops of tRNAPhe (PDB id: 1ehz, gold) and its viral mimic (PDB id: 4p5j, magenta) in place. The interacting loops in the two molecules are overlaid on the reference frame of the common elbow G–C pair, which is oriented vertically with its major-groove edge facing the viewer, roughly matching Figures 2 and 3 (A-C). Since the two elbow G–C pairs have very similar base- pair parameters, they overlap nearly perfectly. Despite large structural variations between the D-loops, the H2U16+U59 pair in tRNA (B, detailed in D) is similar to the presumably semi-protonated C8+C52 pair (forming an i-motif) in the mimic (C, detailed in E). The other two pairs near the elbow (F and G) are also strikingly alike, despite dramatically different modes of interaction. Note that DSSR identifies the C+C pair (E) with the assumed acceptor-acceptor (N3 to N3) hydrogen bond highlighted (red).
Figure S7: The eight base triplets and associated hydrogen bonds (dashed lines) detected by DSSR in the SAM-I riboswitch (PDB id: 2gis). (A) GCG (G11,C44,G58), with G11 in a similar position and orientation as in a type II A-minor motif. (B) AGC (A12,G43,C59), a type I A-minor motif. (C) AGG (A20,G32,G35). (D) GCA (G22,C30,A61). (E) GCA (G23,C29,A62). (F) AUA (A24,U64,A85), with the isolated, linchpin-like U64-A85 pair. (G) AUa (A45,U57,SAM301), with the SAM adenosine moiety taken as a modified base in forming the triplet. (H) ACG (A46,C47,G56). Note that in (D) and (E), A61 and A62 employ their Watson-Crick edges, rather than the minor-groove edges as in A-minor motifs, to interact with the minor-groove edges of the two consecutive G–C pairs.
Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids
Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University