Author Topic: Figure 1 -- summary of methods to identify nucleic acid structural components (Read 124508 times)

xiangjun · « **on:** July 08, 2015, 08:35:14 pm »

"summary of methods to identify RNA structural components" title="summary of methods to identify RNA structural components"

Quote

Figure 1: Summary of steps used to identify nucleic acid structural components. (A) Nucleotides are recognized using standard atom names and base planarity. A base is taken as a pyrimidine (six-membered ring) unless it possesses one of three purine atoms (red). (B) Bases are assigned a standard reference frame independent of sequence: purines and pyrimidines (red) are symmetrically placed with respect to the sugar. (C) The standard base frame is derived from an idealized Watson-Crick base pair, where the x₁, y₁-axes of the sequence base align with the x₂-, y₂-axes of its complement (red) and define three base edges (Watson-Crick, minor groove, Major groove). (D) Base pairs are identified from the distance and coplanarity of base rings (highlighted by rectangular blocks with embedded reference frames and shaded minor-groove edges) and the occurrence of at least one hydrogen bond (dashed lines). (E) Helices are defined by base-stacking interactions. Whereas the two nearest neighbors of a terminal pair (black) lie on one side of the pair, those of a middle pair (red) lie on opposite sides. (F) Closed loops are delineated by the ends of stems and specified by the lengths of consecutive connecting loop segments. Here, the four-way junction (S1 to S4) is denoted [2,1,1,0] in terms of the loop nucleotides (white circles) running clockwise from S1 to S4. Arrows point from the 5′ to 3′ direction along each strand and dashed lines represent stem pairs.

Note:

This figure illustrates key algorithms implemented in DSSR for the analysis of nucleic acid structures. Many other features, such as the identification of pseudoknots and various motifs, are not included here for simplicity. The figure is composed using InkScape, going through numerous iterations and taking great attention to details.
For identifying nucleotides (A), the nine ring atoms of guanine is used. Expressed in the standard base reference frame (see file 'Atomic_G.pdb' distributed with 3DNA), the atomic coordinates of the nine atoms in PDB format are as shown below. A nucleotide must have at least three properly labeled ring atoms, and the least-squares fitting (rmsd) between matched atom-pairs must be less than a cutoff (0.28 Å by default). Note that using adenine as the reference would have no impact on the result, as the base rings between G and A can be nearly perfectly aligned.

Code: [Select]

ATOM      2  N9    G A   1      -1.289   4.551   0.000
ATOM      3  C8    G A   1       0.023   4.962   0.000
ATOM      4  N7    G A   1       0.870   3.969   0.000
ATOM      5  C5    G A   1       0.071   2.833   0.000
ATOM      6  C6    G A   1       0.424   1.460   0.000
ATOM      8  N1    G A   1      -0.700   0.641   0.000
ATOM      9  C2    G A   1      -1.999   1.087   0.000
ATOM     11  N3    G A   1      -2.342   2.364   0.001
ATOM     12  C4    G A   1      -1.265   3.177   0.000

The standard base reference frame has unique features (B). It is symmetric to purines/pyrimidines and independent of base sequence. The standard frame also enjoys simple geometric meaning with its three axes. Overall, the frame fits perfectly for the analysis of RNA structures and is superior to other ad hoc frames seen in literature.
DSSR introduces three edges that are strictly base centered (C): the minor-groove edge, the Major-groove edge, and the Watson-Crick edge. The major-groove edge corresponds to the Hoogsteen/C-H edge in the Leontis-Westhof (LW) notation. The minor-groove edge correlates with the LW sugar edge only when the base is in the anti conformation, and the sugar is in C3′-endo conformation in RNA. See the User Manual for details.
When the standard reference frames are attached to the planar base rings (D), the geometric-based definition of base pairs (first introduced in 3DNA over 15 years ago) is immediately obvious. Moreover, the algorithm applies to canonical as well as noncanonical base pairs, including those with modified nucleotides, in any tautomeric or protonation state.
DSSR's definition of helices and stems is illustrated in (E). It distinguishes a stem of covalently connected canonical pairs from a helix of stacked pairs of arbitrary type/linkage. This differentiation leads naturally to a definition of coaxial stacking, another widely used concept. Moreover, the same algorithm also applies to the identification of continuous base stacks.
In DSSR, a loop forms a 'closed' circle (F) with any two sequential nucleotides connected either by a phosphodiester bond or a canonical base pair, and is specified by the lengths of consecutive bridging-nucleotide segments.

News:

Author Topic: Figure 1 -- summary of methods to identify nucleic acid structural components (Read 124508 times)

xiangjun

Figure 1 -- summary of methods to identify nucleic acid structural components