Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Questions and answers > RNA structures (DSSR)

Odd output for G-quadruplex structure

(1/1)

jms89:
Hi, for the structure 2chj, http://www.rcsb.org/pdb/explore/explore.do?structureId=2chj, DSSR produces some odd output.

First, it identifies two helices instead of one. This is somewhat understandable since removing two strands would still produce a helical structure, but it would be nice if DSSR could identify higher-order helices automatically. Perhaps the definition of a base-pair could be generalized to include any number of nucleotides, and helices could be defined based on the stacking of these generalized base-pairs.

Second, it identifies each strand as a single-stranded segment, which is clearly a bug. Not sure how a single-stranded segment is defined, but I'm guessing it has to do with there being no first order base pairs (i.e. one to one base pair) in the structure.

xiangjun:

--- Quote ---Hi, for the structure 2chj, http://www.rcsb.org/pdb/explore/explore.do?structureId=2chj, DSSR produces some odd output.
--- End quote ---

I understand what you meant. The helix/stem/loop/ss-fragment definitions, as described in the 2015 DSSR paper in NAR, follow the literature on RNA secondary structure which is based on canonical base pairs (WC and G-U wobble). From my experience, the community is not necessarily consistent with its nomenclature/definition (if any), on the basics of (double) helix/stem/arm/paired-region/loop/pseudoknot and coaxial stacking.


--- Quote ---First, it identifies two helices instead of one. This is somewhat understandable since removing two strands would still produce a helical structure, but it would be nice if DSSR could identify higher-order helices automatically. Perhaps the definition of a base-pair could be generalized to include any number of nucleotides, and helices could be defined based on the stacking of these generalized base-pairs.
--- End quote ---

As you noticed, DSSR identifies two helices from a G-quadruplex structure, which clearly looks odd for this well-known structure type. This is just how DSSR works on general RNA/DNA structures where duplexes are the most frequent. Higher-level structures are case-specific: for example, G-quadruplex and i-motif are all composed of 4 strands, even though they are obviously different.

As of v1.6.1-2016aug22, DSSR already can detect and characterize i-motifs (see the DSSR User Manual). For a G-quadruplex structure (e.g. 2chj), did you notice the section "List of 4 G-quartets" as shown below:


--- Code: ---List of 4 G-quartets
   1 nts=4 GGGG A.DG2,B.DG8,C.DG14,D.DG20
   2 nts=4 gggg A.LCG3,B.LCG9,C.LCG15,D.LCG21
   3 nts=4 GGGG A.DG4,B.DG10,C.DG16,D.DG22
   4 nts=4 gggg A.LCG5,B.LCG11,C.LCG17,D.LCG23
--- End code ---

Is the above info useful? What more do you need? DSSR can potentially separate the four strands, and characterize the loops and directionality between the strands (as for i-motif).


--- Quote ---Second, it identifies each strand as a single-stranded segment, which is clearly a bug. Not sure how a single-stranded segment is defined, but I'm guessing it has to do with there being no first order base pairs (i.e. one to one base pair) in the structure.
--- End quote ---

You may call DSSR's list of single-stranded segments as "clearly a bug", from your own perspective for a G-quadruplex structure. Nevertheless, the output of DSSR follows the conventions largely adopted by the RNA (secondary) structure community. To the extent, of course, I understand and can put them into a self-consistent software program.

As I see it, your concerns about DSSR (in general) can be addressed by further its characterizations of G-quadruplex specific features, as it already does for i-motif. No size fits all. DSSR has been designed to work for the most common cases (by default), but can be quickly tailored for specified needs on a case-by-case basis. Browsing the Forum, you'll find several such cases.

HTH,

Xiang-Jun

xiangjun:
I've followed the literature on G-qudraplex structure for a long while, and 3DNA/DSSR already contain basic components for making sense of it. This thread ha prompted me to integrate the pieces, and to add tailored analysis/annontation features for G-quadraplex. The coming DSSR v1.7.0 release will have a section dedicated to G-qudraplex, as it does for the i-motif.

Xiang-Jun

xiangjun:
As a followup, it is worth noting that DSSR v1.7.0-2017oct19, released yesterday, contains features to automatically identify and fully characterize G-quadruplexes. Here are some examples:


* PDB entry: 2chj
List of 1 G4-stem
  stem#1[#1] layers=4 inter-molecular parallel
   1 syn=---- WC-->Major area=23.67 rise=3.04 twist=21.10 nts=4 GGGG A.DG2,B.DG8,C.DG14,D.DG20
   2 syn=---- WC-->Major area=9.60  rise=3.65 twist=32.28 nts=4 gggg A.LCG3,B.LCG9,C.LCG15,D.LCG21
   3 syn=---- WC-->Major area=18.59 rise=3.68 twist=22.74 nts=4 GGGG A.DG4,B.DG10,C.DG16,D.DG22
   4 syn=---- WC-->Major                                  nts=4 gggg A.LCG5,B.LCG11,C.LCG17,D.LCG23
    strand#1  +1 DNA syn=---- nts=4 GgGg A.DG2,A.LCG3,A.DG4,A.LCG5
    strand#2  +1 DNA syn=---- nts=4 GgGg B.DG8,B.LCG9,B.DG10,B.LCG11
    strand#3  +1 DNA syn=---- nts=4 GgGg C.DG14,C.LCG15,C.DG16,C.LCG17
    strand#4  +1 DNA syn=---- nts=4 GgGg D.DG20,D.LCG21,D.DG22,D.LCG23
* PDB entry: 5dww
  stem#1[#1] layers=3 INTRA-molecular parallel
   1 syn=---- WC-->Major area=11.63 rise=3.65 twist=31.14 nts=4 GGGG A.DG1,A.DG5,A.DG9,A.DG14
   2 syn=.--- WC-->Major area=10.64 rise=3.54 twist=28.10 nts=4 GGGG A.DG2,A.DG6,A.DG10,A.DG15
   3 syn=---- WC-->Major                                  nts=4 GGGG A.DG3,A.DG7,A.DG11,A.DG16
    strand#1  +1 DNA syn=-.- nts=3 GGG A.DG1,A.DG2,A.DG3
    strand#2  +1 DNA syn=--- nts=3 GGG A.DG5,A.DG6,A.DG7
    strand#3  +1 DNA syn=--- nts=3 GGG A.DG9,A.DG10,A.DG11
    strand#4  +1 DNA syn=--- nts=3 GGG A.DG14,A.DG15,A.DG16
    loop#1 type=propeller strands=[#1,#2] nts=1 T A.DT4
    loop#2 type=propeller strands=[#2,#3] nts=1 T A.DT8
    loop#3 type=propeller strands=[#3,#4] nts=2 TT A.DT12,A.DT13
* PDB entry: 2hy9
  stem#1[#1] layers=3 INTRA-molecular anti-parallel
   1 syn=ss-s Major-->WC area=13.69 rise=3.14 twist=19.08 nts=4 GGGG 1.DG4,1.DG10,1.DG18,1.DG22
   2 syn=--s- WC-->Major area=13.40 rise=3.05 twist=28.05 nts=4 GGGG 1.DG5,1.DG11,1.DG17,1.DG23
   3 syn=--s- WC-->Major                                  nts=4 GGGG 1.DG6,1.DG12,1.DG16,1.DG24
    strand#1  +1 DNA syn=s-- nts=3 GGG 1.DG4,1.DG5,1.DG6
    strand#2  +1 DNA syn=s-- nts=3 GGG 1.DG10,1.DG11,1.DG12
    strand#3  -1 DNA syn=-ss nts=3 GGG 1.DG18,1.DG17,1.DG16
    strand#4  +1 DNA syn=s-- nts=3 GGG 1.DG22,1.DG23,1.DG24
    loop#1 type=propeller strands=[#1,#2] nts=3 TTA 1.DT7,1.DT8,1.DA9
    loop#2 type=lateral   strands=[#2,#3] nts=3 TTA 1.DT13,1.DT14,1.DA15
    loop#3 type=lateral   strands=[#3,#4] nts=3 TTA 1.DT19,1.DT20,1.DA21
* PDB entry: 5hix
  stem#1[#1] layers=4 inter-molecular anti-parallel
   1 syn=s--s Major-->WC area=12.93 rise=3.64 twist=16.82 nts=4 GGGG A.DG1,B.DG4,A.DG12,B.DG9
   2 syn=-ss- WC-->Major area=18.96 rise=3.71 twist=35.87 nts=4 GGGG A.DG2,B.DG3,A.DG11,B.DG10
   3 syn=s--s Major-->WC area=15.16 rise=3.64 twist=18.64 nts=4 GGGG A.DG3,B.DG2,A.DG10,B.DG11
   4 syn=-ss- WC-->Major                                  nts=4 GGGG A.DG4,B.DG1,A.DG9,B.DG12
    strand#1  +1 DNA syn=s-s- nts=4 GGGG A.DG1,A.DG2,A.DG3,A.DG4
    strand#2  -1 DNA syn=-s-s nts=4 GGGG B.DG4,B.DG3,B.DG2,B.DG1
    strand#3  -1 DNA syn=-s-s nts=4 GGGG A.DG12,A.DG11,A.DG10,A.DG9
    strand#4  +1 DNA syn=s-s- nts=4 GGGG B.DG9,B.DG10,B.DG11,B.DG12
    loop#1 type=diagonal  strands=[#1,#3] nts=4 TTTT A.DT5,A.DT6,A.DT7,A.DT8
    loop#2 type=diagonal  strands=[#2,#4] nts=4 TTTT B.DT5,B.DT6,B.DT7,B.DT8
* PDB entry: 2m4p
  stem#1[#1] layers=3 INTRA-molecular parallel bulged-strands=1
   1 syn=---- WC-->Major area=8.38  rise=3.64 twist=33.34 nts=4 GGGG A.DG3,A.DG8,A.DG12,A.DG16
   2 syn=---- WC-->Major area=10.73 rise=3.23 twist=32.42 nts=4 GGGG A.DG5,A.DG9,A.DG13,A.DG17
   3 syn=---- WC-->Major                                  nts=4 GGGG A.DG6,A.DG10,A.DG14,A.DG18
    strand#1* +1 DNA syn=--- nts=3 GGG A.DG3,A.DG5,A.DG6 bulged-nts=1 T A.DT4
    strand#2  +1 DNA syn=--- nts=3 GGG A.DG8,A.DG9,A.DG10
    strand#3  +1 DNA syn=--- nts=3 GGG A.DG12,A.DG13,A.DG14
    strand#4  +1 DNA syn=--- nts=3 GGG A.DG16,A.DG17,A.DG18
    loop#1 type=propeller strands=[#1,#2] nts=1 T A.DT7
    loop#2 type=propeller strands=[#2,#3] nts=1 T A.DT11
    loop#3 type=propeller strands=[#3,#4] nts=1 T A.DT15
Other featured are also available, but not currently reported (as shown above). Details on the underlying algorithms and a survey of PDB entries will be reported in a formal publication.

Best regards,

Xiang-Jun

Navigation

[0] Message Index

Created and maintained by Dr. Xiang-Jun Lu [律祥俊] (xiangjun@x3dna.org)
The Bussemaker Laboratory at the Department of Biological Sciences, Columbia University.

Go to full version