Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: Odd output for G-quadruplex structure  (Read 22519 times)

Offline jms89

  • with-posts
  • *
  • Posts: 10
    • View Profile
Odd output for G-quadruplex structure
« on: September 20, 2017, 02:50:41 pm »
Hi, for the structure 2chj, http://www.rcsb.org/pdb/explore/explore.do?structureId=2chj, DSSR produces some odd output.

First, it identifies two helices instead of one. This is somewhat understandable since removing two strands would still produce a helical structure, but it would be nice if DSSR could identify higher-order helices automatically. Perhaps the definition of a base-pair could be generalized to include any number of nucleotides, and helices could be defined based on the stacking of these generalized base-pairs.

Second, it identifies each strand as a single-stranded segment, which is clearly a bug. Not sure how a single-stranded segment is defined, but I'm guessing it has to do with there being no first order base pairs (i.e. one to one base pair) in the structure.


Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: Odd output for G-quadruplex structure
« Reply #1 on: September 20, 2017, 06:31:49 pm »
Quote
Hi, for the structure 2chj, http://www.rcsb.org/pdb/explore/explore.do?structureId=2chj, DSSR produces some odd output.

I understand what you meant. The helix/stem/loop/ss-fragment definitions, as described in the 2015 DSSR paper in NAR, follow the literature on RNA secondary structure which is based on canonical base pairs (WC and G-U wobble). From my experience, the community is not necessarily consistent with its nomenclature/definition (if any), on the basics of (double) helix/stem/arm/paired-region/loop/pseudoknot and coaxial stacking.

Quote
First, it identifies two helices instead of one. This is somewhat understandable since removing two strands would still produce a helical structure, but it would be nice if DSSR could identify higher-order helices automatically. Perhaps the definition of a base-pair could be generalized to include any number of nucleotides, and helices could be defined based on the stacking of these generalized base-pairs.

As you noticed, DSSR identifies two helices from a G-quadruplex structure, which clearly looks odd for this well-known structure type. This is just how DSSR works on general RNA/DNA structures where duplexes are the most frequent. Higher-level structures are case-specific: for example, G-quadruplex and i-motif are all composed of 4 strands, even though they are obviously different.

As of v1.6.1-2016aug22, DSSR already can detect and characterize i-motifs (see the DSSR User Manual). For a G-quadruplex structure (e.g. 2chj), did you notice the section "List of 4 G-quartets" as shown below:

Code: [Select]
List of 4 G-quartets
   1 nts=4 GGGG A.DG2,B.DG8,C.DG14,D.DG20
   2 nts=4 gggg A.LCG3,B.LCG9,C.LCG15,D.LCG21
   3 nts=4 GGGG A.DG4,B.DG10,C.DG16,D.DG22
   4 nts=4 gggg A.LCG5,B.LCG11,C.LCG17,D.LCG23

Is the above info useful? What more do you need? DSSR can potentially separate the four strands, and characterize the loops and directionality between the strands (as for i-motif).

Quote
Second, it identifies each strand as a single-stranded segment, which is clearly a bug. Not sure how a single-stranded segment is defined, but I'm guessing it has to do with there being no first order base pairs (i.e. one to one base pair) in the structure.

You may call DSSR's list of single-stranded segments as "clearly a bug", from your own perspective for a G-quadruplex structure. Nevertheless, the output of DSSR follows the conventions largely adopted by the RNA (secondary) structure community. To the extent, of course, I understand and can put them into a self-consistent software program.

As I see it, your concerns about DSSR (in general) can be addressed by further its characterizations of G-quadruplex specific features, as it already does for i-motif. No size fits all. DSSR has been designed to work for the most common cases (by default), but can be quickly tailored for specified needs on a case-by-case basis. Browsing the Forum, you'll find several such cases.

HTH,

Xiang-Jun
« Last Edit: September 20, 2017, 07:27:31 pm by xiangjun »

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: Odd output for G-quadruplex structure
« Reply #2 on: September 22, 2017, 02:45:09 pm »
I've followed the literature on G-qudraplex structure for a long while, and 3DNA/DSSR already contain basic components for making sense of it. This thread ha prompted me to integrate the pieces, and to add tailored analysis/annontation features for G-quadraplex. The coming DSSR v1.7.0 release will have a section dedicated to G-qudraplex, as it does for the i-motif.

Xiang-Jun

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: Odd output for G-quadruplex structure
« Reply #3 on: October 20, 2017, 01:22:48 pm »
As a followup, it is worth noting that DSSR v1.7.0-2017oct19, released yesterday, contains features to automatically identify and fully characterize G-quadruplexes. Here are some examples:

  • PDB entry: 2chj
    List of 1 G4-stem
      stem#1[#1] layers=4 inter-molecular parallel
       1 syn=---- WC-->Major area=23.67 rise=3.04 twist=21.10 nts=4 GGGG A.DG2,B.DG8,C.DG14,D.DG20
       2 syn=---- WC-->Major area=9.60  rise=3.65 twist=32.28 nts=4 gggg A.LCG3,B.LCG9,C.LCG15,D.LCG21
       3 syn=---- WC-->Major area=18.59 rise=3.68 twist=22.74 nts=4 GGGG A.DG4,B.DG10,C.DG16,D.DG22
       4 syn=---- WC-->Major                                  nts=4 gggg A.LCG5,B.LCG11,C.LCG17,D.LCG23
        strand#1  +1 DNA syn=---- nts=4 GgGg A.DG2,A.LCG3,A.DG4,A.LCG5
        strand#2  +1 DNA syn=---- nts=4 GgGg B.DG8,B.LCG9,B.DG10,B.LCG11
        strand#3  +1 DNA syn=---- nts=4 GgGg C.DG14,C.LCG15,C.DG16,C.LCG17
        strand#4  +1 DNA syn=---- nts=4 GgGg D.DG20,D.LCG21,D.DG22,D.LCG23
  • PDB entry: 5dww
      stem#1[#1] layers=3 INTRA-molecular parallel
       1 syn=---- WC-->Major area=11.63 rise=3.65 twist=31.14 nts=4 GGGG A.DG1,A.DG5,A.DG9,A.DG14
       2 syn=.--- WC-->Major area=10.64 rise=3.54 twist=28.10 nts=4 GGGG A.DG2,A.DG6,A.DG10,A.DG15
       3 syn=---- WC-->Major                                  nts=4 GGGG A.DG3,A.DG7,A.DG11,A.DG16
        strand#1  +1 DNA syn=-.- nts=3 GGG A.DG1,A.DG2,A.DG3
        strand#2  +1 DNA syn=--- nts=3 GGG A.DG5,A.DG6,A.DG7
        strand#3  +1 DNA syn=--- nts=3 GGG A.DG9,A.DG10,A.DG11
        strand#4  +1 DNA syn=--- nts=3 GGG A.DG14,A.DG15,A.DG16
        loop#1 type=propeller strands=[#1,#2] nts=1 T A.DT4
        loop#2 type=propeller strands=[#2,#3] nts=1 T A.DT8
        loop#3 type=propeller strands=[#3,#4] nts=2 TT A.DT12,A.DT13
  • PDB entry: 2hy9
      stem#1[#1] layers=3 INTRA-molecular anti-parallel
       1 syn=ss-s Major-->WC area=13.69 rise=3.14 twist=19.08 nts=4 GGGG 1.DG4,1.DG10,1.DG18,1.DG22
       2 syn=--s- WC-->Major area=13.40 rise=3.05 twist=28.05 nts=4 GGGG 1.DG5,1.DG11,1.DG17,1.DG23
       3 syn=--s- WC-->Major                                  nts=4 GGGG 1.DG6,1.DG12,1.DG16,1.DG24
        strand#1  +1 DNA syn=s-- nts=3 GGG 1.DG4,1.DG5,1.DG6
        strand#2  +1 DNA syn=s-- nts=3 GGG 1.DG10,1.DG11,1.DG12
        strand#3  -1 DNA syn=-ss nts=3 GGG 1.DG18,1.DG17,1.DG16
        strand#4  +1 DNA syn=s-- nts=3 GGG 1.DG22,1.DG23,1.DG24
        loop#1 type=propeller strands=[#1,#2] nts=3 TTA 1.DT7,1.DT8,1.DA9
        loop#2 type=lateral   strands=[#2,#3] nts=3 TTA 1.DT13,1.DT14,1.DA15
        loop#3 type=lateral   strands=[#3,#4] nts=3 TTA 1.DT19,1.DT20,1.DA21
  • PDB entry: 5hix
      stem#1[#1] layers=4 inter-molecular anti-parallel
       1 syn=s--s Major-->WC area=12.93 rise=3.64 twist=16.82 nts=4 GGGG A.DG1,B.DG4,A.DG12,B.DG9
       2 syn=-ss- WC-->Major area=18.96 rise=3.71 twist=35.87 nts=4 GGGG A.DG2,B.DG3,A.DG11,B.DG10
       3 syn=s--s Major-->WC area=15.16 rise=3.64 twist=18.64 nts=4 GGGG A.DG3,B.DG2,A.DG10,B.DG11
       4 syn=-ss- WC-->Major                                  nts=4 GGGG A.DG4,B.DG1,A.DG9,B.DG12
        strand#1  +1 DNA syn=s-s- nts=4 GGGG A.DG1,A.DG2,A.DG3,A.DG4
        strand#2  -1 DNA syn=-s-s nts=4 GGGG B.DG4,B.DG3,B.DG2,B.DG1
        strand#3  -1 DNA syn=-s-s nts=4 GGGG A.DG12,A.DG11,A.DG10,A.DG9
        strand#4  +1 DNA syn=s-s- nts=4 GGGG B.DG9,B.DG10,B.DG11,B.DG12
        loop#1 type=diagonal  strands=[#1,#3] nts=4 TTTT A.DT5,A.DT6,A.DT7,A.DT8
        loop#2 type=diagonal  strands=[#2,#4] nts=4 TTTT B.DT5,B.DT6,B.DT7,B.DT8
  • PDB entry: 2m4p
      stem#1[#1] layers=3 INTRA-molecular parallel bulged-strands=1
       1 syn=---- WC-->Major area=8.38  rise=3.64 twist=33.34 nts=4 GGGG A.DG3,A.DG8,A.DG12,A.DG16
       2 syn=---- WC-->Major area=10.73 rise=3.23 twist=32.42 nts=4 GGGG A.DG5,A.DG9,A.DG13,A.DG17
       3 syn=---- WC-->Major                                  nts=4 GGGG A.DG6,A.DG10,A.DG14,A.DG18
        strand#1* +1 DNA syn=--- nts=3 GGG A.DG3,A.DG5,A.DG6 bulged-nts=1 T A.DT4
        strand#2  +1 DNA syn=--- nts=3 GGG A.DG8,A.DG9,A.DG10
        strand#3  +1 DNA syn=--- nts=3 GGG A.DG12,A.DG13,A.DG14
        strand#4  +1 DNA syn=--- nts=3 GGG A.DG16,A.DG17,A.DG18
        loop#1 type=propeller strands=[#1,#2] nts=1 T A.DT7
        loop#2 type=propeller strands=[#2,#3] nts=1 T A.DT11
        loop#3 type=propeller strands=[#3,#4] nts=1 T A.DT15

Other featured are also available, but not currently reported (as shown above). Details on the underlying algorithms and a survey of PDB entries will be reported in a formal publication.

Best regards,

Xiang-Jun

 

Funded by X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids (R24GM153869)

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University