Netiquette · Download · News · Gallery · Homepage · DSSR · Web-DSSR · DSSR Manual · Reproduce DSSR · DSSR-Jmol · DSSR-PyMOL · Web-SNAP

Author Topic: Odd output for G-quadruplex structure  (Read 116 times)

Offline jms89

  • regular
  • *
  • Posts: 6
    • View Profile
Odd output for G-quadruplex structure
« on: September 20, 2017, 02:50:41 pm »
Hi, for the structure 2chj, http://www.rcsb.org/pdb/explore/explore.do?structureId=2chj, DSSR produces some odd output.

First, it identifies two helices instead of one. This is somewhat understandable since removing two strands would still produce a helical structure, but it would be nice if DSSR could identify higher-order helices automatically. Perhaps the definition of a base-pair could be generalized to include any number of nucleotides, and helices could be defined based on the stacking of these generalized base-pairs.

Second, it identifies each strand as a single-stranded segment, which is clearly a bug. Not sure how a single-stranded segment is defined, but I'm guessing it has to do with there being no first order base pairs (i.e. one to one base pair) in the structure.


Offline xiangjun

  • Administrator
  • regular
  • *****
  • Posts: 1198
    • View Profile
    • 3DNA homepage
Re: Odd output for G-quadruplex structure
« Reply #1 on: September 20, 2017, 06:31:49 pm »
Quote
Hi, for the structure 2chj, http://www.rcsb.org/pdb/explore/explore.do?structureId=2chj, DSSR produces some odd output.

I understand what you meant. The helix/stem/loop/ss-fragment definitions, as described in the 2015 DSSR paper in NAR, follow the literature on RNA secondary structure which is based on canonical base pairs (WC and G-U wobble). From my experience, the community is not necessarily consistent with its nomenclature/definition (if any), on the basics of (double) helix/stem/arm/paired-region/loop/pseudoknot and coaxial stacking.

Quote
First, it identifies two helices instead of one. This is somewhat understandable since removing two strands would still produce a helical structure, but it would be nice if DSSR could identify higher-order helices automatically. Perhaps the definition of a base-pair could be generalized to include any number of nucleotides, and helices could be defined based on the stacking of these generalized base-pairs.

As you noticed, DSSR identifies two helices from a G-quadruplex structure, which clearly looks odd for this well-known structure type. This is just how DSSR works on general RNA/DNA structures where duplexes are the most frequent. Higher-level structures are case-specific: for example, G-quadruplex and i-motif are all composed of 4 strands, even though they are obviously different.

As of v1.6.1-2016aug22, DSSR already can detect and characterize i-motifs (see the DSSR User Manual). For a G-quadruplex structure (e.g. 2chj), did you notice the section "List of 4 G-quartets" as shown below:

Code: [Select]
List of 4 G-quartets
   1 nts=4 GGGG A.DG2,B.DG8,C.DG14,D.DG20
   2 nts=4 gggg A.LCG3,B.LCG9,C.LCG15,D.LCG21
   3 nts=4 GGGG A.DG4,B.DG10,C.DG16,D.DG22
   4 nts=4 gggg A.LCG5,B.LCG11,C.LCG17,D.LCG23

Is the above info useful? What more do you need? DSSR can potentially separate the four strands, and characterize the loops and directionality between the strands (as for i-motif).

Quote
Second, it identifies each strand as a single-stranded segment, which is clearly a bug. Not sure how a single-stranded segment is defined, but I'm guessing it has to do with there being no first order base pairs (i.e. one to one base pair) in the structure.

You may call DSSR's list of single-stranded segments as "clearly a bug", from your own perspective for a G-quadruplex structure. Nevertheless, the output of DSSR follows the conventions largely adopted by the RNA (secondary) structure community. To the extent, of course, I understand and can put them into a self-consistent software program.

As I see it, your concerns about DSSR (in general) can be addressed by further its characterizations of G-quadruplex specific features, as it already does for i-motif. No size fits all. DSSR has been designed to work for the most common cases (by default), but can be quickly tailored for specified needs on a case-by-case basis. Browsing the Forum, you'll find several such cases.

HTH,

Xiang-Jun
« Last Edit: September 20, 2017, 07:27:31 pm by xiangjun »
Dr. Xiang-Jun Lu [律祥俊]
Email: xiangjun@x3dna.org
Homepage: http://x3dna.org/
Forum: http://forum.x3dna.org/

Offline xiangjun

  • Administrator
  • regular
  • *****
  • Posts: 1198
    • View Profile
    • 3DNA homepage
Re: Odd output for G-quadruplex structure
« Reply #2 on: September 22, 2017, 02:45:09 pm »
I've followed the literature on G-qudraplex structure for a long while, and 3DNA/DSSR already contain basic components for making sense of it. This thread ha prompted me to integrate the pieces, and to add tailored analysis/annontation features for G-quadraplex. The coming DSSR v1.7.0 release will have a section dedicated to G-qudraplex, as it does for the i-motif.

Xiang-Jun
Dr. Xiang-Jun Lu [律祥俊]
Email: xiangjun@x3dna.org
Homepage: http://x3dna.org/
Forum: http://forum.x3dna.org/

 

Created and maintained by Dr. Xiang-Jun Lu[律祥俊]· Supported by the NIH grant R01GM096889 · Dr. Lu is currently a member of the Bussemaker Laboratory at the Department of Biological Sciences, Columbia University. The project is in collabration with the Olson Laborarory at Rutgers where 3DNA got started.