Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: Higher-order pseudoknots in DP output  (Read 37342 times)

Offline Sylverlin

  • with-posts
  • *
  • Posts: 4
    • View Profile
Higher-order pseudoknots in DP output
« on: May 11, 2014, 03:43:44 pm »
Hello,

first of all, thanks a lot for providing DSSR! It is one of the more useful secondary structure tools out there.

I ran into a problem with pseudoknots in the *.dbn output - higher-order pseudoknots are also assigned a '[]' pair instead of '{}' or some other bracket pair. This doesn't really limit the usefulness of DSSR, since I can bypass the problem by working with *.ct output, but it is confusing.

By the way, what algorithm do you use to identify which of the conflicting base pairs should be output as pseudoknots and which as "normal" base pairs?

Thanks & thumbs up!

--Jan Hajic, Charles University in Prague

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: Higher-order pseudoknots in DP output
« Reply #1 on: May 11, 2014, 04:35:03 pm »
Hi,

Thanks for using DSSR and posting your questions here. I am aware of the issue in dbn output on higher order pseudo-knots, and that's why in the main output file, DSSR just reports that "This structure contains at least one pseudo-knot."

The latest DSSR download does contain info on higher order pseudo-knots in dbn output by default. But overall, this part definitely needs to be improved. Thus far the algorithm on deriving dbn is unique to DSSR, based on my own understanding of the topic and pseudo-knots, without taking advantage of any literature. I am actually reading some referenences, and will address this limitation in the near future. Do you have any recommendations of must-reads?

Best regards,

Xiang-Jun

Offline Sylverlin

  • with-posts
  • *
  • Posts: 4
    • View Profile
Re: Higher-order pseudoknots in DP output
« Reply #2 on: May 11, 2014, 06:59:34 pm »
Well, I did find something by the group that started working on the Bio.RNA branch of Biopython, but I only skimmed it:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2248259/

Not sure if it really is a must-read.

I came up with coloring a conflict graph (most such conflict graphs - at least so I thought - would be bipartite, with a comparatively small number of edges, where several long-range pseudoknot helices would be responsible for most edges). The greedy criterion for selecting the next component to color is simply the number of conflicting helices: I sort components by their highest degree vertex from highest to lowest and start coloring each one from this highest degree vertex, using a standard FIFO queue. The first color used is '[' (the helix with most conflicts of the component gets labeled as a pseudoknotted helix) and further colors are assigned based on a priority list that starts with '(', so that if currently possible, the helix should be a non-pseudoknotted helix. If the graph component is trivial (a helix with no conflicts), I start coloring it with '('.  There could be other greedy criteria - for instance some function of helix length, number of base pairs, number of spanned helices, etc.

JH.

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: Higher-order pseudoknots in DP output
« Reply #3 on: May 11, 2014, 07:21:16 pm »
Hi JH,

It turned out that I just read the paper "From knotted to nested RNA structures: A variety of computational methods for pseudoknot removal" by Smit et al. last night! A nice point of the paper is that it comes with reference Python implementations of the various methods for pseudoknot removal.

I like the DP-based optimization approach (OA) and I am considering to adapt it into DSSR with a customized score function that take into consideration of geometric parameters of the conflicting helices. Of course, in the context of DSSR, the algorithm should output dbn with higher-order pseudo-knots properly labeled instead of being removed. One caveat of RNA structures as deposited in the PDB is that some chains have missing nucleotides. I will update DSSR when this revision is finished.

Best regards,

Xiang-Jun
« Last Edit: May 12, 2014, 12:32:47 am by xiangjun »

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: Higher-order pseudoknots in DP output
« Reply #4 on: June 17, 2014, 07:03:36 pm »
Hi JH,

Over the past few weeks, I've refined DSSR to better handle pseudo-knots in complicated RNA structures. While there may still be corner cases to be dealt with, DSSR can now derive proper .dbn with higher-order pseudo-knots in the cases I have tested. Just for completeness, as of v1.1.3-2014jun18, DSSR also outputs RNA secondary structure in .bpseq format, in addition to .ct and .dbn.

Have a try, and please let me know if you notice any issues.

Best regards,

Xiang-Jun

Offline Sylverlin

  • with-posts
  • *
  • Posts: 4
    • View Profile
Re: Higher-order pseudoknots in DP output
« Reply #5 on: November 02, 2014, 10:21:39 am »
AFAIK, it's working all right. Thanks for the fix!

JH.

 

Created and maintained by Dr. Xiang-Jun Lu [律祥俊] (xiangjun@x3dna.org)
The Bussemaker Laboratory at the Department of Biological Sciences, Columbia University.