Netiquette · Download · News · Gallery · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL
· Video Overview · DSSR v2.5.0 (DSSR Manual) · Homepage
-
Hi,
First, many thanks for your great software suite. It is a real pleasure to work with such complete and convenient outputs for my parsing purposes. I am a strong believer in collaboration in methods development trough connectible building blocs, and it is nice to have others with such view in the RNA-modeling field :)
I am building specific structural fragment libraries for RNA docking, and I try to parse dssr json outputs to automatically keep track of my fragments characteristics (2D structure, interactions ...). For this, I need to now which nucleotides were discarded because of e.g. weird geometry. As I see it, a "&" is inserted in the "bseq" entry for chain breaks. But I can't find any indication of if a nucleotide was discarded or the break was already present in the input pdb, and retrieving it by comparing bseq to the input sequence can be ambiguous.
I am considering using the "Summary of structural features of xx nucleotides" in the text output, by running dssr w. and wo. the "--json" option. Is there a dssp option I could use to get the info directly from the json output and avoid double work (as I am analysing thousands of pdb files)?
Thanks in advance for your help,
Isaure C. de Beauchene
PS: I'm using version v1.7.2-2017nov20
-
Hi,
Thanks for a thoughtful post.
I am a strong believer in collaboration in methods development trough connectible building blocs, and it is nice to have others with such view in the RNA-modeling field :)
I cannot agree with you more. I am glad to know of another one of like mind across the Atlantic.
As I see it, a "&" is inserted in the "bseq" entry for chain breaks. But I can't find any indication of if a nucleotide was discarded or the break was already present in the input pdb, and retrieving it by comparing bseq to the input sequence can be ambiguous.
Yes, the meaning of the symbol "&" in "bseq" with the DBN output is ambiguous. It could be due to: (1) switch of chains for "all_chains" as in a DNA duplex (e.g., 355d), (2) missing atomic coordinates of nucleotides within a DNA/RNA chain, as in some X-ray crystal structures due to local disorder (e.g., 2fk6), (3) abasic sites before DSSR 1.7.3-2017dec26 (which were not considered by default), (4) highly distorted bases, as from some MD simulations, that are out of the default cutoff.
Since you mention you're using v1.7.2-2017nov20, please update to the latest DSSR v1.7.4-2018jan30 that would account for case (3) above.
I am considering using the "Summary of structural features of xx nucleotides" in the text output, by running dssr w. and wo. the "--json" option. Is there a dssp option I could use to get the info directly from the json output and avoid double work (as I am analysing thousands of pdb files)?
The output of "Summary of structural features of xx nucleotides" matches all the detected nucleotides. This information (plus more) is also available from JSON output, as shown below:
x3dna-dssr -i=1ehz.pdb --json | jq .nts
HTH,
Xiang-Jun
-
Hi Xiang-Jun,
Thanks a lot for your detailed reply, and for reminding me of the "nts" entry of the json output. That was very useful.
Best,
Isaure
Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids
Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University