Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: Documentation of the Json output fields ?  (Read 21009 times)

Offline persalteas

  • with-posts
  • *
  • Posts: 11
    • View Profile
Documentation of the Json output fields ?
« on: March 03, 2020, 05:25:07 am »
Hello,

I noticed DSSR's json output of a structure analysis gives a lot of information (about individual nucleotides in particular), and i do not know (yet) all their meanings.
I searched a bit for a documentation, but the manual does not provide the fields descriptions.

The list of fields i am talking about:

Quote
index
index_chain
chain_name
nt_resum
nt_name
nt_code
nt_id
dbn
summary (and there are several fields inside)
alpha, beta, gamma, delta, epsilon, zeta
epsilon-zeta
bb_type
chi
glyco_bond
C5prime_xyz
P_xyz
form
ssZp, Dp
splay_angle, splay_distance, splay_ratio
eta, theta, eta', theta', eta_base, theta_base
v0, v1, v2, v3, v4
amplitude
phase_angle
puckering
sugar_class
bin
cluster
suiteness
filter_rmsd
frame (several descriptors)

In particular, i am interested in the following points:
  • Why are we interested in the C5' XYZ position ?
  • What are the "splay" descriptors ?
  • What are the v0-v4 descriptors ?
  • What is the difference between sugar_class and puckering ?
  • What is the frame, and the RMSD value provided ?

I am mostly trying to learn more, if such documentation does not exist (yet), i welcome some bibliography references if you have some...

Thanks a lot for your time,

Louis
PhD Student @IBISC, Univ Evry, Université-Paris Saclay

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: Documentation of the Json output fields ?
« Reply #1 on: March 03, 2020, 12:06:04 pm »
Hi Louis,

Thanks for your questions on descriptions of the DSSR JSON output. As you noticed, DSSR derives (far) more features than those already documented in the 108-page User Manual. That's on purpose. DSSR contains many unpublished results, as well as a lot of experimental functionalities. On the other hand, some features already documented in the DSSR User Manual may not have been read as I'd expect. Overall, I do want to ensure what have been documented and published to be accurate and reproducible.

For nucleic acid structures, see "Principles of Nucleic Acid Structure" by Wolfram Saenger. For 3DNA/DSSR, read the 2003 NAR paper on 3DNA, and 2015 NAR paper on DSSR and the DSSR User Manual.

Best regards,

Xiang-Jun
 

Offline persalteas

  • with-posts
  • *
  • Posts: 11
    • View Profile
Re: Documentation of the Json output fields ?
« Reply #2 on: March 13, 2020, 04:14:26 am »
Hi xiangjun,

Quote
Overall, I do want to ensure what have been documented and published to be accurate and reproducible.

Okay, i can understand. However, i found answers in your blog posts, may it help some future reader:

Now that i understand a bit more, i think some will be relevant to me. I am building a dataset with as many descriptors per nucleotide as possible to do some feature selection and machine-learning predictions.
So i wish i could include all the sugar conformation descriptors, glyco-bond and backbone conformations, and in general, all what applies to a single nucleotide.

Can you explicitely list which descriptors are certified working as expected and documented, and which i should not use/publish in my future work ?
I just want to avoid the situation where my work strongly depends on something experimental and not citable. I am not gonna implement your ideas ;)

Thanks a lot,

Louis

PhD Student @IBISC, Univ Evry, Université-Paris Saclay

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: Documentation of the Json output fields ?
« Reply #3 on: March 13, 2020, 09:24:28 am »
Hi Louis,

I am glad that your read my blogposts and the DSSR User Manual. I am impressed that you do not want recreate the wheel but rely on DSSR for your research projects.

Quote
Can you explicitely list which descriptors are certified working as expected and documented, and which i should not use/publish in my future work ? I just want to avoid the situation where my work strongly depends on something experimental and not citable. I am not gonna implement your ideas ;)

As you quoted from my previous response,

Quote
Overall, I do want to ensure what have been documented and published to be accurate and reproducible.

Please take what have been published and documented in the DSSR User Manual as certified by me. Note that some of the documented features, e.g., the section on "Splayed-apart conformations" (which are novel and unique to DSSR), have not been published.

Please do cite the 2015 NAR DSSR paper, even for those documented but unpublished features. I may publish more papers on specific features from DSSR, just I did with the DSSR-Jmol integration. No matter what else, the NAR'15 paper is the fundamental one to cite for any DSSR-related applications.

Best regards,

Xiang-Jun
« Last Edit: March 13, 2020, 01:17:10 pm by xiangjun »

 

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University