Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: DSSR: Analyzing NMR structures - overwritten output files  (Read 44145 times)

Offline meier74

  • with-posts
  • *
  • Posts: 7
    • View Profile
DSSR: Analyzing NMR structures - overwritten output files
« on: February 08, 2017, 06:45:55 pm »
Dear all,

I am trying to analyze an NMR ensemble with 10 models using x3dna-dssr with the --nmr option. I am interested in the backbone torsion angles. Unfortunately the file dssr-torsions.txt gets overwritten as the program proceeds from model to model in the PDB file. After the run finishes, the file only contains the data for the last model (#10). Is it possible to configure the program in such a way that it records the data for all models?

I am using DSSR v1.6.5-2017jan22 on Gentoo Linux (64 bit)

Thank you for your help!
Markus

--
Markus Meier, Ph.D.
Research Associate
University of Manitoba
Department of Chemistry
144 Dysart Road
Winnipeg, MB, R3T 2N2, Canada
E-mail: markus.meier@umanitoba.ca
Markus Meier, Ph.D.
Research Associate
University of Manitoba
Department of Chemistry
Winnipeg, MB, R3T 2N2, Canada
E-mail: markus.meier@umanitoba.ca

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #1 on: February 08, 2017, 06:52:39 pm »
Hi Markus,

Glad to see you here!

For the analysis of an NMR ensemble (or MD trajectories), your best bet is to use the --json output format together with --nmr. If you're unfamiliar with JSON, it may take a little while to get started. However, it is well-worth the effort, since the structured JSON output is far easier to parse than free text. For parsing JSON, you may give jq a try.

Please let me know how it goes.

Best regards,

Xiang-Jun
« Last Edit: February 08, 2017, 07:33:23 pm by xiangjun »

Offline meier74

  • with-posts
  • *
  • Posts: 7
    • View Profile
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #2 on: February 09, 2017, 11:35:58 am »
Hi  Xiang-Jun,

long time no see :-)

Thanks for the suggestion, I will give it a try. I am current using R to make some figures, it has libraries for read JSON.

Best regards,
Markus

--
Markus Meier, Ph.D.
Research Associate
University of Manitoba
Department of Chemistry
144 Dysart Road
Winnipeg, MB, R3T 2N2, Canada
E-mail: markus.meier@umanitoba.ca
Markus Meier, Ph.D.
Research Associate
University of Manitoba
Department of Chemistry
Winnipeg, MB, R3T 2N2, Canada
E-mail: markus.meier@umanitoba.ca

Offline meier74

  • with-posts
  • *
  • Posts: 7
    • View Profile
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #3 on: February 09, 2017, 06:27:49 pm »
Hi everyone,

I have attached a short R script that reads the JSON output from x3dna-dssr and puts into a tidy table. Using dplyr's filter() function, you can select the parameters for the nucleotides you are interested in (filter for chain ID, nucleotide number, nucleotide name etc...), similar to a Pymol selection. Pretty cool!
The script requires dplyr and tidyjson libraries which can be obtained from CRAN.

I hope it will help someone else.

Best regards,
Markus

--
Markus Meier, Ph.D.
Research Associate
University of Manitoba
Department of Chemistry
144 Dysart Road
Winnipeg, MB, R3T 2N2, Canada
E-mail: markus.meier@umanitoba.ca
Markus Meier, Ph.D.
Research Associate
University of Manitoba
Department of Chemistry
Winnipeg, MB, R3T 2N2, Canada
E-mail: markus.meier@umanitoba.ca

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #4 on: February 09, 2017, 06:50:54 pm »
Hi Markus,

Thank you so much for your contribution. I'm sure (at least some) DSSR users would find your R script helpful. Hopefully, others will follow your lead by providing more DSSR use cases with example scripts.

Best regards,

Xiang-Jun

Offline meier74

  • with-posts
  • *
  • Posts: 7
    • View Profile
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #5 on: February 10, 2017, 03:07:15 pm »
Hi Xiang-Jun,

you are very welcome. I found some problem with the script. Since x3dna-dssr does not record the nucleotide/residue numbers in the JSON data structure, I have used the index field in lieu of the residue numbers. Unfortunately, this gives incorrect results, if the nucleotide numbers do not start from 1 or are not consecutive. I have attached a revised version of the script that extracts the residue number from the nt_id field using pattern matching. However, it might not work in all cases. The best way would be to modify x3dna-dssr so that it also outputs the residue number in the nts record.

Best regards,
Markus
Markus Meier, Ph.D.
Research Associate
University of Manitoba
Department of Chemistry
Winnipeg, MB, R3T 2N2, Canada
E-mail: markus.meier@umanitoba.ca

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #6 on: February 10, 2017, 03:19:14 pm »
Hi Markus,

Thanks for your follow-up. What specifically needs to be done from my side to make the process more straightforward? Note that the "nt_id" field can be changed via the --idstr option. For example with --idstr=long, the field will be cleanly delinated by its componenents (e.g., model#, chain id, residue number etc).

I'll do whatever it takes and sensible to revise DSSR for better integration to down-stream applications. Please be specific, using concrete examples.

Best regards,

Xiang-Jun
« Last Edit: February 10, 2017, 03:21:18 pm by xiangjun »

Offline meier74

  • with-posts
  • *
  • Posts: 7
    • View Profile
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #7 on: February 10, 2017, 03:52:54 pm »
Hi Xiang-Jun,

A new "nt_resi" field with the residue number as below would help:

  "nts": [
    {
      "index": 1,
      "index_chain": 1,
      "chain_name": "A",
      "nt_resi": 2,
      "nt_name": "DG",
      "nt_code": "G",
      "nt_id": "A.DG2",
...
  }]

This would allow to extract all the essential data with the same block of code. I did not know about the --idstr option which changes the nt_id field to ".A.DG.2.". This one makes pattern matching a safe bet and works as well, but needs a little extra code. So, if you think it is not worth the effort to add an "nt_resi" field, do not worry about it.

Best regards,
Markus


Markus Meier, Ph.D.
Research Associate
University of Manitoba
Department of Chemistry
Winnipeg, MB, R3T 2N2, Canada
E-mail: markus.meier@umanitoba.ca

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #8 on: February 10, 2017, 05:27:04 pm »
Hi Markus,

I see your point. However, since all the info is already there with --idstr=long, all one needs is to split the id-string. A similar scheme has been used to integrate DSSR to Jmol (http://jmol.x3dna.org), which corresponds to --idstr=unit-it (see http://www.bgsu.edu/research/rna/help/rna-3d-hub-help/unit-ids.html).

See the DSSR User Manual for more details on the --idstr option.

Hope this makes sense.

Xiang-Jun
« Last Edit: February 10, 2017, 08:52:09 pm by xiangjun »

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #9 on: February 13, 2017, 03:19:14 pm »
Hi Markus,

Following my previous response, I've thought about your request more carefully. It turns out what you suggested (quoted below) are already there except for one item: residue number ("nt_resi" as you called it).

Code: [Select]
"nts": [
    {
      "index": 1,
      "index_chain": 1,
      "chain_name": "A",
      "nt_resi": 2,
      "nt_name": "DG",
      "nt_code": "G",
      "nt_id": "A.DG2",
...
  }]

So I've added this entry (termed "nt_resnum"), as shown below using 355d as an example.

Code: [Select]
# generated by the following command:
# x3dna-dssr -i=355d.pdb --json | jq .nts[1]
{
  "index": 2,
  "index_chain": 2,
  "chain_name": "A",
  "nt_resnum": 2,
  "nt_name": "DG",
  "nt_code": "G",
  "nt_id": "A.DG2",
...
}

The beauty of the JSON output shines, since it won't break any DSSR parsers due to the addition of this new item.

The DSSR distributions on the download page have been updated, even though the date is still v1.6.5-2017jan22. You may want to revise your R script to take advantage of this new feature.

Many thanks and best regards,

Xiang-Jun
« Last Edit: February 18, 2017, 12:45:08 am by xiangjun »

Offline meier74

  • with-posts
  • *
  • Posts: 7
    • View Profile
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #10 on: March 01, 2017, 12:07:13 pm »
Hi Xiang-Jun,

many thanks again for introducing the "nt_resnum" field into the "nts" record of the JSON output! I have attached version 3 of the example script that takes advantage of the new feature. I will also post a fully working example that produces a plot of the backbone torsion angles soon, as promised.

Best regards,
Markus
Markus Meier, Ph.D.
Research Associate
University of Manitoba
Department of Chemistry
Winnipeg, MB, R3T 2N2, Canada
E-mail: markus.meier@umanitoba.ca

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #11 on: March 01, 2017, 12:12:59 pm »
Hi Markus,

It's so kind of you! Yes, it would be great to provide a worked example that shows unambiguously how your R script can be used. This way, users can benefit better from your work. Better yet, your contribution may well inspire others to write similar scripts in Python, Ruby, Perl etc.

Best regards,

Xiang-Jun
« Last Edit: March 01, 2017, 10:12:45 pm by xiangjun »

Offline meier74

  • with-posts
  • *
  • Posts: 7
    • View Profile
Re: DSSR: Analyzing NMR structures - overwritten output files
« Reply #12 on: March 02, 2017, 03:57:49 pm »
Dear all,

here is a working example how to read the JSON output from x3dna-dssr in an R script and produce a backbone torsion angle plot from the NMR ensemble PDBID 2M4P (a DNA G-quadruplex). I have attached a compressed tar archive (for those unfamiliar with this format, use 7-Zip, http://www.7-zip.org to extract). Please refer to INSTRUCTIONS.txt on how to use it. The script is made available under the MIT license.

Best regards,
Markus
Markus Meier, Ph.D.
Research Associate
University of Manitoba
Department of Chemistry
Winnipeg, MB, R3T 2N2, Canada
E-mail: markus.meier@umanitoba.ca

 

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University