Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Messages - chemikeris

Pages: [1]
1
Thank you very much for the explanations. I will check not only the manual, but also the forum for the solutions in the future.

And yes, the suggested options solved my problems now, thank you very much for the fast answer.

Regards,
Justas

2
Hello.
I see that  using "--auxfile=no"  does not create any auxiliary files. When using "--parallel" , the x3dna-dssr.log file is created but not deleted. Probably it works as the authors expected :-)

Maybe the real problem is that this behavior and the command line options are not described in the manual of the software? The options were available in the DSSR version I had in my computer, update was not necessary, but I did not find any information about that in the PDF manual.

Regards,
Justas

3
I have a also a question related to auxiliary files created by DSSR.

In current version (I use v1.7.4-2018jan30) creation of these files may be regulated by '--prefix' option which is useful if you are interested only in the main output file. However, we have noticed that DSSR also creates a file named 'x3dna-dssr.log' in the directory from which it is run. It causes problems when running multiple instances of DSSR in parallel (for example, on a high-performance cluster), because for some processes we get an error message like this:

Code: [Select]
remove_file failed: x3dna-dssr.log

I do not know what is happening but probably some other DSSR process has already removed the file in the cases when this message appears?

Now I solved this problem by changing the working directory before calling DSSR in my script, but in general the problem persists, so I think that you should know it :-) Please contact me, if you need some additional information.

Regards,
Justas

4
Sorry for delayed reply, but I checked the software on ~5500 PDB entries and found that the fixed version of DSSR works correctly.

Thank you very much for fast update.

5
I try to use DSSR for large-scale analysis of nucleic acid structures which I take from Biological Assemblies from the Protein Data Bank. For this I use the option '--symmetry' which I find very useful.

Unfortunately, I noticed an inconsistency in naming of the chains in DSSR JSON output which creates some troubles when parsing the DSSR results.

When chains in the Biological Assembly file come from different asymmetric units of the crystal structure, their names usually include the MODEL number and chain name from PDB file. Using PDB entry 4ZSF, we see two chains named 'B', one is from MODEL 1, another from MODEL 2:

Code: [Select]
[justas@catfish tmp]$ x3dna-dssr -i=4zsf.pdb1 --symm --json | jq .chains
{
  "m1_chain_B": {
    "num_nts": 14,
    "bseq": "CTCGACCGGTCGAG",
    "sstr": "((((((((((((((",
    "form": "ABBBB...BBBB.-",
    "helical_rise": 3.489,
    "helical_rise_std": 0.789,
    "helical_axis": [
      0.828,
      0.076,
      0.555
    ],
    "point1": [
      -18.24,
      -31.778,
      5.458
    ],
    "point2": [
      19.36,
      -28.329,
      30.661
    ],
    "num_chars": 40,
    "suite": "C1bT!!C4bG!!A!!C1bC!!G4bG!!T!!C!!G!!A!!G"
  },
  "m2_chain_B": {
    "num_nts": 14,
    "bseq": "CTCGACCGGTCGAG",
    "sstr": "))))))))))))))",
    "form": "ABBBB...BBBB.-",
    "helical_rise": 3.489,
    "helical_rise_std": 0.789,
    "helical_axis": [
      -0.828,
      0.076,
      -0.555
    ],
    "point1": [
      18.24,
      -31.778,
      31.283
    ],
    "point2": [
      -19.36,
      -28.329,
      6.08
    ],
    "num_chars": 40,
    "suite": "C1bT!!C4bG!!A!!C1bC!!G4bG!!T!!C!!G!!A!!G"
  }
}

However, in the cases when there are chains from two assymetric units (MODEL 1 and MODEL 2 in input file), but their names are different, we see the no model numbers in chains section of the output.
For example, in PDB entry 4ILM Biological Assembly 2, we see only chains E and I:

Code: [Select]
[justas@catfish tmp]$ x3dna-dssr -i=4ilm.pdb2 --symm --json | jq .chains
{
  "chain_E": {
    "num_nts": 16,
    "bseq": "GCUAAUCUACUAUAGA",
    "sstr": "......((.....)).",
    "form": "A.....A......AA-",
    "helical_rise": 0.115,
    "helical_rise_std": 3.392,
    "helical_axis": [
      -0.734,
      -0.496,
      -0.464
    ],
    "point1": [
      64.548,
      -24.513,
      89.342
    ],
    "point2": [
      62.86,
      -25.653,
      88.275
    ],
    "num_chars": 46,
    "suite": "G!!C!!U!!A!!A4bU4nC1aU!!A!!C4pU2[A6pU!!A1aG1aA"
  },
  "chain_I": {
    "num_nts": 16,
    "bseq": "GCUAAUCUACUAUAGA",
    "sstr": "......((.....)).",
    "form": "A....BA......A.-",
    "helical_rise": 0.305,
    "helical_rise_std": 3.547,
    "helical_axis": [
      0.584,
      0.616,
      0.528
    ],
    "point1": [
      79.844,
      -52.783,
      70.422
    ],
    "point2": [
      83.132,
      -49.316,
      73.395
    ],
    "num_chars": 46,
    "suite": "G!!C!!U!!A!!A!!U4nC1aU!!A!!C4pU2[A6pU2aA1aG!!A"
  }
}


When analyzing the results in more detail (pairs, helices, multiplets, etc.), we see that chain E comes from MODEL 1 in the Biological Assembly file, and chain I is from MODEL 2:

Code: [Select]
[justas@catfish tmp]$ x3dna-dssr -i=4ilm.pdb2 --symm --json | jq .pairs
[
  {
    "index": 1,
    "nt1": "1:E.C7",
    "nt2": "1:E.G15",
    "bp": "C-G",
    "name": "WC",
    "Saenger": "19-XIX",
    "LW": "cWW",
    "DSSR": "cW-W"
  },
  {
    "index": 2,
    "nt1": "1:E.U8",
    "nt2": "1:E.A14",
    "bp": "U-A",
    "name": "WC",
    "Saenger": "20-XX",
    "LW": "cWW",
    "DSSR": "cW-W"
  },
  {
    "index": 3,
    "nt1": "2:I.U6",
    "nt2": "2:I.A16",
    "bp": "U+A",
    "name": "--",
    "Saenger": "n/a",
    "LW": "cWH",
    "DSSR": "cW+M"
  },
  {
    "index": 4,
    "nt1": "2:I.C7",
    "nt2": "2:I.G15",
    "bp": "C-G",
    "name": "WC",
    "Saenger": "19-XIX",
    "LW": "cWW",
    "DSSR": "cW-W"
  },
  {
    "index": 5,
    "nt1": "2:I.U8",
    "nt2": "2:I.A14",
    "bp": "U-A",
    "name": "WC",
    "Saenger": "20-XX",
    "LW": "cWW",
    "DSSR": "cW-W"
  }
]

This inconsistency causes troubles when parsing multiple DSSR output files generated for the PDB Biological Assemblies. I wonder, if the model number for the PDB chain could be included everywhere in the DSSR output, when '--symmetry' option is used?

Thank you very much in advance for your feedback.

Pages: [1]

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University