Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: Is there any tool to convert xyz file to pdb file that can be accepted by 3dna?  (Read 31783 times)

Offline elixer

  • with-posts
  • *
  • Posts: 4
    • View Profile
Hi Xiangjun,
I'm trying to convert an xyz file to a pdb file to use it in 3dna (I got the xyz file from other program which can not provide pdb file directly). I have tried VESTA and OpenBabel but they failed to create a pdb file that can be used in 3dna (although VESTA can work normally with these files). After a quick search I have found other similar problems posted before in the forum (like Topic: PDB conversion on: May 29, 2017, 06:49:35 am), but I have not found a quick solution to my problem. So I'm here to ask if there is any tool that can do the job.
My xyz file is attached below.
Best regards,
Erik

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Hi Erik,

Thanks for using 3DNA and for posting your questions on the Forum.

There is no quick solution to your problem. 3DNA relies on proper naming of atoms in DNA and RNA, as a prerequisite. For example, the sugar atoms are all ended with a prime (e.g., C1'). With only C, O, N, P, H atomic symbols, as in the .xyz file you attached, it is practically impossible to know if an atom belongs to the base or the sugar moiety. The wwPDB is there to ensure the compliance of atom names to standard nomenclatures of biological macromolecular structures.

If VESTA or OpenBabel cannot do the job for you automatically, you've to come up with your own solution. Sorry for not being able to provide you a more encouraging answer.

Best regards,

Xiang-Jun
« Last Edit: April 04, 2019, 12:13:48 am by xiangjun »

Offline elixer

  • with-posts
  • *
  • Posts: 4
    • View Profile
Hello Xiangjun,
Thanks for replying. I think my situation is not that bad. You will find that softwares like OpenBabel are more powerful than you think. OpenBabel can actually identify an atom's pdb name from an xyz file (Although I don't know how it is done. Pretty cool as I think). I have attached the pdb file that OpenBabel generated from my xyz file DNA_TypeA_12bp_1.xyz. It can mark the C atoms as C5*,  C4*, C8 etc, probably mean C5', C4', C8.
So I understand that I need to change symbols like C5* to C5', and do some other format changes. But do I need to add something like HEADER, TITLE,  COMPND, SOURCE etc to the converted pdb file? I want to use 3dna to calculate some geometry attributes like diameter, rise and twist of the DNA segment, so I think maybe these things like "HEADER" are not necessary. So can I leave these parts empty or do I have to fake them to make the pdb file acceptable by 3dna? Is there anything besides the "ATOM" part that will be actually used in 3dna in my case?
Best,
Erik
« Last Edit: April 04, 2019, 02:23:47 am by elixer »

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Hi Erik,

Thanks for your follow-up. 3DNA only needs the ATOM/HETATM coordinate records to its analysis and is generally tolerant of some variants of PDB format (e.g., C1* instead of C1'). As is often the case, please just try 3DNA (or other tools) and see what happens.

Glad to hear that OpenBabel can be of more help in your case.  The PDB file you attached, however, is still not in good condition. As an example, the first residue (designated as A) has the following info, with ONLY hydrogen atoms.

Code: [Select]
ATOM      1  H5*   A A   1      18.493   7.546  16.553  1.00  0.00           H
ATOM      2 H5*1   A A   1      18.971   5.507  14.623  1.00  0.00           H
ATOM      3 H5*2   A A   1      19.700   5.674  16.250  1.00  0.00           H
ATOM      4  H4*   A A   1      21.435   6.259  14.778  1.00  0.00           H
ATOM      5  H3*   A A   1      20.014   8.769  15.785  1.00  0.00           H
ATOM      6 H2*1   A A   1      21.745  10.136  14.663  1.00  0.00           H
ATOM      7 H2*2   A A   1      22.696   8.651  14.285  1.00  0.00           H
ATOM      8  H1*   A A   1      21.349   8.823  12.282  1.00  0.00           H
ATOM      9  H2    A A   1      21.090  13.832  12.321  1.00  0.00           H
ATOM     10  H8    A A   1      18.005   8.213  13.369  1.00  0.00           H
ATOM     11  H61   A A   1      16.614  14.046  11.537  1.00  0.00           H
ATOM     12  H62   A A   1      16.008  12.397  11.886  1.00  0.00           H

You need further work on this file before 3DNA can handle it.

Best regards,

Xiang-Jun


Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Hi Erik,

I dug a bit deeper into your attached PDB file and found it has a strange arrangement of the nucleotides. Instead of putting all atoms (base, sugar, and phosphate) of a nucleotide together, as is the case for a standard PDB entry (e.g., 355d), your PDB file has the atoms arranged by atom types: H, C, N, O, and P. An example is shown below, for A2 on chain A. As you can see clearly, it is separated into 5 segments.

The source of the issue may be due to the .xyz file which has atoms ordered that way. How was your .xyz generated? It is certainly the first time I see such a weird case for DNA structures.
ATOM     13  H2P   A A   2      21.335   7.601  19.955  1.00  0.00           H
ATOM     14 H5*1   A A   2      23.439   9.452  17.299  1.00  0.00           H
ATOM     15 H5*2   A A   2      23.472   9.989  19.020  1.00  0.00           H
ATOM     16  H4*   A A   2      24.470  11.663  17.626  1.00  0.00           H
ATOM     17  H3*   A A   2      21.726  12.489  18.712  1.00  0.00           H
ATOM     18 H2*1   A A   2      22.286  14.772  17.743  1.00  0.00           H
ATOM     19 H2*2   A A   2      23.960  14.206  17.436  1.00  0.00           H
ATOM     20  H1*   A A   2      22.953  13.709  15.302  1.00  0.00           H
ATOM     21  H2    A A   2      20.156  17.569  14.828  1.00  0.00           H
ATOM     22  H8    A A   2      20.392  11.254  16.304  1.00  0.00           H
ATOM     23  H61   A A   2      16.205  15.439  14.199  1.00  0.00           H
ATOM     24  H62   A A   2      16.535  13.725  14.562  1.00  0.00           H
ATOM    313  C5*   A A   2      23.026  10.171  18.027  1.00  0.00           C
ATOM    314  C4*   A A   2      23.361  11.581  17.596  1.00  0.00           C
ATOM    315  C3*   A A   2      22.779  12.705  18.457  1.00  0.00           C
ATOM    316  C2*   A A   2      22.902  13.896  17.502  1.00  0.00           C
ATOM    317  C1*   A A   2      22.479  13.239  16.178  1.00  0.00           C
ATOM    318  C4    A A   2      20.345  14.424  15.560  1.00  0.00           C
ATOM    319  C2    A A   2      19.876  16.522  14.986  1.00  0.00           C
ATOM    320  C6    A A   2      18.108  14.968  14.888  1.00  0.00           C
ATOM    321  C5    A A   2      19.025  13.992  15.353  1.00  0.00           C
ATOM    322  C8    A A   2      20.101  12.255  15.998  1.00  0.00           C

ATOM    548  N9    A A   2      21.031  13.291  15.971  1.00  0.00           N
ATOM    549  N3    A A   2      20.841  15.678  15.407  1.00  0.00           N
ATOM    550  N1    A A   2      18.575  16.241  14.721  1.00  0.00           N
ATOM    551  N7    A A   2      18.882  12.639  15.633  1.00  0.00           N
ATOM    552  N6    A A   2      16.809  14.704  14.628  1.00  0.00           N
ATOM    630  O1P   A A   2      19.542   8.422  18.155  1.00  0.00           O
ATOM    631  O2P   A A   2      21.886   7.642  19.130  1.00  0.00           O
ATOM    632  O5*   A A   2      21.577  10.009  18.124  1.00  0.00           O
ATOM    633  O4*   A A   2      22.891  11.854  16.254  1.00  0.00           O
ATOM    634  O3*   A A   2      23.577  12.788  19.672  1.00  0.00           O

ATOM    767  P     A A   2      21.033   8.490  18.045  1.00  0.00           P

Knowing the issue, one can find a workaround for this special case as below. Note your attached PDB file is named DNA_TypeA_12bp_1.pdb. The resultant file duplex.pdb can be handled by 3DNA.

Code: [Select]
grep ' A A ' DNA_TypeA_12bp_1.pdb | sort -k 6n > chainA.pdb
grep ' T B ' DNA_TypeA_12bp_1.pdb | sort -k 6n > chainB.pdb
cat chainA.pdb chainB.pdb > duplex.pdb

Please have a try and report back how it goes.

Xiang-Jun
« Last Edit: April 05, 2019, 01:32:22 pm by xiangjun »

Offline elixer

  • with-posts
  • *
  • Posts: 4
    • View Profile
Hello Xiangjun,
Sorry for replying late.
YES! You are right! It works! I haven't figured it out in a long time by myself, almost giving up.
I REALLY should have checked the forum earlier! You are a lifesaver, THANK YOU!
And to answer your question, my .xyz file was generated from a .cif file by VESTA. I have attached the .cif file below.

Best,
Erik
« Last Edit: April 10, 2019, 04:26:46 pm by elixer »

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Hi Erik,

Thanks for your confirmation that the proposed method works.

Quote
And to answer your question, my .xyz file was generated from a .cif file by VESTA. I have attached the .cif file below.

The so-called CIF file is nothing but a simple list of atom symbol and xyz coordinates. As shown below, it is just like the .xyz file you attached at the beginning of the thread. If you're interested in generating a starting DNA or RNA structure of generic sequence, you may find 3DNA itself helpful. Try the new "Web 3DNA 2.0 for the analysis, visualization, and modeling of 3D nucleic acid structures" at http://web.x3dna.org, especially the "Fiber" module.

Best regards,

Xiang-Jun


_data_test

audit_creation_method generated by ABACUS

_cell_length_a 31.7506
_cell_length_b 31.7506
_cell_length_c 84.6683
_cell_angle_alpha 90
_cell_angle_beta 90
_cell_angle_gamma 90

_symmetry_space_group_name_H-M
_symmetry_Int_Tables_number

loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z

H 0.58245 0.237665 0.195505
H 0.597501 0.173458 0.172704
H 0.620449 0.178712 0.191924
H 0.675105 0.197142 0.174543
H 0.630339 0.276193 0.186432
H 0.684881 0.319228 0.173178
H 0.714824 0.272464 0.168714
H 0.672394 0.277871 0.145056
H 0.664234 0.435631 0.145526
H 0.567061 0.258673 0.157903
......
P 0.25159 0.656626 0.15821
P 0.395947 0.720755 0.126839
P 0.551859 0.703742 0.0949896
« Last Edit: April 10, 2019, 06:01:45 pm by xiangjun »

Offline elixer

  • with-posts
  • *
  • Posts: 4
    • View Profile
Hi Xiangjun,
I have noticed 3DNA can generate a starting DNA structure now. But, because I didn't know this when I was generating DNA structure, I chose another online tool to generate my structure, and used this structure to run a DFT calculation (with another software called ABACUS), which gave me the .cif file. Because of the DFT calculation, instead of generating a .pdb file from 3DNA, I have to find a method to convert this .cif file to a .pdb file. And because I thought .xyz file was easy to understand and edit, I chose to convert .cif file to .xyz file first, then convert .xyz file to .pdb file.
This is why I posted this topic, and I wish it may help others who have similar problems.

Best,
Erik
« Last Edit: April 11, 2019, 12:03:03 am by elixer »

 

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University