Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: How to get structural parameters from find_pair?  (Read 18503 times)

Offline heldenbr

  • with-posts
  • *
  • Posts: 3
    • View Profile
How to get structural parameters from find_pair?
« on: March 23, 2012, 10:19:28 am »
Hello Dr. Lu and other users,

I am a very new user of 3DNA.  I have done some simulations of RNA and have found some examples of interesting base pairs in them.  Now I would like to search the PDB for other examples of these types of base pairs.  3DNA looks like it could be a very useful tool to accomplish this.

Like the user in this post: http://forum.x3dna.org/general-discussions/analyzing-a-pdb-containing-isolated-bases/msg35/#msg35 I have used Gaussian to optimize the gas phase structure of my base pair.  I have replaced everything beyond the glycosidic bond with a methyl group.  I have tried both Gaussview and OpenBabel to get my Gaussian output file into pdb format.  Some information is missing since Gaussian does not know/care about residue type, etc.  I have tried to add some of this information/reformat by hand, but I still worry that I do not have a correctly formatted file.

When I run find_pair with the options -pz on my pdb file, it does not find any pairs.  I suspect this is a formatting issue, although I do not hear any complaints from find_pair about missing fields (as I did when trying to run find_pair on the pdb file produced by Gaussview).

So I guess I have two questions:

1. Is the attached file properly formatted for use with 3DNA?

2. If find_pair does not find a base pair, will it still output the base pair geometry parameters that were calculated?  This would be very useful to me, because I believe that some of the base pairs I would like to search for may be outside of the default parameters.  This would tell me automatically how I should change 3DNA's base pair parameters to search the pdb for other examples.

Thanks!
-Hugh Heldenbrand
Graduate Student, U of MN

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: How to get structural parameters from find_pair?
« Reply #1 on: March 23, 2012, 12:50:28 pm »
Thanks for using 3DNA and your elaborate post -- your attached PDB file helped in uncovering where the problem is.

Quote
1. Is the attached file properly formatted for use with 3DNA?
No, it is not. Specifically, the atom names do not conform to the PDB convention. Using one of the U residues as an example, see the following two images:
Gaussian-Babel PDBStandard PDB
On the left is the U based on Gaussian-Babel generated PDB file, and on the right is based on the standard PDB file. Notice how the standard PDB have names like " N1 " instead of " N  ", and " O2 " instead of " O  " etc. Proper atom names are important for 3DNA to identify which atom is which.

Quote
2. If find_pair does not find a base pair, will it still output the base pair geometry parameters that were calculated?
The problem is not that find_pair misses a pair due to parameter cutoffs, but the residues are not taken as nucleotides at all. Your best bet is simply to make your PDB file standard compliant, then both problems will be gone.

In your attached test.pdb file, there are two uracils, which follow the same atom ordering and naming convention. Could you provide me example files with A, C, G, and T? It may be worthwhile to have a utility program in 3DNA that can convert Gaussian-Babel generated PDB file to the standard format.

Xiang-Jun

« Last Edit: March 23, 2012, 01:04:16 pm by xiangjun »

Offline heldenbr

  • with-posts
  • *
  • Posts: 3
    • View Profile
Re: How to get structural parameters from find_pair?
« Reply #2 on: March 23, 2012, 02:56:44 pm »
Xiang-Jun,

Thank you for the very prompt reply.  I will modify my pdb to have the correct atom names and will try again.  Also, I will work to provide you with the files that you have requested.

-Hugh

Offline heldenbr

  • with-posts
  • *
  • Posts: 3
    • View Profile
Re: How to get structural parameters from find_pair?
« Reply #3 on: March 25, 2012, 10:31:46 am »
Xiang-Jun-

Fixing the atoms names did indeed fix my problem.  find_pair now works correctly on the cases I have tested so far.

I have made some OpenBabel-formatted pdb files from Gaussian outputs with all 5 nucleobases.  Due to some technical issues they are not actually basepairing; the coordinates are just all there in the file.  As I was creating these I found that OpenBabel is actually more clever than I had originally believed: if the entire base is there in the Gaussian output (with sugar included) then OpenBabel will correctly format the PDB with atoms names/residue names, etc.  It is only in cases where parts have been changed (as when I replaced the sugar ring with a methyl group) that OpenBabel reverts to the default "LIG" for "ATOM" and default atom names, etc.

In my situation I still needed a way to get the correct atom names into a number of pdbs though, so I have written a perl script that can do this.  It works for all of the nucleobases and can handle an arbitrary number of them.   I have done only limited testing on it so far, so if I find other bugs I will post corrections to this thread.  I have taken a cue from the x3dna2charmm_pdb script for the usage line :-)  If this script is useful to others they can feel free to use/change it, but I don't know how many other people will face the same situation as me.

I will go ahead and post the OpenBabel pdbs too.

-Hugh


Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: How to get structural parameters from find_pair?
« Reply #4 on: March 25, 2012, 11:15:12 am »
Hi Hugh,

Thank you so much for contributing back your Perl script that solves your problem, and providing new sample Gaussian-Babel-generated PDB date files. At it turns out, the three PDB files you attached -- AT.pdb, AU.pdb, and GC.pdb -- are all fine with 3DNA. You can verify this point by running find_pair with the -s option:
Code: [Select]
find_pair -s GC.pdb stdout
# and it will output the following:
GC.pdb
GC.outs
    1      # single helix
    2      # number of bases
    1    1 # explicit bp numbering/hetero atoms
    1      # ....>A:...1_:[..G]G
    2      # ....>B:...1_:[..C]C
However, none of the three PDB files contains a base pair, per the default parameters -- check using a molecular graphics viewer like Jmol or PyMOL.

The atom naming issue related to PDB files from computational chemistry packages (e.g. Gaussian) and Babel has appeared a few times in the 3DNA forum. As far as 3DNA is concerned, your effort has led to the first known solution (I am aware of) to this problem. Your question has prompted me to read the article "Open Babel: An open chemical toolbox" and download the latest Open Babel v2.3.1.

Best regards,

Xiang-Jun
« Last Edit: March 25, 2012, 10:01:17 pm by xiangjun »

 

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University