Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: A way to regenerate experimental coordinates from the reference coordinates?  (Read 34593 times)

Offline jyvdf3asdg2

  • non-commercial
  • with-posts
  • *
  • Posts: 24
    • View Profile
Hi,

Love 3DNA and really appreciate its efficiency and speed. Have to admire some quality C coding!

I'm using a Python script that is tied to your program, but need the all-atom coordinates of every base-pair 3DNA finds. Your program generates a lovely allpairs.pdb file that contains all the information I need and would make life so much simpler, but unfortunately it is in a local reference frame.

I was wondering if there was a way to regenerate the coordinates of the native PDB from the generated allpairs.pdb file. Would this be a simple arithmatic operation, or a more complex matrix transformation that would require Lin. Alg. libraries? I've tried reading through the tech-details documentation, but am unclear as to how I could do this, if at all.

Thanks so much

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1655
    • View Profile
    • 3DNA homepage
Thanks for using 3DNA and your kind words about it! Over the years, user feedback/encouragement has been the driving force to move the project forward.

As you noticed, 'allpairs.pdb' has each of its base-pair (bp) reoriented in the local bp reference frame. If you want the all-atom bp coordinates in the native PDB, you can write a simple Python script to extract them from the original PDB file -- certainly no "complex matrix transformation" required. To make the point clear, let's use '6tna' as an example.

Code: [Select]
find_pair -p 6tna.pdb 6tna.bps
    # This generates 'allpairs.pdb', where each bp is delineated by a MODEL/ENDMDL pair
head allpairs.pdb
        # The first 2 lines are as below:
    # MODEL        1
    # REMARK    Section #0001 ....>A:...1_:[..G]G-----C[..C]:..72_:A<....

For the first bp, you can then simply extract nucleotides G1 on chain A (A.G1) and A.C72 from '6tna.pdb'. Check $X3DNA/perl_scripts/pdb_frag for a rudimentary implementation in Perl. By looping through 'allpairs.pdb' and checking for pattern "REMARK    Section #", you can convert all the bps to their native PDB coordinates.

HTH,

Xiang-Jun



« Last Edit: July 07, 2012, 07:58:02 pm by xiangjun »

Offline jyvdf3asdg2

  • non-commercial
  • with-posts
  • *
  • Posts: 24
    • View Profile
Hi Xiang-Jun,

Thanks for the advice. I was hoping for an efficient way as Python can be rather slow when parsing hundreds or more PDB's, which is why I was wondering if there was a simple way to transform the local reference frame to the experimental coords as your program already parsed through the PDB's and organized the pairs so nicely. But I'll give it a shot to see if it doesn't cost too much wall-time.

Thanks

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1655
    • View Profile
    • 3DNA homepage
There are two aspects to your question:
  • You want the original coordinates of the base pairs (bp) instead of those reoriented w.r.t. bp reference frames.
  • You are concerned about the speed of transformations or extracting from PDB files.
Using Python or any other scripting language, you do not need to parse hundreds of PDBs, but only two: one is the the original PDB file, and the other is 'allpairs.pdb'. By scanning the later, you have a list of all bps, then you can extract one-by-one from the same original PDB file. Have a try and report back how it goes -- I do not think speed is an issue here.

That said, the most efficient way would be to add an option to 'find_pair', presumably -original_coordinate, so that instead of transforming to bp reference frame, the original atomic coordinates are written directly to 'allpairs.pdb'. I will update the 3DNA v2.1beta distribution soon, so stay tuned.

Xiang-Jun



Offline jyvdf3asdg2

  • non-commercial
  • with-posts
  • *
  • Posts: 24
    • View Profile
Thanks so much for the prompt reply.

Pardon my poor phrasing. By parsing "hundreds of PDB's", I mean repeating the operation across hundreds of different PDB's to generate hundreds of different coordinate objects. I'm looking through essentially every base-pair (generated by your program) in the PDB for a specific marker. So for each NA PDB I'd run your program, look at the normal output, then (would like to) look at the exp. coordinates for each BP.

I've wrote the script you mentioned, but the best I can do is ~1k bp's per 5.48 seconds in Python. That'd be about 20-25 minutes for all the pairs I look at. C++ did 1k bp in about 0.17 sec with GCC optimization, but I'd like to stay away from compiled languages.

I'm definitely looking forward to the update. If you were able to include that in the update I'd just wait!

Thanks.

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1655
    • View Profile
    • 3DNA homepage
Updated to 2012jul09 -- have a look and report back how it works.

Xiang-Jun

Offline jyvdf3asdg2

  • non-commercial
  • with-posts
  • *
  • Posts: 24
    • View Profile
Oh man, you made my day.

Works perfectly for one structure so far, will let you know later after I've done it for many more.

Thank you so much!

 

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University