Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: baselist  (Read 23276 times)

Offline ghzheng

  • with-posts
  • *
  • Posts: 23
    • View Profile
baselist
« on: December 23, 2008, 07:59:14 am »
Hi Xiangjun,

As you know, unknown bases may be found in a specific pdb, and unknowns should be added to the baselist.dat file in order to let 3DNA process further.

Since I have to pre-analyze all lately updated structures contained by NDB, I must deal with this in an automatic way. What I did is:

1) find_pair XX.pdb XX.inp >& log
2) if file_exists(XX.inp) goto analyze, else fopen(log), find the unknown bases and goto next step.
3) foreach unknow base (triplet), if the first letter matches any of {A, T, C, G, U} , then this bases is represented as the matched base in lower case {a, t, c, g, u}; if no match for the first letter, goto the second, ... up to all the three letters; if no match for all the three letters, the base is replaced by an arbitrary letter 'n'.

I understand that such way is risky, although normally a mutated base contains one and only one wild-type base. What is your suggestion to deal with this? Any knowledge about the naming rule of mutated bases.

Thanks.

Guohui

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1650
    • View Profile
    • 3DNA homepage
Re: baselist
« Reply #1 on: December 23, 2008, 11:02:12 pm »
Hi Guohui,

The way you suggested might work most of the time, but it is not a general approach. Certainly one should not count on the naming convention of the nucleotide residues to judge its identity. On the other hand, it really does not matter (that much) which modified base you use -- a generalized Atomic_N|n.pdb will do.

I could have automated this process by default, but decided to leave it as is. I have an updated version of 'baselist.dat' and 'atomlist.dat' that works for the latest NDB (Dec. 19, 2008 release), and I have attached them with this post. In general, you could write a script that process each NDB entry with 'find_pair -s' which should identify each unknown nucleotides. You could then update your list accordingly.

I have refined 3DNA v2.0, especially with regard to the NP_Recipes/ directory, a few months back. The ones intended for "official" release are currently at http://3dna.rutgers.edu:8080/3DNA_v2.0/. The directory is password protected: v2_beta/qB78Yaz. Please use this version to replace to one you are using.

HTH,

Xiang-Jun[attachment=1:3jea5n73]baselist.dat[/attachment:3jea5n73]

 

Funded by X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids (R24GM153869)

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University