Hi Xiangjun,
As you know, unknown bases may be found in a specific pdb, and unknowns should be added to the baselist.dat file in order to let 3DNA process further.
Since I have to pre-analyze all lately updated structures contained by NDB, I must deal with this in an automatic way. What I did is:
1) find_pair XX.pdb XX.inp >& log
2) if file_exists(XX.inp) goto analyze, else fopen(log), find the unknown bases and goto next step.
3) foreach unknow base (triplet), if the first letter matches any of {A, T, C, G, U} , then this bases is represented as the matched base in lower case {a, t, c, g, u}; if no match for the first letter, goto the second, ... up to all the three letters; if no match for all the three letters, the base is replaced by an arbitrary letter 'n'.
I understand that such way is risky, although normally a mutated base contains one and only one wild-type base. What is your suggestion to deal with this? Any knowledge about the naming rule of mutated bases.
Thanks.
Guohui