Hi Di,
Thanks for your well formulated question regarding x3dna-dssr’s support of N1-methyl-pseudouridine, B8H.
I understand that x3dna-dssr can handle pseudouridine (PDB Chem ID: PSU) correctly. I'm inquiring about its support for N1-methyl-pseudouridine (PDB Chem ID: B8H). Specifically, does x3dna-dssr recognize B8H based on its PDB chemical ID, or does it rely on atomic connectivity?
FYI, I've tested x3dna-dssr with PDB entries 8PFK and 8PFQ, both containing B8H, and the analysis proceeded without errors, with the results looking reasonable. However, given the unique C5-C1′ glycosidic bond for B8H, I want to confirm that x3dna-dssr interprets this modification accurately.
DSSR uses atomic connectivity to identify pseudouridine or its modified forms, including B8H. DSSR User Manual contains the following relevant information:
Note that pseudouridine, the most prevalently modified nt in RNA, is denoted P† in DSSR and the small case p is reserved for potential modified pseudouridines. ... footnote: †Not to be confused with the phosphorus atom in the backbone phosphate group. The distinction should be clear in context.
While anticipated, your reported case of B8H is the first time I see a modified pseudouridine. In DSSR output for 8PFK, you will see the following:
#x3dna-dssr -i=8PFK.pdb -o=8PFK.out
From 8PFK.out
****************************************************************************
List of 1 type of 1 modified nucleotide
nt count list
1 B8H-p 1 A.B8H7
From dssr-torsions.txt
7 p A.B8H7 ... chi -155.3(anti)
The chi for B8H is defined using O4'--C1'--C5--C4 instead of O4'--C1'--N1--C2, which would make no sense for pseudouridine. This is a little detail that DSSR pays attention to where other tools may not. See my blogpost
Torsion angles from DSSR. You could easily verify this, using PyMOL for example, to measure the torsion angle by clicking four atoms in order.
The DSSR results for 8PFQ are also as expected with correct identification of B8H as a modified pseudouridine.
Further, is there a comprehensive list of modified nucleotides currently supported by x3dna-dssr? I came across these two pages (https://x3dna.org/highlights/automatic-identification-of-nucleotides ; https://x3dna.org/highlights/modified-nucleotides-in-the-pdb ), but could not find the exact answer.
To answer your question, here is an excerpt from my recent response to a similar inquiry:
Over the years, I've refined the heuristics of the mapping process. In the early days with 3DNA, I kept an ever increasing list of 'baselist.dat' with hundreds of entries like: MIA a that maps MIA as a modified A, denoted as lowercase 'a'. In the current DSSR, I keep only the standard ones, with 48 entries total (see attached DSSR-baselist.txt). If a residue is not a standard one, the following function is called to do the mapping (DSSR performs filtering to decide if it is a nucleotide, and if so R or Y). DSSR also has a command-line option --nt-mapping as documented in the screenshot.
The DSSR-baselist.txt is attached for your reference. I am planning to write blogpost with details on this topic.
Best regards,
Xiang-Jun