General questions of H-bond section in DSSR

xiangjun:
Hi Honglue,

Thanks for clarifying the Unicode issue of your Python-based JSON parser. DSSR-generated JSON output is in the ASCII charset, so Unicode is an overkill in this case. The detailed examples you provided are very helpful.

--- Quote ---x3dna-dssr -i=3bnq.pdb --symm --get-hbond --json | jq . hbonds[1]

However, it outputs

Processing file '3bnq.pdb'
jq: error: Could not open file hbonds[1]: No such file or directory
--- End quote ---

The error is due to the extra space between the dot and hbonds[1], which makes jq to take hbonds[1] as the JSON file to process.

The following command (jq .hbonds[1]) works, as expected. Also note that no need to use the --symm option in this case.

--- Code: ---#x3dna-dssr -i=3bnq.pdb --get-hbond --json | jq .hbonds[1]
{
"index": 2,
"atom1_serNum": 59,
"atom2_serNum": 975,
"donAcc_type": "standard",
"distance": 2.532,
"atom1_id": "O6@A.G3",
"atom2_id": "N4@B.C23",
"atom_pair": "O:N",
"residue_pair": "nt:nt"
}

--- End code ---

--- Quote ---Example 1: Hbond index 117. donAcc_type acceptable.
--- End quote ---

--- Code: ---#x3dna-dssr -i=3bnq.pdb --get-hbond --json | jq .hbonds[116]
{
"index": 117,
"atom1_serNum": 1426,
"atom2_serNum": 1928,
"donAcc_type": "acceptable",
"distance": 2.612,
"atom1_id": "OP2@C.G22",
"atom2_id": "O41@C.PAR101",
"atom_pair": "O:O",
"residue_pair": "nt:ligand"
}

--- End code ---

--- Quote ---Example 2: Hbond index 113. donAcc_type standard.
--- End quote ---

--- Code: ---#x3dna-dssr -i=3bnq.pdb --get-hbond --json | jq .hbonds[112]
{
"index": 113,
"atom1_serNum": 1406,
"atom2_serNum": 1937,
"donAcc_type": "standard",
"distance": 2.63,
"atom1_id": "OP2@C.C21",
"atom2_id": "N32@C.PAR101",
"atom_pair": "O:N",
"residue_pair": "nt:ligand"
}
--- End code ---

--- Quote ---In both cases, it seems that the hydrogen bond geometry are very similar then why does the DSSR think they are different donAcc_type?
--- End quote ---

In DSSR, the donAcc_type is based on known or heuristically derived donor/acceptor properties of the two atoms in an H-bond.

In the case of H-bond #113, OP2@C.C21 is a known acceptor, and N32@C.PAR101 is judged as a donor. So this H-bond is between an acceptor and a donor, which is 'standard'.

In the case of H-bond #117, OP2@C.G22 is a known acceptor, but O41@C.PAR101 is judged as a hydroxyl group. As the 2'-hydroxyl group in RNA ribose sugar, it can be either an acceptor or a donor. So this H-bond is classified as 'acceptable'.

--- Quote ---Example 3: Hbond index 107. donAcc_type questionable.
--- End quote ---

--- Code: ---#x3dna-dssr -i=3bnq.pdb --get-hbond --json | jq .hbonds[106]
{
"index": 107,
"atom1_serNum": 1323,
"atom2_serNum": 1367,
"donAcc_type": "questionable",
"distance": 3.358,
"atom1_id": "O4'@C.A17",
"atom2_id": "O4'@C.G19",
"atom_pair": "O:O",
"residue_pair": "nt:nt"
}

--- End code ---

--- Quote ---In this case, the DSSR identify a hbonds between two O4' atom, but we know that for ribose, the O4' is unlikely to be protonated. Is this the reason why DSSR think the donAcc_type is questionable?
--- End quote ---

That's right. DSSR does not know (or care) the protonation state of the ribose sugars. It only knows that the O4' atoms are H-bond acceptors. Yet they are close together in 3D space and fulfill DSSR's geometric definition of an H-bond. So it is reported as a 'questionable' H-bond.

This is feature of DSSR, not a bug: it allows DSSR to detect all 3 H-bonds in C+C pairs in an i-motif, for example. In other cases, it may indicate a certain type of errors where users should pay attention to.

Since you're interested in H-bonds between nucleotides and the ligands, you could run the following command. DSSR detects three H-bonds between RNA and the PAR ligand. Are they what you’d expect? Have you tried other well-known software tools for H-bonding identification?

--- Code: ---#x3dna-dssr -i=3bnq.pdb --get-hbond --json | jq '.hbonds[] | select(.residue_pair=="nt:ligand")'
{
"index": 113,
"atom1_serNum": 1406,
"atom2_serNum": 1937,
"donAcc_type": "standard",
"distance": 2.63,
"atom1_id": "OP2@C.C21",
"atom2_id": "N32@C.PAR101",
"atom_pair": "O:N",
"residue_pair": "nt:ligand"
}
{
"index": 117,
"atom1_serNum": 1426,
"atom2_serNum": 1928,
"donAcc_type": "acceptable",
"distance": 2.612,
"atom1_id": "OP2@C.G22",
"atom2_id": "O41@C.PAR101",
"atom_pair": "O:O",
"residue_pair": "nt:ligand"
}
{
"index": 118,
"atom1_serNum": 1438,
"atom2_serNum": 1926,
"donAcc_type": "acceptable",
"distance": 2.644,
"atom1_id": "N7@C.G22",
"atom2_id": "O31@C.PAR101",
"atom_pair": "N:O",
"residue_pair": "nt:ligand"
}

--- End code ---

Hope this clarifies your confusions about H-bonding identification in DSSR.

Xiang-Jun

PS. Please remember to be concrete in asking questions. Be generous in summarizing what you've learned for the benefit of yourself, and other viewers of a thread. Let's work together to make the Forum more informative.

lvelve0901:
Hi Xiangjun,

Your answer is very clear and concrete. I think now I understand how 3DNA identify the H-bonds in general. I will keep posting other PDB examples in the future if we find something wired in the Hbond section since my rotation students is manually inspecting them now. Do you think we should post here or start a new topic in this forum for other PDB?

Also, I tried your way to parse json using jq and it works. But the issue by doing this

--- Quote from: lvelve0901 on November 11, 2017, 05:54:22 pm ---
x3dna-dssr -i=3bnq.pdb --symm --get-hbond --json | jq .hbonds[1]

--- End quote ---

You will first run the DSSR and generate json file. Sometimes for a large PDB file it will take a long time. Is there any command to parse json if we have already generated the json file?

Thanks.

Best,
Honglue

xiangjun:

--- Quote ---Is there any command to parse json if we have already generated the json file?
--- End quote ---

Sure. The pipe form is just a shorthand to avoid an intermediate file. You can certainly generate the JSON file first, and then parse it using jq -- see the excellent documentation of jq for examples.

--- Code: Ruby ---x3dna-dssr -i=3bnq.pdb --get-hbond --json | jq .hbonds[1] # can be decomposed into the following two steps:x3dna-dssr -i=3bnq.pdb --get-hbond --json -o=3bnq-hbonds.jsonjq .hbonds[1] 3bnq-hbonds.json # all with the following results:{ "index": 2, "atom1_serNum": 59, "atom2_serNum": 975, "donAcc_type": "standard", "distance": 2.532, "atom1_id": "O6@A.G3", "atom2_id": "N4@B.C23", "atom_pair": "O:N", "residue_pair": "nt:nt"}
Xiang-Jun

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University