Author Topic: Avoiding pairing that cross chains (Read 50384 times)

eswright · « **on:** July 26, 2019, 01:11:05 pm »

I used DSSR to calculate the secondary structure of PDB 4V6W chain A5, which is a eukaryotic LSU rRNA. This resulted in 896 left pairings "(" and 880 right pairings ")". Presumably this imbalance is because some of the pairs are inter-chain. Is there a way to force pairings only within single chains (i.e., intra-chain pairs only)? Ideally, I would like to obtain a balanced structure for each chain by itself.

Thanks,
Erik

xiangjun · « **Reply #1 on:** July 26, 2019, 02:18:32 pm »

Dear Eric,

Thanks for using DSSR and for posting your questions on the 3DNA Forum.

Quote

Is there a way to force pairings only within single chains (i.e., intra-chain pairs only)?

The default setting of DSSR is there for a reason. Thinking of a DNA duplex, such as 355d for an example. There would be no base pairs within each strand (A, or B). So the DSSR DBN output for the [whole] input structure is balanced, but not necessarily within each chain, as you noticed in PDB 4V6W.

To derive properly formatted DBN per chain, you need to extract the chain into a new file and then run DSSR on it. I may consider adding a new option to DSSR to streamline this process in the future.

Best regards,

Xiang-Jun

eswright · « **Reply #2 on:** July 26, 2019, 02:33:17 pm »

Thank you for the quick and informative reply. What is the best way to extract a single chain from a pdb file?

xiangjun · « **Reply #3 on:** July 26, 2019, 03:01:46 pm »

It really depends. There are many choices. For example, pdb-tools 2.0.0: Have a look of https://pypi.org/project/pdb-tools/ [pdb-tools: a swiss army knife for molecular structures. bioRxiv (2018). doi:10.1101/483305]. Jmol/PyMOL should also do the trick.

Best regards,

Xiang-Jun

eswright · « **Reply #4 on:** July 27, 2019, 07:12:42 am »

Thank you for the suggestions. I managed to extract a chain in R using bio3d. As an example, I can extract chain A5 of PDB 4V6W:

library(bio3d)
pdb <- read.cif("4V6W.cif")
inds <- atom.select(pdb, chain="A5")
pdb2 <- trim.pdb(pdb, inds)
write.pdb(pdb2, "4V6W.pdb")

When I ran DSSR on the resulting pdb file (attached), I obtained the error:

no ATOM/HETATM records found!

On the original cif file with all chains I obtained the expected result. Could you please help me understand why DSSR is not working on the attached file?

xiangjun · « **Reply #5 on:** July 27, 2019, 09:40:06 am »

The attached, bio3d-extracted PDB file is not in the proper PDB format, as shown below:

Code: [Select]

ATOM  146941 "O5'"   U A5   1      25.028 -40.244  90.648  1.00 70.00           O
ATOM  146942 "C5'"   U A5   1      25.523 -39.579  91.790  1.00 70.00           C
ATOM  146943 "C4'"   U A5   1      26.370 -38.340  91.444  1.00 70.00           C
ATOM  146944 "O4'"   U A5   1      27.391 -38.705  90.570  1.00 70.00           O
ATOM  146945 "C3'"   U A5   1      25.628 -37.207  90.717  1.00 70.00           C
ATOM  146946 "O3'"   U A5   1      24.913 -36.404  91.694  1.00 70.00           O

Specifically, the atom serial number is larger than 99999 (maximum for 5 columns), the atom names are more than 4-chars with "", and chain id should be only one character long instead of A5. Could bio3D write the output in mmCIF format? You may need to read the documentation or contact the developer of bio3D.

An example of correctly formatted PDB ATOM record is as below:

Code: [Select]

ATOM     25  P     C A   2      54.635  50.420  53.741  1.00100.19           P
ATOM     26  OP1   C A   2      55.145  51.726  54.238  1.00100.19           O
ATOM     27  OP2   C A   2      54.465  50.204  52.269  1.00100.19           O
ATOM     28  O5'   C A   2      55.563  49.261  54.342  1.00 98.27           O
ATOM     29  C5'   C A   2      55.925  49.246  55.742  1.00 95.40           C
ATOM     30  C4'   C A   2      56.836  48.075  56.049  1.00 93.33           C
ATOM     31  O4'   C A   2      56.122  46.828  55.830  1.00 92.18           O
ATOM     32  C3'   C A   2      58.090  47.947  55.197  1.00 92.75           C
ATOM     33  O3'   C A   2      59.174  48.753  55.651  1.00 92.89           O
ATOM     34  C2'   C A   2      58.416  46.463  55.298  1.00 91.81           C

To help move over this issue quickly, please try the following in PyMOL:

Code: [Select]

reinitialize
load 4v6w.cif
save 4v6w-A5.cif, chain A5

The output file "4v6w-A5.cif" is what you need. Please have a try and let us know how it goes.

Xiang-Jun

eswright · « **Reply #6 on:** July 27, 2019, 01:19:34 pm »

Using PyMol worked much better! Thanks for the tip.

News:

Author Topic: Avoiding pairing that cross chains (Read 50384 times)

eswright

Avoiding pairing that cross chains

xiangjun

Re: Avoiding pairing that cross chains

eswright

Re: Avoiding pairing that cross chains

xiangjun

Re: Avoiding pairing that cross chains

eswright

Re: Avoiding pairing that cross chains

xiangjun

Re: Avoiding pairing that cross chains

eswright

Re: Avoiding pairing that cross chains