26
RNA structures (DSSR) / Re: Definition of Helix Form
« on: November 13, 2017, 11:06:08 am »
I actually did some work to benchmark whether 3DNA did a good job to identify the helix form.
In our lab, we have an in-hosue database of all the DNA stem and RNA stem structures from the entire crystal structures labeled "Protein#DNA" and "Protein#RNA" deposited in RCSB with resolution under 4Å. Yes, I use DSSR to generate stem structures for each PDB.
Then I build fiber idealized B form DNA and idealized A form RNA using 3DNA. If you type
fiber -m
It will generate a list of different nucleic acid model, I pick the number 4 for B-DNA and number 20 for A-RNA.
[hs189@summer:Plot] fiber -m
Fiber data in directory: /home/hs189/X3DNA/fiber/
id# Twist Rise Structure description
(degree) (Angstrom)
-------------------------------------------------------------------------------
1 32.7 2.548 A-DNA (calf thymus; generic sequence: A, C, G and T)
2 65.5 5.095 A-DNA poly d(ABr5U) : poly d(ABr5U)
3 0.0 28.030 A-DNA (calf thymus) poly d(A1T2C3G4G5A6A7T8G9G10T11) :
poly d(A1C2C3A4T5T6C7C8G9A10T11)
4 36.0 3.375 B-DNA (calf thymus; generic sequence: A, C, G and T)
5 72.0 6.720 B-DNA poly d(CG) : poly d(CG)
6 180.0 16.864 B-DNA (calf thymus) poly d(C1C2C3C4C5) : poly d(G6G7G8G9G10)
7 38.6 3.310 C-DNA (calf thymus; generic sequence: A, C, G and T)
8 40.0 3.312 C-DNA poly d(GGT) : poly d(ACC)
9 120.0 9.937 C-DNA poly d(G1G2T3) : poly d(A4C5C6)
10 80.0 6.467 C-DNA poly d(AG) : poly d(CT)
11 80.0 6.467 C-DNA poly d(A1G2) : poly d(C3T4)
12 45.0 3.013 D-DNA poly d(AAT) : poly d(ATT)
13 90.0 6.125 D-DNA poly d(CI) : poly d(CI)
14 -90.0 18.500 D-DNA poly d(A1T2A3T4A5T6) : poly d(A1T2A3T4A5T6)
15 -60.0 7.250 Z-DNA poly d(GC) : poly d(GC)
16 -51.4 7.571 Z-DNA poly d(As4T) : poly d(As4T)
17 0.0 10.200 L-DNA (calf thymus) poly d(GC) : poly d(GC)
18 36.0 3.230 B'-DNA alpha poly d(A) : poly d(T) (H-DNA)
19 36.0 3.233 B'-DNA beta2 poly d(A) : poly d(T) (H-DNA beta)
20 32.7 2.812 A-RNA poly (A) : poly (U)
I know that the 3DNA identify the helix form in a dinucleotide step so I generated two base pair long idealized B-DNA and A-RNA to align the coordinate of the stem structures I generated using only backbone and sugar heavy atom and yielded an alignment RMSD for each dinucleotide step in my database.
Here is the result:
My RMSD cutoff is 2Å.
Protein#DNA
Total number of entries (dinucleotide step): 97366
Number of entries with RMSD (> 2Å) but 3DNA think it is B form: 49
Number of entries with RMSD (< 2Å) but 3DNA think it is X form (ambiguous): 29702
The rest of entries is 3DNA agree with my RMSD cut off.
Protein#RNA
Total number of entries (dinucleotide step): 56530
Number of entries with RMSD (> 2Å) but 3DNA think it is A form: 0
Number of entries with RMSD (< 2Å) but 3DNA think it is X form (ambiguous): 22893
The rest of entries is 3DNA agree with my RMSD cut off.
I think 3DNA basically did a good job considering the number of entries that excess the RMSD cutoff but 3DNA think it is A/B form among the entire PDB.
I am just wondering does 3DNA also simply use coordinate alignment to identify the helix form?
Best,
Honglue
PS. these data is under publication so I am not sure if I can provide further details but I will try my best to give you as much detain as you want.
In our lab, we have an in-hosue database of all the DNA stem and RNA stem structures from the entire crystal structures labeled "Protein#DNA" and "Protein#RNA" deposited in RCSB with resolution under 4Å. Yes, I use DSSR to generate stem structures for each PDB.
Then I build fiber idealized B form DNA and idealized A form RNA using 3DNA. If you type
fiber -m
It will generate a list of different nucleic acid model, I pick the number 4 for B-DNA and number 20 for A-RNA.
[hs189@summer:Plot] fiber -m
Fiber data in directory: /home/hs189/X3DNA/fiber/
id# Twist Rise Structure description
(degree) (Angstrom)
-------------------------------------------------------------------------------
1 32.7 2.548 A-DNA (calf thymus; generic sequence: A, C, G and T)
2 65.5 5.095 A-DNA poly d(ABr5U) : poly d(ABr5U)
3 0.0 28.030 A-DNA (calf thymus) poly d(A1T2C3G4G5A6A7T8G9G10T11) :
poly d(A1C2C3A4T5T6C7C8G9A10T11)
4 36.0 3.375 B-DNA (calf thymus; generic sequence: A, C, G and T)
5 72.0 6.720 B-DNA poly d(CG) : poly d(CG)
6 180.0 16.864 B-DNA (calf thymus) poly d(C1C2C3C4C5) : poly d(G6G7G8G9G10)
7 38.6 3.310 C-DNA (calf thymus; generic sequence: A, C, G and T)
8 40.0 3.312 C-DNA poly d(GGT) : poly d(ACC)
9 120.0 9.937 C-DNA poly d(G1G2T3) : poly d(A4C5C6)
10 80.0 6.467 C-DNA poly d(AG) : poly d(CT)
11 80.0 6.467 C-DNA poly d(A1G2) : poly d(C3T4)
12 45.0 3.013 D-DNA poly d(AAT) : poly d(ATT)
13 90.0 6.125 D-DNA poly d(CI) : poly d(CI)
14 -90.0 18.500 D-DNA poly d(A1T2A3T4A5T6) : poly d(A1T2A3T4A5T6)
15 -60.0 7.250 Z-DNA poly d(GC) : poly d(GC)
16 -51.4 7.571 Z-DNA poly d(As4T) : poly d(As4T)
17 0.0 10.200 L-DNA (calf thymus) poly d(GC) : poly d(GC)
18 36.0 3.230 B'-DNA alpha poly d(A) : poly d(T) (H-DNA)
19 36.0 3.233 B'-DNA beta2 poly d(A) : poly d(T) (H-DNA beta)
20 32.7 2.812 A-RNA poly (A) : poly (U)
I know that the 3DNA identify the helix form in a dinucleotide step so I generated two base pair long idealized B-DNA and A-RNA to align the coordinate of the stem structures I generated using only backbone and sugar heavy atom and yielded an alignment RMSD for each dinucleotide step in my database.
Here is the result:
My RMSD cutoff is 2Å.
Protein#DNA
Total number of entries (dinucleotide step): 97366
Number of entries with RMSD (> 2Å) but 3DNA think it is B form: 49
Number of entries with RMSD (< 2Å) but 3DNA think it is X form (ambiguous): 29702
The rest of entries is 3DNA agree with my RMSD cut off.
Protein#RNA
Total number of entries (dinucleotide step): 56530
Number of entries with RMSD (> 2Å) but 3DNA think it is A form: 0
Number of entries with RMSD (< 2Å) but 3DNA think it is X form (ambiguous): 22893
The rest of entries is 3DNA agree with my RMSD cut off.
I think 3DNA basically did a good job considering the number of entries that excess the RMSD cutoff but 3DNA think it is A/B form among the entire PDB.
I am just wondering does 3DNA also simply use coordinate alignment to identify the helix form?
Best,
Honglue
PS. these data is under publication so I am not sure if I can provide further details but I will try my best to give you as much detain as you want.