Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Messages - lvelve0901

Pages: 1 [2] 3
26
RNA structures (DSSR) / Re: Definition of Helix Form
« on: November 13, 2017, 11:06:08 am »
I actually did some work to benchmark whether 3DNA did a good job to identify the helix form.

In our lab, we have an in-hosue database of all the DNA stem and RNA stem structures from the entire crystal structures labeled "Protein#DNA" and "Protein#RNA" deposited in RCSB with resolution under 4Å. Yes, I use DSSR to generate stem structures for each PDB.

Then I build fiber idealized B form DNA and idealized A form RNA using 3DNA. If you type

fiber -m

It will generate a list of different nucleic acid model, I pick the number 4 for B-DNA and number 20 for A-RNA.

[hs189@summer:Plot] fiber -m
Fiber data in directory: /home/hs189/X3DNA/fiber/

id#  Twist   Rise        Structure description
   (degree) (Angstrom)
-------------------------------------------------------------------------------
 1   32.7   2.548  A-DNA  (calf thymus; generic sequence: A, C, G and T)
 2   65.5   5.095  A-DNA  poly d(ABr5U) : poly d(ABr5U)
 3    0.0  28.030  A-DNA  (calf thymus) poly d(A1T2C3G4G5A6A7T8G9G10T11) :
                                        poly d(A1C2C3A4T5T6C7C8G9A10T11)
 4   36.0   3.375  B-DNA  (calf thymus; generic sequence: A, C, G and T)
 5   72.0   6.720  B-DNA  poly d(CG) : poly d(CG)
 6  180.0  16.864  B-DNA  (calf thymus) poly d(C1C2C3C4C5) : poly d(G6G7G8G9G10)
 7   38.6   3.310  C-DNA  (calf thymus; generic sequence: A, C, G and T)
 8   40.0   3.312  C-DNA  poly d(GGT) : poly d(ACC)
 9  120.0   9.937  C-DNA  poly d(G1G2T3) : poly d(A4C5C6)
10   80.0   6.467  C-DNA  poly d(AG) : poly d(CT)
11   80.0   6.467  C-DNA  poly d(A1G2) : poly d(C3T4)
12   45.0   3.013  D-DNA  poly d(AAT) : poly d(ATT)
13   90.0   6.125  D-DNA  poly d(CI) : poly d(CI)
14  -90.0  18.500  D-DNA  poly d(A1T2A3T4A5T6) : poly d(A1T2A3T4A5T6)
15  -60.0   7.250  Z-DNA  poly d(GC) : poly d(GC)
16  -51.4   7.571  Z-DNA  poly d(As4T) : poly d(As4T)
17    0.0  10.200  L-DNA  (calf thymus) poly d(GC) : poly d(GC)
18   36.0   3.230  B'-DNA alpha poly d(A) : poly d(T) (H-DNA)
19   36.0   3.233  B'-DNA beta2 poly d(A) : poly d(T) (H-DNA  beta)
20   32.7   2.812  A-RNA  poly (A) : poly (U)


I know that the 3DNA identify the helix form in a dinucleotide step so I generated two base pair long idealized B-DNA and A-RNA to align the coordinate of the stem structures I generated using only backbone and sugar heavy atom and yielded an alignment RMSD for each dinucleotide step in my database.

Here is the result:

My RMSD cutoff is 2Å.

Protein#DNA
Total number of entries (dinucleotide step): 97366
Number of entries with RMSD (> 2Å) but 3DNA think it is B form: 49
Number of entries with RMSD (< 2Å) but 3DNA think it is X form (ambiguous): 29702
The rest of entries is 3DNA agree with my RMSD cut off.

Protein#RNA
Total number of entries (dinucleotide step): 56530
Number of entries with RMSD (> 2Å) but 3DNA think it is A form: 0
Number of entries with RMSD (< 2Å) but 3DNA think it is X form (ambiguous): 22893
The rest of entries is 3DNA agree with my RMSD cut off.

I think 3DNA basically did a good job considering the number of entries that excess the RMSD cutoff but 3DNA think it is A/B form among the entire PDB.
I am just wondering does 3DNA also simply use coordinate alignment to identify the helix form?

Best,
Honglue

PS. these data is under publication so I am not sure if I can provide further details but I will try my best to give you as much detain as you want.

27
RNA structures (DSSR) / Re: FRABASE
« on: November 13, 2017, 10:06:40 am »
I see.

Thanks.

Best,
Honglue

28
RNA structures (DSSR) / Re: General questions of H-bond section in DSSR
« on: November 13, 2017, 10:05:22 am »
Hi Xiangjun,

Your answer is very clear and concrete. I think now I understand how 3DNA identify the H-bonds in general. I will keep posting other PDB examples in the future if we find something wired in the Hbond section since my rotation students is manually inspecting them now. Do you think we should post here or start a new topic in this forum for other PDB?

Also, I tried your way to parse json using jq and it works. But the issue by doing this


x3dna-dssr -i=3bnq.pdb --symm --get-hbond --json | jq .hbonds[1]


You will first run the DSSR and generate json file. Sometimes for a large PDB file it will take a long time. Is there any command to parse json if we have already generated the json file?

Thanks.

Best,
Honglue

29
RNA structures (DSSR) / Re: General questions of H-bond section in DSSR
« on: November 11, 2017, 05:54:22 pm »
Hi Xiangjun,

Sorry I have been busy with other stuff in lab but I do remember your question last time.

----------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------
For your last question:


Quote
{u'index': 31, u'atom2_serNum': 212, u'residue_pair': u'nt:aa', u'distance': 3.09, u'atom_pair': u'N:N', u'atom2_id': u'N@2:B.ALA6', u'donAcc_type': u'standard', u'atom1_id': u'N3@2:A.DG3', u'atom1_serNum': 69}

How did you get the above output for PDB id: 1PFE? Specifically, where does the 'u' before each tag name come from?


Basically, I just load the json use my way (my own json parser) and print out the 'hbonds' section. In my python, when I load the json file (using import json module), the string format will be loaded as unicode. I think that's why those string will have the 'u'. I think that is just my python string encode issue. Here is more explanation of the unicode string (https://stackoverflow.com/questions/21808657/what-is-a-unicode-string). I also tried your way as you suggested (using jq) but I didn't make it work. Do I need to install jq in my computer? I installed jq from the website
https://stedolan.github.io/jq/ and put the file in my working folder then type.

x3dna-dssr -i=3bnq.pdb --symm --get-hbond --json | jq . hbonds[1]

However, it outputs

Processing file '3bnq.pdb'
jq: error: Could not open file hbonds[1]: No such file or directory


I don't know if I did the right way.

----------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------
My new questions:

My new target structure is Mitochondrial Ribosomal Decoding Site (PDB ID: 3BNQ). I downloaded the PDB file (not biological assembly file) from RCSB. Then I try to generate the json file by typing:

x3dna-dssr -i=3bnq.pdb -o=3bnq.json --json --more --symm

I use my own json parser to look for the hydrogen bond between the RNA and the ligand PAR.

There are three examples with different don_Acc type here. All the hydrogen bonds mentioned below are labeled in the 3bnq.pse. Measure01 is the first example. Measure02 is the second example. Measure03 is the third example.

Example 1: Hbond index 117. donAcc_type acceptable.
{u'index': 117, u'atom2_serNum': 1928, u'residue_pair': u'nt:ligand', u'distance': 2.612, u'atom_pair': u'O:O', u'atom2_id': u'O41@C.PAR101', u'donAcc_type': u'acceptable', u'atom1_id': u'OP2@C.G22', u'atom1_serNum': 1426}

This is a hydrogen bond between a hydroxyl group in the ligand PAR and the OP2 atom in rG22.

Example 2: Hbond index 113. donAcc_type standard.
{u'index': 113, u'atom2_serNum': 1937, u'residue_pair': u'nt:ligand', u'distance': 2.63, u'atom_pair': u'O:N', u'atom2_id': u'N32@C.PAR101', u'donAcc_type': u'standard', u'atom1_id': u'OP2@C.C21', u'atom1_serNum': 1406}

This is a hydrogen bond between a amino group in the ligand PAR and the OP2 atom in rC21.

In both cases, it seems that the hydrogen bond geometry are very similar then why does the DSSR think they are different donAcc_type?

Example 3: Hbond index 107. donAcc_type questionable.
{u'index': 107, u'atom2_serNum': 1367, u'residue_pair': u'nt:nt', u'distance': 3.358, u'atom_pair': u'O:O', u'atom2_id': u"O4'@C.G19", u'donAcc_type': u'questionable', u'atom1_id': u"O4'@C.A17", u'atom1_serNum': 1323}

In this case, the DSSR identify a hbonds between two O4' atom, but we know that for ribose, the O4' is unlikely to be protonated. Is this the reason why DSSR think the donAcc_type is questionable?

I really appreciate your help.

Best,
Honglue



30
RNA structures (DSSR) / Re: General questions of H-bond section in DSSR
« on: November 10, 2017, 10:35:42 pm »
Hi Xiangjun,

I have follow up questions in terms of donAcc_type in H-bond.

Here I attach the json output file of PDB 3BNQ. In the H-bond section, I see there are three types of donAcc_type: standard, acceptable and questionable. You have already explained to me what questionable mean but could you please explain the difference between standard and acceptable?

Also, is there anyway to tell which atom is donor and which atom is acceptor?

Thank you.

Best,
Honglue

31
RNA structures (DSSR) / General questions of H-bond section in DSSR
« on: November 01, 2017, 11:37:03 am »
Hi Xiangjun,

Sorry I gave a long list of questions yesterday. Here, I just post a few questions in terms of the H-bond in DSSR json.

For the H-bond between protein/peptide/ligand to nucleic acid, my target structure is 1PFE, which is a DNA bound to an antibiotic, echinomycin. I downloaded the biological assembly file and used the following command:

x3dna-dssr -i=1PFE.pdb -o=1PFE.json --json --more --symm

In the "hbonds" session of the output json file, I did found the all the DNA-drug interactions. For example,

{u'index': 31, u'atom2_serNum': 212, u'residue_pair': u'nt:aa', u'distance': 3.09, u'atom_pair': u'N:N', u'atom2_id': u'N@2:B.ALA6', u'donAcc_type': u'standard', u'atom1_id': u'N3@2:A.DG3', u'atom1_serNum': 69}

However, I have a few questions in terms of the hbonds output.

(1) How do I know which atom is H-bond donor and which is acceptor, like do you always put acceptor in the first place(atom1)?
(2) If the 'donAcc_type' is questionable, what does it mean? Does it mean that DSSR probably doesn't guess the valence properly?
(3) Wha does the 'serNum' mean here?

Here, I attached all my files.

Thank you.

Best,
Honglue

32
Hi Xiangjun,

First, I want to apologize for not getting the feedback to you in time, though I carefully benchmarked the calculation as you suggested.

1. H-bond
For the H-bond between protein/peptide/ligand to nucleic acid, my target structure is 1PFE, which is a DNA bound to an antibiotic, echinomycin. I downloaded the biological assembly file and used the following command:

x3dna-dssr -i=1PFE.pdb -o=1PFE.json --json --more --symm

In the "hbonds" session of the output json file, I did found the all the DNA-drug interactions. For example,

{u'index': 31, u'atom2_serNum': 212, u'residue_pair': u'nt:aa', u'distance': 3.09, u'atom_pair': u'N:N', u'atom2_id': u'N@2:B.ALA6', u'donAcc_type': u'standard', u'atom1_id': u'N3@2:A.DG3', u'atom1_serNum': 69}

I have a few questions in terms of the hbonds output.

(1) How do I know which atom is H-bond donor and which is acceptor, like do you always put acceptor in the first place(atom1)?
(2) If the 'donAcc_type' is questionable, what does it mean? Does it mean that DSSR probably doesn't guess the valence properly?
(3) Generally speaking, for a random ligand (not peptide-linking or DNA/RNA-linking), how does DSSR guess the valence, like how to guess which heavy atom should be bonded to hydrogen?
(4) Wha does the 'serNum' mean here?

I am training a rotation student to use DSSR to parse all the drug and nucleic acid interactions from the entire PDB so hopefully we will keep updating the issues of DSSR hbonds under the same page.

--------------------------------------------------------------------------------

2. metal

I got a bit confused about your metal sessions in the json file.

My target structure is 1HR2. It is a P4P6 domain which has many Mg2+ in the coordinates.

I again downloaded the biological assembly file and tried to type:

x3dna-dssr -i=1HR2.pdb -o=1HR2.json --json --more --symm --metal

In the output json file, there is indeed a session called 'metals'. For example,

{u'index': 3, u'ligands_long': u'', u'num_ligands': 0, u'symbol': u'Mg', u'ligands_short': u'', u'id': u'A.MG55'}
{u'index': 4, u'ligands_long': u'A.A248,A.U249,A.G250', u'num_ligands': 3, u'symbol': u'Mg', u'ligands_short': u'AUG', u'id': u'A.MG57'}

In the index 3, I assume it means there is no residue/ligand interact with MG55 right? However, if you open the PDB file, don't you think that MG55 is very closed to the cytosine 255? I guess I might misunderstand something, so my questions in terms of metals are.

(1) How does DSSR define the interactions with metal involved?
(2) More generally, how doesn't DSSR define a metal. For example,

In the structure 1D8X, the COBALT HEXAMMINE(III) (NCO) which is a metal complex is considered as metal in DSSR.
In the structure 3MGV, the VANADATE ION (VO4) which is already an negative charged ion is considered as metal in DSSR.

Is there any list which DSSR think certain atom belongs to metal category?

--------------------------------------------------------------------------------

Again, I really appreciate your help with my research all the time and really hope DSSR will be better and better.

Thanks again.

Best,
Honglue

33
Gotcha, I realized that Json file actually has what I need in the 'hbonds' section.

Also, is there also a section for metal in the pdb file. For example, how many metal-RNA interaction in the PDB file?

Best,
Honglue

34

For example, in the dssr, there is a --get-hbond option. It will output all the H-bonds within two nucleotide and it gives you the distance, donor and acceptor etc., but I guess that only applies for nucleic acid. There is no way to detect H-bond between for example protein-and nucleic acid in current DSSR, right?

If not, do you have any suggestion of any software which can do something like parse the interactions between protein/peptide/ligand and nucleic acids.

Really appreciate your help.

Best,
Honglue

35
RNA structures (DSSR) / Can DSSR detect nucleic acid ligand interaction
« on: October 31, 2017, 03:48:34 pm »
Hi Xiangjun,

I am just curious whether DSSR can detect something like RNA and ligand/metal interaction in a PDB file.
For example, list all the hydrogen bond between RNA and ligand?

Best,
Honglue

36
RNA structures (DSSR) / FRABASE
« on: October 18, 2017, 11:35:30 am »
Hi Xiangjun,

Do you have any idea that how different between DSSR and some online RNA database such as FRABASE in terms of secondary structure identification.

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-231

Thanks.

Best,
Honglue

37
RNA structures (DSSR) / Re: Definition of Helix Form
« on: September 15, 2017, 05:03:35 pm »
Hi Xiangjun,

Could you please tell me when you will report the detail of helix form definition in dssr? My advisor asked me some questions in terms of how dssr identify helix form so I would like to know more details of this.

This is not urgent so please take your time.

Thank you so much.

Best,
Honglue

38
RNA structures (DSSR) / Re: Groove width distance in DSSR
« on: September 15, 2017, 05:00:36 pm »
Hi Honglue,

I've finally added groove widths into DSSR JSON output, per your request. Now with --more, you will see the groove_widths key in helices/stems output. The corresponding value is an array with 4 numbers: [minor_groove_width, major_groove_width, minor_groove_width_refined, major_groove_width_refined], as they appear in the 3DNA analyze output. For example, "groove_widths":[12.023,11.942,17.338,17.281] for the 4th dinucleotide step in 355d.

As mentioned in my previous responses, some additional features from the 3DNA analyze program have been implemented into DSSR. They will be revealed later.

Please have try and report back how it goes. Note I've not updated the release version on the download page yet. However, the download files have been updated.

Best regards,

Xiang-Jun


Hi Xiangjum,

I just want to kindly make sure that the output groove_widths in the json file should be [minor_groove_width, minor_groove_width_refined, major_groove_width, major_groove_width_refined], not [minor_groove_width, major_groove_width,, minor_groove_width_refined, major_groove_width_refined], right?

Best,
Honglue

39
RNA structures (DSSR) / Re: Bulge motif
« on: August 21, 2017, 03:06:06 pm »
Here is the secondary structure. However, in the real pdb structure, the U6, U11, A18 forms a base triple. I am not sure if these base triple will impact the secondary structure landscape?


40
RNA structures (DSSR) / Bulge motif
« on: August 21, 2017, 11:27:58 am »
Hi Xiangjun,

Long time no contact. How is everything going?

I am using DSSR to analyze the structure 1FUF to look for bulge motif.

The DSSR tells me there are two bulge motif in this structure. However, some of the bulge residue base have contact with other base and forms a base triple. I am not sure if these count some additional bulge or other motif?

In general, could you tell me how 3DNA detect secondary motif like bulge or internal loop? Is it just based on dot-bracket notation?

Best,
Honglue

41
Hi Xiangjun,

I have a general question about the definition of gamma torsion angles in 3DNA.

I read some paper about gamma torsion angles and it seems the angular space for the angles are from -360 ~ 360 degree.

For example, in paper titled "Sequence-specific transitions of the torsion angle gamma change the polar-hydrophobic profile of the DNA grooves" The paper defined

Values of γ angle were classified according to
classical three-fold pattern into: gauche + (60° ± 30°),
trans (180° ± 30°), gauche – (300° ± 30°) conformations.


However, in 3DNA, the angular space of gamma is from -180 --> 180. So I am wondering whether there is any degeneracy in the angular space defined in 3DNA?

Best,
Honglue





42
Hi Xiangjun,

I know that when you run x3dna-dssr on a RNA structure. You can output secondary motif like bulge, hairpin, internal loop etc as a pdb file.
But is there any way that we can search and output structures called helix-junction-helix motif. Junction means either internal loop or bulge while helix means helices with 3 bps.

For example, in 1anr.pdb (HIV-1-TAR), when you run DSSR, there are a dssr-iloops.pdb.
That dssr-iloops file only have G21 A22 U23 C24 U25 G26 C39 U40 C41.
My question is is there any way to also output C19 A20 U42 G43 and A27 G28 C37 U38 to make the stem part like 3 bps long? I plan to calculate inter-helical angles for these helix-junction-helix motif across all the pdb structures. If there is no such option in DSSR, I might have to think about it by myself.

Thank you.

Best,
Honglue

43
RNA structures (DSSR) / Re: How to look for abasic site using DSSR
« on: April 26, 2017, 03:23:57 pm »
Does that mean DSSR cannot detect abasic site, since there is no base atom?

Thanks.

Best,
Honglue

44
RNA structures (DSSR) / How to look for abasic site using DSSR
« on: April 26, 2017, 12:50:57 pm »
Hi Xiangjun,

Long time no contact. I hope you had a good time recently.

I have a question about DSSR. Can DSSR detect abasic site?

For example, in 1l2c.pdb and 1l2d.pdb, there is a abasic site HPD18DG7. I thought this could be written in internal loop; however, the internal loop only detect a lower base pair open (DA23DT2).

Best,
Honglue

45
General discussions (Q&As) / Re: nucleic acid sugar structure
« on: February 21, 2017, 06:42:23 pm »
I type:

fiber -b -seq=A ATbp.pdb

46
General discussions (Q&As) / nucleic acid sugar structure
« on: February 21, 2017, 01:13:21 pm »
Hi, xiangjun,

Here is the snapshot of a deoxyribose in a idealized B-form generated by 3DNA and some distance and angle measurement.

I am just curious where these bond length and bond angle details come from. How do you build the idealized model? Is it purely from previous crystal structure or literature. Or you build them based on your own criteria.

Best,
Honglue

47
General discussions (Q&As) / Swap the dinucleotide step index
« on: February 09, 2017, 05:39:06 pm »
Hi, xiangjun,

I have a general question about the coordinate frame.

For a given dinucleotide step, you can swap the dinucleotide step (5'-CA-3' ==> (3'-AC-5') 5'-TG-3') by renumber the residue id and the sign of shear, buckle, shift, tilt, y-displacement and tip will also be swapped.

Could you kindly explain to me? Is it because the coordinate frame changes? Does that mean we should be cautious to interpret the sign of those parameters above when we analyze structure?

Best,
Honglue

48
RNA structures (DSSR) / Re: Groove width distance in DSSR
« on: February 08, 2017, 12:26:00 pm »
I see.

Could you kindly explain to me the definition of helix-form in DSSR? I think you said it is undocumented right?

Thanks!

Best,
Honglue

49
RNA structures (DSSR) / Re: Groove width distance in DSSR
« on: February 08, 2017, 12:16:12 pm »
Hi, xiangjun,

Yes, it works very well! Thank you so much.

I am assuming you are also adding more details to the json output files, right? Is it also ok to add the helix-form (A, B, Z) in the stems and helical sections in the json file?

Best,
Honglue

50
RNA structures (DSSR) / Definition of Helix Form
« on: February 07, 2017, 10:44:31 pm »
Hi, xiangjun,

How is everything going?

In DSSR helices section, there is helix-form part including 'A', 'B' or 'Z' for the common A-, B- and Z-form helices, '.' for an unclassified step, and 'x' for a step without a continuous backbone.

My question is:
How do you identify the the helix form? Which parameter did you use to define the 'A' 'B' 'Z' form?

Best,
Honglue

Pages: 1 [2] 3

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University