Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Netiquette · Download · News · Gallery · Homepage · DSSR · Web-DSSR · DSSR Manual · Reproduce DSSR · DSSR-Jmol · DSSR-PyMOL · Web-SNAP

Messages - xiangjun

Pages: 1 [2] 3 4 ... 82
16
Thanks for your feedback.

Quote
The data are unpublished, so if you could provide me a way of sharing it in a non-public way (e.g. email address), I'm able to provide a native trajectory.

I just need a sample dataset for checking purpose. It could be from a public data repository or a deliberately revised version of your unpublished data. You could send me such a minimal dataset (~1K frames) to my 3dna.lu gmail account. Of course, I will keep such data for the internal testing purpose only, not to be shared with anyone else.

Quote
Just as a note: I continued the DSSR analysis since yesterday and frame 14762 is currently loaded, so I do agree to the memory issue.

I've checked via valgrind that the slowness with the DSSR --nmr option for a large dataset is not due to memory issue -- "no leaks are possible". Otherwise, the program would eventually run out of memory. I now have a general idea of the issue and a possible fix.

Best regards,

Xiang-Jun



17
Hi Marcel,

Thanks for your follow-up.

Code: [Select]
do_x3dna -f ../mod.pdb -s ../mod.pdb -o test -hbond
and extract the dihedral information from "BackBoneCHiDihedrals_g.dat"

Could you dig further to see what 3DNA command is being called with the above do_x3dna run? What is the -hbond option? What's the do_x3dna output looks like? Do you only need backbone torsion angles?

Quote
Based on my impression, DSSR and do_x3dna are approx. similar in analyzing the first ~1000 structures and DSSR gets slower and slower with every following structure.

This is an important piece of information. I'd check if the slower performance after the first 1K structures (as you noticed) is relevant to increased memory allocation. In principle, DSSR should run each model at roughly constant speed.

Quote
As a workaround, I could cut my trajectory into 1000 structure fragments and analyze them independently, but I think the performance issue is important especially for the MD community to work with DSSR.

As noted above, I'll surely look into this issue. Would it be possible that you provide me a sample MD trajectories file?

Best regards,

Xiang-Jun

18
Hi Marcel,

Thanks for using DSSR and for your feedback on the Forum.

For your first question on the relative speed of DSSR vs. do_x3dna for calculating backbone parameters. In principle, DSSR would be slower than the 3DNA 'analyze' program since DSSR has many more (housekeeping) features calculated in the 'background'. However, the 2 hours vs 10 days difference for calculating backbone parameters as you noticed is well beyond my expectation. To investigate this issue further, could you elaborate on how you calculated the backbone parameters using do_x3dna? Does do_x3dna call the 'analyze -t' (for torsion angles) option?

For your second question, what version of DSSR were you using? This bug should have been fixed in the later releases of DSSR, as shown below:

Code: [Select]
Processing file 'frame1630.pdb'
  X.U.7               0.121
  X.U.42              0.143
    total number of nucleotides: 44
    total number of base pairs: 20
    total number of helices: 1
    total number of stems: 3
    total number of internal loops: 2

Also, note that you do not need to specify the --abasic option anymore. This feature is taken into consideration by default in recent releases of DSSR.

Best regards,

Xiang-Jun

19
Hi Shuxiang,

You've touched a subtle point in the labeling of residues (nucleotides) in mmCIF vs. PDB. Using PDB entry 3mgp, as you used, as an example, an excerpt for the corresponding PDB and mmCIF files (all downloaded from RCSB PDB) for I.DA.-73 is as below:

# PDB format
ATOM   6169  O5'  DA I -73       2.638   0.163  93.308  1.00166.52           O 
ATOM   6170  C5'  DA I -73       3.279   0.178  94.579  1.00166.78           C 
ATOM   6171  C4'  DA I -73       3.645  -1.223  95.042  1.00167.01           C 
ATOM   6172  O4'  DA I -73       2.489  -2.096  95.012  1.00167.37           O 
ATOM   6173  C3'  DA I -73       4.650  -1.969  94.180  1.00166.94           C 
ATOM   6174  O3'  DA I -73       5.972  -1.523  94.462  1.00166.58           O 
ATOM   6175  C2'  DA I -73       4.428  -3.410  94.635  1.00167.20           C 
ATOM   6176  C1'  DA I -73       2.941  -3.442  94.998  1.00167.53           C 
ATOM   6177  N9   DA I -73       2.097  -4.257  94.106  1.00167.70           N 
ATOM   6178  C8   DA I -73       0.995  -3.832  93.410  1.00167.66           C 

#mmCIF:
ATOM   6161  O  "O5'" . DA  I  5 1   ? 2.638   0.163   93.308 1.00 166.52 ? -73  DA  I "O5'" 1
ATOM   6162  C  "C5'" . DA  I  5 1   ? 3.279   0.178   94.579 1.00 166.78 ? -73  DA  I "C5'" 1
ATOM   6163  C  "C4'" . DA  I  5 1   ? 3.645   -1.223  95.042 1.00 167.01 ? -73  DA  I "C4'" 1
ATOM   6164  O  "O4'" . DA  I  5 1   ? 2.489   -2.096  95.012 1.00 167.37 ? -73  DA  I "O4'" 1
ATOM   6165  C  "C3'" . DA  I  5 1   ? 4.650   -1.969  94.180 1.00 166.94 ? -73  DA  I "C3'" 1
ATOM   6166  O  "O3'" . DA  I  5 1   ? 5.972   -1.523  94.462 1.00 166.58 ? -73  DA  I "O3'" 1
ATOM   6167  C  "C2'" . DA  I  5 1   ? 4.428   -3.410  94.635 1.00 167.20 ? -73  DA  I "C2'" 1
ATOM   6168  C  "C1'" . DA  I  5 1   ? 2.941   -3.442  94.998 1.00 167.53 ? -73  DA  I "C1'" 1
ATOM   6169  N  N9    . DA  I  5 1   ? 2.097   -4.257  94.106 1.00 167.70 ? -73  DA  I N9    1
ATOM   6170  C  C8    . DA  I  5 1   ? 0.995   -3.832  93.410 1.00 167.66 ? -73  DA  I C8    1

As noted in the mmCIF header, the sequence number "-73" matches "_atom_site.auth_seq_id", and number "1" matches "_atom_site.label_seq_id". Since the corresponding PDB entry uses _atom_site.auth_seq_id (-73), DSSR follows that convention.

DSSR currently has no option to employ the labeling "_atom_site.label_seq_id" while "_atom_site.auth_seq_id" exists.

Best regards,

Xiang-Jun

20
Thanks for the encouragement for writing a paper on SNAP -- it's been on my to-do list for a while, but delayed for various reasons. Overall, I take it as a positive thing that a method paper is written after that corresponding program has been in active use for a while. My goal here is not to write a paper but to solve a set of related problems so that the community can build upon my work.

You are right in that pseudo-pairing/stacking interactions are between planar moieties in proteins and the standard base reference frame. The planar moieties include the amino-acids { "arg", "phe", "tyr", "trp", "his", "asn", "asp", "gln", "glu" } and the peptide bond. A reference frame is defined for each of them. The pseudo-pairing/stacking interactions of these planar moieties with nucleobases are identified and quantified using exactly the same algorithms as in 3DNA/DSSR. In addition to the pair-wise interactions, 'multiplets' and stacks (as in DSSR) involving both amino-acids and bases will be reported in future releases of SNAP.

HTH,

Xiang-Jun

21
General discussions (Q&As) / Re: x3dna-v2.3 no backbone ribbon
« on: July 31, 2018, 03:02:10 pm »
Quote
I did find a relatively easy "workaround" though with pymol-- If I load the file "pmiview1" and have pymol create a cartoon with it, it overlays nicely with the r3d file.

That's true. That's also the tricks the DSSR --blocview option depends on. When 3DNA was initially released around the beginning of the century, MolScript/Raster3D (and RasMol) were very popular. Nowadays, these earlier software programs are virtually gone and PyMOL (among others) become the dominant players. DSSR takes advantages of what PyMOL has to offer and it greatly simplifies the user-interface to creation of the characteristic cartoon-block images.

Best regards,

Xiang-Jun


 

22
Bug reports / MOVED: x3dna-v2.3 no backbone ribbon
« on: July 31, 2018, 12:54:39 pm »

23
General discussions (Q&As) / Re: x3dna-v2.3 no backbone ribbon
« on: July 31, 2018, 12:42:59 pm »
Are you sure you have MolScript installed and in your path? The 3DNA v2.x version of 'blocview' relies on MolScript for the backbone ribbon, be it for nucleic acids or proteins.

You may well consider switching to the DSSR --blocview option which relies ONLY on PyMOL.

Best regards,

Xiang-Jun

24
FAQs / Re: How to set up 3DNA on Windows
« on: July 30, 2018, 11:37:13 pm »
Hi,

You do not need ConEmu to run 3DNA, even though ConEmu is a much better command-line interface than the native cmd.exe on Windows. Moreover, ConEmu has nothing to do with importing PDB files.

Please follow the steps and show us where you got stuck, preferrable with screenshots.

Best regards,

Xiang-Jun

25
It should be possible. RNAstructure or ViennaRNA package may already have some utility, or you may need to write some code to parse the DBN to get what you want. As mentioned before, DSSR do not have such functionality.

Best regards,

Xiang-Jun

26
Hi,

It is not completely clear to me what you want to achieve, based on the information provided. However, DSSR is unlikely to be the tool you are looking for since DSSR takes a 3D structure as its starting point.

If you already have the 2nd structure in dot-bracket notation (DBN), I’d assume the nucleotides in matached () are involved in helices. Right?

Xiang-Jun

27
RNA structures (DSSR) / Re: List only base pairs in output
« on: July 26, 2018, 01:32:12 pm »
Try:

x3dna-dssr -i=1ehz.pdb --json | jq -c '.pairs[] | [.bp, .name]'

You would get:

Code: [Select]
["G-C","WC"]
["C-G","WC"]
["G-C","WC"]
["G-U","Wobble"]
["A-U","WC"]
["U-A","WC"]
["U-A","WC"]
["U-A","rHoogsteen"]
["U+A","--"]
["A+A","--"]
["g-C","WC"]
["g+G","--"]
["C-G","WC"]
["U-A","WC"]
["C-G","WC"]
["G+C","rWC"]
["u+U","--"]
["G+P","--"]
["G-C","WC"]
["G-g","--"]
["g-A","Imino"]
["C-G","WC"]
["C-G","WC"]
["A-U","WC"]
["G-c","WC"]
["A-P","--"]
["c-A","--"]
["U-A","--"]
["c-G","WC"]
["U-A","WC"]
["G-C","WC"]
["U-A","WC"]
["G-C","WC"]
["t-a","rHoogsteen"]

The jq utility program can be found on https://stedolan.github.io/jq/.

HTH,

Xiang-Jun

28
Thanks for such a well-phrased question! Yes, 3DNA/DSSR has some utilities that may help in achieving your goal.

Quote
4JVH.pdb (QKI) http://www.rcsb.org/structure/4JVH has the RNA ACUAACAA and I need to trim that to _ACUAAC because that lines up the best with YNCURAY.  The underscore needs to be the C or T and to do that I need 3DNA to add the C or T to the 5' end of the ACUAAC.

I assume the conformation of the 5' C or T to be added is not defined, i.e. it can be any (reasonable) starting geometry. You could use frame_mol to set the ACUAAC fragment from 4JVH to be in the base reference of the first A. You can choose any (or your favorite) CA (or TA) dinucleotide from the PDB, and use the same frame_mol utility to reset it to the base reference frame of the ending A. Since both the ACUAAC fragment and the CA fragment share the same A base reference frame, you can overlay them together. With some manual editing, you should be able to get an approximate starting PDB structure with required RNA base sequence for your MD simulations.

Have a try and let me know if it works.

Xiang-Jun

29
Sorry for being late in responding to your follow-up post on June 22, 2018. I was on a trip to China for the whole month of June and slightly beyond, and I was constrained by my schedule and limited internet access.

Now back to your question, I am confused by what you mean by the following quantitative measures:

Quote
trying to classify structures as A ( 75-100 percent = A , 57/58- 75 percent = A-like , 43% - 58% = A to B)...
About 63 percent B so, B- Like...
9 percent A..?

The classification of a DNA dinucleotide step as A-, B- or TA-type is (mostly) based on Zp. See the 2000 A-DNA motif paper in JMB, and the 3DNA source code (ana_fncs.c) for further details.

Best regards,

Xiang-Jun

30
Thanks, Mauricio, for chiming in and for your nice words about my early publications.

Quote from: Puru
I was wondering if there is a script , another way to classify the structures that I have compiled by A, B, AB, TA- type( which is done and seen on the forum)?

Could you please clarify your question, preferably with a concrete example?

Xiang-Jun

Pages: 1 [2] 3 4 ... 82

Created and maintained by Dr. Xiang-Jun Lu [律祥俊], Principal Investigator of the NIH grant R01GM096889
Dr. Lu is currently affiliated with the Bussemaker Laboratory at the Department of Biological Sciences, Columbia University.