3DNA Forum

Welcome => Feature requests => Topic started by: auffinger on August 04, 2012, 02:31:03 pm

Title: chain continuation character in analyze
Post by: auffinger on August 04, 2012, 02:31:03 pm
Hi Xiang-Jun,

Went just through a few options.
In the analyze file, you insert the '-' character when a strand is not broken and 'x' when its broken.
Then a new chain starts. May be you could add a third '+' character for these residues ?
This is a really minor point.

Also, for the same strand P...P and C1'...C1' distances, could you add two decimals instead of one?
This could be convenient for some applications.

Best,

Pascal
Title: Re: chain continuation character in analyze
Post by: xiangjun on August 04, 2012, 03:01:52 pm
Quote
In the analyze file, you insert the '-' character when a strand is not broken and 'x' when its broken.
Then a new chain starts. May be you could add a third '+' character for these residues ?
Could you provide an example?

Quote
Also, for the same strand P...P and C1'...C1' distances, could you add two decimals instead of one?
Done -- the distribution will be updated after clarification of the above point.

Xiang-Jun
Title: Re: chain continuation character in analyze
Post by: auffinger on August 05, 2012, 01:06:10 pm
Well, normally after a 'x' character a new chain starts, so this is easy. Yet putting a '+' instead of a '-' at these positions seems more informative to me although it looks like a detail. A '+' should then appear for the first residue of a structure. Hence, you would have a specific marker for all 3' and 5' residues.
Of course, an isolated residue can be found and you would have to chose between '+' and 'x' (I suggest '+').

1   (0.012) ....>A:...1_:[..C]C-          <-- a '+' here instead of a '-'
2   (0.019) ....>A:...2_:[.DC]C-
3   (0.038) ....>A:...3_:[.DG]G-
4   (0.030) ....>A:...4_:[.DG]G-
5   (0.010) ....>A:...5_:[.DC]C-
6   (0.043) ....>A:...6_:[.DG]G-
7   (0.012) ....>A:...7_:[.DC]C-
8   (0.013) ....>A:...8_:[.DC]C-
9   (0.027) ....>A:...9_:[.DG]G-
10   (0.027) ....>A:..10_:[..G]Gx
11   (0.015) ....>B:..11_:[..C]C-       <-- a '+' here
12   (0.009) ....>B:..12_:[.DC]C-

Pascal
Title: Re: chain continuation character in analyze
Post by: xiangjun on August 06, 2012, 01:04:12 pm
Please download the updated aug06 release of 3DNA v2.1beta. Now you can specify helix begin/continuation/end and isolated bp characters through option -chain_markers. For example,

Code: [Select]
`find_pair 1egk.pdb stdout    #  default as beforefind_pair -chain_markers='o|x+' 1egk.pdb stdout    #  with helix beginning character assigned to 'o'`
The same option can also be applied to 'analyze'.

Also, as noted previously, the same strand P...P and C1'...C1' distances are now output in two decimals.

Have a try and report back how it goes.

Xiang-Jun
Title: Re: chain continuation character in analyze
Post by: auffinger on August 08, 2012, 06:19:18 am
Thanks Xiang-Jun,

Works fine, when no options for find_pairs are used. Yet, I am using following options
find_pair -s -pdbv3 -attach=off  -chain_markers='o|x+' ...
and would like to see these markers in the output of analyze

analyze -pdbv3 file.fp out

that is like

...
3   (0.038) ....>A:...3_:[.DG]G-
4   (0.030) ....>A:...4_:[.DG]G-
5   (0.010) ....>A:...5_:[.DC]C-
6   (0.043) ....>A:...6_:[.DG]G-
...

Sorry if I didn't explain myself well on this point.

Best,

Pascal
Title: Re: chain continuation character in analyze
Post by: xiangjun on August 08, 2012, 09:38:32 am
The -chain_markers option works only for duplexes (default for find_pair), not yet with the -s option for single-stranded (ss) structures. 3DNA analyze checks for O3'(i) to P(i+1) distance with a cut-off of 2.5 Å for chain breaks; only two characters are used: 'x' for a break, and '-' for a covalent bond. I am a bit hesitated to make the current settings more complicated; you are the first 3DNA user to notice such little detailed info at all. Do you a solid use case to convince me?

In current 3DNA v2.1beta, no need to set -pdbv3; it's the default.

Xiang-Jun
Title: Re: chain continuation character in analyze
Post by: auffinger on August 08, 2012, 11:52:50 am
Well if its a lot of work, I understand.
for me, since I treat automatically, all pdb files, I thought it would be simple to change it for  analyze with the -s option (thought it is basically the same as with no option).
It would facilitate my work, but I can (will have to may be) use  workarounds.
Let me know if you can do it.

Is the option -pdbv3 standard also for O1P_O2P ?

Best,

Pascal
Title: Re: chain continuation character in analyze
Post by: xiangjun on August 08, 2012, 12:00:46 pm
I will think about the issue, and try to find a consistent way to handle find_pair in default and with the -s option, and streamline the output style between find_pair and analyze. I will post back in this thread once it is done.

The -pdbv3 is globally set to TRUE, so it also applies to o1p_o2p.

HTH

Xiang-Jun
Title: Re: chain continuation character in analyze
Post by: xiangjun on August 09, 2012, 07:43:01 am
Hi Pascal,

I've implemented the -chain_markers option to analyze as of 3DNA v2.1beta dated 2012aug09. As an example, run the following commands,

Code: [Select]
`find_pair -s 1egk.pdb stdout | analyze -chain_markers='+|x*' stdin`
You will see the the portion below in output file '1egk.outs':

`   1   (0.013) ....>A:...1_:[..A]A     +       2   (0.020) ....>A:...2_:[..G]G     |       3   (0.019) ....>A:...3_:[..G]G     |       4   (0.014) ....>A:...4_:[..A]A     |       5   (0.014) ....>A:...5_:[..G]G     |       6   (0.016) ....>A:...6_:[..A]A     |       7   (0.020) ....>A:...7_:[..G]G     |       8   (0.015) ....>A:...8_:[..A]A     |       9   (0.028) ....>A:...9_:[..G]G     |      10   (0.015) ....>A:..10_:[..A]A     |      11   (0.015) ....>A:..11_:[..U]U     |  12   (0.022) ....>A:..12_:[..G]G     |  13   (0.015) ....>A:..13_:[..G]G     |  14   (0.021) ....>A:..14_:[..G]G     |  15   (0.025) ....>A:..15_:[..U]U     |  16   (0.016) ....>A:..16_:[..G]G     |  17   (0.016) ....>A:..17_:[..C]C     |  18   (0.016) ....>A:..18_:[..G]G     |  19   (0.012) ....>A:..19_:[..A]A     |  20   (0.017) ....>A:..20_:[..G]G     x  21   (0.010) ....>B:..21_:[..C]C     +  22   (0.018) ....>B:..22_:[..T]T     |  23   (0.007) ....>B:..23_:[..C]C     |  24   (0.016) ....>B:..24_:[..G]G     |  25   (0.011) ....>B:..25_:[..C]C     |  26   (0.013) ....>B:..26_:[..A]A     |  27   (0.011) ....>B:..27_:[..C]C     |  28   (0.006) ....>B:..28_:[..C]C     |  29   (0.010) ....>B:..29_:[..C]C     |`

As always, check it out and report back if that fits the bill.

Xiang-Jun
Title: Re: chain continuation character in analyze
Post by: auffinger on August 09, 2012, 10:36:59 am
Hi Xiang-Jun,

Thanks, works quite fine and its nice to have '+' for isolated nucleotides on top of it.
Hope it will be useful to others than me,

Best,

Pascal

Created and maintained by Dr. Xiang-Jun Lu [律祥俊] (xiangjun@x3dna.org)
The Bussemaker Laboratory at the Department of Biological Sciences, Columbia University.