Print Page - chain continuation character in analyze

Welcome => Feature requests => Topic started by: auffinger on August 04, 2012, 02:31:03 pm

Netiquette · Download · News · Gallery · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · Video Overview · DSSR v2.5.4 (DSSR Manual) · Homepage

Title: chain continuation character in analyze
Post by: auffinger on August 04, 2012, 02:31:03 pm

Hi Xiang-Jun,

Went just through a few options.
In the analyze file, you insert the '-' character when a strand is not broken and 'x' when its broken.
Then a new chain starts. May be you could add a third '+' character for these residues ?
This is a really minor point.

Also, for the same strand P...P and C1'...C1' distances, could you add two decimals instead of one?
This could be convenient for some applications.

Best,

Pascal

Title: Re: chain continuation character in analyze
Post by: xiangjun on August 04, 2012, 03:01:52 pm

Quote

In the analyze file, you insert the '-' character when a strand is not broken and 'x' when its broken.
Then a new chain starts. May be you could add a third '+' character for these residues ?

Could you provide an example?

Quote

Also, for the same strand P...P and C1'...C1' distances, could you add two decimals instead of one?

Done -- the distribution will be updated after clarification of the above point.

Xiang-Jun

Title: Re: chain continuation character in analyze
Post by: auffinger on August 05, 2012, 01:06:10 pm

Well, normally after a 'x' character a new chain starts, so this is easy. Yet putting a '+' instead of a '-' at these positions seems more informative to me although it looks like a detail. A '+' should then appear for the first residue of a structure. Hence, you would have a specific marker for all 3' and 5' residues.
Of course, an isolated residue can be found and you would have to chose between '+' and 'x' (I suggest '+').

1 (0.012) ....>A:...1_:[..C]C- <-- a '+' here instead of a '-'
2 (0.019) ....>A:...2_:[.DC]C-
3 (0.038) ....>A:...3_:[.DG]G-
4 (0.030) ....>A:...4_:[.DG]G-
5 (0.010) ....>A:...5_:[.DC]C-
6 (0.043) ....>A:...6_:[.DG]G-
7 (0.012) ....>A:...7_:[.DC]C-
8 (0.013) ....>A:...8_:[.DC]C-
9 (0.027) ....>A:...9_:[.DG]G-
10 (0.027) ....>A:..10_:[..G]Gx
11 (0.015) ....>B:..11_:[..C]C- <-- a '+' here
12 (0.009) ....>B:..12_:[.DC]C-

Pascal

Title: Re: chain continuation character in analyze
Post by: xiangjun on August 06, 2012, 01:04:12 pm

Please download the updated aug06 release of 3DNA v2.1beta. Now you can specify helix begin/continuation/end and isolated bp characters through option -chain_markers. For example,

Code: [Select]

find_pair 1egk.pdb stdout
    #  default as before
find_pair -chain_markers='o|x+' 1egk.pdb stdout
    #  with helix beginning character assigned to 'o'

The same option can also be applied to 'analyze'.

Also, as noted previously, the same strand P...P and C1'...C1' distances are now output in two decimals.

Have a try and report back how it goes.

Xiang-Jun

Title: Re: chain continuation character in analyze
Post by: auffinger on August 08, 2012, 06:19:18 am

Thanks Xiang-Jun,

Works fine, when no options for find_pairs are used. Yet, I am using following options
find_pair -s -pdbv3 -attach=off -chain_markers='o|x+' ...
and would like to see these markers in the output of analyze

analyze -pdbv3 file.fp out

that is like

...
3 (0.038) ....>A:...3_:[.DG]G-
4 (0.030) ....>A:...4_:[.DG]G-
5 (0.010) ....>A:...5_:[.DC]C-
6 (0.043) ....>A:...6_:[.DG]G-
...

Sorry if I didn't explain myself well on this point.

Best,

Pascal

Title: Re: chain continuation character in analyze
Post by: xiangjun on August 08, 2012, 09:38:32 am

The -chain_markers option works only for duplexes (default for find_pair), not yet with the -s option for single-stranded (ss) structures. 3DNA analyze checks for O3'(i) to P(i+1) distance with a cut-off of 2.5 Å for chain breaks; only two characters are used: 'x' for a break, and '-' for a covalent bond. I am a bit hesitated to make the current settings more complicated; you are the first 3DNA user to notice such little detailed info at all. Do you a solid use case to convince me?

In current 3DNA v2.1beta, no need to set -pdbv3; it's the default.

Xiang-Jun

Title: Re: chain continuation character in analyze
Post by: auffinger on August 08, 2012, 11:52:50 am

Well if its a lot of work, I understand.
for me, since I treat automatically, all pdb files, I thought it would be simple to change it for analyze with the -s option (thought it is basically the same as with no option).
It would facilitate my work, but I can (will have to may be) use workarounds.
Let me know if you can do it.

Is the option -pdbv3 standard also for O1P_O2P ?

Best,

Pascal

Title: Re: chain continuation character in analyze
Post by: xiangjun on August 08, 2012, 12:00:46 pm

I will think about the issue, and try to find a consistent way to handle find_pair in default and with the -s option, and streamline the output style between find_pair and analyze. I will post back in this thread once it is done.

The -pdbv3 is globally set to TRUE, so it also applies to o1p_o2p.

HTH

Xiang-Jun

Title: Re: chain continuation character in analyze
Post by: xiangjun on August 09, 2012, 07:43:01 am

Hi Pascal,

I've implemented the -chain_markers option to analyze as of 3DNA v2.1beta dated 2012aug09. As an example, run the following commands,

Code: [Select]

find_pair -s 1egk.pdb stdout | analyze -chain_markers='+|x*' stdin
You will see the the portion below in output file '1egk.outs':

   1   (0.013) ....>A:...1_:[..A]A     +    
   2   (0.020) ....>A:...2_:[..G]G     |    
   3   (0.019) ....>A:...3_:[..G]G     |    
   4   (0.014) ....>A:...4_:[..A]A     |    
   5   (0.014) ....>A:...5_:[..G]G     |    
   6   (0.016) ....>A:...6_:[..A]A     |    
   7   (0.020) ....>A:...7_:[..G]G     |    
   8   (0.015) ....>A:...8_:[..A]A     |    
   9   (0.028) ....>A:...9_:[..G]G     |    
  10   (0.015) ....>A:..10_:[..A]A     |    
  11   (0.015) ....>A:..11_:[..U]U     |
  12   (0.022) ....>A:..12_:[..G]G     |
  13   (0.015) ....>A:..13_:[..G]G     |
  14   (0.021) ....>A:..14_:[..G]G     |
  15   (0.025) ....>A:..15_:[..U]U     |
  16   (0.016) ....>A:..16_:[..G]G     |
  17   (0.016) ....>A:..17_:[..C]C     |
  18   (0.016) ....>A:..18_:[..G]G     |
  19   (0.012) ....>A:..19_:[..A]A     |
  20   (0.017) ....>A:..20_:[..G]G     x
  21   (0.010) ....>B:..21_:[..C]C     +
  22   (0.018) ....>B:..22_:[..T]T     |
  23   (0.007) ....>B:..23_:[..C]C     |
  24   (0.016) ....>B:..24_:[..G]G     |
  25   (0.011) ....>B:..25_:[..C]C     |
  26   (0.013) ....>B:..26_:[..A]A     |
  27   (0.011) ....>B:..27_:[..C]C     |
  28   (0.006) ....>B:..28_:[..C]C     |
  29   (0.010) ....>B:..29_:[..C]C     |

As always, check it out and report back if that fits the bill.

Xiang-Jun

Title: Re: chain continuation character in analyze
Post by: auffinger on August 09, 2012, 10:36:59 am

Hi Xiang-Jun,

Thanks, works quite fine and its nice to have '+' for isolated nucleotides on top of it.
Hope it will be useful to others than me,

Best,

Pascal

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University

3DNA Forum

Welcome => Feature requests => Topic started by: auffinger on August 04, 2012, 02:31:03 pm