Netiquette · Download · News · Gallery · Homepage · DSSR Manual · G-quadruplexes · DSSR-Jmol · DSSR-PyMOL · DSSR Licensing · Video Overview· RNA Covers

Author Topic: find_pair and HETATM  (Read 53085 times)

Offline auffinger

  • with-posts
  • *
  • Posts: 108
    • View Profile
find_pair and HETATM
« on: March 22, 2010, 01:48:16 pm »
Dear Xiang-Jun,

Just found out that for some base pairs, find_pair extracts some additional atoms (CA, K, TL, MG among others) and also complete modified residues like in 1MO5 (bp 12) and 2PIS (bp 9) (see attached files). Is that how it should be. Did we miss an option or misunderstand something ? This does not occur with the -a option, yet info related to modified residues are then lost.

Thanks for your help,

Pascal
pascal auffinger
ibmc-cnrs
15, rue rené descartes
67084 strasbourg cedex
france

web sites:
http://www-ibmc.u-strasbg.fr/arn/Westho ... er_pub.HTM
http://www-ibmc.u-

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: find_pair and HETATM
« Reply #1 on: March 23, 2010, 11:42:15 pm »
Dear Pascal,

Over the years, I have always been surprised by your sharp observations of some fine details of undocumented 3DNA features, through our extensive email communications (mostly before the 3DNA forum was set up) and your posts in the forum.

Quote from: "Pascal"
Just found out that for some base pairs, find_pair extracts some additional atoms (CA, K, TL, MG among others) and also complete modified residues like in 1MO5 (bp 12) and 2PIS (bp 9) (see attached files). Is that how it should be. Did we miss an option or misunderstand something ? This does not occur with the -a option, yet info related to modified residues are then lost.

Yes, the attached HETATM moieties (including metals, and modified nucleotides) you observed are expected, thus you did not miss any option or misunderstand something. Based on my experience with PDB format 2.x, I checked for possible linkage between a hetero group (e.g., a drug molecule) and the base residue, and added the HETATM group to the output coordinates file if it is connected to the nucleotide.

Obviously, the extra-mileage I took seems too far for your purpose. So I am considering to add a new command line option to “find_pair” that, if specified explicitly, would exclude such HETATM groups (metals, or modified residues). I am pretty busy for my job right now, but I would keep your request in mind. Hopefully, I would be able to get a working solution for you in a week (by the end of this month).

Xiang-Jun

Offline auffinger

  • with-posts
  • *
  • Posts: 108
    • View Profile
Re: find_pair and HETATM
« Reply #2 on: March 24, 2010, 06:37:57 am »
Dear Xiang-Jun,

Thanks for your reply. I understand better what you intended to do. This is, as you wrote, undocumented and therefore puzzling when one discovers it, although probably very useful when one is aware of the bias you chose. Indeed, an option in find_pair, related to this issue would be really welcome and add more flexibility to 3DNA. Getting an updated version is great news for us and we are waiting for it.
Regarding our last post, we would also really appreciate if you could at one point consider providing an output for "find_pair -p" (using the -p option) that could be run through analyze (its not the case right now). This would be also more than helpful for us. Yet, I understand that you are really busy combining several jobs.

Thanks for being so reactive,

Kind regards,

Pascal
pascal auffinger
ibmc-cnrs
15, rue rené descartes
67084 strasbourg cedex
france

web sites:
http://www-ibmc.u-strasbg.fr/arn/Westho ... er_pub.HTM
http://www-ibmc.u-

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: find_pair and HETATM
« Reply #3 on: March 24, 2010, 10:01:23 pm »
Dear Pascal,

Quote from: "Pascal"
Regarding our last post, we would also really appreciate if you could at one point consider providing an output for "find_pair -p" (using the -p option) that could be run through analyze (its not the case right now). This would be also more than helpful for us.
I may consider adding a new output file from "find_pair -p" that can be fed directly into "analyze" when I turn into "programming mode" to address your HETATM request. In the meantime, it should be straightforward to write a script that parses the output file from "find_pair -p" to feed into "analyze". More generally, as demonstrated in the 2008 Nature Protocols paper, 3DNA should be taken as a toolset that, when combined with other programs, can be explored with command line scripts to fulfill specific needs.
Quote from: "Pascal"
Yet, I understand that you are really busy combining several jobs.
As mentioned several times in the forum and on the 3DNA websites, and made clear in my blog post titled "On maintaining the 3DNA forum", my supporting of 3DNA and the forum is purely on a voluntary basis, not a "job" duty at all:
Quote
Over the past few years, maintaining the 3DNA forum (i.e., answering questions, performing administrative tasks) has taken up a significant amount of my spare time. Sometimes it could be quite demanding, especially because I need to pay great attention to details. Overall, though, it is a valuable experience, and I feel that the time is well-spent: 3DNA has been continuously refined and more widely used; my knowledge of nucleic acid structures (especially RNA) has been significantly sharpened; I have stayed aware of progress in related research fields and see more of the world; and I feel great pleasure in being of help to the community.
I do enjoy what I am doing with 3DNA and I have learned a lot, even from a negative comment on my effort (BTW, the thread is well worth reading):
[hr:1txo5elk][/hr:1txo5elk]
Quote
You [Xiang-Jun] clearly have the ability to avoid answers to the questions. You lecture me how to pose a question, I recommend you learn how to answer a question. I read your answers to other questions and the pattern is the same: to give as little help as possible. Still I like your program better than the other programs, so please, try to be more helpful. Do you think Wilma knows how to use this program? Maybe I should write to her?
[hr:1txo5elk][/hr:1txo5elk]

Best regards,

Xiang-Jun

Offline auffinger

  • with-posts
  • *
  • Posts: 108
    • View Profile
Re: find_pair and HETATM
« Reply #4 on: March 25, 2010, 07:10:46 am »
Dear Xiang-Jun,

Again, we appreciate your dedication to 3DNA and this program is a the ground on which a large part of our research evolves. As for "find_pair -p", we wrote a workaround (that is just a workaround) since we cannot translate everything (different info in both findpair outputs). If you do that, it would be certainly much nicer and accurate and benefit to ourselfs (certainly) and others (probably). Thanks.

Best regards,

Pascal
pascal auffinger
ibmc-cnrs
15, rue rené descartes
67084 strasbourg cedex
france

web sites:
http://www-ibmc.u-strasbg.fr/arn/Westho ... er_pub.HTM
http://www-ibmc.u-

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: find_pair and HETATM
« Reply #5 on: March 27, 2010, 04:42:22 pm »
Hi Pascal,

What OS do you use? I will compile a version of 'find_pair' that works for you. As with v1.5, I will keep the currently distributed v2.0 as is, unless I notice some significant issues.

Xiang-Jun

Offline auffinger

  • with-posts
  • *
  • Posts: 108
    • View Profile
Re: find_pair and HETATM
« Reply #6 on: March 29, 2010, 06:49:45 am »
Hi Xiang-Jun,

We use Linux Ubuntu 8.04 LTS. Thanks for your help,

Pascal
pascal auffinger
ibmc-cnrs
15, rue rené descartes
67084 strasbourg cedex
france

web sites:
http://www-ibmc.u-strasbg.fr/arn/Westho ... er_pub.HTM
http://www-ibmc.u-

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: find_pair and HETATM
« Reply #7 on: March 29, 2010, 08:13:54 pm »
Hi Pascal,

I have finally found time to update "find_pair" to fulfill your requests:

  • I have added an option "-attach=STRING": where STRING is case-insensitive, and if it is "off", "no" or "0", then the attached HETATM groups will not be added to the output coordinate file.
  • With the "-p" option, I have added an output file named "allpairs.ana" that can be fed into "analyze".

See my email for detailed instructions to update. Please post back here to confirm that my modifications work as advertised, or if you have further questions.

Xiang-Jun

Offline auffinger

  • with-posts
  • *
  • Posts: 108
    • View Profile
Re: find_pair and HETATM
« Reply #8 on: April 09, 2010, 08:50:19 am »
Hi Xiang-Jun,

Finally we tested your option -attach that works really fine. A lot easier for us. now! The .ana output looks less interesting to us than our workaround since it s lacking some info that we get by rewriting the find_pair output with -pz to a new output readable by analyze. At a later stage, it would be nice to have a complete output for analyze comming out from "find_pair -pz".

Was just also looking at the RN9-YN1 and RC8-YC6 info you provide in the analyze output. The labeling of this columns seems not appropriate since you do not consider the YY and RR base pairs occurrences, do you ?. The YN1-YN1 and RN9-RN9 columns are missing apparently. From my point of view, the C1'-C1' and the two lambdas provide all the needed info.

Best wishes and thanks again for your help,

Pascal
pascal auffinger
ibmc-cnrs
15, rue rené descartes
67084 strasbourg cedex
france

web sites:
http://www-ibmc.u-strasbg.fr/arn/Westho ... er_pub.HTM
http://www-ibmc.u-

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: find_pair and HETATM
« Reply #9 on: April 09, 2010, 11:45:18 pm »
Hi Pascal,

Quote from: "Pascal"
Finally we tested your option -attach that works really fine. A lot easier for us. now!

Glad to hear.

Quote from: "Pascal"
The .ana output looks less interesting to us than our workaround since it s lacking some info that we get by rewriting the find_pair output with -pz to a new output readable by analyze. At a later stage, it would be nice to have a complete output for analyze comming out from "find_pair -pz".
Could you be more specific so I (or others who are interested in) can see what exactly you want to achieve? As mentioned before, the "-p" option of "find_pair" was incompatible with "analyze" by design for a valid reason. I understand that 3DNA serves also as a tool kit, being used in "unexpected" ways. I am open and I'd be more than willing to be convinced to new ideas. As a design principle, though, I change 3DNA only in ways that make sense to me. When adding a new feature (excluding experimental and undocumented one, of course), I have always asked myself this question: will I be able to give user a concrete explanation or quickly acknowledge (and possibly fix) a bug? As a supplement and complement, users are always welcome to share their tricks and scripts with the community in the Users' contributions section.

Quote from: "Pascal"
Was just also looking at the RN9-YN1 and RC8-YC6 info you provide in the analyze output. The labeling of this columns seems not appropriate since you do not consider the YY and RR base pairs occurrences, do you ?. The YN1-YN1 and RN9-RN9 columns are missing apparently. From my point of view, the C1'-C1' and the two lambdas provide all the needed info.
Again, be specific (using a concrete example) to show where (you think) is wrong. As fas as I could tell, even though the column headers are always labeled RN9-YN1 and RC8-YC6 (intentionally), 3DNA should handle YY and RR pairs properly. Please verify.

Xiang-Jun

Offline auffinger

  • with-posts
  • *
  • Posts: 108
    • View Profile
Re: find_pair and HETATM
« Reply #10 on: April 12, 2010, 09:47:47 am »
Dear Xiang-Jun,

Well, the .ana output does not apparently provide the column filed with | and x characters that is used for calculating stackings and other base pair step related infos (it seems to us). This results in an output from analyze in which some sections are just filled with zeros. At a later point we will probably need this info and forget our workaround. The workaround we wrote in perl converts the output from "find_pair -p" by placing indiscriminately | characters in this column. Not perfect indeed - its a workaround, but works at least for our purpose.

As for the RN9-YN1 and RC8-YC69 columns, you are right and we noticed that the info are correct, but the columns are misslabeled and that is something that might trouble other users. May be you would like to change at a later point the labeling of these columns to a lengthy text or something simpler with a note explaining what it means (much more coherent and satisfactory). Again, yes, the data are correct, no doubt about that.

pascal
pascal auffinger
ibmc-cnrs
15, rue rené descartes
67084 strasbourg cedex
france

web sites:
http://www-ibmc.u-strasbg.fr/arn/Westho ... er_pub.HTM
http://www-ibmc.u-

Offline auffinger

  • with-posts
  • *
  • Posts: 108
    • View Profile
Re: find_pair and HETATM
« Reply #11 on: April 14, 2010, 08:59:29 am »
Dear Xiang-Jun,

Just a few words to let you know that, still, with the -attach options, some HETATM remain attached to the base pair as in one bp of 1VBZ (barium atom) - see picture. We should be able to workaround this for now.

Kind regards,

Pascal
pascal auffinger
ibmc-cnrs
15, rue rené descartes
67084 strasbourg cedex
france

web sites:
http://www-ibmc.u-strasbg.fr/arn/Westho ... er_pub.HTM
http://www-ibmc.u-

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: find_pair and HETATM
« Reply #12 on: April 15, 2010, 11:12:25 pm »
Hi Pascal,

Could you provide a step-by-step reproducible example? At the bare minimum, please attach the PDB file you used to generate the image. I checked 1VBZ, but failed to repeat the problem you reported  -- I could not find any BA atom in file "allpairs.pdb".

If verified, it is certainly a bug and I won't hesitate to fix it.

Best regards,

Xiang-Jun

Offline auffinger

  • with-posts
  • *
  • Posts: 108
    • View Profile
Re: find_pair and HETATM
« Reply #13 on: April 16, 2010, 05:55:35 am »
Hi Xiang-Jun,

I was a little bit quick on this. We had a selection issue with pymol that for unknown reasons takes sometimes connectity cards into cosideration leading to the picture I sent to the forum. Our scripts start to become quite complex,and even after the recent pdb file updates (cleaning), there are a lot of special issues to deal with that complicate our work. I can remove this and my preceding post from the file since it might not be too useful to the community. By curiosity, do you take into account connectivity cards in 3DNA? Anyway, your programs don't need to be modified.

Cheers,

Pascal
pascal auffinger
ibmc-cnrs
15, rue rené descartes
67084 strasbourg cedex
france

web sites:
http://www-ibmc.u-strasbg.fr/arn/Westho ... er_pub.HTM
http://www-ibmc.u-

Offline xiangjun

  • Administrator
  • with-posts
  • *****
  • Posts: 1652
    • View Profile
    • 3DNA homepage
Re: find_pair and HETATM
« Reply #14 on: April 16, 2010, 09:11:51 pm »
Hi Pascal,

Thanks for your verification. We all learn from mistakes, and our change of views serves as another example of the ever-refining process (kaizen; improve; 改善) that is required to keep a (software) product alive. As a general rule, I have never deleted or changed posts from users. I lock a thread when I believe the topic is finished.

3DNA does not read CONECT records from PDB, which is incomplete (by design), inconsistent, and unreliable. Instead, 3DNA implements an algorithm to decide if two atoms are covalently connected based on pure geometrical criteria. If you want to discuss this topic further, please start a new thread.

Best regards,

Xiang-Jun

 

Funded by the NIH R24GM153869 grant on X3DNA-DSSR, an NIGMS National Resource for Structural Bioinformatics of Nucleic Acids

Created and maintained by Dr. Xiang-Jun Lu, Department of Biological Sciences, Columbia University