The
find_pair -p option can find all base pairs and higher-order base associations. I implemented this option early on in 3DNA v1.x; yet in
the 2003 Nucleic Acids Research paper and the corresponding
find_pair -h output for v1.5, I deliberately omitted mentioning this functionality. I was hoping to further refine the algorithm/implementation, and to write up a detailed method paper on
find_pair, a core 3DNA component. After leaving Rutgers nearly a decade ago, I've continuously maintained and refined 3DNA. However, for various reasons, up to now I've not been able to finish the long overdue 'technical' manuscript.
Over the years, numerous RNA structural bioinformatics resources have taken advantage of the functionality provided by
find_pair; RNAView & BPS, two Rugters-based tools, are based directly on
early versions of the program. It was only in
the 3DNA 2008 Nature Protocols paper that I first illustrated the functionality of the
find_pair -p option, in the protocol "identification of higher-order base associations in ribosomal RNA". This post provides further detailed examples so 3DNA users can take better advantage of this still underused functionality, useful in
RNA structure related applications.
Let's create a new directory (folder), named '
find_pair-p-examples', and change to that directory. Now the directory is empty (check with
ls).
mkdir find_pair-p-examples
cd find_pair-p-examples
ls
As an example, here we use the crystal structure of an RNA tetraplex (UGAGGU)
4 with A-tetrads, G-tetrads, U-tetrads and G-U octads: NDB id:
ur0023; PDB id:
1j6s (see figure below). The structure was solved by
Sundaralingam et al. at 1.4 Ångstroms resolution [
Structure. 2003 Jul;
11(7):815-23]. Its
asymmetric unit contains 4 single chains.The NDB/PDB provides 4 biological assemblies, each consisting of 4 identical chains from the asymmetric unit. Download
biological assembly 1 from the NDB (or the PDB, if you prefer; but notice the case difference in PDB id):
wget ftp://ndbserver.rutgers.edu/NDB/coordinates/na-biol/1j6s.pdb1
Run
find_pair -p on '
1j6s.pdb1'. Note the
-all_model option; by default, 3DNA programs (such as
find_pair) handle only the first model (structure) in a given PDB data file.
find_pair -p -all_model 1j6s.pdb1 1j6s.mbp
At the end of output file '
1j6s.mbp', one can see the following identified multiplets: one octad and three tetrads:
1: #8 [1]...1>A:...1_:[BRU]u + [2]...1>A:...2_:[..G]G + [47]...2>A:...1_:[BRU]u + [48]...2>A:...2_:[..G]G + [93]...3>A:...1_:[BRU]u + [94]...3>A:...2_:[..G]G + [139]...4>A:...1_:[BRU]u + [140]...4>A:...2_:[..G]G
2: #4 [3]...1>A:...3_:[..A]A + [49]...2>A:...3_:[..A]A + [95]...3>A:...3_:[..A]A + [141]...4>A:...3_:[..A]A
3: #4 [4]...1>A:...4_:[..G]G + [50]...2>A:...4_:[..G]G + [96]...3>A:...4_:[..G]G + [142]...4>A:...4_:[..G]G
4: #4 [5]...1>A:...5_:[..G]G + [51]...2>A:...5_:[..G]G + [97]...3>A:...5_:[..G]G + [143]...4>A:...5_:[..G]G
Among other outputs, there is also a file named '
multiplets.pdb' which contains the atomic coordinates of the corresponding multiplets, each oriented in its most-extended view. The base multiplets can be extracted with
ex_str and then converted to .r3d format for Raster3D or PyMol rendering (see also post "
What can 3DNA do for RNA structures?" for more examples).
ex_str -1 multiplets.pdb oct.pdb
r3d_atom -od -r=0.1 -b=0.2 oct.pdb stdout | render -png > oct.png
ex_str -2 multiplets.pdb A-tetrad.pdb
r3d_atom -od -r=0.1 -b=0.2 A-tetrad.pdb stdout | render -png > A-tetrad.png
ex_str -3 multiplets.pdb G-tetrad.pdb
r3d_atom -od -r=0.1 -b=0.2 G-tetrad.pdb stdout | render -png > G-tetrad.png
The three png images ('
oct.png', '
A-tetrad.png' and '
G-tetrad.png') as generated directly above are attached below.