Netiquette · Download · News · Gallery · Homepage · DSSR · Web-DSSR · DSSR Manual · Reproduce DSSR · DSSR-Jmol · DSSR-PyMOL · Web-SNAP

Author Topic: SNAP: software for characterizing DNA-protein interactions  (Read 3171 times)

Offline xiangjun

  • Administrator
  • regular
  • *****
  • Posts: 1215
    • View Profile
    • 3DNA homepage
DNA/RNA-protein interactions underpin fundamental biological processes such as transcription, splicing, and translation. The increasing number of experimentally determined three-dimensional structures of nucleic acid-protein complexes provides unprecedented opportunities to decipher the underlying principles governing the process of DNA/RNA-protein recognition. Existing bioinformatics tools are fragmented, with limited scope or usability. We have developed SNAP, a new 3DNA program for the characterization of three-dimensional Structures of Nucleic Acid-Protein complexes. The program is currently in beta testing release, focusing on DNA-protein interactions. SNAP consolidates, refines, and significantly extends 3DNA's functionality for DNA-protein structural analysis.

Starting from a structure of a DNA-protein complex in PDB or PDBx/mmCIF format, SNAP automatically detects double-helical regions consisting of either canonical or non-canonical base-pairs using DSSR, and categorizes protein into secondary structural units (alpha-helices, beta-sheets, turns, and loops) using DSSP. The program aims to characterize DNA/RNA-protein interactions by checking all combinations between the two components: major groove, minor groove, and backbone for DNA/RNA, versus each alpha-helix, beta-sheet, turn, and loop for protein. SNAP recognizes and outputs base-amino-acid H-bonding and stacking interactions. To quantify the relative spatial relationship between interacting amino acids and bases, SNAP defines a local amino-acid reference frame in the side chain, and takes advantage of the standard base reference frame (see figures below). SNAP calculates all six rigid-body parameters to allow for the analysis of large sets of DNA/RNA-protein complexes consistently and rigorously.

Implemented in ANSI C as a standalone command-line program, SNAP follows the same minimalist design as DSSR. It is tiny (executables are less than 1MB) and self-contained, with zero runtime dependencies on third party libraries or configurations. The program is currently under active development, and your feedback will make a difference!

List of users who has helped improve SNAP by reporting bugs, making comments/suggestions etc:


Auffinger; jdbrown444; ldfinger; miaozhichao; Phosphoserine; jms89

-- Xiang-Jun


Note: please start a new topic with a more specific title; do not post directly below this announcement.


Release history (in reverse chronological order):
Abbreviations used: AA: amino acid; BP: base-pair

  • beta-r10-2017apr10 -- documented the --type=string where string can be "base" (the default), "backbone", "either", or "both". The "base" argument reports protein interactions with only DNA/RNA base atoms, "backbone" with only DNA/RNA backbone atoms, "either" with base or backbone atoms, and "both" with base plus backbone atoms.
  • beta-r09-2016sept28 -- fixed a bug with the --cleanup option (thanks to jms89).
  • beta-r08-2016jun02 -- added the --tshape (or --t-shape) option to fix issues reported in the Supplemental Table S1 of the Wilson et al. paper "Topology of RNA–protein nucleobase–amino acid π–π interactions and comparison to analogous DNA–protein π–π contacts". Specifically, the authors said:

    "Furthermore, although the recently released beta-r06-2015oct23 version of 3DNA-SNAP (Lu and Olson 2008) is able to distinguish between such errors, and accurately detects stacking interactions between nucleobases and amino acids, it unfortunately is currently unable to identify T-shaped interactions (see, for example, Supplemental Table S1)."

  • beta-r07-2016may21 -- fixed the "Segmentation fault" bug (due to undefined reference frames for certain amino acids with missing side-chain atoms); miscellaneous internal code refinements.
  • beta-r06-2015oct23 -- detected aromatic stacking interactions between bases and amino acids; numerous refinements along with DSSR.
  • beta-r05-2015may03 -- added option --get-hbond to output a list of H-bonds between protein and nucleic acid; numerous internal code refinements.
  • beta-r04-2014sep30 -- removed the (undocumented) --rna option so that RNA-protein complexes are handled the same way as DNA-protein complexes; relaxed default settings so SNAP now runs on pure nucleic acid or protein structures in addition to their complexes; added DSSP output for the protein component in file snap-dssp.txt.
  • beta-r03-2014sep16 -- listed base-AA pseudo-pairs and output an associated PDB ensemble file (snap-pseudoPairs.pdb); significant code speed-up.
  • beta-r02-2014may31 -- detailed listing of H-bonding interactions between a component of nucleotide (base/phosphate/sugar) and an amino acid.
  • beta-r01-2014may05 -- initial release to kick the ball rolling. SNAP identifies base-AA or BP-AA interactions based on a distance cutoff (default to 4.5 angstrom), calculates six parameters to uniquely quantify the spatial relationships, and sets the coordinates in the standard base or BP reference frame for easy visualization and for deriving knowledge-based potentials.




Here is a sample run on 1oct (see x3dna-snap -h for more info), the crystal structure of the Oct-1 POU domain bound to an octamer site solved by Pabo et al.:

Code: [Select]
Run: x3dna-snap -i=1oct.pdb -o=1oct.out
****************************************************************************
    SNAP: a program for the characterization of three-dimensional
             Structures of Nucleic Acid-Protein complexes
              beta-r10-2017apr10, by xiangjun@x3dna.org

  This program is being actively maintained and developed. As always,
  I greatly appreciate your feedback! Please report all SNAP-related
  issues on the 3DNA Forum (forum.x3dna.org). I strive to respond
  *promptly* to *any questions* posted there.

****************************************************************************
Note: By default, each nucleotide/amino-acid is identified by chainId.name#.
      So a common case would be B.DA1689, meaning adenosine #1689 on chain B.
      Use the --idstr=long option to get strictly delineated id strings.

Command: x3dna-snap -i=1oct.pdb -o=1oct.out --aa_min=0 --nt_min=0
Date and time: Mon Apr 10 17:28:07 2017
File name: 1oct.pdb
    no. of peptide chains: 1 [C=131]
    no. of DNA/RNA chains: 2 [A=15,B=15]
    no. of amino acids:    131
    no. of nucleotides:    30
    no. of atoms:          1670
    no. of waters:         0
    no. of metals:         0

****************************************************************************
List of 1 helix
  Note: a helix is defined by base-stacking interactions, regardless of bp
        type and backbone connectivity, and may contain more than one stem.
      helix#number[stems-contained] bps=number-of-base-pairs in the helix
      bp-type: '|' for a canonical WC/wobble pair, '.' otherwise
      helix-form: classification of a dinucleotide step comprising the bp
        above the given designation and the bp that follows it. Types
        include 'A', 'B' or 'Z' for the common A-, B- and Z-form helices,
        '.' for an unclassified step, and 'x' for a step without a
        continuous backbone.
      --------------------------------------------------------------------
  helix#1[0] bps=14
      strand-1 5'-GTATGCAAATAAGG-3'
       bp-type    ||||||||||||||
      strand-2 3'-CATACGTTTATTCC-5'
      helix-form  BBBBBBBBBBBBB
   1 A.DG202        B.DC230        G-C WC           19-XIX    cWW  cW-W
   2 A.DT203        B.DA229        T-A WC           20-XX     cWW  cW-W
   3 A.DA204        B.DT228        A-T WC           20-XX     cWW  cW-W
   4 A.DT205        B.DA227        T-A WC           20-XX     cWW  cW-W
   5 A.DG206        B.DC226        G-C WC           19-XIX    cWW  cW-W
   6 A.DC207        B.DG225        C-G WC           19-XIX    cWW  cW-W
   7 A.DA208        B.DT224        A-T WC           20-XX     cWW  cW-W
   8 A.DA209        B.DT223        A-T WC           20-XX     cWW  cW-W
   9 A.DA210        B.DT222        A-T WC           20-XX     cWW  cW-W
  10 A.DT211        B.DA221        T-A WC           20-XX     cWW  cW-W
  11 A.DA212        B.DT220        A-T WC           20-XX     cWW  cW-W
  12 A.DA213        B.DT219        A-T WC           20-XX     cWW  cW-W
  13 A.DG214        B.DC218        G-C WC           19-XIX    cWW  cW-W
  14 A.DG215        B.DC217        G-C WC           19-XIX    cWW  cW-W

****************************************************************************
List of 30 nucleotide/amino-acid interactions
      id  nt-aa  nt           aa              Tdst    Rdst     Tx      Ty      Tz      Rx      Ry      Rz
   1 1oct A-arg A.DA208      C.ARG105        10.98  -45.39  -10.37   -1.40   -3.30   18.22   38.36  -16.42
   2 1oct A-arg A.DA209      C.ARG102        12.96  -64.64  -12.42   -3.69    0.50   19.39   38.44  -49.42
   3 1oct A-arg A.DA209      C.ARG105       -10.54  -77.58   -9.01    0.77   -5.43   32.27   40.40  -59.98
   4 1oct A-arg A.DA210      C.ARG102        13.00  -90.23  -12.62   -0.84   -3.03   31.67   30.70  -80.82
   5 1oct A-asn A.DA209      C.ASN151         9.24 -165.14   -6.41    6.54   -1.24   44.17   60.06 -161.30
   6 1oct A-asn A.DA210      C.ASN151         7.95  169.39    3.58   -6.02   -3.76  -61.24  -46.91  166.41
   7 1oct A-gln A.DA204      C.GLN44          8.90 -147.71   -1.25    8.28   -3.00   62.13   -4.03 -142.09
   8 1oct A-ser A.DA204      C.SER48         10.01 -126.61   -5.86    8.07    0.89   21.11   -7.37 -125.49
   9 1oct A-thr B.DA227      C.THR45         -6.83  150.54   -0.53   -5.80   -3.56  -38.91   -1.49  148.71
  10 1oct A-val A.DA210      C.VAL147         9.50  177.54    8.35   -4.40    1.13  -31.38  -26.50  177.38
  11 1oct C-arg A.DC207      C.ARG49        -10.55  130.65   -1.73  -10.40    0.39   42.89  -25.07  125.22
  12 1oct C-arg A.DC207      C.ARG105        10.64   61.56   -9.91   -3.22    2.16    3.33   59.24   17.22
  13 1oct C-arg B.DC226      C.ARG105       -11.37  -65.61   -8.90    6.45   -2.94   -3.39  -34.34  -56.68
  14 1oct C-thr B.DC226      C.THR45         -7.82 -173.77   -2.54    7.16   -1.83   17.56   14.19 -173.64
  15 1oct G-arg A.DG206      C.ARG49        -11.02  175.51    3.67   -9.94    3.04   54.67  -28.85  174.77
  16 1oct G-arg B.DG225      C.ARG49         10.84 -140.82   -1.39   10.75   -0.00   40.74   39.22 -135.24
  17 1oct G-arg B.DG225      C.ARG105       -10.31  -55.06   -9.54    3.91   -0.13   12.03  -47.33  -26.24
  18 1oct G-ser B.DG225      C.SER43         10.76 -127.61   -8.48    6.55   -1.00  -33.95   77.50 -106.70
  19 1oct G-thr A.DG206      C.THR45         -7.46  178.53   -1.92   -7.19    0.56   34.46  -11.97  178.45
  20 1oct G-thr B.DG225      C.THR45          8.31 -141.05   -5.19    5.96    2.57   27.42   16.56 -139.41
  21 1oct T-arg B.DT223      C.ARG102        12.45   46.89  -11.36    4.59    2.22   27.19  -16.42   34.94
  22 1oct T-arg B.DT224      C.ARG49         12.06 -116.46   -5.79   10.20    2.79   24.96   60.01 -102.74
  23 1oct T-arg B.DT224      C.ARG102       -14.08  -32.87  -11.42    8.12   -1.33   20.82  -25.38   -1.63
  24 1oct T-arg B.DT224      C.ARG105        10.20   47.44   -9.49    1.43    3.47   32.81  -33.87    5.35
  25 1oct T-cys B.DT219      C.CYS150        10.66  -81.40   -2.86   10.10    1.89    8.36   20.66  -78.81
  26 1oct T-gln A.DT203      C.GLN44         10.67 -115.42   -4.72    9.57   -0.05   65.02   27.06  -98.34
  27 1oct T-leu B.DT224      C.LEU55         11.64  144.35    9.48   -3.62   -5.72  -19.44  -72.43  134.61
  28 1oct T-ser A.DT205      C.SER48        -10.33 -151.96   -5.14    8.73   -2.03   28.42  -17.90 -150.69
  29 1oct T-thr A.DT205      C.THR45          6.48 -148.19   -1.00    5.50    3.28  -35.77    2.63 -146.52
  30 1oct T-val A.DT211      C.VAL147        -9.12  143.26    7.02   -5.17   -2.68  -31.13  -15.46  141.43

****************************************************************************
List of 24 base-pair/amino-acid interactions
      id   bp-aa  nt1          nt2          aa              Tdst    Rdst    Tx      Ty      Tz      Rx      Ry      Rz
   1 1oct AT-arg A.DA208      B.DT224      C.ARG49        -12.01  110.94   -6.48   -9.84   -2.30   21.51  -51.98  100.00
   2 1oct AT-arg A.DA208      B.DT224      C.ARG102        14.29  -31.63  -11.90   -7.65    1.98   13.43   28.49   -2.95
   3 1oct AT-arg A.DA208      B.DT224      C.ARG105       -10.59  -45.43   -9.93   -1.42   -3.39   25.50   36.06  -10.92
   4 1oct AT-arg A.DA209      B.DT223      C.ARG102        12.71  -54.74  -11.96   -4.19   -0.89   23.07   27.36  -42.12
   5 1oct AT-arg A.DA209      B.DT223      C.ARG105       -10.29  -68.58   -8.18    0.74   -6.21   36.66   30.50  -50.81
   6 1oct AT-arg A.DA210      B.DT222      C.ARG102       -13.20  -84.60  -12.59   -1.35   -3.71   32.25   24.46  -75.94
   7 1oct AT-asn A.DA209      B.DT223      C.ASN151         9.34 -156.07   -6.69    6.48   -0.66   53.59   59.89 -148.51
   8 1oct AT-asn A.DA210      B.DT222      C.ASN151         7.73  175.46    3.73   -5.97   -3.21  -66.31  -46.20  174.04
   9 1oct AT-cys A.DA213      B.DT219      C.CYS150       -10.45   82.10   -3.17   -9.77   -1.89    9.43  -17.73   80.04
  10 1oct AT-gln B.DA229      A.DT203      C.GLN44        -10.82  115.12   -4.79   -9.70   -0.08   67.58  -19.25   98.01
  11 1oct AT-gln A.DA204      B.DT228      C.GLN44          8.85 -145.35   -1.20    8.29   -2.86   64.97   -7.53 -138.55
  12 1oct AT-leu A.DA208      B.DT224      C.LEU55        -11.39 -149.57    9.27    3.36    5.69  -23.35   66.87 -142.43
  13 1oct AT-ser A.DA204      B.DT228      C.SER48          9.93 -125.38   -5.92    7.94    0.78   23.33  -11.93 -123.79
  14 1oct AT-ser B.DA227      A.DT205      C.SER48         10.37  152.19   -4.85   -8.92    2.09   26.76   18.96  150.98
  15 1oct AT-thr B.DA227      A.DT205      C.THR45         -6.65  149.36   -0.76   -5.65   -3.42  -37.34   -2.05  147.61
  16 1oct AT-val A.DA210      B.DT222      C.VAL147         9.44 -177.78   -8.53    3.83    1.33   37.06   24.02 -177.61
  17 1oct AT-val B.DA221      A.DT211      C.VAL147        -9.19 -149.81    7.12    5.31    2.37  -42.28   11.70 -147.39
  18 1oct GC-arg A.DG206      B.DC226      C.ARG49         11.14  174.67    3.34   -9.86    3.98   45.92  -30.78  173.98
  19 1oct GC-arg A.DG206      B.DC226      C.ARG105        11.24   73.06   -8.49   -6.69    3.11   -0.59   41.26   61.67
  20 1oct GC-arg B.DG225      A.DC207      C.ARG49         10.69 -135.66   -1.58   10.57   -0.18   41.82   32.16 -130.18
  21 1oct GC-arg B.DG225      A.DC207      C.ARG105       -10.47  -57.77   -9.76    3.61   -1.15    7.66  -53.29  -21.80
  22 1oct GC-ser B.DG225      A.DC207      C.SER43         10.70 -125.48   -8.19    6.60   -1.95  -34.20   69.17 -108.25
  23 1oct GC-thr A.DG206      B.DC226      C.THR45         -7.64  176.19   -2.23   -7.21    1.19   26.03  -13.35  176.06
  24 1oct GC-thr B.DG225      A.DC207      C.THR45          8.13 -137.49   -5.46    5.69    1.99   29.38    9.20 -135.83

****************************************************************************
List of 18 pair-wise phosphate-group/amino-acid H-bonding interactions
      id  nt-aa  nt           aa          H-bonds
   1 1oct T-arg A.DT203      C.ARG20      2:OP1-NH2[2.74],OP2-NH2[3.02]
   2 1oct T-gln A.DT203      C.GLN27      1:OP2-N[3.19]
   3 1oct A-gln A.DA204      C.GLN27      1:OP2-NE2[2.79]
   4 1oct A-ser A.DA204      C.SER48      1:OP2-OG[2.60]
   5 1oct A-arg A.DA208      C.ARG113     1:OP1-NH2[2.69]
   6 1oct A-thr A.DA210      C.THR106     2:OP1-N[2.96],OP1-OG1[2.37]
   7 1oct T-lys A.DT211      C.LYS103     1:OP1-N[3.10]
   8 1oct C-lys B.DC217      C.LYS142     1:OP1-NZ[3.64]
   9 1oct C-ser B.DC217      C.SER128     1:O5'-OG[3.41]
  10 1oct C-arg B.DC218      C.ARG146     1:OP2-NE[2.90]
  11 1oct T-lys B.DT219      C.LYS125     1:OP1-NZ[3.21]
  12 1oct T-arg B.DT219      C.ARG153     2:OP2-NE[3.04],OP2-NH2[2.58]
  13 1oct T-ser B.DT223      C.SER56      1:OP2-OG[2.61]
  14 1oct T-lys B.DT224      C.LYS62      1:OP1-NZ[2.42]
  15 1oct T-asn B.DT224      C.ASN59      1:OP2-ND2[2.83]
  16 1oct G-ser B.DG225      C.SER43      3:OP1-N[3.26],OP2-N[3.12],OP2-OG[2.74]
  17 1oct G-thr B.DG225      C.THR46      1:OP2-OG1[2.45]
  18 1oct C-arg B.DC226      C.ARG105     1:OP1-N[3.82]

****************************************************************************
List of 5 pair-wise sugar/amino-acid H-bonding interactions
      id  nt-aa  nt           aa          H-bonds
   1 1oct A-arg A.DA208      C.ARG105     2:O4'-NH2[3.40],N3-NH1[3.23]
   2 1oct A-thr A.DA209      C.THR106     1:O3'-N[3.48]
   3 1oct C-ser B.DC217      C.SER128     1:O5'-OG[3.41]
   4 1oct C-lys B.DC218      C.LYS125     1:O3'-NZ[3.44]
   5 1oct C-arg B.DC218      C.ARG153     1:O3'-NH2[3.14]

****************************************************************************
List of 10 base/amino-acid H-bonding interactions
      id  nt-aa  nt           aa          H-bonds
   1 1oct A-gln A.DA204      C.GLN44      2:N7-NE2[3.16],N6-OE1[3.37]
   2 1oct T-thr A.DT205      C.THR45      1:O4-OG1[2.93]
   3 1oct G-arg A.DG206      C.ARG49      1:N7-NH2[3.44]
   4 1oct A-arg A.DA208      C.ARG105     2:O4'-NH2[3.40],N3-NH1[3.23]
   5 1oct A-arg A.DA210      C.ARG102     1:N3-NH2[3.85]
   6 1oct A-asn A.DA210      C.ASN151     2:N7-ND2[3.47],N6-OD1[3.54]
   7 1oct T-arg B.DT223      C.ARG102     1:O2-NH2[3.69]
   8 1oct G-arg B.DG225      C.ARG49      2:O6-NH1[2.84],O6-NH2[3.21]
   9 1oct C-thr B.DC226      C.THR45      1:N4-OG1[3.22]
  10 1oct A-thr B.DA227      C.THR45      1:N6-OG1[3.76]

****************************************************************************
List of 7 base/amino-acid pseudo pairs
      id  nt-aa  nt           aa      vertical-distance   plane-angle
   1 1oct A-gln A.DA204      C.GLN44         0.05              9
   2 1oct G-arg A.DG206      C.ARG49         1.68             26
   3 1oct A-arg A.DA208      C.ARG105        0.33             44
   4 1oct A-arg A.DA210      C.ARG102        2.12             37
   5 1oct A-asn A.DA210      C.ASN151        0.70             27
   6 1oct T-arg B.DT223      C.ARG102        1.17             49
   7 1oct G-arg B.DG225      C.ARG49         0.44             32

****************************************************************************
List of 1 base/amino-acid pseudo stacks
      id  nt-aa  nt           aa      vertical-distance   plane-angle
   1 1oct G-arg B.DG225      C.ARG105        3.31             37
« Last Edit: April 10, 2017, 05:28:35 pm by xiangjun »
Dr. Xiang-Jun Lu [律祥俊]
Email: xiangjun@x3dna.org
Homepage: http://x3dna.org/
Forum: http://forum.x3dna.org/

 

Created and maintained by Dr. Xiang-Jun Lu[律祥俊]· Supported by the NIH grant R01GM096889 · Dr. Lu is currently a member of the Bussemaker Laboratory at the Department of Biological Sciences, Columbia University. The project is in collabration with the Olson Laborarory at Rutgers where 3DNA got started.