You are here: Home V2 Software Software More ... Contributed Software S Chandra Shekar NMRView scripts


Short description of the programs

Date: 31 May 2005

See Authors web site for more/newer information

    Scripts to extract information, manipulate, or generate cross peak
    (.xpk) files in the 'nmrview' (an nmr data viewing, analysis graphical package
    by Bruce Johnson).

    Also contains the directory xpk2nv w/ Fortran77, C and Perl
    codes to simulate, from a given .xpk file, 2d/3d spectra in frequency domain by
    modelling the spectra as a collection of non-interacting Gaussian oscillators.

    The directory is grouped into subdirectories with mostly Perl scripts,
    with some c-shell, awk and a few sed scripts - there are about 225 in all.
    The usage information for almost all the scripts can be obtained just by typing
    the name of the script (change mode to executable, if that is a valid state; for
    e.g., c-shell and perl scripts can be made executable, but it is not meaningful
    to do so for awk and sed scripts). Bulk of the remainder of the scripts will
    have the usage information written in the code itself (so look at the source
    code near the top).

    NOTE: some of the scripts may use bmrb chemical shift information (so you
    may need to provide that table). there is such a file "bmrb.cs" in the
    main directory; but you may wanna use the latest updated table
    from bmrb (bioMagResBank). Also, if any of the scripts have the path to
    bmrb.cs hard-wired into them, you may have to edit it suitably.

Name Type Purpose
aaShift csh extract chem.shifts for a given type of amino acid from a bmrb
(bioMagResBank) table (in the form of a text file; the pathname for the table
should be changed according to local needs); cacbCS --- related script to
extract just the ca and cb shifts for the given amino acid type
perl reformat nmrview .xpk file for use with the auto-assignment package
usage: file.xpk exptName
bmrb.cs txt is an input file for some of the scripts in this tree;
it was copied from bioMagResBank (bmrb) web site sometime in 2002; you may
wanna use the updated information from bmrb web site and you can locate the
file anywhere you wish; but if a script has the name of bmrb.cs hardwired
into it you may have to edit it to make the pathname correct.
bmrb2tatapro/ directory a set of awk scripts to transform .xpk files into format suitable
for use by the auto-assignment code tatapro
bmrb2xpk/ directory mostly perl scripts to generate from bmrb chemical shift (cs) table
an nmrview .xpk file corresponding to a given type of nmr experiment;
for e.g. generates a 3d .xpk file corresponding to 3d,3res
hncacb experiment and fills in the assignment boxes; on the other hand generates the .xpk file w/o the assignments.,
etc., almost all of them have a variant to carry over the assignment information
into the .xpk file from bmrb data-base based table files.
cacbCS (c)sh see entry for "aaShift" above; this is one example where the
name (i mean, full name including the pathname) for "bmrb.cs" has to be edited
diff.{awk,sed} awk/sed reformat output from unix command "diff"; see entry for sub{,.sed}
findXpk/ directory a bunch of perl and awk scripts to do various locating operations
regarding peaks in an nmr data set; the most extensively used (and hence
the most reliable and relevant?) of these are,; "fb" stands
for forward/backward;
usage: {hn,file{A,B}}.xpk c1 c2 [htol] [ctol] [ntol]
(dflts) htol=0.02 ctol=0.4 ntol=0.325
see also: ~/xpkScrpts/findXpk/
above, we have used c-shell expansion notation in which {a,b}.xpk means
a.xpk b.xpk
anyhow, as the "usage" line shows above (which will appear if the just the
name of the perl script is typed), this script given an hsqc.xpk and
two other .xpk files corresponding to a 'sister' pair of 3d 3res experiments
(for example giving the inter and inter-residue connectivities, for example
hncacb and hncocacb) and two carbon shifts w/ the same h and n frequencies
will try to find all possible matches arising from different h and n
frequencies; and depending on whether the intra or the inter .xpk file is
the 2nd .xpk file (hsqc.xpk is always the 1st .xpk file in the command line
argument list), the possible matches are either in the forward (c-terminal)
or backward direxion.
genXpk/ directory
to generate various types of peak lists (even modifications
such as renumbering the peaks in a continuum of natural numbers is treated as
generating a new peak list; some of the scripts that i have found very useful
0) hsqcFile{1,2} [h1tol] [n15tol]
(default) h1tol=0.05 n15tol=0.3125
this perl script transfers an assignments from a 2d .xpk file to another
2d xpk file, if the peaks match in frequency and if the peak from the 2nd
.xpk file is unassigned! it does not touch the assigned xpk's in the 2nd
xpk file.

...and its various versions which have different selection criteria for
printing the matched/unmatched peaks from the 2nd .xpk file and also the
way the output is ordered. some of the reame files in this folder may come in
handy, but the best way to familiarize and explore these scripts is just do that
--- explore!

1) (and similar scripts) hsqc.xpk hcn.xpk [h1tol] [n15tol]
(default) h1tol=0.05 n15tol=0.3125
fill in two of the three assignable boxes in a 3d 3res .xpk file from the
assignments present in a 2d hsqc .xpk file
etc. filter 2d/3d .xpk files based on if the peaks in the file match a given
2d .xpk file in a pre-specified 2 coordinate/frequency axes;

2) -- this is one of those scripts which doesn't print usage
information when entered w/o arguments; just waits for input ==> can operate
on data streaming thro' unix pipe, or unix file redirection or file as the
argument to the command (which is just the name of the perl script:;
formats a given xpk file
enables using "diff" on different xpk files more easily
most importantly tries to keep the length of each record (representing
one peak) to a minimum, thus reducing the required window size; 3d .xpk
file records typically can be fit on a standard monitor screen this way.
(also see the script xtrctEtc/ which can take an .xpk file
from and extracts only the most important fields in a record
and thus making viewing the .xpk files extremely un-painfu).

3), convert the the shifts from hsqc -> trosy and
vice versa for peaks in an .xpk file

4) ... converts peak tables in xeasy format to .xpk files for

5) will rewrite the .xpk file such that the assignmened
xpk's are in the increasing order of residue numbers to which they are assigned.

6) hsqc.xpk cCarrierPpm cLbl
generates a pseudo 3d 3-res .xpk file from an hsqc/trosy 2d .xpk file, w/
the 3rd (typically carbon dimension) containing all frequencies at an user
specified frequency

7), renumbers the peak numbers in an .xpk file

8) xpkFile <offset>
(dflt offset=0)
this xtremely useful script renumbers the assignments in an .xpk file
(very useful when starting amino acid number in a given sequence is changed,
based on one's need of the moment)

9) some of the above perl scripts have an "awk" versions in this directory
meg/ directory Prof. Mark E. Girvins Perl scripts.
Serve to assign a given sequence of connected cross peaks to a given sequence of
a.a. residues; i.e. translate peak connectivities to a possible of sequence
nmrDrw2nv/ perl convert nmrDraw generated peak tables to nmrview .xpk file; there is
a sample input file in the same directory;
usage: nmrDrw2nv/ parFile

sample parFile
# lblX lblY
hn n
# swX swY
5204.34 2603.96
# larmorX larmorY
800.2338 81.0963
# nmDrawXpkFile
plScrpt{,0} csh to generate gnuplot script and execute the generated gnuplot script
which in turn generates a postscript file (look@the usage info' in the script)
scrpts/ directory a small set of miscellaneous small scripts; just take a look
and try; some of them are not finished; look for any updates via link to my
web site or e.mail me
seq/ directory collexion of awk and perlscripts to output protein sequence in
various formats (some required for some other popular and not so popular
packages/programs such as nmrview, tatapro (an automated nmrdata to protein
sequence assignment package) etc. Consult the readme files therein and
just try, take a look, whatever
sl/ directory scripts here extensively used by the author for his day-to-day work
w/ spin labeled nmr studies of protein under study;
0) -- perl script2calculate the amount of reducing agent
(phenyl hydrazine) to be added to the nmr sample;
usage: ~/xpkScrpts/sl/ protMolarConcn(mM) sampleVol vol2bAdded

usage: {oxd,red}.xpk tauc <sclFctr> <oxdNoiseLvl> <redNoiseLvl>
default sclFctr=1, noiseLvl=0
calculate intensity ratios from 'oxidized' and 'reduced' hsqc nmr data set
nmrview peak lists; it also calculates the distances from the spin labeled
site to the xpk in question. it only deals w/ "assigned" peaks; however
the directory contains scripts which calculate the intensity ratios for any
two matching (in both frequency dimensions) cross peaks from the data pair

usage: xpkScrpts/sl/ parFile
======================== sampleParFile ============================
# oxd.xpk
# red.xpk
# tauC
# mutResNum
# cutOffRatio (above which, ir = cutOffRatio, d-d(cutOffRatio)<=100
# sclFctr

calculates the 'binned' distances and generates constraint file for use w/
cns (crystallographic and nmr system), xplorNih structure refinement via
molecular dynamics packages.
( for e.g. generates constraint files for Dyana/Cyana packages)

there are also a variety of other scripts here, for e.g. to generate color codes
and build the color codes into pdb files so that molecule viewing graphic
programs such as molmol can display the residues according to the color code
(for e.g., the intensity ratios can be color coded); the color coding is
achieved by assigning "temperature" factors; this is Prof. Givin's idea.
strngs/ directory scripts to chains of connectivity from 3res 3d/4d data (under
development, but the scripts work suprisingly well, even at this stage)
sub{,.sed} (c)sh/sed reformats output of unix command "diff" to make it very useful in
terms of sorting and other operations
tubes/ directory awk scripts put together when i was stil new to heavy duty solution
nmr of proteins; idea was to get the chains of connected xpk's in 3res 3d
nmr data; not worked on in a long time. just left there so....
unfold.awk awk "awk" coding of the aliasing and unaliasing formulae that i came up
with quite some time ago; interesting behavior; may still need some work.
xpk2nv/ directory powerful and beautiful and useful c-code (w/ couple of versions,
including a faster version w/ reasonable compromise and a 'full' simulation
version) to simulate 3d spectra (3res or otherwsie) modeled as a set of
noninteracting gaussian oscillators using the information that may arise
from .xpk files of nmrview (which in turn can be synthesized from bmrb
data using scripts in bmrb2xpk directory (see entry for "bmrb2xpk"); but only
FREQUENCY domain data is generated.
fortran77 and perl versions are also presented. but c-code is the best! (as
fast as fortran77 code but only far more versatile); whatever is the means (c,
fortran, perl), the generated binary file is converted to nmrPipe data format
(can be viewed in nmrDraw) via nmrPipe scripts (template scripts are included)
and further converted into nmrview format. one potential application is to
generate frequency domain nD nmr data from bmrb data banks;
xpk2xeasy/ directory contains script to convert nmrview's .xpk file into xeasy format
xtrctEtc/ directory a bunch of scripts to xtract info' from .xpk files; many of them
handle streaming data (i.e. from stdin, so no usage information will be
given by just typing the command w/o arguments, you have to look into the
code itself to know how to use it). There are also somre readme files which
may come in handy.

0) for example,
cat ...pathname..../hncacb.xpk | |tail
1563 8.727 44.306 105.548 -1.24897 {} {?} {56.n}
1564 8.442 55.092 105.429 1.65900 {} {?} {68.n}
1565 8.436 45.884 105.426 -1.15888 {} {?} {68.n}
1566 8.440 45.299 105.400 18.89294 {} {?} {68.n}
1568 8.440 39.485 105.365 -3.13722 {} {?} {68.n}
1570 8.461 45.848 105.132 -1.31767 {} {?} {68.n}
1577 8.183 45.447 104.417 27.68684 {} {?} {71.n}
1578 8.181 42.526 104.406 -3.45622 {} {?} {71.n}
1579 8.185 56.074 104.352 3.44162 {} {?} {71.n}
1581 8.181 46.104 104.357 -1.32899 {} {?} {71.n}
extracting the peaknumber, the prton, carbon and nitrogen chemical shifts,
peak intensity and assignments, if any, from the .xpk file; the .xpk file
had 27 entries for each xpk, and could not fit on a screen w/ 80 characters.
the above output could be further piped into other programs or unix commands
such as lp, grep, sort etc.

1) similarly "assndXpks.awk" will find all records w/ assignments
awk -f assgndXpks.awk xpkFile OR cat xpkFile|awk -f assgndXpks.awk

2) assgndXpkList{,1} are c-shell scripts that work respectively w/ an .xpk file
as an argument, as streaming data from stdin and print to stdout a single column
of xpk numbers in the .xpk file which have assignment boxes filled in; the
output is a numerically sorted list.

3) cunt{1,2}multiAssgn help extract records in an .xpk file w/ multiple
assignments in them.

4) fmtPpm.awk is a useful script which taken a ppm.out file (from nmrview)
and truns it into a 'horizontal' table (i.e. with all the chemical shifts
for a given residue on a single line)

5) rmmbrStrpsByXpk.awk, to help manage recording/remembering the strips in
strips and strips2 utilities of nmrview