Azara, v2.8, copyright (C) 1993-2010 Wayne Boucher
	and Department of Biochemistry, University of Cambridge.

Help for Azara suite of programs.

Date help created:  28 Dec 1993
Date last updated:  06 Apr 2010
If accessing this information from the azara program: When the help pauses, hit <carriage return> to continue, or any character followed by <carriage return> to stop.

e-mail address (bugs, etc.): azara@bioc.cam.ac.uk

Azara is a suite of programs to process and view NMR data. Copies of the source code are available from the above address. See the LICENSE for the terms and conditions of use. See the INSTALL notes for installation information.

The programs are available via anonymous ftp. See README-2.8 in the top-level directory. CHANGES are occasionally made to the release code.

The problem of support still needs to be worked out, but one statement definitely holds: no LICENSE, no support.

First, a quick guide to the programs currently available. [Motif] means that the Motif libraries are needed to compile the program, and an X server is needed to run it.

process :

	A general multi-dimensional NMR processing program.
	It can be used just to convert unblocked data to
	blocked data, for use in the other programs.

plot2 : [Motif]

	Allows contouring and viewing of (2-dimensional) planes
	from one or more data files, with hardcopy output also
	available.  Also allows (approximate) phasing of
	(1-dimensional) slices (rows or columns) of the planes.

plot1 : [Motif]

	Allows processing and viewing of 1-dimensional data,
	with 'real-time' control over arbitrary parameters.
	Hardcopy output is also available.

connect :

	Matches crosspeaks to one or more pairs of shifts.

contours :

	Contours (two-dimensional) planes from multi-dimensional
	data.  The contours are output in a format suitable for
	use by Per Kraulis' program Ansig.

peak_find :

	Finds extrema in a spectrum, and optionally allows a
	simple parabolic fit of the extrema centers.

peak_fit :

	Fits extrema (magnitude, phase, center and linewidth) in
	a spectrum using process scripts.

combine :

	Combines two or more separate data sets, e.g. by
	adding them together.  Only a couple of combining
	functions are currently defined.

project :

	Projects multi-dimensional data onto chosen dimensions.
	It can also be used to permute the ordering of the
	dimensions of data.  In particular, any 2 dimensions
	of a multi-dimensional data file can be transposed.

extract :

	Extracts (hyper)planes from multi-dimensional data.
	Useful for testing process on smaller data sets.

deflate :

	Compresses data by zeroing all data below a specified
	level (in absolute value), and then using 'run-length'
	encoding.  Can be used as input to Per Kraulis'
	program Ansig.

reflate :

	Uncompresses data compressed using deflate.

unblock :

	Converts blocked data to unblocked (i.e., sequential)
	data.  This provides a possible route to importing
	data into other programs.

Each program has (most of) its source code in its own directory. To find out more about a given program, type

	<program> help

There is also a Python module:

DataRows :

	Allows Python access to blocked data files a row at a time.

There are some other directories.

global :

	Contains source code that is used by more than one program.

utility :

	Contains miscellaneous utility programs (e.g. 'bin2asc',
	which converts binary data to ascii).  See the README file
	in the utility directory for a description of programs.

bin :

	Contains copies of (links to) the programs (executables).

help :

	Contains the source (text) for all the help files.

html :

	Contains HTML files for use with Web browsers.
	This is the recommended way for viewing the help files.

azara :

	Typing 'azara help' prints out this information.

The normal entry point into the suite is via the program process. All the other programs assume that the data has a 'blocked' structure. process automatically creates blocked data from unblocked (sequential) data. process accepts data that is blocked or unblocked for input.

Blocked data files do not have headers as part of the data file. Instead, associated with every data file is a so-called par file which describes the data. This par file must also exist for unblocked data files. The par file must be created by hand for unblocked data files. All the other par files needed will be created by the programs (except the referencing may need changing).

The par files are in text and so can be edited, but beware, it is important that only the referencing and file name be edited.

The processing programs all have an associated 'script' file, which specifies the input par file, the output data file, and whatever other parameters are needed. An output par file will be created, if that makes sense (it does not for contours or unblock, for example). Thus, a typical script file will look like

	input <par file of input data file>
	output  <output data file>
	[other parameters]

The output par file will have the name '<output data file>.par' and will appear in the directory in which the program is run, unless it cannot be created, in which case it will appear in the same directory as the <output data file>.

If another name for the output par file is desired then the following script can be used instead

	input <par file of input data file>
	output  <output data file>
	par <par file of output data file>
	[other parameters]

For explicit examples of script files for a given program, type

	<program> help

The general structure of these script files, and also par files, is that each line will have the form (except for comments)

	<keyword> <one or more parameter values>

All parameter values must be given explicitly for every keyword (i.e. there are no implicit default values), but some keywords are optional (such as par above).

Comments in (non-data) files are everything in a line following an occurence of the character '!'. Blanks lines are allowed. White space separates parameters in a line.

Dimensions of data always present a problem with conventions. In Azara the dimension of data that is 'fastest' on disk is 'dim 1', the dimension that is second fastest is 'dim 2', etc. Thus the acquisition dimension in NMR experiments will be dimension 1.

Point counting is another place where there is a problem with conventions. In par files, points are counted in real points, even for complex data. This is to avoid having to specify whether the data is real or complex in the given dimensions. Thus a dimension with 16 complex points would have 32 points in the par file. However, in the program process, commands that need points assume that the count is given in terms of complex points for complex data.

All binary data files exported from the processing programs have the data as 4-byte floating point (with exceptions of the programs contours and deflate, which have some integer data).

For more information about blocked data (and the casual user will not need to know any more), type

	azara help blocked

and for more information about par files (and every user will need to know more), type

	azara help par

blocked

It is easiest to describe blocked data by considering an example. The corresponding statements are true no matter what the dimension.

Let N1, N2, N3 be the number of (real) points in the three dimensions of a three-dimensional data set.

A sequential ordering of the data has N3 sets of (N2 sets of N1 data points). A blocked data file chops up this 'cube' of data into sub-cubes. This makes for faster access of the data in dimensions 2 and 3.

Let B1, B2, B3 be the number of points in the three dimensions of a block (here, sub-cube). Then B = B1 x B2 x B3 is the size of one block.

The first B points in the blocked data file correspond to the first sub-cube of the cube of the sequential data file, the next B points correspond to the second sub-cube, etc. The ordering of the data in a sub-cube is inherited from the ordering of the sequential data file.

A block may be specified by its position in the (blocked) cube in the same way a point may be specified by its position in the (sequentially ordered) cube. This position may either be specified as a 3-vector (thinking of the data geometrically) or as a single number (thinking of the sequential ordering).

Blocked data files always have an integral number of blocks, even if N1 (resp. N2, N3) is not a multiple of B1 (resp. B2, B3). This padding of data can waste a bit of disk space, but such is life. Let M1 (resp. M2, M3) be the smallest multiple of B1 (resp. B2, B3) that is >= N1 (resp. N2, N3). Then the blocked data file is actually of size M1 x M2 x M3.

As an example, consider the (3-vector) point (x1, x2, x3) in the cube. This is position x1 + x2*N1 + x3*N1*N2 in the sequentially ordered file. It is also point (x1 % B1, x2 % B2, x3 % B3) in block (x1/B1, x2/B2, x3/B3). (Here, % means remainder, and x/B means the integral part of the quotient.) Conversely, point (y1, y2, y3) in block (b1, b2, b3) corresponds to the point (y1 + b1*B1, y2 + b2*B2, y3 + b3*B3) in the cube.

In Azara, B1, B2 and B3 are powers of 2. This is for the convenience of typical NMR processing. However, all of the block access routines are written so that B1, B2 and B3 could be anything. Again, the corresponding statements are true no matter what the dimension.

Blocked files do not have headers, they are just rearrangements of sequential data files. In place of headers there are par files. To find out more information about par files, type

	azara help par

par

A par file is used to describe the dimensions, referencing, etc., of a data set. It is a text file, hence can be edited. A par file must have the following at the very least:

	ndim <number of dimensions of associated data set>
	file <file name of assocated data set>

and then for each dimension

	dim <dimension number, from 1 to number of dimensions>
	npts <number of (real) points for this dimension>

Optionally, before the occurence of the first 'dim', there may be one or more of

	head <length of header of data file, in (4-byte) words>
	int		! integer (i.e. not floating point) data
	swap		! data has wrong byte ordering
	big_endian	! data file has big endian byte ordering
	little_endian	! data file has little endian byte ordering
	deflate <level>	! data has been compressed at <level>
	reflate <level>	! data has been compressed at <level>
			!	and then uncompressed
	blocks <desired block sizes each of the dimensions>
	varian <dimension ordering>

and for each dimension there may optionally be one or more of

	block <block size for this dimension>
	sw <spectral width in Hz, e.g. 8065>
	sf <spectrometer frequency in MHz, e.g. 600>
	refppm <ref. ppm of ref. point for this dim., e.g. 4.72>
	refpt <reference point for this dimension, e.g. 512.5>
	nuc <nucleus for this dimension, e.g. 1H or 13C or 15N>

and for one dimension there may optionally be

	params ! list of length npts parameters
	sigmas ! list of length npts parameters

If 'block' occurs for one dimension it must occur for all. If it does not occur then the associated data file is assumed to be unblocked (i.e. sequential). Only process allows the data to be unblocked.

If 'blocks' occurs then the data file must be unblocked, and hence is only relevant for data files that are used as input to process. The specified block sizes are then used for the output data file (otherwise process determines the block sizes).

'varian' should be used for data acquired on Varian spectrometers, and in this case <dimension ordering> specifies the ordering of the data. For example, for a 3D experiment where the data for each FID in the acquistion dimension is ordered as RR, IR, RI, II (R = real, I = imaginary) then 'varian 2 3' would be used, whereas if the data is ordered as RR, RI, IR, II then 'varian 3 2' would be used. Currently this is only allowed (and should only be needed) for process.

'params' is currently only used in plot2 in the fitting module. It can specify, for example, relaxation time or temperature or pH that is being varied from one plane to the next in a 3D experiment.

'sigmas' is currently only used in plot2 in the fitting module. It can only occur if 'params' occurs, and then in the same dimension. It gives an estimate of the standard deviation of the data, which then allows an estimate of the fitting error.

Any of the referencing information that is missing will be given a default (which will be wrong of course). Only some of the programs make use of the referencing information.

It is suggested that the correct referencing be put in the initial par file (i.e., for the unblocked data), but it should be remembered that process does not modify this information in any way. Alternatively, the referencing in the par file of the spectrum should be edited.

SPECIAL WARNING: Incorrect referencing can be a source of error. You have been warned.

The par file must be created by hand for unblocked data files. All the other par files needed will be created by the programs.

A typical par file for input to process might be (if the data was collected on a Bruker AMX and processed on a Silicon Graphics Indigo, or other Unix machine with the same byte ordering)

	! /usr/people/wb104/edl387/edl387_5.bin.par

	ndim 3	! data is 3 dimensional

	file /usr/people/wb104/edl387/edl387_5.bin
		! name of data file
		! use of the full path name is recommended

	int	! data is integer
	swap	! data has wrong byte ordering

	dim 1		! dimension 1 parameters
	npts 1024	! 1024 (real) points

	dim 2		! dimension 2 parameters
	npts 256	! 256 (real) points

	dim 3		! dimension 3 parameters
	npts 64		! 64 (real) points

and using this, a typical par file output by process might be

	! /usr/people/wb104/edl387/edl387_5.spc.par

	ndim 3
	file /usr/people/wb104/edl387/edl387_5.spc

	dim 1
	npts 512
	block 64
	sw 1000.00
	sf 500.00
	refppm 1.00
	refpt 1.0
	nuc 1H

	dim 2
	npts 256
	block 16
	sw 1000.00
	sf 500.00
	refppm 1.00
	refpt 1.0
	nuc 1H

	dim 3
	npts 64
	block 4
	sw 1000.00
	sf 500.00
	refppm 1.00
	refpt 1.0
	nuc 1H

and this might be then correctly referenced to read

	! /usr/people/wb104/edl387/edl387_5.spc.par

	ndim 3
	file /usr/people/wb104/edl387/edl387_5.spc

	dim 1
	npts 512
	block 64
	sw 4032.8
	sf 600.1
	refppm 4.72
	refpt 512.5
	nuc 1H

	dim 2
	npts 256
	block 16
	sw 8065.5
	sf 600.1
	refppm 4.72
	refpt 128.5
	nuc 1H

	dim 3
	npts 64
	block 4
	sw 1016.7
	sf 60.82
	refppm 117.4
	refpt 32.5
	nuc 15N

This correct referencing could have been put in the original par file.

Azara help: azara / W. Boucher / azara@bioc.cam.ac.uk