2D HSQC Screening, Multivariate Analysis, and Chemical Shift Titration
Many applications of 1D and 2D NMR spectral series analysis have found use in
the drug design and evaluation process. In particular, the elegant approach
of Fesik and coworkers to use protein 2D HSQC perturbation screening as
part of a comprehensive scheme for drug design is now well known (Shuker
et al., 1996).
NMRPipe facilities have been used to create special applications
to address a variety of HSQC analysis tasks.
Some of the applications are suitable for any collection of related
HSQC spectra acquired with similar parameters, and can be used
with collections of several spectra or several hundred. Other
applications are intended primarily for HSQC chemical shift titration series.
The AutoProc application automatically processes a related series of
HSQC spectra, with automatic phase correction of the directly-detected
dimension. It uses as input a list of the spectra to be processed,
and a representative NMRPipe conversion script for any one of the
individual spectra in the series.
The TitrView application follows the change in position of one or
more peaks in a spectral series. The results are summarized in a single
table. As such, TitrView is a complement to PCAView,
which does not use peak table information.
Facilities are provided for automated peak picking of the series,
automated tracking of each peak's position, and interactive adjustment
of the results, to insert, move or delete peaks.
In the TitrView graphical interface,
the initial results are found automatically, and confirmed and adjusted
interactively. In the display, a given row follows the evolution of a peak's position
over the series, as indicated by cursor lines; the first entry in each
row shows the given region drawn in overlay for all spectra in the series.
The ModelTitr application is intended specifically for ligand titration
series. It provides a method for estimating dissociation constants (Kd)
for ligand binding to individual residues in the target protein according to
the 1H or 15N chemical shift evolutions in the HSQC titration series
(Zhou et al., 1996; Johnson et al., 1996).
ModelTitr fits each peak position evolution curve to
estimate a dissociation constant Kd for the corresponding residue.
The application uses the results of TitrView as its input, and works
non-interactively. The results for each curve are summarized in
PostScript plots. Either HN, 15N or a weighted combination of
the two types of shifts can be analyzed.
The ShowTitr application is an on-screen interactive alternative
to ModelTitr. The ShowTitr graphical interface
can step through and display the
individual evolution curves, and apply fitting routines to
selected curves via the script fitXY.tcl. The application uses the
results of TitrView as its input. As with modelTitr, either HN,
15N, or a weighted combination of the two types of shifts can be analyzed.
The PCAView application is a unique approach to spectral series analysis
that allows an entire spectral series to be summarized and evaluated
graphically, without the need for peak picking. Instead, the
complete matrix of intensities for each spectrum is used directly,
and spectra are clustered according to how similar their overall collection
of intensities are. The clustering is performed interactively
by inspecting the results of Principal Component Analysis (PCA).
As a qualitative screening technique,
this method has been used to highlight cases where spectral perturbations
are due to nonspecific effects such as pH, to reveal cases where meaningful
changes to spectra have occurred, and to identify different modes of binding.
(Ross et al., 1999).
The PCA technique can also find use in other types of spectral
series analysis, including 1D metabolite screening of biofluids.
Because the PCA method does not require peak analysis, it can be
quick and effective even in cases where the HSQC spectra are typically not fully resolved.
Multivariate Representations and Principal Component Analysis
The PCAView application makes use of a direct multivariate approach,
where an entire spectrum is represented as
a single object in a multidimensional space. The coordinates of that object
are simply the intensities at each point of the spectrum. There are some
useful properties of this representation. For example:
-
Similar spectra will cluster together in the same region of the multidimensional space,
since their intensities (i.e. their multivariate coordinates) are similar.
-
Spectra with similar features but differing intensity will cluster along
lines and curves in the multidimensional space. This is because they will have
some intensities in common, and some others which vary continuously.
-
Weaker spectra will cluster nearer to the origin of the multivariate space than more intense spectra.
The multivariate space can be visualized by projecting it onto a smaller
number of dimensions, by the method of Principal Component Analysis (PCA).
A given principal component points in the direction of maximum variance
in the data.
The PCAView application implements the method of Ross and coworkers
for analyzing HSQC drug screening series (Shuker et. al., 1996) by
Principal Component Analysis (PCA) (Ross et al., 2000). The
application uses multivariate statistics to provide a graphical
summary of the similarities and differences in a collection of
related spectra, in this case a series of automatically processed
HSQC spectra of roughly 200 samples of a target protein mixed with
various small molecules. Each number in the scatter plot at
lower left of the PCAView graphical interface
represents an entire HSQC spectrum in the series.
The distance between entries in the scatter plot relates to the
degree of similarity between spectra. The spectral window on the
right of the graphical interface allows one or more spectra or regions from the series to be
viewed in overlay.
In the examples shown above, the classes have already been interactively
selected and colored after inspection.
The yellow cluster reveals the bulk of spectra,
which are
mostly unchanged. The green cluster is a subgroup of spectra, which were
acquired with experimental conditions which make them more intense than
the others. The spectra in the red cluster are all exceptionally weak; the protein
in these samples has aggregated. The magenta cluster
reveals samples that have undergone extensive pH
changes, resulting in systematic elimination of certain peaks.
The blue cluster reveals spectra with collections of peaks which have
moved systematically, indicating that binding has occurred.
References
Johnson, P.E., Tomme, P., Joshi, M.D., and McIntosh, L.P., (1996) Biochemistry,
35, 13895-13906.
Ross, A., Schlotterbeck, G., Klaus, W., and Senn, H. (2000) J. Biomol.
NMR, 16, 139-146.
Shuker, S.B, Hajduk, P.J., Meadows, R.P., and Fesik, S.W. (1996) Science,
274, 1531-1534.
Zhou M., Harlan J.E., Wade, W.S., Crosby, S., Ravichandran, K.S., Burakoof,
S.J. and Fesik, S,W. (1996) J. Biol. Chem., 271, 31119-31123.
|