Bioinformatics Studies of Proteins
Biological processes are based on interactions between
proteins, peptides, nucleic acids, lipid bilayers
and ligands. The long-term research goal in the lab
is to find general principles that both describe the
molecular basis underlying these interactions and
provide a framework for predicting the functions of
the various molecules involved. In recent years, our
research has been focused mainly on two routes: (a)
membrane protein systems and (b) protein-protein interactions.
Our approach is based on combining fundamental physicochemical
principles with bioinformatics. Three-dimensional
(3D) structures are central to both routes, and we
use existing theoretical-computational tools and develop
new ones when necessary. Our calculations provide
information unobtainable through structural analysis
of the proteins alone, and the application of bioinformatics
allows us to extend the detailed, quantitative analysis
to annotating functions for whole families of proteins.
(A) Structure, function and motion in membrane
About 20-30% of the genes in all organisms code for
integral membrane proteins. The overexpression and
crystallization of membrane proteins is, however,
difficult and thus, 3D structures have been determined
for only several of the close to 20,000 sequences
of transmembrane (TM) proteins currently available
in the SWISS-PROT database. The objective of this
research project is to develop and use algorithmic
tools to predict structure, function and motion in
We recently developed a novel method for predicting
preferred conformations of pairs of tightly packed
TM helices (Fleishman & Ben-Tal, 2002). The method
is particularly suitable for cases such as glycophorin
A, where packing is mediated by a GxxxG-like motif.
The motif allows for two small residues to be on consecutive
helix turns in the helix-helix interface. The method
was subsequently used in a search for compact conformations
in the TM domain of a homodimer of the receptor tyrosine
kinase (RTK) erbB2, also known as HER2 (Fleishman
et al., 2002). The domain consists of two TM helices,
one from each monomer, each of which contains two
GxxxG-like motifs, and experiments suggest that both
motifs mediate dimerization. The computational search
yielded two stable conformations of these helices,
corresponding to dimerization via the two motifs,
and we hypothesized that they correspond to the basal
(inactive) and active states of erbB2. Based on this
hypothesis, we explained in molecular detail the effect
of the dozen or so available mutations of this receptor,
including the constitutively active and transforming
mutation denoted as neu*.
We will improve the methodology further to enable
structure prediction in other TM proteins. Constraints
derived from low-to-medium resolution structural data
obtained from cryo-EM or mutagenesis studies, and
constraints imposed by the loops connecting pairs
of TM helices will be added, and the methodology will
be made to deal with more than two helices. The revised
methodology will be used to study structure function
and motion in TM receptors, transporters and channels.
Predictions will be tested in collaboration with experimental
(B) Protein-protein interactions
Experimental approaches for the identification of
functionally important regions on the surface of a
protein involve mutagenesis, in which exposed residues
are replaced one after another while the change in
binding to other molecules or changes in activity
are recorded. However, practical considerations limit
the use of these methods to small-scale studies, precluding
a full mapping of all the functionally important residues
on the surface of a protein. A main research direction
in the lab is the development of alternative approaches,
based on the use of evolutionary data on protein families
to identify surface patches that are likely to be
involved in recognition processes. In parallel, we
also characterize inter-protein interfaces using physicochemical
and bioinformatics tools. One of our long-term goals
is to develop algorithmic tools to identify proteins
that may associate with each other, and to dock them
to each other if possible.
The rate of evolution is not constant among amino-acid
sites; some positions are highly conserved while others
vary substantially. These rate variations correspond
to different levels of purifying selection acting
on these sites. This purifying selection can be the
result of geometrical constraints on the folding of
the protein into its 3D structure, constraints at
amino-acid sites involved in enzymatic activity or
in ligand binding or, alternatively, at amino-acid
sites that take part in protein-protein interactions.
We developed and tested two new algorithmic tool for
mapping amino acid conservation onto the molecular
surface of proteins: ConSurf (Armon et al., 2001)
and Rate4Site (Pupko et al., 2002). Very recently
we developed the ConSurf (http://consurf.tau.ac.il/)
web-Server, a web-based-tool that uses the ConSurf
and Rate4Site algorithms. Given the 3D-structure of
a protein, or preferentially a domain, as an input,
the server automatically collects its close sequence-homologues,
calculates conservation grades based on the phylogenetic
relations among them and maps the grades onto the
Van-der-Waals surface of the protein. The protein,
with the conservation grades color-coded onto its
surface, can finally be visualized on-line (Fig. 1).
Fig. 1. Evolutionary conservation
pattern in the potassium ion channel. The channel
is represented as a spacefill model, with each atom
represented as a sphere. Scores, representing the
degree of evolutionary conservation of the amino acids,
are color-coded onto the structure of the channel.
Evolutionary conserved amino acids are colored maroon,
residues of average conservation are white, and variable
amino acids are turquoise. A potassium ion is shown
in yellow. The residues that are involved in ion binding
are highly conserved. The picture was produced using
complete list since 1998, including pdf copies is provided
http://ashtoret.tau.ac.il/ under “Manuscripts”.)
- Kessel, A. and Ben-Tal, N. (2002) Free energy
determinants of peptide association with lipid bilayers.
Current Topics in Membranes: Peptide-Lipid Interactions
52: 205-253 (Sydney Simon and Thomas McIntosh, Eds.),
Academic Press, San Diego.
- Fleishman, S.J. and Ben-Tal, N. (2002) A novel
scoring function for predicting the conformations
of tightly packed pairs of transmembrane -helices.
J. Mol. Biol. 321: 363-378.
- Fleishman, S.J., Schlessinger, J. and Ben-Tal,
N. (2002) A putative molecular-activation switch
in the transmembrane domain of erbB2. Proc. Natl.
Acad. Sci. USA 99: 15937-15940.
- Armon, A., Graur, D. and Ben-Tal, N. (2001) ConSurf:
an algorithmic tool for the identification of functional
regions in proteins by surface-mapping of phylogenetic
information. J. Mol. Biol. 307: 447-463.
- Pupko, T., Bell, R.E., Mayrose, I., Glaser, F.
and Ben-Tal, N. (2002) Rate4Site: an algorithmic
tool for the identification of functional regions
in proteins by surface mapping of evolutionary determinants
within their homologues. Bioinformatics 18 Suppl.