Manual and Examples for
Options and parameters for ESPript are specified
on the standard input. Thus, it is possible to group all the required values
in an input file.
All commands described below are accessible either running ESPript on-line or in the mode
Expert of the web interface. Less features
are accessible in modes Beginner and Advanced.
We strongly recommend that you use
the web interface and read accompanying
tutorial if you start with ESPript.
What do the input lines look like ?
Input Line 1: Aligned Sequences
| Syntax |
Sequence-File |
Selected-Range |
Start-Index |
Extra-Input |
PDB-File |
CNS-File |
| Example |
file.aln |
5-50 |
1 |
+ |
file.pdb |
cns.ctct |
| Mode in www |
beginner |
advanced |
advanced |
advanced |
expert |
expert |
-
Sequence-File
-
file name of the aligned sequences - see appendix1
for more details
-
Selected-Range [default: whole sequence]
-
range of residues to be displayed (for example 5-50).
-
Start-Index [default: 1]
-
renumbers residues, so that the first displayed sequence
starts with the specified Start-Index.
Remark: If the first displayed sequence starts
by ATREYES, the command line file.aln 5-4500 2 gives YES and Y
is numbered as second residue. Do not enter a Start-Index value if the first residue
is already numbered in file.aln, as explained in appendix1.
You can check residue numbering of all sequences using option N described in section
Output Layout.
Extra-Input [default: none]
-
a + specified enables overlay or extra input.
More in overlay example.
-
PDB-File [default: none]
-
name of a pdb file. A pdb output will be generated
with bfactors replaced by similarity score per residues - see appendix2.
-
CNS-File [default: none]
-
name of a CNS file containing a list of intermolecular
contacts - see appendix3.
Input Line 2: Secondary
Structures
| Syntax |
Sec.Str-File |
Acc-Disp |
Sec.Str-File |
Acc-Disp |
ScoreConfidence |
AutomaticSearch |
| Example |
file1.2st |
A |
file2.2st |
A |
9 |
all |
| Mode in www |
beginner |
beginner |
advanced |
advanced |
advanced |
advanced |
-
Sec.Str-File
-
Name of the file containing secondary
structure information. By default, displayed secondary elements are extracted from the first monomer, but
you can select a chainID with the command 'chain_': file1.2st chain_A
Remark: Three kinds of layouts are used,
depending if one or two secondary structure files are supplied:
-
1. If one sec. structure file is supplied (uploaded in the box named
top in webESPript)
-
Sec. str. elements are displayed at the
top of each block of sequences and relative accessibility at the bottom.
-
2. If two sec. structure files are supplied:
-
Sec. str. elements of first file and corresponding accessibility are displayed at the
top of each block
-
Sec. str. elements of second file (uploaded in the box named
bottom in webESPript) and corresponding accessibility are displayed at the
bottom of each block
-
3a. If file1.2st is entered as usual and the string
none is entered
as file2.2st: sec. str. elements and relative accessibility are displayed
at the top of each block - see example1
-
3b. If the string none is entered as file1.2st and file2.2st is entered in turn:
sec. str. elements of second file and relative accessibility are displayed at the bottom
of each block.
-
By default, file1.2st
(top in webESPript) and file2.2st
(bottom in webESPript)
refer to the first and the last displayed sequences. This default can be changed
by using the Special Character X for the first sec. structure file
and Z for the second.
Remark: sec. str. elements can be extracted
by reading the alignment file file.aln,
if you type the character * instead of file1.2st (click on 'extract info' in webESPript).
This * option prevents you from typing file.aln
twice and can be used for alignment files from
Predict Protein
or from NPS@, which contain information on predicted sec. structure elements - see example1.
You can also click on 'extract info' in webESPript to extract secondary structure
information with DSSP, if you have uploaded a PDB file in the box 'aligned sequences'.
-
Acc-Disp [default: none]
-
Displays relative accessibility if you upload DSSP or PHD files
as file1.2st or file2.2st.
-
ScoreConfidence [default: 9]
-
When the sec. structure file is a
PHD file, secondary
elements with a reliability equal at least to ScoreConfidence are highlighted.
If reliability is below limit, helices appear as small squiggles, beta strands
as dotted lines and labels are not written - see example1
-
AutomaticSearch [default: none]
- ESPript searches in the directory $DSSP_DIR for files having the same name
as aligned sequences. Thus, secondary structure information
of each aligned sequence with a known 3D structure can be displayed.
This option implies that you have a database of DSSP files installed on your disk. It can be
used with ENDscript, when you search for homologous sequences against the PDB.
Input Line 3: Output
| Syntax |
Output-file |
NumberingOption |
SequenceOutput |
BobScriptOutput |
| Example |
file.ps |
L or M |
SEQ |
BOB |
| Mode in www |
beginner |
beginner |
expert |
expert |
-
Output-file
-
name of the PostScript output file.
-
NumberingOption
-
By default, alpha helices, 310 helices, 5-helices (pi) and beta strands
are numbered with digits.
With the L option, helices and beta strands
are numbered with letters, starting at 'A'. With the M
option, helices are numbered with digits and strands with letters.
Remark: You can remove all sec. structure labels by using the
Special Character command: A S all (button 'hide
labels' in webESPript) if you want to prepare figures such example5.
-
SequenceOutput
- The SEQ option allows to extract
a single sequence in a one-letter code, from a multiple alignment file entered as file.aln.
By default, this sequence correponds to the first displayed in the ESPript figure and is written in a file named file.seq.
The extracted sequence can be used in NPS@
or other servers to perform further queries. The SEQ option can also be used to extract sequence information
from a PDB file.
- BobScriptOutput
- The BOB option allows to generate a BOBSCRIPT command file, which is used in turn
to generate a PostScript figure, showing secondary structure elements
defined by DSSP or STRIDE and coloured according to sequence similarity (see appendix2).
You can use ENDscript to obtain such
a figure without too much pain.
Remark: You can use the command BOBL instead of BOB, in order to show in the PostScript file,
side chains of residues strictly conserved.
Input Line 4: Similarity
Score
| Syntax |
SimilarityGlobalScore |
SimilarityDiffScore |
SimilarityType |
Consensus |
| Example |
0.7 |
0.5 |
R, B, P, I or S, M, E |
C |
|
Mode in www |
beginner |
advanced |
beginner |
beginner |
Check appendix for a general view on similarity computation and colour scheme.
-
SimilarityGlobalScore [default: 0.7]
-
If R, B, P or I as SimilarityType: a global score is calculated
for all sequences by extracting all possible pairs of residues per columns.
If applicable, a second score is calculated within each group of
sequences.
-
If S, M or E as SimilarityType: a percentage
is calculated for each column of residues.
If the score is greater than SimilarityGlobalScore,
it will be rendered as coloured characters (red characters on a white background by default
and white characters on a red background if residues are strictly conserved in the column) with frames
(blue by default).
Note that strictly conserved residues are boxed but are not framed, if you enter a SimilarityGlobalScore greater than 1.
-
SimilarityDiffScore [default: 0.5]
-
Applicable if R, B, P or I
as SimilarityType: residues which are conserved within a group
but
not conserved from one group to the other are highlighted (yellow background by default).
-
SimilarityType [default: R]
-
If R, B, P or I: a matrix
is used to calculate the similarity score.
Risler, Blosum62,
Pam250
and Identity are the four possibilities. We recommend a SimilarityGlobalScore
of 0.1-0.2 with B or P matrices and of 0.6-0.7 with R or I
matrices.
-
If S: a percentage of Strictly conserved residues
is calculated per columns.
-
If M: a percentage of similarity
is calculated considering criteria used in
MULTALIN.
- IV ; LM ; FY ; NDQEBZ
If E: a percentage of Equivalent residues
is calculated per columns, considering physico-chemical properties.
-
HKR are polar positive ; DE are polar
negative ; STNQ are polar neutral ;
-
AVLIM are non polar aliphatic ; FYW
are non polar aromatic ; PG ; C
Consensus [default: none]
A consensus sequence is generated using criteria
from MULTALIN:
uppercase
is identity, lowercase is consensus level > 0.5,
! is anyone
of IV, $ is anyone of LM, % is anyone of FY, # is
anyone of NDQEBZ
Remark: lowercase is consensus level
> SimilarityGlobalScore if S, M or E are used as SimilarityType
Input Line 5: Output Layout
| Syntax |
FontSize |
ColumnNb |
Vgap |
Vshift |
Hshift |
Bshift |
PrinterOpt |
Paper |
AllNumbered |
| Example |
7 |
70 |
6 |
0 |
0 |
0 |
C, T, S, B, F |
P, L, P3, L3 |
N |
|
Mode in www |
beginner |
beginner |
beginner |
beginner |
beginner |
advanced |
beginner |
beginner |
beginner |
-
FontSize [default: 7]
-
Size in points for the fonts (Courier for sequence
names and residues).
-
ColumnNb [default: 70]
-
Number of residue columns per line.
Remark: a column numbers value is calculated at the end of the log
file (end of the OUT file in webESPript). This value can be re-entered in
order to obtain a justified figure (look for the sentence
'suggestion columns per line' in the log file or in OUT).
Vgap [default: 6]
-
Vertical gap between two blocks of sequences. The
unit for the distance is the height of a line.
-
Vshift [default: 0]
-
Vertical shift for the whole display. The unit for
the distance is the height of a line.
-
Hshift [default: 0, centred]
-
Horizontal shift for the whole display. The unit
for the distance is the width of a residue.
-
Bshift [default: 0]
-
Shift lines below bottom sequence. The unit
for the distance is the width of a residue.
-
PrinterOpt [default: C]
-
C coloured output ; T coloured with all
letters in bold, ideal for thermal printers or any others, before reduction of your figure in an article ; S light yellow
background, ideal for slides ; B black & white, a grey scale is used
(see standard colours in section Special Characters) ; F flashy
colours, similar residues are written with black bold characters and boxed in yellow, ideal for overheads.
-
Paper [default: P]
-
P: Portrait(A4) ; L: Landscape(A4)
; P3: Portrait(A3) ; L3: Landscape(A3).
-
AllNumbered [default: first sequence]
-
By default, the first sequence is numbered every
ten residues as in example2. With the option N
(click on 'number sequences' in webESPript) all sequences
are numbered at the beginning of each block of sequences as in example3
Input Lines 6: Special Commands and Characters
hide sequences
| Example |
@skip |
|
Mode in www |
advanced |
Aligned sequences are not written (click on 'hide sequence'
in webESPript). '@skip' is a shortcut for the block of
Special Characters below:
I S all ! skip all
F S all ! "
H S all ! "
B S all ! "
O S all ! "
N S all ! "
T S all ! "
Y S all ! "
This option can be used to build a figure including several
secondary structure elements as in example4.
| Example |
@pp |
|
Mode in www |
advanced |
Extra information can be extracted upon use of the command '@pp', if
1. a mail from the
Predict Protein server
is entered as file.aln.
- - ProDom domains are visualized with yellow bars below each block
of sequences
- - x marks from the SEG low-complexity1 search are represented with dotted lines
- - peptides resulting from a PROSITE2 search are shown with bold letters
- 2. a file from the NPS@ server with
multiple sequence alignment and predicted secondary structure elements is entered as file.aln.
- - predicted sec.structure elements are shown below each aligned sequence: i.e. helices with squiggles, beta strands
with arrows, ambiguous predictions with solid circles
minus or plus
| Example |
@minus 5 40
@plus 63 |
|
Mode in www |
expert |
Residue numbering can be changed along a single sequence.
If '@minus' is used, numbering is shifted by -1 at the given column (here at columns
5 and 40). If '@plus' is used, residue numbering is shifted by +1 at the
given column. Before using this option, use the command '@ruler' described below to
visualize column numbers.
Remark: '@minus' and '@plus' are equivalent
to the buttons 'delete in seq' and 'insert in seq' in webESPript
Note that by default sequence numbering refers to the first
displayed sequence, but it can refer to the third displayed sequence (for example) if you enter the
Special Command Y D 3.
preview column numbers
| Example |
@ruler |
|
Mode in www |
advanced |
Column numbers are displayed on the PostScript image. This option is useful when preparing a figure with
the special commands '@minus' or '@plus' presented above, or the
Special Characters Q, V, W (section Do_it).
| Example |
@seq 5 text @seq vp7_ehdv1 text |
|
Mode in www |
expert |
The command is: @seq [sequence number or sequence name] [text or blank]
The text is then inserted above the chosen sequence.
Note that sequences numbers are given in the log file of ESPript.
Special case: the text is inserted below the last displayed sequence, if you chose a number
greater than the number of displayed sequences. Thus, you can give a name to a line of
Special Characters and change the colour of the name with the Special Character T.
As a test, you can enter in example2 the two lines of command below:
@seq 100 important residues
T R 100
modify or create colours
| Example |
@col R .8 0 0 B 0 0 .8 |
|
Mode in www |
expert |
Assigns a new rgb code for a Special Characters colour in ESPript (here red, R,
and blue, B, colours are modified).
You can also create a new special character colour, such as A for grey:
@col A .5 .5 .5 ! create a new colour named A
I A all ! strictly conserved residues
are in grey
Remark: a new character colour must be created before being used as in the example above. S
is reserved to skip. Otherwise, any uppercase character can be used. Have a look
at this site to chose new colours and corresponding percent rgb values
(range is 0-1 and white is 1 1 1).
replace labels
| Example |
@aA1 aA2 bB1 hH1 bB2
@aA3 bB3 |
|
Mode in www |
advanced |
Sec.structure labels can be replaced by new ones defined by the user. Labels
starting by a, b, h, p refer to alpha helices, beta
strands, 310 helices and 5-helices respectively. These first
characters are not displayed. Replacement is made according to the order
of entrance (see example4), firstly through the top sec. structure elements, then through
the bottom sec. structure elements if applicable.
Command lines can be written with all alpha helices firstly, then all beta strands,
310 and 5-helices (for instance you can remove labels of
all 310 helices by typing as many @h h h h h as needed).
Remark: If the first letter is typed in uppercase (@Ag1 Ag2),
the second letter is displayed using a Symbol font (here, displayed labels
would be gamma1 gamma2).
hide turns
| Example |
@nott |
|
Mode in www |
advanced |
Strict alpha and beta turns, usually rendered as TTT and TT, are not
displayed (see information on secondary structures).
insert secondary structure elements
| Example |
@top a 10-20 20-30 b 50-55
@bottom b 25-35 |
|
Mode in www |
expert |
Inserts alpha helices (a), beta strands (b), 310 helices (h)
or 5-helices (p) at the top or bottom of sequences blocks. Rules of numbering are the same
as in section Secondary Structures (i.e. by default, top and bottom secondary structure elements match top and bottom
sequences respectively).
Remark: You can enter up to 264 characters on this line of command. Click
on the button +1 of the interface to duplicate the form, if you use webESPript and
if you exceed this limit. Thus, you may be able to enter alpha helices in part
[01], and beta strands in part
[0 1], while still being under the limit of 264 characters in
each part.
hide names of secondary structure elements
| Example |
@noname |
|
Mode in www |
advanced |
Removes the name of the corresponding sequence at the beginning of each line of sec. str. elements
By default this name has the same colour as the first displayed element.
Remark: Assume a very special case, where your sequence starts at 10,
and you want to colour sec.structure name in red and
sec.structure elements in blue. Then you can use the Special Characters command
X:
X R 10-10
X B 11-4500.
hide alternate conformations
| Example |
@noalt |
|
Mode in www |
advanced |
Removes grey stars added on the top of blocks of sequences, above residues modelled
with alternate conformations. These residues are flagged automatically,
when you use the web interface and you directly upload a PDB file in the box reserved to Secondary
Structure information.
hide disulphide bridges
| Example |
@nodi |
|
Mode in www |
advanced |
Removes green digits (1 1, 2 2...)
added on the figure at the bottom of sequences blocks to show disulphide bridges. These bound cysteins are flagged automatically,
when you use the web interface and you directly upload a PDB file in the box reserved to Secondary
Structure information.
substitute sequence names
| Example |
@sub oldname1 newname1 oldname2 newname2 |
|
Mode in www |
expert |
Replaces the name of a sequence contained in your alignment file file.aln by a new one.
You can substitute up to 15 names. Remark: Suppose you want to change
the names of the first and third displayed sequences.
You can also type: @sub 1 newname1 3 newname2
Input Lines 6b: Special Characters
| Character-Type |
Character-Colour |
Position |
|
P, T, R, X, Y, Z, Q, V, W,
U, D, G, J, S, C, E, L, K, A, I, F, M, H, B, O, N,
s, t, u, a, b, c, d, e, f, g, h, i, j, k, l, m, n |
D, B, R, P, G, C, O, Y, W, S |
2 9-39 |
|
|
Example |
U R 2 9-39 |
|
Mode in www |
advanced |
-
Entry on each line is Character-Type Colour Position. Thus,
the command to display red triangles on residues 2 and from 9 to 39 is:
U R 2 9-39
By default, residues are numbered according to the first displayed sequence.
Character-Type
-
Type 1: miscellaneous
- P calculates hydropathy ; T changes colour of sequence names
; R reads intermolecular
contacts
- Type 2: assignment
- X top secondary structure information is assigned to a
chosen sequence, which is the first one by default; colour
of secondary elements can be changed ;
Y sequence numbering is assigned to a chosen sequence, which is the first one by default; colour of digits can be changed
;
Z residue numbering of another sequence, which is the last one by default,
can be displayed at the bottom of sequences blocks ; secondary structure information corresponding to this sequence
can also be displayed (see example3 )
-
Type 3: do it yourself
-
Q boxes residues (check example5) ; V
bold characters ; W adds frames
- Type 4a: adding markers
-
U triangle up (check example2); D triangle down ; G go ; J jammed ; S star ; C
solid circle ; E open circle ; L dotted line
; K stroke
-
Type 4b: changing default colours of
-
A labels above top sec. structure elements ;
I identity boxes ; F identity characters ; M group similarity boxes ;
H group similarity characters ; B global similarity frames ;
O difference similarity boxes ; N low similarity scores
-
Type 4c: adding NMR markers
-
s: Amide proton slow exchange rate (< 1mn-1)
-
t: 3JHN,HaNH-Ha
coupling constant less than 6 Hz
-
u: 3JHN,HaNH-Ha
coupling
constant > or equal to 7 Hz
-
a, b, c: dNN(i,i+1) NOE
between proton NH of residue i and i+1 (weak medium strong)
-
d, e, f: daN(i,i+1)
NOE between proton alpha of residue i and proton NH of i+1 (weak
medium strong)
-
g, h, i: dbN(i,i+1)
: NOE between proton beta of residue i and proton NH of i+1 (weak
medium, strong)
-
j: dNN(i,i+2) NOE between
proton NH of residue i and proton NH of i+2
-
k: daN(i,i+2)
: NOE between proton alpha of residue i and proton NH of i+2
-
l: daN(i,i+3)
: NOE between proton alpha of residue i and proton NH of i+3
-
m: dab(i,i+3)
: NOE between proton alpha of residue i and proton beta of i+3
-
n: daN(i,i+4)
: NOE between proton alpha of residue i and proton NH of i+4
-
Character-Colour (except if R is Character-Type)
-
Dark(black), Blue, Red, Pink, Green,
Cyan, Orange, Yellow, White, Skip(transparent)
Remark: a grey scale is used in mode black & white
-
Position, four cases, [] means mandatory and {} optional
-
1a. if P or T: [Character-Colour] [sequence name number or range, 1-1000 stands for all] {new sequence
name number or range} {....}
-
Example to calculate hydropathy of the third displayed sequence: P R 3
(the string 'hyd' will be written in red) or to colour the name of the second sequence in green: T G 2
1b. if R: [ChainId]
[residue range] {new
residue range} {...}
checks appendix for details on intermolecular contacts
2. if X, Y or Z: [Character-Colour] [name or number of sequence displayed]
{Start-Index, 1 by default}
OR
[residue range] {new residue range} {...}
-
Example to assign the first sec. str. file to the third displayed
sequence: X B 3 (sec. structure elements are in blue), to number the fourth displayed
sequence in blue: Z B 4
(the same command Z B 4 can be used to assign the second sec. structure file to
the fourth displayed sequence and to colour sec. structure elements in blue).
You can also colour elements in blue and red as in the example below
X B 3 ! sec. str. elements refer to the 3 displayed sequence and
are in blue. This sequence is now the reference
X R 4-50 60-80 ! but sec. structure elements from residues 4 to 50 and from 60 to 80 are in red
Remark: you can type X B name_of_the_third_displayed_sequence
instead of X B 3 as in the example above.
3. if Q, V or W: [Character-Colour] [number or range of sequence displayed]
{column range} {new column range} [...]
- Note that column numbering is used in this case instead of
residue numbering.
Use the special command @ruler to preview column numbers.
Example to highlight in yellow residues of sequences 3-8 from
columns 40 to 45 and from 50 to 55: Q Y 3-8 40-45 50-55, or to highlight the last sequence in cyan:
Q C 1000.
4. if U, D, S, C, L, A,
I, F, M, H, B, O, N,
s, t, u, a, b, c, d, e, f,
g, h, i, j, k, l, m or n:
[Character-Colour] [residue number or range] {new
residue number or range} {...}
-
Example to add red triangles at residue 2 and from 9 to 39: U
R 2 9-39, to box all identical residues in blue: I B 1-4500, to remove all sec. structure
labels: A S 1-4500
-
By default, positions refer to residue numbering of the first displayed sequence.
Use the special command Y to change this default.
Y B 3 ! residue numbering refers to the 3 displayed sequence and residues numbering is in blue
U R 9 20-30 ! adds red triangles below columns containing residues 9 and 20 to 30 of sequence 3
Remark: Enter one special character per line. You can
repeat a residue range or a column range on this line. You can use the string all
instead of 1-1000 in case 1a and of 1-4500 in cases 1b, 2, 3 and 4 (i.e. A S all
to remove all secondary structure labels).
Input Line 6c: Comment
| Example |
%This is a reminder |
|
Mode in www |
advanced |
A line beginning with % will be displayed at the bottom of the generated PostScript,
as a comment or a title.
Input Line 6d: Ending the section
| Example |
. |
|
Mode in www |
advanced |
A single dot on a line ends this section.
Input Lines 7...: Defining Groups and Blocks
| Example |
1-4 9 %8
6 5 7
. |
|
Mode in www |
beginner |
You can select the sequences to be displayed and their order on a single line: 2 1 3-5
all can be used to select the rest of the
sequences: 2 all (see example5)
A % before a sequence number keeps a
sequence for similarity calculations but prevents it from displaying: 2 %1 %3-5
(see example4).
You can also separate your sequences in groups for similarity computations, each line defining a group and giving the order
of the sequences to display as in example2 (ADVanced or EXPert modes in webESPript).
The calculation by group is not performed
if SimilarityType is Strict, Multalin or Equivalent (groups are just numbered).
This section is ended by a single dot on a single line.
Appendix
file.aln
file.aln is the ascii file containing aligned
sequences. The following formats are supported:
Should you have other aligned sequences, be sure
to keep two fields per line: the first one is the name of the sequence,
the second one the sequence itself. Use white characters (spaces) to separate
the two fields; use blank lines to separate two blocks as in:
vp7_btv1s MDTIAARALTVMRACATLQEARIVLEANVMEILGIAINRYNGLTLRGVTMRPTSLAQRNE
vp7_btv10 MDTIAARALTVMRACATLQEARIVLEANVMEILGIAINRYNGLTLRGVTMRPTSLAQRNE
vp7_btv1s MFFMCLDMMLSAAGINVGPISPDYTQHMATIGVLATPEIPFTTEAANEIARVTGETSTWG
vp7_btv10 MFFMCLDMMLSAAGINVGPISPDYTQHMATIGVLATPEIPFTTEAANEIARVTGETSTWG
FASTA format for multiple alignments is supported. Sequences can be entered
as below:
> vp7_btv1s
MDTIAARALTVMRACATLQEARIVLEANVMEILGIAINRYNGLTLRGVTMRPTSLAQRNE
MFFMCLDMMLSAAGINVGPISPDYTQHMATIGVLATPEIPFTTEAANEIARVTGETSTWG
> vp7_btv10
MDTIAARALTVMRACATLQEARIVLEANVMEILGIAINRYNGLTLRGVTMRPTSLAQRNE
MFFMCLDMMLSAAGINVGPISPDYTQHMATIGVLATPEIPFTTEAANEIARVTGETSTWG
If a Start-Index is present in file.aln (at
least in the first block of sequences), residue numbering is modified accordingly. Format
is title_Start-Index_ or title(Start-Index) as below:
vp7_btv1s(3) TIAARALTVMRACATLQEARIVLEANVMEIL
vp7_btv10(5) --AARALTVMRACATLQEARIVLEANVMEIL
vp7_btv1s GIAINRYNGLTLRGVTMRPTSLAQRNEMFFM
vp7_btv10 GIAINRYNGLTLRGVTMRPTSLAQRNEMFFM
file.pdb
You can enter the name of a PDB file at the first input line
(instead of the multiple alignment file, file.aln). ESPript will extract a one letter code sequence, corresponding to all
the residues contained in this PDB file.
You can display the sequence of a single monomer defined by a ChainID in the PDB file, by using the command chain_X on the input line: file.pdb chain_A
The extracted sequence is given the name of the input PDB file. This
default can be changed, if the header of the PDB file contains a line starting by
DBREF. The string of characters following DBREF will be the name of the extracted
sequence: DBREF sequence_name
You can also enter the name of a multiple alignment file, file.aln, and of a PDB file, file.pdb, on
the first input line: file.aln file.pdb (see example2).
Then, a file named file_bcol.pdb is created by ESPript from file.pdb.
The bfactors of the original file file.pdb are replaced by similarity scores in file_bcol.pdb.
Attention, similarity scores in file_bcol.pdb have been rescaled between 0 and 100.
This trick allows in a next step, to show conserved region along the structure with a nice colour ramping going from white to red.
The command chain_x allows to copy the similarity score of a chosen monomer in the output file file_bcol.pdb:
file.aln file.pdb chain_A
The output PDB file, file_bcol.pdb, can be read in
a BOBSCRIPT13 command file,
to produce the figure below (residues with SimilarityGlobalScore lower than 0.7 are in white, conserved areas
with SimilarityGlobalScore in the range 0.7-1.0 are colour-ramped in red with a 0-100 pseudo bfactor value).
The command file for BOBSCRIPT (vp7_btv10.bob) can be obtained
in ESPript by using the option BOB or more easily by using the interface
ENDscript.

Intermolecular contacts
A log file produced by CNS14
such as vp3_contact.log can be read
by ESPript to display protein-protein contacts (see example3).
You can also use ENDscript to generate
rapidly such a figure. A list of contacts is generated as follows:
-
Crystallographic contacts
-
Addition to CNS command file:
-
delete selection=(hydrogen) end flags exclude
* include pvdw end
-
parameter nbond wmin=4.0 end end energy end
-
generates in CNS log file:
-
%atoms "A -62 -ASN -OD1 " and "C -112 -THR -C
"(XSYM# 4) only 3.64 A apart
-
Non-crystallographic contacts
-
Addition to CNS command file:
-
flags exclude * include vdw end parameter nbond
wmin=0 end end
-
distance cuton=0.0 cutoff=4 from =(segid A) to
=(not segid A) end
- generates in CNS log file:
-
atoms "A -90 -ALA -CB " and "B -181 -HIS -CE1
" 3.6958 A apart
Residue names, residue numbers, first letter of segIDs and
distances are extracted from the CNS log file.
If the input line in ESPript is R A all, segIDs of all residues in contact
with molecule A are displayed on a bottom line named i_A.
The segID character is in red if the distance is less than
3.2 Å and in black if it is in the range 3.2-4 Å.
The shortest intermolecular distance is taken for each
residue. Thus, a B would be written under residue 90,
if the distance listed in the example above is the shortest between Ala90 segA and His181 segB.
A A would be written under His181 on a new bottom line named i_B with the command R B all.
Contacts can be further analysed looking to the figure produced by ESPript:
a to z in italic are crystallographic contacts between residues
A to Z in italic are non-crystallographic contacts between residues
# is a crystallographic contact between two residues having same names, numbers and segIDs
(crystallographic identity)
a to z are crystallographic contacts between residues having same names, numbers
but different segIDs
A to Z are non-crystallographic contacts between residues having same names, numbers
but different segIDs (for example along a non-crystallographic 2-fold axis)
file.2st
This file is an ASCII file from which ESPript
will extract Secondary Structure information. The following formats are
supported:
-
DSSP15 (a PDB file can be directly uploaded if you use webESPript, DSSP
being executed on the server)
-
STRIDE16
-
PHD17
(enter the full mail from Predict Protein)
Alpha helices, 3-helices (310) and 5-helices
are displayed as medium, small and large squiggles respectively. Beta strands are rendered as arrows, strict beta turns
as TT letters and strict alpha turns as TTT.
The secondary structures files of the two sequences have been entered in the excerpt below.

A verification is performed between residue names of the secondary structure file and of the chosen sequence
(which is the first displayed by default). In case of problem, the program will try to align the
two sequences without gaps.
You get the following warnings, if some residues do not correspond between the two sequences:
-
Warning: DSSP residue M does not match seq residue D 2 sequence 1 column 2
If the program failed to align the two sequences, you get an error message:
-
Warning: DSSP residue M does not match seq residue D 2 sequence 1 column 2
- Warning: DSSP residue D does not match seq residue T 3 sequence 1 column 3
- ...........................
- Error: sec. structure elements are certainly misplaced
and the figure generated by ESPript gives you a false information.
Remark: a file produced by DSSP can include the positions of disulphide bridges.
This information is rendered in ESPript by green digits (1 1, 2 2 ...)
written under each column with a bound cystein. Residues with alternate positions can also be flagged in the DSSP file (we use a
modified version of DSSP on the www interface), in order to be marked
by grey stars on the top of sequences blocks in the PostScript figure.
Accessibility
The relative accessibility of each residue can be extracted
from DSSP15 and PHD18 files. It is rendered
as blue-coloured boxes located at the last or first line of each block (see remark in section Secondary Structures).
Note that DSSP include only protein atoms in its calculation of accessibility. Co-ordinates of
water molecules, ligands... are not taken into account.

The blue square scale is set as follow:
| colour |
value |
accessibility |
| blue |
A>0.4 |
accessible |
| cyan |
A>0.1 and
A<0.4 |
intermediate |
| white |
A<0.1 |
buried |
| blue with red borders |
A>1 |
| red |
either accessibility is not predicted in PHD18
or residue names between sequence and DSSP15 file do not match
|
|
Maximum accessibility values for each residue according to DSSP:
ACDEFGHIKL
MNPQRSTVWY
10613516319419784184169205164188
157136198248130142142227222|
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Hydropathy
The hydropathic character of a sequence selected
with the P command (P D 1 for first displayed sequence) is calculated according to the algorithm
of Kyte and Doolittle19 with a window of
3.

| colour |
values |
Hydropathy |
| pink |
H>1.5 |
hydrophobic |
| grey |
H>-1.5 and H<1.5 |
intermediate |
| cyan |
H<-1.5 |
hydrophilic |
Hydropathic values for each residue:
IVLFCMAGTSWYPHEQDNKR
4.54.23.82.82.51.91.8-0.4-0.7-0.8-0.9
-1.3-1.6-3.2-3.5-3.5-3.5-3.5-3.9-4.5|
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Similarity computations
Similarity Scores
If Risler, BLOSUM62, PAM250 or Identity, several
scores are calculated:
-
in-Group Score(ISc)
which is a classical computation of a similarity score within each
group.
-
For a column made of 3 residues ACD: ISc=
(AC+AD+CD) / 3
-
Cross-Group Score(XSc)
which is the similarity score average for every sequence pair, where each
sequence belongs to a different group.
-
For a column made of 6 residues divided in 3 groups
(ACD)(DE)(G):
-
XSc= ( (AD+AE+CD+CE+DD+DE)/6+(AG+CG+DG)/3+(DG+EG)/2
) / 3
-
Total Score (TSc) is the sum of in-Group
Score and Cross-Group Score.
-
TSc= (ISc + XSc)/2
The user specifies a threshold for in-Group(ThIn)
and Diff-Group (ThDiff) scores.
Colours are chosen according to the following
rule:
|
A |
red box, white character -> strict
identity |
|
A |
red character (or black bold character with option flashy)
-> similarity in a group:
ISc > ThIn |
|
_ blue frame (filled
in yellow with option flashy) -> similarity across groups:
TSc > ThIn |
|
A |
orange box
-> differences between conserved groups: (ISc - Xsc)/2
> ThDiff |
Similarity Scores Matrices
Risler Matrix20
A C D E F G H I K L M N P Q R S T V W Y .
A 22-15 2 17 6 6 -6 17 14 13 10 13 -2 18 15 20 19 20 -9 2-30
C-15 22-17-15-16-17-18-16-16-15-16-16-18-14-15-13-14-14-18-11-30
D 2-17 22 10 -3 -4-13 0 1 -2 -5 8-12 6 -1 7 0 0-14 -4-30
E 17-15 10 22 6 3 -6 15 14 9 6 14 -1 21 19 18 16 16-10 2-30
F 6-16 -3 6 22 -4-11 10 1 10 -2 4-11 7 4 5 3 8 -9 20-30
G 6-17 -4 3 -4 22-12 0 -1 -2 -4 2-12 2 1 7 2 1-13 -2-30
H -6-18-13 -6-11-12 22 -8-10 -9-12 -3-16 -5 -4 -4 -9 -7-17 -8-30
I 17-16 0 15 10 0 -8 22 10 21 9 9 -6 14 14 16 16 22 -7 4-30
K 14-16 1 14 1 -1-10 10 22 7 4 10 -7 17 21 14 12 12-11 5-30
L 13-15 -2 9 10 -2 -9 21 7 22 18 8 -8 11 12 13 12 20 -8 5-30
M 10-16 -5 6 -2 -4-12 9 4 18 22 0-12 12 11 6 8 8-13 -2-30
N 13-16 8 14 4 2 -3 9 10 8 0 22-10 16 12 19 11 11-11 -1-30
P -2-18-12 -1-11-12-16 -6 -7 -8-12-10 22 -6 -3 -3 -5 -6-16-12-30
Q 18-14 6 21 7 2 -5 14 17 11 12 16 -6 22 20 18 17 15-10 5-30
R 15-15 -1 19 4 1 -4 14 21 12 11 12 -3 20 22 20 19 15 -8 8-30
S 20-13 7 18 5 7 -4 16 14 13 6 19 -3 18 20 22 21 18 -8 4-30
T 19-14 0 16 3 2 -9 16 12 12 8 11 -5 17 19 21 22 16-10 3-30
V 20-14 0 16 8 1 -7 22 12 20 8 11 -6 15 15 18 16 22 -7 3-30
W -9-18-14-10 -9-13-17 -7-11 -8-13-11-16-10 -8 -8-10 -7 22 -6-30
Y 2-11 -4 2 20 -2 -8 4 5 5 -2 -1-12 5 8 4 3 3 -6 22-30
.-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30 0
PAM250 Matrix21
A R N D C Q E G H I L K M F P S T W Y V .
A 2 -2 0 0 -2 0 0 1 -1 -1 -2 -1 -1 -4 1 1 1 -6 -3 0-15
R -2 6 0 -1 -4 1 -1 -3 2 -2 -3 3 0 -4 0 0 -1 2 -4 -2-15
N 0 0 2 2 -4 1 1 0 2 -2 -3 1 -2 -4 -1 1 0 -4 -2 -2-15
D 0 -1 2 4 -5 2 3 1 1 -2 -4 0 -3 -6 -1 0 0 -7 -4 -2-15
C -2 -4 -4 -5 12 -5 -5 -3 -3 -2 -6 -5 -5 -4 -3 0 -2 -8 0 -2-15
Q 0 1 1 2 -5 4 2 -1 3 -2 -2 1 -1 -5 0 -1 -1 -5 -4 -2-15
E 0 -1 1 3 -5 2 4 0 1 -2 -3 0 -2 -5 -1 0 0 -7 -4 -2-15
G 1 -3 0 1 -3 -1 0 5 -2 -3 -4 -2 -3 -5 -1 1 0 -7 -5 -1-15
H -1 2 2 1 -3 3 1 -2 6 -2 -2 0 -2 -2 0 -1 -1 -3 0 -2-15
I -1 -2 -2 -2 -2 -2 -2 -3 -2 5 2 -2 2 1 -2 -1 0 -5 -1 4-15
L -2 -3 -3 -4 -6 -2 -3 -4 -2 2 6 -3 4 2 -3 -3 -2 -2 -1 2-15
K -1 3 1 0 -5 1 0 -2 0 -2 -3 5 0 -5 -1 0 0 -3 -4 -2-15
M -1 0 -2 -3 -5 -1 -2 -3 -2 2 4 0 6 0 -2 -2 -1 -4 -2 2-15
F -4 -4 -4 -6 -4 -5 -5 -5 -2 1 2 -5 0 9 -5 -3 -3 0 7 -1-15
P 1 0 -1 -1 -3 0 -1 -1 0 -2 -3 -1 -2 -5 6 1 0 -6 -5 -1-15
S 1 0 1 0 0 -1 0 1 -1 -1 -3 0 -2 -3 1 2 1 -2 -3 -1-15
T 1 -1 0 0 -2 -1 0 0 -1 0 -2 0 -1 -3 0 1 3 -5 -3 0-15
W -6 2 -4 -7 -8 -5 -7 -7 -3 -5 -2 -3 -4 0 -6 -2 -5 17 0 -6-15
Y -3 -4 -2 -4 0 -4 -4 -5 0 -1 -1 -4 -2 7 -5 -3 -3 0 10 -2-15
V 0 -2 -2 -2 -2 -2 -2 -1 -2 4 2 -2 2 -1 -1 -1 0 -6 -2 4-15
.-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15 0
BLOSUM62 Matrix22
A R N D C Q E G H I L K M F P S T W Y V .
A 4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0 -4
R -1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3 -4
N -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3 -4
D -2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3 -4
C 0 -3 -3 -3 9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 -4
Q -1 1 0 0 -3 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2 -4
E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 -4
G 0 -2 0 -1 -3 -2 -2 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3 -4
H -2 0 1 -1 -3 0 0 -2 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3 -4
I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -2 -1 -3 -1 3 -4
L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1 -4
K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -2 -4
M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 0 -2 -1 -1 -1 -1 1 -4
F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1 -4
P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 -1 -1 -4 -3 -2 -4
S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -2 -4
T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 -2 -2 0 -4
W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 2 -3 -4
Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 -1 -4
V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 -4
. -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 1
Identity Matrix
A R N D C Q E G H I L K M F P S T W Y V .
A 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
R 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
N 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
D 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
C 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Q 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
E 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
G 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
H 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
I 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
L 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
K 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
M 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
F 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
W 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Input file examples
These examples refer to a study made with Pr David Stuart,
Division of Structural Biology, Oxford, on viral proteins
VP7 and VP3 in orbiviruses23,24
1. vp2_rota.inp (resulting PostScript,
Gif)
vp2_rota.phd ! mail from the Predict Protein server on vp2 rotavirus
* A none ! shows predicted sec. str. elements and accessibility on the top of each block
.
0.7 E ! physico-chemical boxing
6 81 ! layout
@pp ! extracts all infos from the Predict Protein file
@noname ! no names for sec. structures
.  
2-6 ! sequences to be displayed
.
2. vp7_adv.inp (resulting PostScript,
Gif)
vp7.aln vp7_btv10.pdb ! aligned sequences (from CLUSTALW) and pdb file
vp7_btv10.dssp A ! secondary structures (from DSSP)
vp7_adv.ps M ! PostScript output
.7 .5 R ! similarity criteria
7 60 ! layout
U R 127 250 ! -> red triangles
S B 168-170 178-180 ! -> blue stars
X B 1-126 254-349 ! -> sec. structure information in blue
X G 127-253 ! -> sec. structure information in green
T R 1-2 ! -> names of btv sequences in red
@noname ! no names for sec. structure
%Alignment for protein VP7.
. !
2 1 3-6 ! first group of sequences
7-8 ! second group of sequences
9 ! third group of sequences
. !
3. vp3_sup.inp (resulting PostScript,
Gif)
vp3.aln ! 1.CLU
vp31001.art_dssp ! DSSP file for btv-10 vp3
vp3_sup.ps M ! PostScript output
.7 .5 R
5 160 13 8 0 C L N
T R 1 ! title 1 sequence in red
Y R all ! numbering 1 sequence in red
@noname ! no names for sec. structure
%Alignment vp3 orbivirus and attempt against vp2 rotavirus
. ! end special characters
1-2 ! 1 group of sequences
9 10 ! 2 "
11 ! 3 "
. ! end
vp2_casper.aln 8 ! 2.THREADER alignment between btv vp3 and rota vp2
vp31001.art_dssp vp2_rota.phd ! DSSP file for btv and PHD file for rota
vp3_sup.ps
.7 R
5 160 16 -1 0 C L N
T R 1 ! title 1 sequence in red
Y R all ! numbering 1 sequence in red
T B 2 ! title 2 sequence in blue
Z B 2 ! numbering 2 sequence in blue
@noname ! no names for sec. structure
. ! end special characters
. ! all sequences in one group
The + on the
first line enables extra input.

4. vp3_art.inp (resulting PostScript,
Gif)
vp3.aln ! CLUSTAL alignment for VP3 and option +
vp32001.art_dssp ! DSSP file for first monomer
vp3_art.ps
.7 R
6 129 8 0 0 C L
X G 1  ! domains
X D 1-1
X B 298-587
X R 699-856
@skip ! hide sequence
@noname ! no names for sec. structure
@ah3 bbC bbD bbE ah4 ah5 bbF bbG bbH ah6 bbI ah7 ah8 bbJ bbK ah9
@ah10 bbL ah11 bbM bbN ah12 ah13 bbA ah1/bB ah2 ah3 bbD ah4 ah5
@ah6 bbE bbF bbG ah7 bbH bbI bbJ/h8 ah9 ah10 ah11 bbK ah12 bbL
@bbQ ah14 ah15 ah16 ah17 bbP/h18 ah19 ah20 ah21 bbA bbB bbC ah1
@bbE bbF bbG ah2 ah3 bbH bbI ah4 bbJ bbK bbL ah5 bbM bbN bbQ ah22
@bbR bbS ah23 bbT ! new sec. structure labels
.
1 %10 %11 %12
.
vp3.aln vp3_contact.log ! same alignment and X-PLOR output for contacts
vp31001.art_dssp ! DSSP file for second monomer
vp3_art.ps
.7 R
6 129 8 -1 0 C L ! vertical shift
X G 1  ! domains
X D 1-1
X B 298-587
X R 699-856
A S all ! no letters above sec. elements
R A all ! intermolecular contacts for VP3A
R B all ! intermolecular contacts for VP3B
@noname ! no names for sec. structure
%Secondary structures for vp3A and vp3B, article definition. 16 March 98.
.
1 %10 %11 %12
.
5. vp7_exp.inp (resulting PostScript,
Gif)
vp7.aln ! CLUSTAL alignment on orbivirus
vp7_btv10.dssp ! btv10 secondary elements
vp7_exp.ps M
.
7 60 6 0 F N
X B 1 ! btv10 sec. elements in blue
@skip ! hide all sequences
@h ! first 310 helix is not labeled
%Alignment on VP7 with two secondary structure elements. June 1999.
.
2 all ! btv10 in first then other sequences
.
vp7.aln ! same aligned sequences
vp7_btv1.dssp ! btv1 secondary elements
vp7_exp.ps M
.8 E ! homology criteria is %Equivalence
7 60 6 -1 F N ! vertical shift(-1) flashy colours(F) all sequences numbered(N)
X R 2 ! btv1 sec. elements in red
A S all ! remove sec. structure labels
T B 1 ! btv10 title in blue
T R 2 ! btv1 title in red
Q P 1-3 169-171 ! highlight RGD segment in pink
Q P 5-7 169-171
Q P 8 180-182
.
2 all ! btv10 is the first displayed sequence
.
References
1. Wootton, J.C. and Federhen, S. (1996)
Analysis of compositionally biased regions in sequence databases. Meth. in Enzymol.
266 554-571
2. Bairoch, A., Bucher, P. and Hofmann, K. (1997)
The PROSITE database, its status in 1997. Nucl. Acids Res. 25 217-221
3. Corpet, F. (1988)
Multiple sequence alignment with hierarchical clustering. Nucl.
Acids Res. 16 10881-10890
4. Corpet, F., Gouzy J. and Kahn D. (1999)
Recent improvements of the ProDom database of protein domain families. Nucl.
Acids Res. 27 263-267
5. Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994)
CLUSTAL W: improving the sensitivity of progressive multiple sequence
alignment through sequence weighting, positions-specific gap penalties
and weight matrix choice. Nucl. Acids Res. 22 4673-4680
6. Combet, C., Blanchet, C., Geourjon, C. and Deleage, G. (2000)
NPS@: Network Protein Sequence Analysiss.
TIBS 25 147-150
7. Wisconsin Package Version 9.0, Genetics Computer Group (GCG),
Madison, Wisc.
8. Pearson, R. and Lipman, D.J. (1988)
Improved Tools for Biological Sequence Analysis. Proc. Natl. Acad.
Sci. 85 2444-2448
9. Galtier, N., Gouy, M. and Gauthier, C. (1996)
SeaView and Phylo_win, two graphic tools for sequence alignment and
molecular phylogeny. Comput. Applic. Biosci. 12 543-548
10. Sander, C. and Schneider, R. (1991)
Database of homology-derived protein structures and the structural
meaning of sequence alignment. Proteins. 9 56-68
11. Jones, D.T., Taylor, W.R. and Thornton, J.M. (1992)
A new approach to protein fold recognition. Nature. 358
86-89
12. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H.,
Shindyalov, I.N. and Bourne, P.E. (2000)
The Potein Data Bank. Nucleic Acids Res. 28 235-242
13. Esnouf, R.M. (1997)
An extensively modified version of MolScript that includes greatly
enhanced coloring capabilities. J. Mol. Graphics. 15 132-134
14. Brünger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros P.,
Grosse-Kunstleve, R.W., Jiang J.S., Kuszewski, J., Nilges, M., Pannu, N.S., Read, R.J.,
Rice L.M., Simonson, T. and Warren, G.L. (2000)
Crystallography & NMR system: A new software suite for macromolecular
structure determination. Acta Crystallogr. D 54 905-921
15. Kabsch, W. and Sander, C. (1983)
Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded
and geometrical features. Biopolymers. 22 2577-2637
16. Frishman, D. and Argos, P. (1995)
Knowledge-based secondary structure assignment. Proteins. 23
566-579
17. Rost, B. (1996)
PHD: predicting one-dimensional protein structure by profile based
neural networks. Meth. in Enzym. 266 525-539
18. Rost, B. and Sander, C. (1994)
Conservation and prediction of solvent accessibility in protein families.
Proteins.
20
216-226
19. Kyte, J. and Doolittle, R. (1982)
A simple method for displaying the hydropathic character of a protein.
J.
Mol. Biol. 157 105-132
20. Risler, J.L., Delorme, M.O., Delacroix, H. and Henaut, A.
(1988)
Amino acid substitutions in structurally related proteins. A pattern
recognition approach. Determination of a new and efficient scoring matrix.
J.
Mol. Biol. 204 1019-1029
21. Dayhoff, M. (1978)
"Atlas of protein sequences and structure"National Biomedical Research
Foundation. Washington, D.C. p. 345
22. Henikoff, J.G. and Henikoff, S. (1996)
Blocks database and applications. Meth. in Enzym. 266
88-105
23. Grimes, J.M., Burroughs, J.N., Gouet, P., Diprose, J.M., Malby,
R., Zientara, S., Mertens, P.P.C. and Stuart, D.I. (1998)
The atomic structure of the bluetongue virus core. Nature. 395
470-478
24. Gouet, P., Diprose, J.M., Grimes, J.M., Malby, R., Burroughs, J.N.,
Zientara, S., Stuart, D.I. and Mertens, P.P.C. (1999)
The highly ordered double-stranded RNA genome of bluetongue virus
revealed by crystallography. Cell. 97
481-490
Back to MAIN page