Manual and Examples for
E S P r i p t  2.1
ESPript   go to the ENDscript program
Doc Run ESPript The ESPript home Frequently Asked Questions
The ESPript tutorial The ESPript reference documentation (page 1) Reference Documentation(page 2) Citing ESPript/ENDscript Download

Options and parameters for ESPript are specified on the standard input. Thus, it is possible to group all the required values in an input file. All commands described below are accessible either running ESPript on-line or in the mode Expert of the web interface. Less features are accessible in modes Beginner and Advanced.
We strongly recommend that you use the web interface and read accompanying tutorial if you start with ESPript.

What do the input lines look like ?

Typical Input Lines 
    example
1 Aligned Sequences file.aln 5-50 1 + file.pdb cns.ctct
2  Secondary Structures file1.2st A file2.phd A 9 all
3 Output file.ps L SEQ BOB
4 Similarity Scoring 0.7 0.5 R C
5 Output Layout 7 70 6 0 0 0 C P N
6 Special Commands @skip
@pp
@minus 5 40
@ruler
@seq 5 text
@col R .8 0 0 B 0 0 .8
@aA1 aA2 bB1 hH1 bB2
@nott
@top a 10-20 30-40 b 50-55
@noname
@noalt
@nodi
@sub oldname1 newname1
Special Characters U B 2
L D 10-16
Comment %This is a reminder
Ending the section . (single dot on single line)
7 Defining Groups and Blocks 1-4 9 %8
6 5 7
.

Input Line 1: Aligned Sequences

Syntax Sequence-File  Selected-Range  Start-Index  Extra-Input  PDB-File  CNS-File 
Example file.aln  5-50  file.pdb  cns.ctct 
Mode in www beginner advanced advanced advanced expert expert
Sequence-File
file name of the aligned sequences - see appendix1 for more details
Selected-Range [default: whole sequence]
range of residues to be displayed (for example 5-50).
Start-Index [default: 1]
renumbers residues, so that the first displayed sequence starts with the specified Start-Index.

Remark: If the first displayed sequence starts by ATREYES, the command line file.aln 5-4500 2 gives YES and Y is numbered as second residue. Do not enter a Start-Index value if the first residue is already numbered in file.aln, as explained in appendix1. You can check residue numbering of all sequences using option N described in section Output Layout.

Extra-Input [default: none]
a + specified enables overlay or extra input. More in overlay example.
PDB-File [default: none]
name of a pdb file. A pdb output will be generated with bfactors replaced by similarity score per residues - see appendix2.
CNS-File [default: none]
name of a CNS file containing a list of intermolecular contacts - see appendix3.

Input Line 2: Secondary Structures

Syntax Sec.Str-File Acc-Disp Sec.Str-File Acc-Disp ScoreConfidence AutomaticSearch
Example file1.2st  file2.2st 9 all
Mode in www beginner beginner advanced advanced advanced advanced
Sec.Str-File
Name of the file containing secondary structure information. By default, displayed secondary elements are extracted from the first monomer, but you can select a chainID with the command 'chain_': file1.2st chain_A

Remark: Three kinds of layouts are used, depending if one or two secondary structure files are supplied:
1. If one sec. structure file is supplied (uploaded in the box named top in webESPript)
Sec. str. elements are displayed at the top of each block of sequences and relative accessibility at the bottom.
2. If two sec. structure files are supplied:
Sec. str. elements of first file and corresponding accessibility are displayed at the top of each block
Sec. str. elements of second file (uploaded in the box named bottom in webESPript) and corresponding accessibility are displayed at the bottom of each block
3a. If file1.2st is entered as usual and the string none is entered as file2.2st: sec. str. elements and relative accessibility are displayed at the top of each block - see example1
3b. If the string none is entered as file1.2st and file2.2st is entered in turn: sec. str. elements of second file and relative accessibility are displayed at the bottom of each block.

By default, file1.2st (top in webESPript) and file2.2st (bottom in webESPript) refer to the first and the last displayed sequences. This default can be changed by using the Special Character X for the first sec. structure file and Z for the second.

Remark: sec. str. elements can be extracted by reading the alignment file file.aln, if you type the character * instead of file1.2st (click on 'extract info' in webESPript). This * option prevents you from typing file.aln twice and can be used for alignment files from Predict Protein or from NPS@, which contain information on predicted sec. structure elements - see example1.
You can also click on 'extract info' in webESPript to extract secondary structure information with DSSP, if you have uploaded a PDB file in the box 'aligned sequences'.


Acc-Disp [default: none]
Displays relative accessibility if you upload DSSP or PHD files as file1.2st or file2.2st.
ScoreConfidence [default: 9]
When the sec. structure file is a PHD file, secondary elements with a reliability equal at least to ScoreConfidence are highlighted. If reliability is below limit, helices appear as small squiggles, beta strands as dotted lines and labels are not written - see example1
AutomaticSearch [default: none]
ESPript searches in the directory $DSSP_DIR for files having the same name as aligned sequences. Thus, secondary structure information of each aligned sequence with a known 3D structure can be displayed.
This option implies that you have a database of DSSP files installed on your disk. It can be used with ENDscript, when you search for homologous sequences against the PDB.

Input Line 3: Output

Syntax Output-file  NumberingOption  SequenceOutput  BobScriptOutput 
Example file.ps  L or M SEQ BOB
Mode in www beginner beginner expert expert
Output-file
name of the PostScript output file.
NumberingOption
By default, alpha helices, 310 helices, 5-helices (pi) and beta strands are numbered with digits.

With the L option, helices and beta strands are numbered with letters, starting at 'A'. With the M option, helices are numbered with digits and strands with letters.

Remark: You can remove all sec. structure labels by using the Special Character command: A S all (button 'hide labels' in webESPript) if you want to prepare figures such example5.

SequenceOutput
The SEQ option allows to extract a single sequence in a one-letter code, from a multiple alignment file entered as file.aln. By default, this sequence correponds to the first displayed in the ESPript figure and is written in a file named file.seq. The extracted sequence can be used in NPS@ or other servers to perform further queries. The SEQ option can also be used to extract sequence information from a PDB file.
BobScriptOutput
The BOB option allows to generate a BOBSCRIPT command file, which is used in turn to generate a PostScript figure, showing secondary structure elements defined by DSSP or STRIDE and coloured according to sequence similarity (see appendix2). You can use ENDscript to obtain such a figure without too much pain.

Remark: You can use the command BOBL instead of BOB, in order to show in the PostScript file, side chains of residues strictly conserved.

Input Line 4: Similarity Score

Syntax SimilarityGlobalScore  SimilarityDiffScore  SimilarityType  Consensus 
Example 0.7  0.5  R, B, P, I or S, M, E 
Mode in www beginner advanced beginner beginner

Check appendix for a general view on similarity computation and colour scheme.

SimilarityGlobalScore [default: 0.7]
If R, B, P or I as SimilarityType: a global score is calculated for all sequences by extracting all possible pairs of residues per columns. If applicable, a second score is calculated within each group of sequences.
If S, M or E as SimilarityType: a percentage is calculated for each column of residues.

If the score is greater than SimilarityGlobalScore, it will be rendered as coloured characters (red characters on a white background by default and white characters on a red background if residues are strictly conserved in the column) with frames (blue by default). Note that strictly conserved residues are boxed but are not framed, if you enter a SimilarityGlobalScore greater than 1.

SimilarityDiffScore [default: 0.5]
Applicable if R, B, P or I as SimilarityType: residues which are conserved within a group but not conserved from one group to the other are highlighted (yellow background by default).
SimilarityType [default: R]
If R, B, P or I: a matrix is used to calculate the similarity score. Risler, Blosum62, Pam250 and Identity are the four possibilities. We recommend a SimilarityGlobalScore of 0.1-0.2 with B or P matrices and of 0.6-0.7 with R or I matrices.
If S: a percentage of Strictly conserved residues is calculated per columns.
If M: a percentage of similarity is calculated considering criteria used in MULTALIN.
IV ; LM ; FY ; NDQEBZ
If E: a percentage of Equivalent residues is calculated per columns, considering physico-chemical properties.
HKR are polar positive ; DE are polar negative ; STNQ are polar neutral ;
AVLIM are non polar aliphatic ; FYW are non polar aromatic ; PG ; C
Consensus [default: none]
A consensus sequence is generated using criteria from MULTALIN: uppercase is identity, lowercase is consensus level > 0.5, ! is anyone of IV, $ is anyone of LM, % is anyone of FY, # is anyone of NDQEBZ

Remark: lowercase is consensus level > SimilarityGlobalScore if S, M or E are used as SimilarityType

Input Line 5: Output Layout

Syntax FontSize  ColumnNb  Vgap  Vshift  Hshift  Bshift  PrinterOpt  Paper  AllNumbered 
Example 70  C, T, S, B, F P, L, P3, L3
Mode in www beginner beginner beginner beginner beginner advanced beginner beginner beginner
FontSize [default: 7]
Size in points for the fonts (Courier for sequence names and residues).
ColumnNb [default: 70]
Number of residue columns per line.

Remark: a column numbers value is calculated at the end of the log file (end of the OUT file in webESPript). This value can be re-entered in order to obtain a justified figure (look for the sentence 'suggestion columns per line' in the log file or in OUT).

Vgap [default: 6]

Vertical gap between two blocks of sequences. The unit for the distance is the height of a line.
Vshift [default: 0]
Vertical shift for the whole display. The unit for the distance is the height of a line.
Hshift [default: 0, centred]
Horizontal shift for the whole display. The unit for the distance is the width of a residue.
Bshift [default: 0]
Shift lines below bottom sequence. The unit for the distance is the width of a residue.
PrinterOpt [default: C]
C coloured output ; T coloured with all letters in bold, ideal for thermal printers or any others, before reduction of your figure in an article ; S light yellow background, ideal for slides ; B black & white, a grey scale is used (see standard colours in section Special Characters) ; F flashy colours, similar residues are written with black bold characters and boxed in yellow, ideal for overheads.
Paper [default: P]
P: Portrait(A4) ; L: Landscape(A4) ; P3: Portrait(A3) ; L3: Landscape(A3).
AllNumbered [default: first sequence]
By default, the first sequence is numbered every ten residues as in example2. With the option N (click on 'number sequences' in webESPript) all sequences are numbered at the beginning of each block of sequences as in example3

Input Lines 6: Special Commands and Characters

hide sequences

Example @skip
Mode in www advanced

Aligned sequences are not written (click on 'hide sequence' in webESPript). '@skip' is a shortcut for the block of Special Characters below:
I S all ! skip all
F S all ! "
H S all ! "
B S all ! "
O S all ! "
N S all ! "
T S all ! "
Y S all ! "
This option can be used to build a figure including several secondary structure elements as in example4.

more info from a file from Predict Protein or from NPS@

Example @pp
Mode in www advanced

Extra information can be extracted upon use of the command '@pp', if
1. a
mail from the Predict Protein server is entered as file.aln.
- ProDom domains are visualized with yellow bars below each block of sequences
- x marks from the SEG low-complexity1 search are represented with dotted lines
- peptides resulting from a PROSITE2 search are shown with bold letters
2. a file from the NPS@ server with multiple sequence alignment and predicted secondary structure elements is entered as file.aln.
- predicted sec.structure elements are shown below each aligned sequence: i.e. helices with squiggles, beta strands with arrows, ambiguous predictions with solid circles

minus or plus

Example @minus 5 40
@plus 63
Mode in www expert

Residue numbering can be changed along a single sequence. If '@minus' is used, numbering is shifted by -1 at the given column (here at columns 5 and 40). If '@plus' is used, residue numbering is shifted by +1 at the given column. Before using this option, use the command '@ruler' described below to visualize column numbers.

Remark: '@minus' and '@plus' are equivalent to the buttons 'delete in seq' and 'insert in seq' in webESPript
Note that by default sequence numbering refers to the first displayed sequence, but it can refer to the third displayed sequence (for example) if you enter the Special Command Y D 3.

preview column numbers

Example @ruler
Mode in www advanced

Column numbers are displayed on the PostScript image. This option is useful when preparing a figure with the special commands '@minus' or '@plus' presented above, or the Special Characters Q, V, W (section Do_it).

insert text at sequences

Example @seq 5 text
@seq vp7_ehdv1 text
Mode in www expert

The command is: @seq [sequence number or sequence name] [text or blank]
The text is then inserted above the chosen sequence. Note that sequences numbers are given in the log file of ESPript.

Special case: the text is inserted below the last displayed sequence, if you chose a number greater than the number of displayed sequences. Thus, you can give a name to a line of
Special Characters and change the colour of the name with the Special Character T. As a test, you can enter in example2 the two lines of command below:
@seq 100 important residues
T R 100

modify or create colours

Example @col R .8 0 0 B 0 0 .8
Mode in www expert

Assigns a new rgb code for a Special Characters colour in ESPript (here red, R, and blue, B, colours are modified).
You can also create a new special character colour, such as A for grey:
@col A .5 .5 .5 ! create a new colour named A
I A all      ! strictly conserved residues are in grey

Remark: a new character colour must be created before being used as in the example above. S is reserved to skip. Otherwise, any uppercase character can be used. Have a look at this site to chose new colours and corresponding percent rgb values (range is 0-1 and white is 1 1 1).

replace labels

Example @aA1 aA2 bB1 hH1 bB2
@aA3 bB3
Mode in www advanced

Sec.structure labels can be replaced by new ones defined by the user. Labels starting by a, b, h, p refer to alpha helices, beta strands, 310 helices and 5-helices respectively. These first characters are not displayed. Replacement is made according to the order of entrance (see example4), firstly through the top sec. structure elements, then through the bottom sec. structure elements if applicable.
Command lines can be written with all alpha helices firstly, then all beta strands, 310 and 5-helices (for instance you can remove labels of all 310 helices by typing as many @h h h h h as needed).

Remark: If the first letter is typed in uppercase (@Ag1 Ag2), the second letter is displayed using a Symbol font (here, displayed labels would be gamma1 gamma2).

hide turns

Example @nott
Mode in www advanced

Strict alpha and beta turns, usually rendered as TTT and TT, are not displayed (see information on secondary structures).

insert secondary structure elements

Example @top a 10-20 20-30 b 50-55
@bottom b 25-35
Mode in www expert

Inserts alpha helices (a), beta strands (b), 310 helices (h) or 5-helices (p) at the top or bottom of sequences blocks. Rules of numbering are the same as in section Secondary Structures (i.e. by default, top and bottom secondary structure elements match top and bottom sequences respectively).

Remark: You can enter up to 264 characters on this line of command. Click on the button +1 of the interface to duplicate the form, if you use webESPript and if you exceed this limit. Thus, you may be able to enter alpha helices in part [01], and beta strands in part [0 1], while still being under the limit of 264 characters in each part.

hide names of secondary structure elements

Example @noname
Mode in www advanced

Removes the name of the corresponding sequence at the beginning of each line of sec. str. elements By default this name has the same colour as the first displayed element.

Remark: Assume a very special case, where your sequence starts at 10, and you want to colour sec.structure name in red and sec.structure elements in blue. Then you can use the Special Characters command X:
X R 10-10
X B 11-4500
.

hide alternate conformations

Example @noalt
Mode in www advanced
Removes grey stars added on the top of blocks of sequences, above residues modelled with alternate conformations. These residues are flagged automatically, when you use the web interface and you directly upload a PDB file in the box reserved to Secondary Structure information.

hide disulphide bridges

Example @nodi
Mode in www advanced
Removes green digits (1 1, 2 2...) added on the figure at the bottom of sequences blocks to show disulphide bridges. These bound cysteins are flagged automatically, when you use the web interface and you directly upload a PDB file in the box reserved to Secondary Structure information.

substitute sequence names

Example @sub oldname1 newname1 oldname2 newname2
Mode in www expert
Replaces the name of a sequence contained in your alignment file file.aln by a new one. You can substitute up to 15 names.

Remark: Suppose you want to change the names of the first and third displayed sequences.
You can also type: @sub 1 newname1 3 newname2

Input Lines 6b: Special Characters

Character-Type  Character-Colour  Position 
P, T, R, X, Y, Z, Q, V, W, U, D, G, J, S, C, E, L, K, A, I, F, M, H, B, O, N,
s, t, u, a, b, c, d, e, f, g, h, i, j, k, l, m, n
D, B, R, P, G, C, O, Y, W, S   2 9-39
Example U R 2 9-39
Mode in www advanced
Entry on each line is Character-Type Colour Position. Thus, the command to display red triangles on residues 2 and from 9 to 39 is:
U R 2 9-39
By default, residues are numbered according to the first displayed sequence.

Character-Type

Type 1: miscellaneous
P calculates hydropathy ; T changes colour of sequence names ; R reads intermolecular contacts

Type 2: assignment
X top secondary structure information is assigned to a chosen sequence, which is the first one by default; colour of secondary elements can be changed ; Y sequence numbering is assigned to a chosen sequence, which is the first one by default; colour of digits can be changed
; Z residue numbering of another sequence, which is the last one by default, can be displayed at the bottom of sequences blocks ; secondary structure information corresponding to this sequence can also be displayed (see example3 )

Type 3: do it yourself
Q boxes residues (check example5) ; V bold characters ; W adds frames

Type 4a: adding markers
U triangle up (check example2); D triangle down ; G go ; J jammed ; S star ; C solid circle ; E open circle ; L dotted line
; K stroke

Type 4b: changing default colours of
A labels above top sec. structure elements ; I identity boxes ; F identity characters ; M group similarity boxes ; H group similarity characters ; B global similarity frames ; O difference similarity boxes ; N low similarity scores

Type 4c: adding NMR markers
s: Amide proton slow exchange rate (< 1mn-1)
t: 3JHN,HaNH-Ha coupling constant less than 6 Hz
u: 3JHN,HaNH-Ha coupling constant > or equal to 7 Hz
a, b, c: dNN(i,i+1) NOE between proton NH of residue i and i+1 (weak medium strong)
d, e, f: daN(i,i+1) NOE between proton alpha of residue i and proton NH of i+1 (weak medium strong)
g, h, i: dbN(i,i+1) : NOE between proton beta of residue i and proton NH of i+1 (weak medium, strong)
j: dNN(i,i+2) NOE between proton NH of residue i and proton NH of i+2
k: daN(i,i+2) : NOE between proton alpha of residue i and proton NH of i+2
l: daN(i,i+3) : NOE between proton alpha of residue i and proton NH of i+3
m: dab(i,i+3) : NOE between proton alpha of residue i and proton beta of i+3
n: daN(i,i+4) : NOE between proton alpha of residue i and proton NH of i+4
Character-Colour (except if R is Character-Type)
Dark(black), Blue, Red, Pink, Green, Cyan, Orange, Yellow, White, Skip(transparent)
Remark: a grey scale is used in mode black & white
Position, four cases, [] means mandatory and {} optional

1a. if P or T: [Character-Colour] [sequence name number or range, 1-1000 stands for all] {new sequence name number or range} {....}
Example to calculate hydropathy of the third displayed sequence: P R 3
(the string 'hyd' will be written in red) or to colour the name of the second sequence in green: T G 2
1b. if R: [ChainId] [residue range] {new residue range} {...}
checks appendix for details on intermolecular contacts

2. if X, Y or Z: [Character-Colour] [name or number of sequence displayed] {Start-Index, 1 by default} OR [residue range] {new residue range} {...}

Example to assign the first sec. str. file to the third displayed sequence: X B 3 (sec. structure elements are in blue), to number the fourth displayed sequence in blue: Z B 4 (the same command Z B 4 can be used to assign the second sec. structure file to the fourth displayed sequence and to colour sec. structure elements in blue).
You can also colour elements in blue and red as in the example below
X B 3 ! sec. str. elements refer to the 3 displayed sequence and are in blue. This sequence is now the reference
X R 4-50 60-80 ! but sec. structure elements from residues 4 to 50 and from 60 to 80 are in red

Remark: you can type X B name_of_the_third_displayed_sequence instead of X B 3 as in the example above.

3. if Q, V or W: [Character-Colour] [number or range of sequence displayed] {column range} {new column range} [...]

Note that column numbering is used in this case instead of residue numbering.
Use the special command @ruler to preview column numbers.
Example to highlight in yellow residues of sequences 3-8 from columns 40 to 45 and from 50 to 55: Q Y 3-8 40-45 50-55, or to highlight the last sequence in cyan: Q C 1000.

4. if U, D, S, C, L, A, I, F, M, H, B, O, N, s, t, u, a, b, c, d, e, f, g, h, i, j, k, l, m or n: [Character-Colour] [residue number or range] {new residue number or range} {...}
Example to add red triangles at residue 2 and from 9 to 39: U R 2 9-39, to box all identical residues in blue: I B 1-4500, to remove all sec. structure labels: A S 1-4500
By default, positions refer to residue numbering of the first displayed sequence. Use the special command Y to change this default.

Y B 3 ! residue numbering refers to the 3 displayed sequence and residues numbering is in blue
U R 9 20-30 ! adds red triangles below columns containing residues 9 and 20 to 30 of sequence 3

Remark: Enter one special character per line. You can repeat a residue range or a column range on this line. You can use the string all instead of 1-1000 in case 1a and of 1-4500 in cases 1b, 2, 3 and 4 (i.e. A S all to remove all secondary structure labels).

Input Line 6c: Comment

Example %This is a reminder
Mode in www advanced

A line beginning with % will be displayed at the bottom of the generated PostScript, as a comment or a title.

Input Line 6d: Ending the section

Example .
Mode in www advanced

A single dot on a line ends this section.

Input Lines 7...: Defining Groups and Blocks

Example 1-4 9 %8
6 5 7 
.
Mode in www beginner

You can select the sequences to be displayed and their order on a single line: 2 1 3-5
all can be used to select the rest of the sequences: 2 all (see example5)
A % before a sequence number keeps a sequence for similarity calculations but prevents it from displaying: 2 %1 %3-5 (see example4).

You can also separate your sequences in groups for similarity computations, each line defining a group and giving the order of the sequences to display as in example2 (ADVanced or EXPert modes in webESPript). The calculation by group is not performed if SimilarityType is Strict, Multalin or Equivalent (groups are just numbered).
This section is ended by a single dot on a single line.


Appendix

file.aln

file.aln is the ascii file containing aligned sequences. The following formats are supported: Should you have other aligned sequences, be sure to keep two fields per line: the first one is the name of the sequence, the second one the sequence itself. Use white characters (spaces) to separate the two fields; use blank lines to separate two blocks as in:
vp7_btv1s       MDTIAARALTVMRACATLQEARIVLEANVMEILGIAINRYNGLTLRGVTMRPTSLAQRNE
vp7_btv10       MDTIAARALTVMRACATLQEARIVLEANVMEILGIAINRYNGLTLRGVTMRPTSLAQRNE

vp7_btv1s       MFFMCLDMMLSAAGINVGPISPDYTQHMATIGVLATPEIPFTTEAANEIARVTGETSTWG
vp7_btv10       MFFMCLDMMLSAAGINVGPISPDYTQHMATIGVLATPEIPFTTEAANEIARVTGETSTWG
FASTA format for multiple alignments is supported. Sequences can be entered as below:
> vp7_btv1s
MDTIAARALTVMRACATLQEARIVLEANVMEILGIAINRYNGLTLRGVTMRPTSLAQRNE
MFFMCLDMMLSAAGINVGPISPDYTQHMATIGVLATPEIPFTTEAANEIARVTGETSTWG
> vp7_btv10
MDTIAARALTVMRACATLQEARIVLEANVMEILGIAINRYNGLTLRGVTMRPTSLAQRNE
MFFMCLDMMLSAAGINVGPISPDYTQHMATIGVLATPEIPFTTEAANEIARVTGETSTWG
If a Start-Index is present in file.aln (at least in the first block of sequences), residue numbering is modified accordingly. Format is title_Start-Index_ or title(Start-Index) as below:
vp7_btv1s(3)    TIAARALTVMRACATLQEARIVLEANVMEIL
vp7_btv10(5)    --AARALTVMRACATLQEARIVLEANVMEIL

vp7_btv1s       GIAINRYNGLTLRGVTMRPTSLAQRNEMFFM
vp7_btv10       GIAINRYNGLTLRGVTMRPTSLAQRNEMFFM

file.pdb

You can enter the name of a PDB file at the first input line (instead of the multiple alignment file, file.aln). ESPript will extract a one letter code sequence, corresponding to all the residues contained in this PDB file. You can display the sequence of a single monomer defined by a ChainID in the PDB file, by using the command chain_X on the input line: file.pdb chain_A

The extracted sequence is given the name of the input PDB file. This default can be changed, if the header of the PDB file contains a line starting by DBREF. The string of characters following DBREF will be the name of the extracted sequence: DBREF sequence_name

You can also enter the name of a multiple alignment file, file.aln, and of a PDB file, file.pdb, on the first input line: file.aln file.pdb (see example2).
Then, a file named file_bcol.pdb is created by ESPript from file.pdb. The bfactors of the original file file.pdb are replaced by similarity scores in file_bcol.pdb. Attention, similarity scores in file_bcol.pdb have been rescaled between 0 and 100. This trick allows in a next step, to show conserved region along the structure with a nice colour ramping going from white to red. The command chain_x allows to copy the similarity score of a chosen monomer in the output file file_bcol.pdb: file.aln file.pdb chain_A

The output PDB file, file_bcol.pdb, can be read in a BOBSCRIPT13 command file, to produce the figure below (residues with SimilarityGlobalScore lower than 0.7 are in white, conserved areas with SimilarityGlobalScore in the range 0.7-1.0 are colour-ramped in red with a 0-100 pseudo bfactor value). The command file for BOBSCRIPT (vp7_btv10.bob) can be obtained in ESPript by using the option BOB or more easily by using the interface ENDscript.


 

Intermolecular contacts

A log file produced by CNS14 such as vp3_contact.log can be read by ESPript to display protein-protein contacts (see example3). You can also use ENDscript to generate rapidly such a figure. A list of contacts is generated as follows:
Crystallographic contacts
Addition to CNS command file:
delete selection=(hydrogen) end flags exclude * include pvdw end
parameter nbond wmin=4.0 end end energy end
generates in CNS log file:
%atoms "A -62 -ASN -OD1 " and "C -112 -THR -C "(XSYM# 4) only 3.64 A apart
Non-crystallographic contacts
Addition to CNS command file:
flags exclude * include vdw end parameter nbond wmin=0 end end
distance cuton=0.0 cutoff=4 from =(segid A) to =(not segid A) end
generates in CNS log file:
atoms "A -90 -ALA -CB " and "B -181 -HIS -CE1 " 3.6958 A apart
Residue names, residue numbers, first letter of segIDs and distances are extracted from the CNS log file. If the input line in ESPript is R A all, segIDs of all residues in contact with molecule A are displayed on a bottom line named i_A. The segID character is in red if the distance is less than 3.2 Å and in black if it is in the range 3.2-4 Å. The shortest intermolecular distance is taken for each residue. Thus, a B would be written under residue 90, if the distance listed in the example above is the shortest between Ala90 segA and His181 segB. A A would be written under His181 on a new bottom line named i_B with the command R B all.

Contacts can be further analysed looking to the figure produced by ESPript:

  • a to z in italic are crystallographic contacts between residues
  • A to Z in italic are non-crystallographic contacts between residues
  • # is a crystallographic contact between two residues having same names, numbers and segIDs (crystallographic identity)
  • a to z are crystallographic contacts between residues having same names, numbers but different segIDs
  • A to Z are non-crystallographic contacts between residues having same names, numbers but different segIDs (for example along a non-crystallographic 2-fold axis)
  • file.2st

    This file is an ASCII file from which ESPript will extract Secondary Structure information. The following formats are supported: Alpha helices, 3-helices (310) and 5-helices are displayed as medium, small and large squiggles respectively. Beta strands are rendered as arrows, strict beta turns as TT letters and strict alpha turns as TTT. The secondary structures files of the two sequences have been entered in the excerpt below.

    A verification is performed between residue names of the secondary structure file and of the chosen sequence (which is the first displayed by default). In case of problem, the program will try to align the two sequences without gaps. You get the following warnings, if some residues do not correspond between the two sequences:
     Warning: DSSP residue M does not match seq residue D 2 sequence 1 column 2
    If the program failed to align the two sequences, you get an error message:
     Warning: DSSP residue M does not match seq residue D 2 sequence 1 column 2
     Warning: DSSP residue D does not match seq residue T 3 sequence 1 column 3
    ...........................
     Error: sec. structure elements are certainly misplaced
    and the figure generated by ESPript gives you a false information.

    Remark: a file produced by DSSP can include the positions of disulphide bridges. This information is rendered in ESPript by green digits (1 1, 2 2 ...) written under each column with a bound cystein. Residues with alternate positions can also be flagged in the DSSP file (we use a modified version of DSSP on the www interface), in order to be marked by grey stars on the top of sequences blocks in the PostScript figure.

    Accessibility

    The relative accessibility of each residue can be extracted from DSSP15 and PHD18 files. It is rendered as blue-coloured boxes located at the last or first line of each block (see remark in section Secondary Structures). Note that DSSP include only protein atoms in its calculation of accessibility. Co-ordinates of water molecules, ligands... are not taken into account.

    The blue square scale is set as follow:
    colour value accessibility
    blue A>0.4 accessible
    cyan A>0.1 and A<0.4 intermediate
    white A<0.1 buried
    blue with red borders A>1
    red either accessibility is not predicted in PHD18 or residue names between sequence and DSSP15 file do not match

    Maximum accessibility values for each residue according to DSSP:
    ACDEFGHIKL MNPQRSTVWY
    10613516319419784184169205164188 157136198248130142142227222

    Hydropathy

    The hydropathic character of a sequence selected with the P command (P D 1 for first displayed sequence) is calculated according to the algorithm of Kyte and Doolittle19 with a window of 3.

    colour values Hydropathy
    pink  H>1.5 hydrophobic
    grey  H>-1.5 and H<1.5 intermediate
    cyan  H<-1.5 hydrophilic

    Hydropathic values for each residue:
    IVLFCMAGTSWYPHEQDNKR
    4.54.23.82.82.51.91.8-0.4-0.7-0.8-0.9 -1.3-1.6-3.2-3.5-3.5-3.5-3.5-3.9-4.5

    Similarity computations

    Similarity Scores

    If Risler, BLOSUM62, PAM250 or Identity, several scores are calculated: The user specifies a threshold for in-Group(ThIn) and Diff-Group (ThDiff) scores.

    Similarity Scores Matrices

    Risler Matrix20

            A  C  D  E  F  G  H  I  K  L  M  N  P  Q  R  S  T  V  W  Y  .
         A 22-15  2 17  6  6 -6 17 14 13 10 13 -2 18 15 20 19 20 -9  2-30
         C-15 22-17-15-16-17-18-16-16-15-16-16-18-14-15-13-14-14-18-11-30
         D  2-17 22 10 -3 -4-13  0  1 -2 -5  8-12  6 -1  7  0  0-14 -4-30
         E 17-15 10 22  6  3 -6 15 14  9  6 14 -1 21 19 18 16 16-10  2-30
         F  6-16 -3  6 22 -4-11 10  1 10 -2  4-11  7  4  5  3  8 -9 20-30
         G  6-17 -4  3 -4 22-12  0 -1 -2 -4  2-12  2  1  7  2  1-13 -2-30
         H -6-18-13 -6-11-12 22 -8-10 -9-12 -3-16 -5 -4 -4 -9 -7-17 -8-30
         I 17-16  0 15 10  0 -8 22 10 21  9  9 -6 14 14 16 16 22 -7  4-30
         K 14-16  1 14  1 -1-10 10 22  7  4 10 -7 17 21 14 12 12-11  5-30
         L 13-15 -2  9 10 -2 -9 21  7 22 18  8 -8 11 12 13 12 20 -8  5-30
         M 10-16 -5  6 -2 -4-12  9  4 18 22  0-12 12 11  6  8  8-13 -2-30
         N 13-16  8 14  4  2 -3  9 10  8  0 22-10 16 12 19 11 11-11 -1-30
         P -2-18-12 -1-11-12-16 -6 -7 -8-12-10 22 -6 -3 -3 -5 -6-16-12-30
         Q 18-14  6 21  7  2 -5 14 17 11 12 16 -6 22 20 18 17 15-10  5-30
         R 15-15 -1 19  4  1 -4 14 21 12 11 12 -3 20 22 20 19 15 -8  8-30
         S 20-13  7 18  5  7 -4 16 14 13  6 19 -3 18 20 22 21 18 -8  4-30
         T 19-14  0 16  3  2 -9 16 12 12  8 11 -5 17 19 21 22 16-10  3-30
         V 20-14  0 16  8  1 -7 22 12 20  8 11 -6 15 15 18 16 22 -7  3-30
         W -9-18-14-10 -9-13-17 -7-11 -8-13-11-16-10 -8 -8-10 -7 22 -6-30
         Y  2-11 -4  2 20 -2 -8  4  5  5 -2 -1-12  5  8  4  3  3 -6 22-30
         .-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30-30  0
    

    PAM250 Matrix21

            A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V  .
         A  2 -2  0  0 -2  0  0  1 -1 -1 -2 -1 -1 -4  1  1  1 -6 -3  0-15
         R -2  6  0 -1 -4  1 -1 -3  2 -2 -3  3  0 -4  0  0 -1  2 -4 -2-15
         N  0  0  2  2 -4  1  1  0  2 -2 -3  1 -2 -4 -1  1  0 -4 -2 -2-15
         D  0 -1  2  4 -5  2  3  1  1 -2 -4  0 -3 -6 -1  0  0 -7 -4 -2-15
         C -2 -4 -4 -5 12 -5 -5 -3 -3 -2 -6 -5 -5 -4 -3  0 -2 -8  0 -2-15
         Q  0  1  1  2 -5  4  2 -1  3 -2 -2  1 -1 -5  0 -1 -1 -5 -4 -2-15
         E  0 -1  1  3 -5  2  4  0  1 -2 -3  0 -2 -5 -1  0  0 -7 -4 -2-15
         G  1 -3  0  1 -3 -1  0  5 -2 -3 -4 -2 -3 -5 -1  1  0 -7 -5 -1-15
         H -1  2  2  1 -3  3  1 -2  6 -2 -2  0 -2 -2  0 -1 -1 -3  0 -2-15
         I -1 -2 -2 -2 -2 -2 -2 -3 -2  5  2 -2  2  1 -2 -1  0 -5 -1  4-15
         L -2 -3 -3 -4 -6 -2 -3 -4 -2  2  6 -3  4  2 -3 -3 -2 -2 -1  2-15
         K -1  3  1  0 -5  1  0 -2  0 -2 -3  5  0 -5 -1  0  0 -3 -4 -2-15
         M -1  0 -2 -3 -5 -1 -2 -3 -2  2  4  0  6  0 -2 -2 -1 -4 -2  2-15
         F -4 -4 -4 -6 -4 -5 -5 -5 -2  1  2 -5  0  9 -5 -3 -3  0  7 -1-15
         P  1  0 -1 -1 -3  0 -1 -1  0 -2 -3 -1 -2 -5  6  1  0 -6 -5 -1-15
         S  1  0  1  0  0 -1  0  1 -1 -1 -3  0 -2 -3  1  2  1 -2 -3 -1-15
         T  1 -1  0  0 -2 -1  0  0 -1  0 -2  0 -1 -3  0  1  3 -5 -3  0-15
         W -6  2 -4 -7 -8 -5 -7 -7 -3 -5 -2 -3 -4  0 -6 -2 -5 17  0 -6-15
         Y -3 -4 -2 -4  0 -4 -4 -5  0 -1 -1 -4 -2  7 -5 -3 -3  0 10 -2-15
         V  0 -2 -2 -2 -2 -2 -2 -1 -2  4  2 -2  2 -1 -1 -1  0 -6 -2  4-15
         .-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15-15  0
    

    BLOSUM62 Matrix22

            A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V  .
         A  4 -1 -2 -2  0 -1 -1  0 -2 -1 -1 -1 -1 -2 -1  1  0 -3 -2  0 -4
         R -1  5  0 -2 -3  1  0 -2  0 -3 -2  2 -1 -3 -2 -1 -1 -3 -2 -3 -4
         N -2  0  6  1 -3  0  0  0  1 -3 -3  0 -2 -3 -2  1  0 -4 -2 -3 -4
         D -2 -2  1  6 -3  0  2 -1 -1 -3 -4 -1 -3 -3 -1  0 -1 -4 -3 -3 -4
         C  0 -3 -3 -3  9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 -4
         Q -1  1  0  0 -3  5  2 -2  0 -3 -2  1  0 -3 -1  0 -1 -2 -1 -2 -4
         E -1  0  0  2 -4  2  5 -2  0 -3 -3  1 -2 -3 -1  0 -1 -3 -2 -2 -4
         G  0 -2  0 -1 -3 -2 -2  6 -2 -4 -4 -2 -3 -3 -2  0 -2 -2 -3 -3 -4
         H -2  0  1 -1 -3  0  0 -2  8 -3 -3 -1 -2 -1 -2 -1 -2 -2  2 -3 -4
         I -1 -3 -3 -3 -1 -3 -3 -4 -3  4  2 -3  1  0 -3 -2 -1 -3 -1  3 -4
         L -1 -2 -3 -4 -1 -2 -3 -4 -3  2  4 -2  2  0 -3 -2 -1 -2 -1  1 -4
         K -1  2  0 -1 -3  1  1 -2 -1 -3 -2  5 -1 -3 -1  0 -1 -3 -2 -2 -4
         M -1 -1 -2 -3 -1  0 -2 -3 -2  1  2 -1  5  0 -2 -1 -1 -1 -1  1 -4
         F -2 -3 -3 -3 -2 -3 -3 -3 -1  0  0 -3  0  6 -4 -2 -2  1  3 -1 -4
         P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4  7 -1 -1 -4 -3 -2 -4
         S  1 -1  1  0 -1  0  0  0 -1 -2 -2  0 -1 -2 -1  4  1 -3 -2 -2 -4
         T  0 -1  0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1  1  5 -2 -2  0 -4
         W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1  1 -4 -3 -2 11  2 -3 -4
         Y -2 -2 -2 -3 -2 -1 -2 -3  2 -1 -1 -2 -1  3 -3 -2 -2  2  7 -1 -4
         V  0 -3 -3 -3 -1 -2 -2 -3 -3  3  1 -2  1 -1 -2 -2  0 -3 -1  4 -4
         . -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4  1
    

    Identity Matrix

            A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V  .
         A  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
         R  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
         N  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
         D  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
         C  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
         Q  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
         E  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
         G  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0
         H  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0
         I  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0
         L  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0
         K  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0
         M  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0
         F  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0
         P  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0
         S  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0
         T  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
         W  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0
         Y  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0
         V  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0
         .  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
    

    Input file examples

    These examples refer to a study made with Pr David Stuart, Division of Structural Biology, Oxford, on viral proteins VP7 and VP3 in orbiviruses23,24

    1. vp2_rota.inp (resulting PostScript, Gif)

    vp2_rota.phd ! mail from the Predict Protein server on vp2 rotavirus
    *  A  none   ! shows predicted sec. str. elements and accessibility on the top of each block
    .
    0.7     E    ! physico-chemical boxing
    6 81         ! layout
    @pp          ! extracts all infos from the Predict Protein file
    @noname      ! no names for sec. structures
    .    
    2-6          ! sequences to be displayed
    . 

    2. vp7_adv.inp (resulting PostScript, Gif)

    vp7.aln vp7_btv10.pdb           ! aligned sequences (from CLUSTALW) and pdb file 
    vp7_btv10.dssp A                ! secondary structures (from DSSP) 
    vp7_adv.ps  M           ! PostScript output 
    .7 .5    R              ! similarity criteria 
    7 60                    ! layout
    U R 127 250                     !  -> red triangles
    S B 168-170 178-180             !  -> blue stars
    X B 1-126 254-349               !  -> sec. structure information in blue 
    X G 127-253                     !  -> sec. structure information in green
    T R 1-2                         !  -> names of btv sequences in red
    @noname                         ! no names for sec. structure
    %Alignment for protein VP7.   
    .                               ! 
    2 1 3-6                 ! first  group of sequences
    7-8                     ! second group of sequences     
    9                       ! third  group of sequences
    .                       !

    3. vp3_sup.inp (resulting PostScript, Gif)

    vp3.aln +               ! 1.CLU
    vp31001.art_dssp        ! DSSP file for btv-10 vp3 
    vp3_sup.ps M            ! PostScript output
    .7 .5 R                         
    5 160 13 8 0 C L N
    T R 1                   ! title 1 sequence in red
    Y R all              ! numbering 1 sequence in red
    @noname                 ! no names for sec. structure
    %Alignment vp3 orbivirus and attempt against vp2 rotavirus
    .                       ! end special characters
    1-2                      ! 1 group of sequences 
    9 10                     ! 2  "
    11                       ! 3  " 
    .                        ! end 
    vp2_casper.aln 8         ! 2.THREADER alignment between btv vp3 and rota vp2
    vp31001.art_dssp  vp2_rota.phd ! DSSP file for btv and PHD file for rota
    vp3_sup.ps 
    .7 R     
    5 160 16 -1 0 C L N
    T R 1                  ! title 1 sequence in red
    Y R all             ! numbering 1 sequence in red
    T B 2                  ! title 2 sequence in blue
    Z B 2                  ! numbering 2 sequence in blue 
    @noname                ! no names for sec. structure
    .                      ! end special characters
    .                      ! all sequences in one group
    The + on the first line enables extra input.


     

    4. vp3_art.inp (resulting PostScript, Gif)

    vp3.aln +            ! CLUSTAL alignment for VP3 and option + 
    vp32001.art_dssp     ! DSSP file for first monomer
    vp3_art.ps 
    .7 R 
    6 129 8 0 0 C L 
    X G 1       ! domains
    X D 1-1
    X B 298-587
    X R 699-856
    @skip       ! hide sequence 
    @noname     ! no names for sec. structure
    @ah3 bbC bbD bbE ah4 ah5 bbF bbG bbH ah6 bbI ah7 ah8 bbJ bbK ah9
    @ah10 bbL ah11 bbM bbN ah12 ah13 bbA ah1/bB ah2 ah3 bbD ah4 ah5
    @ah6 bbE bbF bbG ah7 bbH bbI bbJ/h8 ah9 ah10 ah11 bbK ah12 bbL
    @bbQ ah14 ah15 ah16 ah17 bbP/h18 ah19 ah20 ah21 bbA bbB bbC ah1
    @bbE bbF bbG ah2 ah3 bbH bbI ah4 bbJ bbK bbL ah5 bbM bbN bbQ ah22
    @bbR bbS ah23 bbT ! new sec. structure labels 
    .
    1 %10 %11 %12
    .
    vp3.aln vp3_contact.log   ! same alignment and X-PLOR output for contacts
    vp31001.art_dssp          ! DSSP file for second monomer
    vp3_art.ps
    .7 R 
    6 129 8 -1 0 C L          ! vertical shift 
    X G 1       ! domains
    X D 1-1
    X B 298-587
    X R 699-856
    A S all  ! no letters above sec. elements
    R A all  ! intermolecular contacts for VP3A
    R B all  ! intermolecular contacts for VP3B
    @noname     ! no names for sec. structure
    %Secondary structures for vp3A and vp3B, article definition. 16 March 98.
    .
    1 %10 %11 %12 
    .

    5. vp7_exp.inp (resulting PostScript, Gif)

    vp7.aln   +      ! CLUSTAL alignment on orbivirus
    vp7_btv10.dssp   ! btv10 secondary elements 
    vp7_exp.ps   M           
    .                       
    7 60 6  0 F  N  
    X B 1            ! btv10 sec. elements in blue 
    @skip            ! hide all sequences
    @h               ! first 310 helix is not labeled
    %Alignment on VP7 with two secondary structure elements. June 1999.   
    .
    2 all            ! btv10 in first then other sequences
    .
    vp7.aln          ! same aligned sequences 
    vp7_btv1.dssp    ! btv1 secondary elements 
    vp7_exp.ps   M          
    .8      E        ! homology criteria is %Equivalence
    7 60 6 -1 F  N   ! vertical shift(-1) flashy colours(F) all sequences numbered(N)
    X R 2            ! btv1 sec. elements in red
    A S all          ! remove sec. structure labels
    T B 1            ! btv10 title in blue  
    T R 2            ! btv1  title in red
    Q P 1-3 169-171  ! highlight RGD segment in pink
    Q P 5-7 169-171
    Q P 8 180-182
    .
    2 all            ! btv10 is the first displayed sequence
    .


    References

    1. Wootton, J.C. and Federhen, S. (1996)
    Analysis of compositionally biased regions in sequence databases. Meth. in Enzymol. 266 554-571
    2. Bairoch, A., Bucher, P. and Hofmann, K. (1997)
    The PROSITE database, its status in 1997. Nucl. Acids Res. 25 217-221
    3. Corpet, F. (1988)
    Multiple sequence alignment with hierarchical clustering. Nucl. Acids Res. 16 10881-10890
    4. Corpet, F., Gouzy J. and Kahn D. (1999)
    Recent improvements of the ProDom database of protein domain families. Nucl. Acids Res. 27 263-267
    5. Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994)
    CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucl. Acids Res. 22 4673-4680
    6. Combet, C., Blanchet, C., Geourjon, C. and Deleage, G. (2000)
    NPS@: Network Protein Sequence Analysiss. TIBS 25 147-150
    7. Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, Wisc.
    8. Pearson, R. and Lipman, D.J. (1988)
    Improved Tools for Biological Sequence Analysis. Proc. Natl. Acad. Sci. 85 2444-2448
    9. Galtier, N., Gouy, M. and Gauthier, C. (1996)
    SeaView and Phylo_win, two graphic tools for sequence alignment and molecular phylogeny. Comput. Applic. Biosci. 12 543-548
    10. Sander, C. and Schneider, R. (1991)
    Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins. 9 56-68
    11. Jones, D.T., Taylor, W.R. and Thornton, J.M. (1992)
    A new approach to protein fold recognition. Nature. 358 86-89
    12. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N. and Bourne, P.E. (2000)
    The Potein Data Bank. Nucleic Acids Res. 28 235-242
    13. Esnouf, R.M. (1997)
    An extensively modified version of MolScript that includes greatly enhanced coloring capabilities. J. Mol. Graphics. 15 132-134
    14. Brünger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros P., Grosse-Kunstleve, R.W., Jiang J.S., Kuszewski, J., Nilges, M., Pannu, N.S., Read, R.J., Rice L.M., Simonson, T. and Warren, G.L. (2000)
    Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr. D 54 905-921
    15. Kabsch, W. and Sander, C. (1983)
    Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 22 2577-2637
    16. Frishman, D. and Argos, P. (1995)
    Knowledge-based secondary structure assignment. Proteins. 23 566-579
    17. Rost, B. (1996)
    PHD: predicting one-dimensional protein structure by profile based neural networks. Meth. in Enzym. 266 525-539
    18. Rost, B. and Sander, C. (1994)
    Conservation and prediction of solvent accessibility in protein families. Proteins. 20 216-226
    19. Kyte, J. and Doolittle, R. (1982)
    A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157 105-132
    20. Risler, J.L., Delorme, M.O., Delacroix, H. and Henaut, A. (1988)
    Amino acid substitutions in structurally related proteins. A pattern recognition approach. Determination of a new and efficient scoring matrix. J. Mol. Biol. 204 1019-1029
    21. Dayhoff, M. (1978)
    "Atlas of protein sequences and structure"National Biomedical Research Foundation. Washington, D.C. p. 345
    22. Henikoff, J.G. and Henikoff, S. (1996)
    Blocks database and applications. Meth. in Enzym. 266 88-105
    23. Grimes, J.M., Burroughs, J.N., Gouet, P., Diprose, J.M., Malby, R., Zientara, S., Mertens, P.P.C. and Stuart, D.I. (1998)
    The atomic structure of the bluetongue virus core. Nature. 395 470-478
    24. Gouet, P., Diprose, J.M., Grimes, J.M., Malby, R., Burroughs, J.N., Zientara, S., Stuart, D.I. and Mertens, P.P.C. (1999)
    The highly ordered double-stranded RNA genome of bluetongue virus revealed by crystallography. Cell. 97 481-490

    Back to MAIN page