IEDB Analysis Resource

MHC-II binding predictions - Tutorial

How to obtain predictions

This website provides access to predictions of peptide binding to MHC class II molecules. The screenshot below illustrates the steps necessary to make a prediction. Each of the steps is described in more detail below.

1. Specify sequences

First specify the sequences you want to scan for binding peptides. The sequences should either be entered directly into the textarea field labeled "Enter protein sequence(s), or can be taken from a file that has to be uploaded using the button labeled "Browse".

The sequences can be supplied in three different formats:

The format of the sequences can be specified explicitly using the list box labeled "Choose sequence format". If that list box is set to "auto detect format", the input will be interpreted as FASTA if an opening ">" character is found, or as a continuous sequence otherwise.

All sequences have to be amino acids specified in single letter code (ACDEFGHIKLMNPQRSTVWY)

2. Choose a prediction method

The prediction method list box allows choosing between six currently implemented MHC class II binding prediction methods, Consensus method, Average relative binding (arb), combinatorial library (manuscript in preparation), NN-align (netMHCII-2.2), SMM-align (netMHCII-1.1), Sturniolo.

We have conducted two a large scale evaluation of the performance of the MHC class II binding predictions: a 2008 staudy based o over 10,000 bindign affinities and a recent 2010 study based on over 40,000 binding affinities, and found that they in general rank as 1) consensus method 2) NN-align method 3) SMM-align method 4) Combinatorial library 5) Sturniolo and 6) ARB.

Supplementary information for the 2008 study including all datasets used to establish and evaluate the methods are available here.

Supplementary information for the 2010 study including all datasets and scores used to establish and evaluate the methods are available via the download tab.

By default, the overall best method (Consensus) is selected. However, not all methods can currently make predictions for all alleles, so only the alleles available will be displayed.

3. Specify what to make predictions for

Predictions are limited to alleles that are currently covered by specific prediction methods. Selection of a particular prediction method will generate a list of available alleles. User can then choose a specific allele to make predictions.

4. Specify the output

The menus in this section change how the prediction output is displayed. Using the "Sort peptides by" listbox, the results can be presorted by the order of the peptides in their source sequence (default) or by their predicted affinity.

To reuse the prediction results in an external program, it is possible to retrieve the predictions in a plain text format. To do this, choose "Text file" in the output format listbox.

5. Submit the prediction

This one is easy. Click the submit button, and a result screen similar to the one below should appear.

Interpreting prediction output

Below is a screenshot of a prediction output page, with three relevant sections marked that are described in more detail below.

1. Input Sequences

This table displays the sequences and their names extracted from the user input. If no names were assigned by the user (which is only possible in FASTA format), the sequences are numbered in their input order (sequence 1, sequence 2, ...).

2. Prediction output table

Each row in this table corresponds to one peptide binding prediction. The columns contain the allele the prediction was made for, the position of the peptide in the input sequences (in the format [Sequence #]: [Start Position] - [End Position]), the core sequence, the predicted score and percentile rank for ARB, combinatorial library, SMM_align and Sturniolo. The last column is the percentile rank for the consensus method. Table can be sorted by clicking on the table column headers.

3. Interpreting predicted results

The predicted output is given in units of IC50nM for ARB, combinatorial library and SMM_align. Therefore a lower number indicates higher affinity. As a rough guideline, peptides with IC50values <50 nM are considered high affinity, <500 nM intermediate affinity and <5000 nM low affinity. Most known epitopes have high or intermediate affinity. Some epitopes have low affinity, but no known T-cell epitope has an IC50 value greater than 5000.

The prediction result for Sturniolo is given as raw score. Higher score indicates higher affinity.

For each peptide, a percentile rank for each of the four methods (ARB, combinatorial library, SMM_align and Sturniolo) is generated by comparing the peptide's score against the scores of five million random 15 mers selected from SWISSPROT database. A small numbered percentile rank indicates high affinity. The median percentile rank of the four methods were then used to generate the rank for consensus method.