The epitope scan API runs the rosetta MHC II epitope prediction algorithm on an input protein structure or sequence. This API uses a machine learning model to predict epitopes based entirely on the sequence of the protein. Structure input is provided as a convenience but the API will produce identical results regardless of whether a sequence or structure is used as input.
The epitope scan predicts the immunogencity of the protein with respect to the following alleles:
H-2-IAb |
HLA-DPA10103-DPB10201 |
HLA-DPA101-DPB10401 |
HLA-DPA10201-DPB10101 |
HLA-DPA10201-DPB10501 |
HLA-DPA10301-DPB10402 |
HLA-DPB10301-DPB10401 |
HLA-DQA10101-DQB10501 |
HLA-DQA10102-DQB10602 |
HLA-DQA10301-DQB10302 |
HLA-DQA10401-DQB10402 |
HLA-DQA10501-DQB10201 |
HLA-DQA10501-DQB10301 |
HLA-DRB10101 |
HLA-DRB10301 |
HLA-DRB10401 |
HLA-DRB10404 |
HLA-DRB10405 |
HLA-DRB10701 |
HLA-DRB10802 |
HLA-DRB10901 |
HLA-DRB11101 |
HLA-DRB11302 |
HLA-DRB11501 |
HLA-DRB30101 |
HLA-DRB40101 |
HLA-DRB50101 |
Inputs
You must specify either a PDB file or a sequence, but not both
- Input PDB file — a PDB file
- CLI argument: –pdb-file input.pdb
- Python submit() argument: pdb-file=”input.pdb”
- Do not include nonprotein residues.
- Do not include multimodel (NMR-sourced) PDBs.
- Sequence –a protein sequence
- CLI Arguments: –sequence NLYIQWLKDGGPSSGRPPPS
- Python submit() argument: sequence=”NLYIQWLKDGGPSSGRPPPS”
Outputs
Output file descriptions
-
The API returns a CSV file with the following fields:
-
begin_seqpos – The start of the sequence window involved in the prediction
-
epitope_seq – The sequence of the epitope involved in the prediction
-
allele – The MHC allele binding affinity is being predicted for
-
IC50_nM – The predicted IC50 in nanomolarity
-
rank_percentage – The epitope is in the top n% of binders measured against random background
-
score – The score of the prediction model, lower is better
-
genome_sequence – Is the epitope in the human reference genome
-
known – Does the sequence exist in the IEDB as a known T-cell activating epitope
-