Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
Select Page

This documentation is deprecated, please use the new documentation site instead.

The epitope scan API runs the rosetta MHC II epitope prediction algorithm on an input protein structure or sequence.  This API uses a machine learning model to predict epitopes based entirely on the sequence of the protein.  Structure input is provided as a convenience but the API will produce identical results regardless of whether a sequence or structure is used as input.

The epitope scan predicts the immunogencity of the protein with respect to the following alleles:

H-2-IAb

HLA-DPA10103-DPB10201

HLA-DPA101-DPB10401

HLA-DPA10201-DPB10101

HLA-DPA10201-DPB10501

HLA-DPA10301-DPB10402

HLA-DPB10301-DPB10401

HLA-DQA10101-DQB10501

HLA-DQA10102-DQB10602

HLA-DQA10301-DQB10302

HLA-DQA10401-DQB10402

HLA-DQA10501-DQB10201

HLA-DQA10501-DQB10301

HLA-DRB10101

HLA-DRB10301

HLA-DRB10401

HLA-DRB10404

HLA-DRB10405

HLA-DRB10701

HLA-DRB10802

HLA-DRB10901

HLA-DRB11101

HLA-DRB11302

HLA-DRB11501

HLA-DRB30101

HLA-DRB40101

HLA-DRB50101

Inputs

You must specify either a PDB file or a sequence, but not both

  • Input PDB file — a PDB file
    • CLI argument: –pdb-file input.pdb
    • Python submit() argument: pdb-file=”input.pdb”
    • Do not include nonprotein residues.
    • Do not include multimodel (NMR-sourced) PDBs.  
  • Sequence –a protein sequence
    • CLI Arguments: –sequence NLYIQWLKDGGPSSGRPPPS
    • Python submit() argument: sequence=”NLYIQWLKDGGPSSGRPPPS”

Outputs

Output file descriptions

  • The API returns a CSV file with the following fields:

    • begin_seqpos – The start of the sequence window involved in the prediction

    • epitope_seq – The sequence of the epitope involved in the prediction

    • allele – The MHC allele binding affinity is being predicted for

    • IC50_nM – The predicted IC50 in nanomolarity

    • rank_percentage –  The epitope is in the top n% of binders measured against random background

    • score – The score of the prediction model, lower is better

    • genome_sequence – Is the epitope in the human reference genome

    • known – Does the sequence exist in the IEDB as a known T-cell activating epitope