B Cells can have surface antibodies that can bind proteins. This can elicit an immune response if other factors are present to activate the B Cell. One way to inhibit or increase antibody binding to a protein is to alter its surface residues to change the propensity of antibodies to bind. Here, we present a workflow for designing B Cell antibody binding into or out of your protein. It is based on this paper:
The paper describes a method for predicting the propensity of an antibody to bind to your protein. It uses a protein structure rather than sequence. This has been shown to be more predictive than sequence-based prediction of antigen binding. It’s called DiscoTope because it can predict epitopes that are discontinuous regions since it is based on structure, not sequence. Plus, >90% of B Cell epitopes are predicted to be discontinuous.
Antibodies are known to prefer binding regions that are on the surface, are hydrophilic, flexible, and are often β-turns. This work expands that knowledge by using XRay structures of antibody-antigen pairs to analyze the interacting residues. They have developed a scoring method that uses a log odds ratio for each amino acid, reflecting its propensity for being in an epitope compared to non-epitope regions. Epitopes are more likely to have N, R, P, K and not C, A, L, V, F. The log odds score is the average of log odds over 9 sequential residues.
DiscoTope Score = (log odds score summed for structurally proximal residues) + (score term for surface localization)
We are using two examples to show how to design your protein in order to increase or decrease binding by an antibody.
EXAMPLE ONE: DE-IMMUNIZATION OF AMA1
First, we will decrease antibody binding to the Malaria antigen (AMA1). AMA1 is known to bind an inhibitory monoclonal antibody. Below, we are showing the crystal structure, 2Q8A, which has AMA1 (blue) bound to the heavy (dark green) and light chain (light green) Fab of 1F9. The interacting residues colored in purple for the Fab and yellow for AMA1. Below-left shows the complex. Below-right zooms in on the interacting residues.
This shows that there are known antibody binding sites. However, we are using the crystal structure, 1Z40 of the unbound AMA1 in order to predict binding site.
1) Identify “sticky” regions – regions that B cell antibody are predicted to bind
Go to http://www.cbs.dtu.dk/services/DiscoTope/ and run the free online tool in order to get predictions of where an antibody is likely to bind. They also have a downloadable version. We may soon have a version that you can use in CAD if interest is found among users. Contact email@example.com if you are interested.
A DiscoTope run should take under a minute. Download the results files which is shown below. The DiscoTope score is weighted by the # of amino acids in its proximity, which helps select residues at the surface.
In this example above, DiscoTope Identified 76 B-Cell epitope residues (yellow) out of 311 total residues (blue). These are all potential spots for design. This seems like a daunting task since 76 mutation positions could drastically alter protein structure. But not all sites need to be mutated in order to drastically reduce the probability of antibody binding (if that is your goal).
The regions that are considered “sticky” will be indicated in the results file by <=B. Keep in mind that the numbering relates to the numbering the structure of the file it is analyzing CAD numbering may be different. To make the numbering match CAD, upload your structure into CAD, then download it from CAD. The new structure will have CAD numbering. So use this pdb to load into DiscoTope for analysis. And the results file will be much easier to correlate to the structure in CAD.
2.) Select small regions for design
For most cases, there will be a lot of positions around the protein that need design. These should first be split into multiple regions for design. Then, run multiple, separate design runs with a lot of repeats within each design set. Once individual sets are run, use the results are used to determine mutations that are tolerated at each position. This data will be used for a final design run using these tolerated mutation list. The philosophy behind this is that subsets will eliminate unfavorable mutations, then all interacting regions can be designed using these most probable mutations. The final design will be able to focus it’s sampling on fewer mutations so it can better determine which are best. Running design without doing pre-Design elimination will spend a lot of time eliminating bad mutations so does not have time to sample the ideal combination of good mutations.
We suggest each design subset include up to 6 interacting residues or up to 6 residues that cluster into the same region. Below, these pre-Design sites were split into 9 design sets, which are shown in 9 different colors. Non-design positions are in blue. Since interacting residues are best designed in the same set, large regions must be divided with that in mind. Below-right, we zoom in on a region where multiple sets are found. You can see in the case of the yellow set, two interacting loops are designed together. The peach design set includes two interacting beta sheets.
Create Selectors for all of these individual sets. For many situations, it is challenging to split the job into well-defined subsets. So you can include more subsets that have overlapping regions with other subsets. That expands sampling which can prevent incorrect elimination of tolerated mutations. In the example below, there are 3 sets selected for design on the left. On the right, the same positions are split into 5 sets. All 8 sets can be run to see if different mutations are tolerated with different regions included. This allows different co-mutating positions to be sampled.
3) Run Pre-Design jobs for all subsets
For each design set, run Design, Flex Design, or Relax Design. Determine which residues you will allow mutable positions to sample depending on the known propensity of antibody binding. See the Log Odds Ratio of antibody binding below. It is best to avoid mutating native cysteines or prolines. Cysteines are often involved in disulfide bonds. So, removing them could significantly alter the structure. Prolines form a unique, rigid backbone orientation so removing them can also greatly alter the structure. Also, it is recommended to avoid creating new cysteines because they can form new disulfide. Creating prolines is particularly destructive in secondary structures. Though, Rosetta’s energy function will usually select against this kind of mutation.
In the AMA1 example, there were 76 positions predicted to bind an antibody. This is an exceptionally large number and represents an extreme case. Since this is an example of design and not a real life project, we are not taking into account that functional sites are usually excluded from design in order to maintain behavior. So, we will allow mutation at all 76 positions. We are making this an aggressive design by allowing all mutation except the 7 residues which have the highest Log Odds score. Since that allows sampling of 13 residues per position, we only allowed 4 or 5 residues per pre-Design run with up to 1,000 repeats. A conservative variation would select mutations with similar properties, but a better Odds score.
HOW TO ANALYZE PRE-DESIGN RESULTS
Each pre-Design run will sample all 13 residues, but will probably only favor a couple amino acids at each spot. Occasional locations will allow much more than that because the protein doesn’t depend on the side chain for structural stability. Look at the Sequence Logo for each pre-Design run in order to select the favorable mutations at each position. In the example left-below, there are 4 mutatable positions and the vast majority of the repeats for this design region selected one residue as favorable.
The wild type sequence was YKDN, but the last 3 are positions are preferred antibody binding residues so were not allowed for sampling. Rosetta found that (of the allowed residues), the most favorable sequence is YSTI. For other regions of the protein (like the one to the right), a few favored residues were chosen. So a few mutations can be selected for use when mutating the whole protein.
Once all pre-Design runs have been analyzed, you will have a list of favored mutations for all mutatable positions in the protein. This is usually a much more tractable number of mutations to sample in a single design run.
4) Make a final design run using pre-Design data
Pre-Design runs should have vastly limited the number of allowable mutations that you need to sample for the final design. So you only need to compile a list of all residues that will be mutated and the favorable mutations from the pre-Design runs. Use these for a Design with an appropriate number of repeats.
[Number of Repeats = # of Permutations / 50]
Calculating # of Permutations: If you have 3 mutation positions with 13 mutations allowed at each spot, the # of permutations = 13 x 13 x 13 = 2,197. This need 44 repeats for complete sampling.] Note that running more than the suggested number of repeats won’t make your results worse, but will require more compute cycles and may take longer to return.
While running design, the structure will not find its energetic minimum because sampling is more focused on design. So the final structure should be submitted to at least 100 Relax repeats before analysis. For design situations that include a large number of residues, it is often necessary to run multiple rounds of Relax to find the energetic minimum. This should also be done to the wild type in order to have an appropriate structure for comparison.
In the case of AMA1, design was aggressive in the sense that the 7 amino acids with higher propensity for binding antibodies were not allowed at design regions. However, this is considered a conservative design case because all positions adapted readily to at least 1 of the remaining allowable residues. The final energy of the mutant AMA1 was -545 which is even better than WT, which is -543 and had an RMSD of 0.16.
The mutated structure has no positions predicted to bind a B-cell antibody based on it’s DiscoTope score. So, the final structure appears to be stable, maintain the same global fold, and is not predicted to bind B-cell antibodies.
EXAMPLE TWO: INCREASING B-CELL ANTIBODY AFFINITY TO ADH1
Secondly, we are showing how to mutate ADH1 in order to increase the probability of B-cell antibodies will bind.
1) Identify “sticky” regions – regions that B cell antibody are predicted to bind
Go to http://www.cbs.dtu.dk/services/DiscoTope/ and run the predictions of where an antibody is likely to bind. This should take under a minute. Download the results file to highlight regions that are already predicted to bind an antibody. Given this information, you can select proximal regions to expand predicted binding. Below, we are showing the protein in green with regions predicted to bind an antibody in pink. This target probably has sufficient affinity, but for the sake of this example, we increase affinity by designing the region surrounding the smaller region (see arrow).
2.) Select small regions for design
We selected surface regions circling this binding region for design. Below, the region already identified is still shown in pink, and two additional regions that we are designing to be “stickier” to antibodies are shown in blue and yellow.
3.) Run Pre-Design jobs for all subsets
These regions are 9 and 11 residues long so are best split into 2 design runs each. For aggressive design, we allowed only the 7 residues with the highest propensity for antibody binding (see below). For 6 positions, a neutral residues is wild type so it was allowed during sampling. So, in total, 4 design runs were done in order to determine prefered residues that will be used for the final design run.
4.) Make a final design run using pre-Design data
Preferred mutations were used for a final set of design. This is followed by 1,000 Relaxes using the most energetically favorable design. The wild type energy was -893 and the mutant is -889. This increase in energy indicates a minor loss in stability. If this is shown to affect the protein in a meaningful way, stability can be recovered by optimizing core stability through design.
The final design had a significantly higher predicted area of antibody binding. The image below shows the original region predicted to bind an antibody in pink. The mutant is predicted to bind this region in addition the the areas in purple.
Note that this expands beyond the region of design because the DiscoTope scoring method is a 9 residue sliding window. So, mutations can affect the score in proximal areas.