The disulfide single chain homology modeling API creates homology models of single chain proteins using the Rosetta hybridize homology modeling method. By default, the hybridize method will automatically identify templates to use to construct the model. However, a custom template can be specified by the user. If the input sequence has cysteine residues capable of forming disulfide bonds, and those residues form disulfide bonds in the selected templates, Rosetta will attempt to maintain the disulfides in the homology model. Additionally, pairs of cystine residues can be explicitly marked as disulfides by the user.
The single chain homology modeling protocol works better on proteins larger than ~20 residues and smaller than 500 residues. The API is limited to proteins of less than 1000 residues. There are other modeling approaches which are better suited to modeling short peptides and very large structures.
This tool is a modification of the HM tool in Cyrus Bench; it has added hooks for the disulfide list. It should otherwise perform very similarly to that tool.
- Input sequence– a protein sequence
- CLI argument: –sequence SECVECGGFCPDPEKMGDWCCGRCIRNECRCG
- Python submit() argument: sequence=”SECVECGGFCPDPEKMGDWCCGRCIRNECRCG”
- Only canonical amino acids are supported
- Disulfide List (optional) — A list of residue number pairs which should form disulfide bonds
- CLI argument: –disulfide-list “1:10,3:5”
- Python submit() argument: disulfide_list=[(1,10), (3,5)]
- The numbering scheme is against the input sequence (the first residue is 1).
- Each residue should only be referenced in one disulfide pair
- Each residue should be within range for the submitted sequence, and already a cysteine in that sequence.
- Residues should be in a location such that they can reasonably participate in a disulfide bond without major movement of the protein backbone
- Rosetta will attempt to satisfy these disulfides but is not guaranteed to form them
- Other disulfides may be formed depending on the protein sequence and template selection
- Residues in the disulfide list should not already be in competing disulfides. For example if the templates have a disulfide from 10 to 30, do not submit 10-15 as a disulfide pair.
- You may explicitly list disulfides you expect to form that are already in your templates as a reinforcement of the desired patterns. This will increase the occurrence of those disulfides in your results if Rosetta finds the disulfides marginal.
- Custom template (optional) — A PDB file to be used as a custom template. The input sequence will be threaded along this template.
- CLI argument: –template template.pdb
- Python submit() argument: template=”template.pdb”
- The template structure should have high sequence homology to the template.
Output file description
- Models folder – 5 PDB files representing the centers of the top-scoring clusters of models generated during the homology modeling process.
- score.sc – The Rosetta scores associated with the 5 cluster centers.
Output file interpretation
Cyrus’s HM tool returns 5 cluster centers after running a large number of HM trajectories. This clustering is balanced to return 5 models that have good energy within their structural cluster and represent different clusters.
If all 5 models are similar even after clustering, it means that HM was highly converged and/or that the template match was very high. This is a good sign, it means Rosetta has good confidence in this prediction.
If there are 5 distinct predictions, it may mean that the default sampling is insufficient, or that this particular problem is harder than this API is able to accommodate – please let Cyrus know and we can discuss other options for this type of modeling problem.