APPLICATIONS OF TECHNOLOGY:
- Design of novel proteins with a specified set of properties or functions
- Modification of properties of existing proteins
- Inference of the function of protein sequences
BACKGROUND:
- Challenges in protein design include: not every protein sequence encodes a functional protein, and there is an underlying syntax to folding protein sequences required in order for function to be present. This syntax is not understood well enough to explicitly perform design without structural knowledge or an initial existing protein.
TECHNOLOGY OVERVIEW:
Researchers at Berkeley Lab have developed a model that is trained on the entire known protein sequence space to generate syntactically correct proteins that are likely to fold and function.
The model intuits the rules implicit in the structure of natural proteins. Through a selection of relevant downstream design, classification, and regression tasks, this technology is capable of reconstructing sequences that are likely to retain their function.
Due to the greater availability of protein sequence data over crystal structures, the method may improve accuracy in creating de novo proteins or modifying existing proteins. More accurate design reduces cost because fewer sequences need to be experimentally tested in order to find proteins that work for a particular application. Additionally this method may determine the function of unknown proteins, allowing for easier discovery of new metabolic pathways.
DEVELOPMENT STAGE: Early stage
PRINCIPAL INVESTIGATORS:
STATUS: Patent pending
OPPORTUNITIES: Available for licensing or collaborative research.
TECHNOLOGY CATEGORIES/SUBCATEGORIES:
- Bio-based products
- Synthetic Biology Tools and Software