Scalable protein design using optimization in a relaxed sequence space

October 24, 2024 

Christopher Frank,Ali Khoshouei, Lara Fuβ , Dominik Schiwietz , Dominik Putz, Lara Weber, Zhixuan Zhao, Motoyuki Hattori, Shihao Feng, Yosta de Stigter, Sergey Ovchinnikov, and Hendrik Dietz  

Working with a team of experts, this piece was published in Science in October 2024 covering “Scalable protein design using optimization in a relaxed sequence space”.  

Editor’s summary

Protein design methods have made rapid progress in recent years with the introduction of machine learning–based generative models. Frank et al. now debut an alternative method called relaxed sequence optimization that enables efficient and robust convergence toward optimal structures on the basis of iterative sequence evolution using gradient descent–based hallucination. The authors generated de novo protein designs ranging from 100 to 1000 amino acids that are structurally diverse and can be biased toward desired properties in a flexible manner. Thorough experimental characterization confirmed a high success rate and close correspondence of the design to protein structure. The design approach allowed for production of hetero- and homodimeric proteins and will be adaptable to other complex design tasks. —Michael A. Funk

Abstract

Machine learning (ML)–based design approaches have advanced the field of de novo protein design, with diffusion-based generative methods increasingly dominating protein design pipelines. Here, we report a “hallucination”-based protein design approach that functions in relaxed sequence space, enabling the efficient design of high-quality protein backbones over multiple scales and with broad scope of application without the need for any form of retraining. We experimentally produced and characterized more than 100 proteins. Three high-resolution crystal structures and two cryo–electron microscopy density maps of designed single-chain proteins comprising up to 1000 amino acids validate the accuracy of the method. Our pipeline can also be used to design synthetic protein-protein interactions, as validated experimentally by a set of protein heterodimers. Relaxed sequence optimization offers attractive performance with respect to designability, scope of applicability for different design problems, and scalability across protein sizes.

Editor’s summary and Abstract copied from Science: 10.1126/science.adq1741