In a recent paper published in the journal Nature Communications, the researchers introduce Evo-MD and demonstrate its ability to help explain important biophysical processes at the plasma membrane in mammalian cells. The paper also challenges a long-held hypothesis about how cholesterol accumulates near plasma membrane proteins, proposing an unusual mechanism that could better explain this phenomenon.
Cholesterol is a major component of plasma membranes in mammals, and when it interacts with proteins is instrumental in processes that transmit biochemical signals into and out of cells. Evidence suggests that when cholesterol accumulates abnormally, it can also play a role in neurological diseases and cancer. Gaining a better understanding of how cholesterol interacts with proteins in the plasma membrane could thus help to identify new strategies for drug design.
Since 2009, data produced using a variety of scientific approaches have suggested that membrane proteins attract cholesterol at the amino acid motif CRAC (Cholesterol Recognition/Interaction Amino Acid Consensus) and its inverse, CARC. The observation posits that a genetic sequence and its resulting 3D protein structure form a good fit for cholesterol molecules, attracting them and holding them in place. As interesting has this hypothesis has been, however, these motifs have only been loosely defined and research has not been successful in demonstrating conclusively that cholesterol can bind to proteins at such sites.
"The concept that cholesterol molecules bind to CRAC and CARC motifs is based on the idea that there is a specific attraction between ligands and proteins," Risselada explained. "Using x-ray crystallography, one can observe how cholesterol binds with other things, but resolving structure in CRAC / CARC binding has not been possible. When we started our research we wanted to prove that Evo-MD works by recovering a known motif. Surprisingly, however, it revealed that that the motif was not responsible for cholesterol attraction. Instead, it led to a better hypothesis of why it occurs."
Risselada and postdoctoral scientist Dr. Jeroen Methorst set out to better understand the nature of cholesterol accumulation using Evo-MD. Specifically, they explored the thermodynamic forces that govern how a cholesterol molecule becomes attracted to a signaling protein that spans a plasma membrane.
In a computer algorithm, Evo-MD began by generating a large set of random amino acid sequences — the genetic building blocks of proteins. Evo-MD then evaluated the ability of each sequence to attract cholesterol, assigning it a fitness score. Those found to be most likely to interact with cholesterol receive a higher score, while those that are less likely to do so receive a lower one.
After a round is complete, the entire process repeats, but this time the best performers from the previous round become seeds for the next. The algorithm repeatedly selects high-fitness sequences and recombines them to create a new population. This process, called a genetic algorithm, mimics the passing of genetic material from parents to children, while also introducing random alterations in the amino acid sequences that prevent a mere copying of genetic information between generations.
To generate the fitness score for each amino acid sequence, the researchers use molecular dynamics (MD), a computationally demanding method that uses physical principles to simulate interactions among molecules down to the atomic level. Specifically, Risselada's team investigated the ability of cholesterol and transmembrane motifs to interact using force fields, quantifying the energy within the system that results from the arrangement of molecules and atoms in space.
Like a species adapting successfully to its environment, repeating this cycle numerous times guides evolution toward the goal of an optimal fitness solution. By using random sampling and this data-driven reinforcement learning approach, Evo-MD is much faster and more efficient at maximizing protein–cholesterol attraction than would be possible by individually screening every possible combination of amino acid sequences (In the example studied, this would require evaluating 2010 possibilities!). At the same time, the incorporation of molecular dynamics simulations within the algorithm ensures that well established biophysical principles provide a scientifically reliable foundation for predicting how cholesterol and proteins interact.
The investigators found that after simulating approximately 128 individual amino acid sequences over 40 iterations, Evo-MD begins to converge to an optimal fitness score in which the top sequences facilitate a stable interaction between protein and cholesterol molecules. "The interesting thing about using a data driven approach is that it finds a solution that is the best fit for the available data," Risselada explained. "But then the question is, why is that the solution?"
Surprisingly, when looking at the optimal protein–cholesterol interactions, the investigators realized that CRAC / CARC motifs can not by themselves attract cholesterol, because they contain amino acid sequences that actively repel it. Instead, the results suggest that there is a fine balance in protein structures surrounding CRAC / CARC between those that attract cholesterol and those that repel it. This exerts a more complex set of thermodynamic forces that holds the lipid in place. From this perspective, it is not that cholesterol binds to CRAC / CARC motifs, but that neighboring structures in the protein facilitate cholesterol accumulation. Experiments using nuclear magnetic resonance scans and cellular assays confirmed that this explanation is more likely, and provided more detail about the exact amino acid residues involved.
"The typical idea is that molecules bind because their shapes fit together, but in this case it's clear that the mechanism is not normal," Risselada said. "It makes a lot more sense to think about this as a membrane-mediated effect that fixes some compositional features but also enables a lot of flexibility. The protein induces a high energy state in the membrane, which causes an effective cholesterol attraction."
Evo-MD needs to run large pools of amino acid sequences simultaneously in order to capture the diversity of all potential solutions. This means running many simulations in parallel, using up to thousands of compute cores. For this reason, doing this research efficiently was only possible using HLRS's Hawk supercomputer and other HPC systems. By conducting the research on multiple computers and receiving consistent results, the investigators gained confidence that their method is sound.
As this research project developed, the investigators also improved the Evo-MD algorithm to use supercomputing capabilities more efficiently. Completing all simulations for each generation of amino acid sequences before moving to the next generation meant that initially many cores would spend time idling while waiting for others to finish their calculations. Today, the system uses a more complex workflow where generations do not evolve in a strictly chronological fasion but jump among pools of potential amino acid sequences in ways that improve load balancing and make better use of HPC's parallel computing capabilities.
In ongoing work, Risselada's team has been applying Evo-MD to new problems, including using evolutionary schemes to look at interactions among small molecules, which form much more complex systems. "Many research groups that use machine learning are very pragmatic, focusing on structure-based optimization for pharmaceutical applications," Rissealada explained. "Evo-MD allows us to measure very dynamic properties, and is enabling us to pursue research that is more unique and fundamental in nature."
— Christopher Williams
Methorst J, Verwei N, Hoffmann, et al. 2025. Physics-based evolution of transmembrane helices reveals mechanisms of cholesterol attraction. Nat Commun. 16: 9275.
Funding for HLRS's Hawk supercomputer was provided by the Baden-Württemberg Ministry for Science, Research, and the Arts and the German Federal Ministry of Research, Technology and Space through the Gauss Centre for Supercomputing (GCS).