AlphaMissense and Genetic Sequencing

By João L. Carapinha

September 24, 2023

Artificial intelligence (AI) in healthcare continues to show its vast capabilities. Many examples have shown how they deploy it as a predictive tool for early disease identification. This has enabled the provision of early and effective treatment where necessary. But could we do this earlier? And how early could healthcare implement a tool like this? 

Machine learning algorithms have helped predict harmful genetic changes. Thus, healthcare benefits from this through improved rare disease diagnosis and targeted treatments. In a recent paper, researchers explore how to use this in genetic sequencing.

Genome sequencing has revealed over 4 million missense variants. These are genetic variants that alter the amino acid sequence of proteins. However, researchers have clinically classified only about 2% of these variants as pathogenic or benign. So, the challenge is to predict accurately how the remaining variants will affect protein function and the health of the organism.

To address this, researchers have developed machine-learning approaches that exploit patterns in biological data. Alphafold 1 is a protein structure prediction tool designed by DeepMind. It ranked 13th in Critical Assessment Structure Prediction (CASP). But, in 2020, the researchers made significant advancements to the model, which produced Alphafold 2. In fact, it scored 90 out of 100 on the Global Distance Test (GDT). This was a powerful achievement. For reference, a score of 100 shows a complete match of proteins that formed naturally.

As a result of adapting the AlphaFold models, researchers designed AlphaMissense. This was specific for human and primate databases. It prevents circularity by utilizing weak labels from population frequency data and unsupervised protein language modelling. Amino acid sequences are used to predict the pathogenicity of all single amino acid changes at a position in the sequence. They trained it in two stages. They trained the network to predict single-chain structure and model protein language in the first stage. In the second stage, they fine-tune the model to classify variant pathogenicity on human proteins.

AlphaMissense found new disease variants and measured their effects on clinical annotation and experimental tests. In contrast to other models, it excelled in distinguishing between harmful and harmless gene mutations. Additionally, it achieved state-of-the-art performance across all curated clinical benchmarks.

Because of these advances, researchers created a dataset of 71 million missense variant predictions for the human proteome using the model. Clinicians could use these resources to prioritise variants for rare disease diagnostics, inform studies of complex trait genetics, and they could serve as a starting point for designing and interpreting further experiments across the human proteome. 

AI and machine learning are poised to play a crucial role in healthcare by enabling accurate prediction of variant pathogenicity. Models like AlphaMissense could speed up our understanding of the molecular effects of variants on protein function. This will help find genes that cause diseases and improve the diagnosis of rare genetic diseases.

Reference url

Recent Posts

South African National Health Insurance Bill

High Court on South African National Health Insurance Bill

📢Latest update on the NHI! 📢 Dive into our latest piece on the judicial review of the South African National Health Insurance Bill. Learn about the intriguing case and its implications on the healthcare industry. 🏛️💼💡 #HealthcarePolicy #LegalReview #NationalHealthInsuranceBill

Global Perspective on Orphan Medicines

A Global Perspective on Orphan Medicines and Rare Diseases

Discover the global perspective on orphan medicines and their role in treating rare diseases. Learn about the unique challenges and incentives in their development. 🌍💊 #OrphanMedicines #RareDiseases #HealthcareInnovation #RareDiseaseDay. Read more here


The Impact of Universal Health Coverage on Poverty Reduction in Low and Middle-Income Countries

📚 New research findings alert! 🚨 A recent study shows that increasing coverage of maternal, child and inpatient services is associated with reduced poverty in low and middle-income countries. 💡 This supports pro-poor approaches towards Universal Health Coverage. 🌍 Let’s work together to ensure #HealthForAll and #EndPoverty. Read more here 📖 #UHC #GlobalHealth #HealthEquity

When you partner with Syenza, it’s like a Nuclear Fusion.

Our expertise are combined with yours, and we contribute clinical expertise and advanced degrees in health policy, health economics, systems analysis, public finance, business, and project management. You’ll also feel our high-impact global and local perspectives with cultural intelligence.



1950 W. Corporate Way, Suite 95478
Anaheim, CA 92801, USA



© 2024 Syenza™. All rights reserved.