KAUST’s DeepGO-SE AI tool excels in predicting functions of unknown proteins, offering promising applications in biotechnology and research.
A new artificial intelligence (AI) tool that draws logical inferences about the function of unknown proteins promises to help scientists unravel the inner workings of the cell.
Developed by KAUST bioinformatics researcher Maxat Kulmanov and colleagues, the tool outperforms existing analytical methods for forecasting protein functions and is even able to analyze proteins with no clear matches in existing datasets.
Advancements in Protein Function Analysis
The model, termed DeepGO-SE, takes advantage of large language models similar to those used by generative AI tools such as Chat-GPT. It then employs logical entailment to draw meaningful conclusions about molecular functions based on general biological principles about the way proteins work.
It essentially empowers computers to logically process outcomes by constructing models of part of the world — in this case, protein function — and inferring the most plausible scenario based on common sense and reasoning about what should happen in these world models.
Collaborative Research and Applications
“This method has many applications,” says Robert Hoehndorf, head of the KAUST Bio-Ontology Research Group, who supervised this research, “especially when it is necessary to reason over data and hypotheses generated by a neural network or another <span class="glossaryLink" aria-describedby="tt" data-cmtooltip="
” data-gt-translate-attributes=”[{"attribute":"data-cmtooltip", "format":"html"}]” tabindex=”0″ role=”link”>machine learning model,” he adds.
Kulmanov and Hoehndorf collaborated with KAUST’s Stefan Arold, as well as researchers at the Swiss Institute of Bioinformatics, to assess the model’s ability to decipher the functions of proteins whose role in the body are unknown.
The tool successfully used data regarding the amino <span class="glossaryLink" aria-describedby="tt" data-cmtooltip="
” data-gt-translate-attributes=”[{"attribute":"data-cmtooltip", "format":"html"}]” tabindex=”0″ role=”link”>acid sequence of a poorly understood protein and its known interactions with other proteins and precisely predicted its molecular functions. The model was so accurate that DeepGO-SE was ranked in the top 20 of more than 1,600 algorithms in an international competition of function prediction tools.
Impact and Future Directions
The KAUST team is now using the tool to investigate the functions of enigmatic proteins discovered in plants that thrive in the extreme environment of the Saudi Arabian desert. They hope that the findings will be useful for identifying novel proteins for biotechnological applications and would like other researchers to embrace the tool.
As Kulmanov explains: “DeepGO-SE’s ability to analyze uncharacterized proteins can facilitate tasks such as drug discovery, metabolic pathway analysis, disease associations, protein engineering, screening for specific proteins of interest, and more.”
Reference: “Protein function prediction as approximate semantic entailment” by Maxat Kulmanov, Francisco J. Guzmán-Vega, Paula Duek Roggli, Lydie Lane, Stefan T. Arold and Robert Hoehndorf, 14 February 2024, Nature Machine Intelligence.
DOI: 10.1038/s42256-024-00795-w