Using artificial intelligence (AI) and one of the world’s fastest supercomputer, Chinese scientists are engineering otherwise unknown chemicals that can be clinically used in the future.
The Tianhe-2 supercomputer in south China’s Guangdong Province, ranking among the global top 10 fastest computers in the TOP 500 listing published this month, has been used as a platform for drug discovery. Now, AI-based algorithms make the machine even smarter. Scientists from Sun Yat-sen University and Beijing-based AI startup Galixir, along with those from the Georgia Institute of Technology and the Massachusetts Institute of Technology, reported a practical deep-learning toolkit to predict the biosynthetic pathways for natural products (NPs) or NP-like compounds in Tianhe-2.
Natural products are the primary source of clinical drug discovery. More than 60 percent of FDA-approved small molecule drugs in the United States are NPs or their derivatives. Over 300,000 NPs have been recorded to date, but owing to the complex production know-how, only one-tenth have been developed as a substrate or product, with the computer-aided screening urgently needed. In a recent study published in Nature Communications, the researchers presented a tool called BioNavi-NP to propose NP biosynthetic pathways from simple building blocks in an optimal fashion, which requires no already-known biochemical rules.
Firstly, a single-step bio-retrosynthesis prediction model is trained to generate candidate precursors for a target NP. The full data-driven model achieves a prediction accuracy 1.7 times more precise than the previous rule-based model, according to the study. Then, an automatic retro-biosynthesis route planning system efficiently samples plausible biosynthetic pathways. The study reveals that the toolkit can successfully identify biosynthetic pathways for 90.2 percent of 368 test compounds.
Also, the researchers combined an existing enzyme prediction tool to provide a user-friendly, open-to-public web server that can predict biosynthetic pathways. It can also score the biological feasibility of those pathways based on the estimated preference of species and enzymes. Inputting any relevant NP molecules into the online toolkit, one can obtain multiple predicted ways to synthesize them in a few minutes. The quick results are only made possible by Tianhe-2’s strong parallel computing capability and its customized GPU resources, which help shorten the training and testing time from more than two weeks to one day. China’s supercomputer Tianhe-2 has been widely used to promote research in health and medicine.
Source: This news is originally published by cgtn