The availability of big data means that now, for the first time, we have access to more information than the human brain can comprehend in a lifetime. At the same time, in the past few years there have been significant advances in AI technology so that computers are now not only able to scan and read through these data, they can interpret, even learn from it.
These advances are down to a type of AI called deep learning. Deep learning systems are a type of machine learning that enables the computer to learn from unstructured data representations in a similar way that neural networks work in the brain. These computers improve their performance each time they are exposed to more data. Deep learning AI systems have led to striking improvements in pattern and image recognition, which underpins, for example, self-driving cars.
‘Traditionally in computer programming we get data and carefully write down all the steps it takes to get to from our input to output. The computer programmer of the past is like a micromanager who tells the computer what to do every step of the way,’ explained Peter Norvig, director of research at Google, at a seminar on AI at the AAAS (American Association for the Advancement of Science) annual meeting in Texas, US, in February 2018.
However, with the advent of big data, this logical step by step progression is neither possible, nor desired. ‘We’re looking at speech input, vision and other data from the environment that is driven by uncertainty rather than logic, so we don’t have the ability to tell a computer what to do each step of the way, but we can teach it how to do it itself. The programmer has moved from being a micromanager to a teacher,’ says Norvig.
US biotechnology company Zymergen is one of several organisations aiming to use AI to take over some of the time-consuming tasks involved in doing R&D, and to accelerate the pace of discovery. Part of the company’s work involves fine-tuning the genes of industrial microbes produced by other companies, so that the microbes produce higher yields of biologic molecules that make up biofuels, plastics and drugs.
The microbes that arrive at Zymergen are already pretty good at what they do, so scientists need to do hundreds, or perhaps thousands of experiments to create different strains, each with a single deliberate mutation, to see which ones produce the most products. With robotic machines they can do around 1000 experiments/week, but these machines must be told what to do, which holds up the process. ‘There are more ways to alter the DNA of a simple single-celled microbe than there are atoms in the universe,’ says Aaron Kimball, chief technology officer at Zymergen, adding: ‘There are simply too many possibilities to test in a lab, even with an automated approach.’
Zymergen is using AI to augment the work of its scientists. The AI system analyses the huge amount of data resulting from the experiments, looks for patterns and offers theories to explain them. It then suggests subsequent experiments, helping the scientists to find the precise genetic change that will elicit an improvement. It is also able to search through and scan millions of scientific papers, looking for patterns and emergent themes that could generate new research ideas.
So far, Zymergen has reduced the time-to-market for one customer by more than three years. For another customer, it claims to have more than doubled the net product margin. Even more interestingly, the genes discovered by the robots are ones that scientists would probably never have found, since the majority are not directly related to known pathways for synthesising the desired chemicals.
Over the past few years chemical giants like BASF, Dow, and Evonik Industries have all embraced AI technology. BASF’s new supercomputer Quriosity, for example, based in Dresden, Germany, can perform 1.5 quadrillion computing operations/second – the capacity of around 50,000 laptops. The supercomputer is being used to find new chemical compounds and products that would otherwise remain undiscovered. For example, BASF wanted to find a soluble form of an existing active agent used in crop protection. Instead of performing thousands of experiments themselves, researchers provided the supercomputer with a large number of possible structures and asked it to find the best candidates. The best hundred were then tested in the laboratory.
In addition, BASF is developing a software app, called Maglis, to help farmers detect crop diseases in their fields. The app uses image-recognition software. The idea is that farmers take photographs of suspicious spots on crop leaves; the software then searches through a database of known pathogens and delivers early warnings for diseases. According to BASF, tests suggest the app can identify disease as well as, or better than, farmers. The software can also predict crop development. The farmers simply tell the program what day they will sow seeds, feed in the coordinates of the field and Maglis can predict when the first seedlings will appear, what the yield will be and when the crop will be ready for harvest. The system learns and gets better as time goes on.
In drug discovery, one reason this is so difficult is that many of the most obvious candidates have already been found. In addition, it is still a hit and miss affair, with some 50% of mid- and late-stage trials ending in failure.
But AI systems can analyse everything from scientific papers, through molecular structures and gene sequences, to images. They can make connections and form hypotheses in weeks, many of which scientists would never see. And AI systems will scan the data and generate hypotheses in a totally unbiased way, unlike people.
Not surprisingly, many big pharma companies are investing in the technology. In 2016, Pfizer announced it was using IBM’s Watson supercomputer to accelerate cancer drug discovery. In 2017, AstraZeneca in the UK teamed up with US biopharma company Berg to find biomarkers and drugs for neurological diseases. In the same year, Roche subsidiary Genentech announced a partnership with US-based GNS Healthcare to use its AI platform for cancer drug development.
Since 2015, UK pharma company GSK has been consolidating the huge amount of data it holds on R&D into one giant database. The database contains all the molecules it has explored for drug discovery purposes, including the tests done on them. These range from the synthetic routes to making them, the screens in which they were tested and their fate in testing, including human trials. In summer 2017, GSK announced the creation of its own in-house AI unit that will trawl through these data, looking for hidden connections that humans are unlikely to see.
GSK hopes that by using machine learning technology, it will be able to develop medicines more quickly, at a reduced cost and with higher precision. Ultimately, the company wants to reduce the time it takes to identify disease targets and molecules that act against them from an average of 5.5 years to just one year.
‘The cost of drug discovery and development is well documented,’ says John Baldoni, head of the AI unit at GSK. ‘Approximately a third of that cost is in going from a target associated with a patient with a particular disease to having a molecule that you can test on that patient.’
Currently, scientists approach this step in different ways. If they know a certain pathway is implicated in a disease, they might look at what proteins and molecules are implicated in this pathway. They might then try to design molecules that will fit into binding pockets in these target proteins and modify their activity. Today, this design process takes place in silico, ie using computer simulation. However, these approaches address one variable of the discovery process at a time, eg efficacy, toxicity, metabolism.
According to Baldoni, AI computers can help scientists in this process, testing various theoretical molecules for binding in the desired pocket (efficacy), while assessing potential toxicity and metabolism effects. This simultaneous assessment increases the efficiency of this aspect of the drug discovery cycle, reducing the time this step takes down to a few days.
By way of example, Baldoni explains: ‘We are interested in a particular pathway implicated in several diseases. We approached an AI company, and asked if it could help us find potential targets involved in this pathway. Using AI, the company trawled through public, scientific data sources and looked for patterns in genes that are over or under expressed in patients to identify potential protein targets for which we might design a drug. Based on those targets, we then needed to know how many molecules are out there that might bind to those targets and modulate them? The AI company did a virtual screen, and based on protein homology and known molecules, it identified 10,000 possible molecules.’ Currently, GSK is working with other AI companies that can weed out from that collection any compounds that are hard or impossible to make and aid the chemist in selection and synthesis. AI technology is transforming how scientists and engineers design, make and test molecules in silico, making the cycle much shorter.
The real revolution, however, is expected to come in the second stage of drug research. Currently the slowest stage in R&D is making the molecules that have been designed on the computer and testing them to see if they do what is required. According to Baldoni, machines equipped with AI will soon be able to aid in this too.
‘Imagine being able to use AI to design a molecule that fits into the active site of a protein, and simultaneously assess its potential toxicity and metabolism. Promising structures are fed into robotic chemistry for synthesis and then transferred into another system for biological screens. All the data from these operations are captured and reused to refine the AI algorithms and further increase the effectiveness of this design, make and test cycle. Because each element is individually available, I think this integration will be done within three years. When that happens, it will dramatically shorten the design, test, make cycle where you ultimately make the molecule. That’s the future.’
Biotech companies are also using AI to look for new drugs. UK company BenevolentAI, for example, uses deep learning linguistic models and algorithms to analyse scientific papers, patents and clinical trial information, as well as huge chemical, biological and medical databases.
AI could reduce the time it takes to identify disease targets and molecules that act against them from an average 5.5 years to just one year
BenevolentAI’s AI technology is shaving four years off the early drug discovery cycle
Using AI, biotech company Zymergen has reduced the time-to-market for one customer by more than three years
‘The computer analyses and understands the context; then reasons, learns, explores, creates and translates what it has learnt to produce unique drug development hypotheses,’ says Jackie Hunter, CEO of BenevolentAI. ‘Our expert scientific teams then triage and validate the hypotheses we think are worth exploring, and subsequently test these hypotheses in biological experiments.’
The company has teamed up with the Sheffield Institute of Translational Neuroscience to find a new drug for motor neurone disease. There are only two drugs currently licensed to treat the condition, which affected the late astrophysicist Stephen Hawking. However, BenevolentAI has already come up with two new candidates, one of which has produced promising results in preclinical trials. ‘What is particularly exciting is that research suggests that our technology may have found a way to prevent the death of motor neurons, the key to finding a cure for the disease,’ says Hunter.
The company has also used its AI technology to research two drugs for Alzheimer’s disease, which it sold to a US company to develop in a deal worth £624m. As well as developing its own clinical pipeline, BenevolentAI is looking to repurpose existing ones. For example, it is currently conducting a Phase 2b study on a previously unsuccessful compound from Johnson & Johnson to see if it can treat Parkinson’s disease patients that suffer from excessive daytime sleepiness.
‘We are now working on over 20 potential programmes driven by the technology,’ says Hunter. This includes collaborations with Parkinson’s UK and The Cure Parkinson’s Trust (CPT). ‘We aim to identify at least three currently available medicines that can be repurposed to address Parkinson’s and two new ways to treat the disease.’
Hunter believes that using AI to look for therapeutic drugs has saved her clients both time and money, delivering 60% savings and shaving four years off the early drug discovery cycle. ‘The results speak for themselves. AI and machine learning do not supplant human intelligence but instead augment it. Our AI-driven hypothesis engine generates bias-free hypotheses at a scale impossible to replicate through traditional methods.’