Reduce Risk, Cost and Time in the New Era of Small Molecule Drug Discovery

With the renaissance of Small Molecule Drug Discovery, the race to explore uncharted biologically active chemical space is on. However, the process of discovery has multiple challenges that have an impact on time, cost and risk to any Biotech working towards a therapeutic breakthrough.

Discover the challenges and how to remove avoidable risks, costs and time to accelerate your path to discovery.

Transcript: Reduce Risk, Cost and Time in the New Era of Small Molecule Drug Discovery

Hello, my name's Hayden Boehm. I'm the director of product marketing.

And I'm going talk to you now a little bit about thematic solution for small molecule drug discovery. For those of you who not as familiar with Dotmatics, we're a company that's been going over 10 years providing informatic software to the sort of chemical industry life science industries, really delivering cheminformatics and bioinformatics to lots of different organizations that you'll see here and really supporting them at these preclinical phases.

So we have over 2 million users of our products and they have been fundamental in helping collaborate with us as we keep developing the next iterations of our product and what brings us to where we are today with this offering for small molecule drug discovery. So, where did we see this opportunity come about? Really, it was about trying to help scientists really understand their data and have the most comprehensive data to make the best decisions make them have the best insights.

So we said, this is pretty huge when it comes to the sort of small molecule drug discovery industry, especially as there has been a Renaissance in the opportunity for small molecule. 10 years ago with sort of the rise of biologics, if you like the sort of the narrative was very much that small molecule was going to be dead in 10 years time. Actually what we've seen here from the graph on the right is that from the number of FDA approvals over the last 10 years, actually, we're seeing an increase in small molecule. And the reasons for this is that as screening technologies have improved, we've uncovered some new potential for small molecule against certain disease targets, a better understanding of the target site has increased that and changed that suitability profile, also better computational chemistry to help understanding sort of what's happening and where, how can we sort of make a better target to fit there.

And what's happened is that we've really unlocked a huge amount of what was previously uncharted, biologically active chemical space. So what was deemed undruggable by small molecule is now very much in focus, um, that coupled on top of the fact that the requirements to develop a small molecule therapy is a lot simpler versus the development of a biologics, and also the, the sort of the cost efficacy profile. And that is in due to the fact of the complexity and the cost of R&D of biologic is, is higher versus small molecule. And with that means that that cost is also sort of the factor in how much it's charged to a patient or a health organization. So that question of, who can afford to buy it. So a lower cost profile means that it, the therapy gets to a broader populace and means it has a bigger impact, um, to the community at large, so huge opportunity around small molecule.

So where are the challenges where sort of an improvement of the Cheminformatics could really help, with this? So this slide here shows a, a very simplified journey of small molecule drug discovery. The data that's represented here is sort of homogenized and harmonized from multiple different sources. So in terms of taking a drug from discovery through to commercialization, actually the cost of doing that is anywhere between one and 3 billion where sort of 30 to 40% of that is spent in that discovery phase. But the cost of failure is actually, you know, higher downstream. So it's really, so as molecules get sort of selected, out's where the bigger cost comes, but, you know, within the discovery phase itself, which is what we're gonna focus on today, actually there's, there's 12%, you know, increased year on year over the task sort of 10 years around R & D costs.

And so the question that we could sort of probably ask ourselves is how much of this avoidable costs could be removed with improvements in informatics around drug discovery. Um, then we look at timelines somewhere between 12 and 18 years from sort of this initial discovery start of discovery to that, that sort of, that start of commercialization. Um, so four to five years in discovery, again, you know, when we start digging into the process, maybe we can start thinking about, well, how much of this time is wasted unnecessarily? Are there opportunities to optimize this? Then we look at the number of molecules that we're making to bring a drug to market. So we sort of, within the discovery phase, we're looking at trying to develop anywhere around 10,000 compounds. As we go through this sort of hit identification through to lead optimization and then candidate selection, our best two 50, typically that we take into this preclinical testing phase, which then reduces between between five and 10 for these clinical trials.

And again, we said, how do we improve those odds? These are our opportunities to get better at that. And again, when we bring one drug to market, when we look back down the process, we have a huge attrition rate, lots of hidden costs, but the burning question is how do we find better candidates more quickly? So looking at the complexity of the workflow around drug discoveries, we sort of break discovery down a little bit further.

What we see is there are sort of multiple sort of pieces that go into bringing the profile of the drug and the molecule to this sort of preclinical phase. And what we sort of see is actually there is a lot of different software solutions for each of these tasks that are shown here. Each of them probably provided by a different vendor who has their own proprietary data, which means that actually the data's between each of these points, aren't the same.

So many times these systems and softwares don't talk to each other, or can't talk to each other because they can't ingest somebody else's data. Likewise, can't deliver their data in a format that can be ingested by other softwares. So what we're seeing is that there's anything up to 80% of data is stuck, and this is also coming from our recent Gartner paper that we talked about there. So that is a pretty significant challenge.

And then when we sort of break it down to another sort of part of this, distilling it further, it's when we look at just purely the small molecule drug discovery workflow, if we take sort of the different lead approaches to get our hit, when we come into this lead identification and lead optimization through this sort, make test decide cycle, this is where that 10,000 molecules gets developed. And to bring our sort of magic to 50, that we would move out into free clinical development. We're going through those toxicology studies, for example. So, you know, again, taking that cycle and then drilling down a little bit further to sort of the tiny tasks that go to make up this cycle, actually we see there are a huge amount of tasks that are associated with it with regards to provider making something, and then putting data into a system, looking at registration or service booking or visualization. There's also softwares that could help us here with once we've got visualization, how do you perhaps enumerate on different ideas? So there's a case of softwares that provide machine learning and AI to a certain degree to then go through this next cycle to sort of pick the next better set of compounds we think are going to be more effective against the disease target.

So there are lots of different tasks here that have softwares to optimize that particular task in itself. And so what that leads to is sort of an ecosystem that looks a little bit like this. When looking at our optimal best of breed apps, we could potentially see it. And I've marked these all in different colors to represent, actually this could be a software from a different vendor. So in terms of optimizing your capabilities, you could bring all of these different capabilities together to optimize your process again, requesting in terms of getting stuff done, communicating through different groups to get samples analyzed, but with all of those different softwares, as we said, these things don't necessarily talk.

So the, the burden of plumbing them all together, if you like is actually with the research organizational research Institute. So they've got to really think about all these different elements of services to all of their users, as well as managing this database and all the things that they need to consider here, integrating their own data from different departments, as well as third party data. If they're working with external collaborators at different phases of the drug discovery process, also bringing in different sort of third party data apps and web data as well.

So something we've got this huge ecosystem that's being well being developed. And the there's a big ask here of a research Institute's informatics department. But one of the other challenges that we've noticed from, from talking with our customers and this paper from Gartner beautifully captures the the essence of what we've been hearing: we've done all the right things in terms of optimizing task and bringing in softwares to do that.

But when we talk about that burden of making them all talk together and plum together, that is what's known as technical debt, which, with the research organization's IT or informatics department, that's their responsibility. Now, as we are bringing more applications to these workflows, or to optimize certain tasks in the workflow together, they are all on different languages. So the birth of making them that suddenly gets higher and higher and higher. And so you've then got multiple data models, multiple different architectures. And so really it's a research organization to have to build this suite, the recommendation, going forward to say, how could you optimize this process? Because actually there's a huge amount of human endeavor. That's actually even put on the data scientists: anywhere between a data scientist's time, anywhere between 60 to 80% is actually focused on cleaning data and getting data for a secondary use (i.e. into the next software.)

This is where there is a huge opportunity to create efficiencies around time reduce cost in terms of the process improve the quality of the data. Because again, every time you are asking a human to pass data, to get it ready, there is an opportunity for error to creep in because this is a linear task and humans are not greater linear task. They tie they fatigue and then errors creep in. So the recommendation here is, could the optimal sort of way, going forward be having a common data model to common data code, common tooling, common architecture, so all these applications run off of one single platform utilizing a single data code. And that's really where we at Dotmatics have looked and said, that's where the direction is for us, we need to go here.

This is how we can best benefit the industry for small molecule drug discovery. This is where we can play up up to allow our customers to make these breakthroughs and these innovations that are gonna get better drugs to market faster and more cost effectively. So this herds, the launch of our small molecule drug discovery solution, really we are supporting that part of the workload from hit and lead identification through to candidate selection. It's really a fully integrated sort of informatic software that captures experiments, molecules, samples, assays, and visualization tools, which keep layering on with every piece of information you add into it, and also, elements of machine learning and AI to help enumerate and make suggestions of what would be the next best candidate or best sort of library to continue to make.

So really we talked about in a slide earlier about all of these different tasks, and this is where over the last 10 years we've been developing best of breed applications that have been supported by all of these different capabilities behind them. Again, we talked about our group analysis, statistical modeling library, enumeration, the ability to work with CRO portals with compound sentences to work in partnerships, and as we've probably seen from the last sort of three to four years, the massive breakthroughs of world health have not been done in isolation. They have relied extensively on partnerships. So having safe portals for collaboration where that data is secure has been something that we've been developing to enable library synthesis. As we go through registration, QC analysis, different depart, analytical departments sharing, and everybody working from this same single source of data.

So it's a single source of truth for every part, every department or function that has to help the molecule and the decisions around the molecule, go through the process and the workflow just to visualize this, the benefit from a user perspective is that now you don't have to learn and navigate multiple different interfaces. It's one interface. So you single sign on and then you are there and you can see all the different tools, functions, capabilities for where you are on this make test decide cycle. Also, other people are seeing the same data when they log in and perform their part of the process to enable this work to flow. So again, and as we're on a single data code, this means we've got multiple benefits in terms of data integrity, because we're not much switching between different libraries and different data sources.

It's great from a compliance and traceability standpoint and also a data security standpoint. There are multiple benefits to having a single platform, single data code and everybody having this single source of truth. And that's very much where our customers have seen the benefits. They've said the workflows, broad capabilities that Dotmatics has been creating over the last 10 plus years have enabled them to make every experiment matter. So now they're not compromising speed or accuracy. And on top of that, because of this single data code and everything under one space, it's single source of truth. This is reliable, repeatable results. Customers that have been working with us on early precursors to this solution have very much seen that benefit that the scientists are able to have much better access to the data to make better informed decisions.

And therefore, when they're working on multiple different projects, if they're in QC department, you know exactly what they're working on, their data is taken to the right source. Everything just flows in terms of making those decisions, here from a cancer research Institute, this platform enabled that collaboration and it's the collaboration between different internal and external departments. So now there is a great team of experts working to give the best outcome possible.

And so we are taking the thematic vision of having a best of breed application, single platform that is based on the foundations of our electronic notebook search capabilities and the visualization and analytics. We've got data management around assays, library, design, registration. So as we develop, we've also acquired some partners with Prism and BioBright. These are things that will be integrated into the small molecule drug discovery solution to optimize our customer's work. You can sort of see these tab top here. The thinking and the methodology and the vision that we have for biologics and solutions around biology, formulations for your industry materials, science, and also with that's powered through data management with instrument integration.

And again, that would also fall part of this development and expansion of this small molecule drug discovery solution. So with all of these things, the goal for us today is making your small molecule drug discovery work flow. It's improving the collaboration and productivity. It's really removing any unnecessary work arounds that you have to date improving the efficiency by reducing these needless hidden costs that are currently inherent in a workflow to date, improving the data, quality, the compliance, and by addressing all these sort of near term hidden problems or hidden friction points that enables us to build and eliminate those and build on that now having a platform and a sort of data hub around small molecule drug discovery that allows you to make those breakthroughs, those insights and innovations a lot faster, more cost effectively to enable you to really make those breakthroughs that are going to benefit hopefully me and you in the future.

So that's everything I have to talk about with regards to the small molecule, drug discovery, workflows, and introduction. If you are interested in learning more, then come to our webpage, on And, uh, and if you want to talk to somebody a bit further, then there's a request information form, which will put you in contact with one of our team members and help you on your way.

Learn More

Get a live demonstration of Dotmatics from a product expert who can help you learn how to optimize your small molecule discoveries.

Speakers and Panelists

Hironobu Koga

Hironobu has extensive experience in drug discovery research at domestic pharmaceutical companies and has skills in molecular modeling and chemoinformatics; since joining Dotmatics in April 2021, he has been primarily involved in consulting on Bioregister implementation.