Dotmatics
  • Platform

    Scientific Intelligence Platform

    AI-powered data management and workflow automation for multimodal scientific discovery

    Learn More

    Capabilities

    Adaptive Workflows

    Customize, automate, and scale your lab workflows

    Artificial Intelligence

    Leverage AI and ML to accurately predict scientific outcomes

    Material & Ontology Management

    Classify materials and manage entities with full traceability

    Luma Products

    BioGlyph Luma

    Next-gen protein design for complex biologics – integrating molecular modeling, registration, and production with seamless data traceability and precision.

    FCS Express Luma

    Streamlined flow cytometry data capture and traceability—connecting FCS Express outputs to the centralized Luma platform.

    Geneious Luma

    Accelerated antibody discovery for sequence analysis, construct design, and lab execution—integrating the power of Geneious Prime and Geneious Biologics with Luma’s adaptive workflows.

    Lab Connect

    Automated lab data ingestion and modeling—connect instruments, structure scientific data, and streamline lab operations with seamless integration.

  • Solutions

    The State of Chemicals & Materials

    Uncover key trends shaping the chemicals and materials industry

    Read More

    Solutions

    Antibody & Protein Engineering

    Integrated registration, lab workflow and data management

    Flow Cytometry

    Automated flow data processing and auto-gating

    Industry

    Biology Discovery

    Chemistry R&D

    Chemicals and Materials

  • Products

    R&D Software for Scientists

    Review our comprehensive portfolio of products driving scientific breakthroughs for R&D innovation and collaboration.

    Explore All

    BIOINFORMATICS

    SnapGene

    Geneious Prime

    Geneious Biologics

    CHEMINFORMATICS

    Vortex

    DATA ANALYSIS & VISUALIZATION

    Prism

    ELN

    ELN & Data Discovery Platform

    FLOW CYTOMETRY

    OMIQ

    FCS Express

    MULTIMODAL SCIENCE

    Scientific Intelligence Platform

    PROTEOMICS

    Protein Metrics

  • Resources

    Watch a Demo

    See Dotmatics in action with on-demand product tours and demos.

    View Demos

    Resources

    All Resources

    Explore the resource library

    Blog

    Latest insights and perspectives to lead your R&D

    Case Studies

    How our customers are using Dotmatics

    Ebooks & White Papers

    News and discoveries from industry leaders

    Videos

    On-demand videos from industry topics to product demos

    Events

    Dotmatics Summit

    Upcoming Events & Webinars

  • Company

    COMPANY

    About Us

    Careers

    Contact Us

    COMPANY

    News & Media

    Partners

    Portfolio

    Latest News

Request Demo
Dotmatics
Request Demo
  • Platform

    Scientific Intelligence Platform

    AI-powered data management and workflow automation for multimodal scientific discovery

    Learn More

    Capabilities

    Adaptive Workflows

    Customize, automate, and scale your lab workflows

    Artificial Intelligence

    Leverage AI and ML to accurately predict scientific outcomes

    Material & Ontology Management

    Classify materials and manage entities with full traceability

    Luma Products

    BioGlyph Luma

    Next-gen protein design for complex biologics – integrating molecular modeling, registration, and production with seamless data traceability and precision.

    FCS Express Luma

    Streamlined flow cytometry data capture and traceability—connecting FCS Express outputs to the centralized Luma platform.

    Geneious Luma

    Accelerated antibody discovery for sequence analysis, construct design, and lab execution—integrating the power of Geneious Prime and Geneious Biologics with Luma’s adaptive workflows.

    Lab Connect

    Automated lab data ingestion and modeling—connect instruments, structure scientific data, and streamline lab operations with seamless integration.

  • Solutions

    The State of Chemicals & Materials

    Uncover key trends shaping the chemicals and materials industry

    Read More

    Solutions

    Antibody & Protein Engineering

    Integrated registration, lab workflow and data management

    Flow Cytometry

    Automated flow data processing and auto-gating

    Industry

    Biology Discovery

    Chemistry R&D

    Chemicals and Materials

  • Products

    R&D Software for Scientists

    Review our comprehensive portfolio of products driving scientific breakthroughs for R&D innovation and collaboration.

    Explore All

    BIOINFORMATICS

    SnapGene

    Geneious Prime

    Geneious Biologics

    CHEMINFORMATICS

    Vortex

    DATA ANALYSIS & VISUALIZATION

    Prism

    ELN

    ELN & Data Discovery Platform

    FLOW CYTOMETRY

    OMIQ

    FCS Express

    MULTIMODAL SCIENCE

    Scientific Intelligence Platform

    PROTEOMICS

    Protein Metrics

  • Resources

    Watch a Demo

    See Dotmatics in action with on-demand product tours and demos.

    View Demos

    Resources

    All Resources

    Explore the resource library

    Blog

    Latest insights and perspectives to lead your R&D

    Case Studies

    How our customers are using Dotmatics

    Ebooks & White Papers

    News and discoveries from industry leaders

    Videos

    On-demand videos from industry topics to product demos

    Events

    Dotmatics Summit

    Upcoming Events & Webinars

  • Company

    COMPANY

    About Us

    Careers

    Contact Us

    COMPANY

    News & Media

    Partners

    Portfolio

    Latest News

Want AI-enabled Scientific Discovery? Meet Dotmatics Luma.

Dan OrmsbyNov 26, 2024
Latest Blogs
Case Studies
White Papers
Upcoming Events
News
Search

Many people are familiar with how large language models (LLMs) work. You ask a program like ChatGPT a question and it replies from an AI model built on large public datasets. For most in their daily lives the impact falls somewhere between a novelty to a handy work tool. But what if the same approach were applied to drug discovery—what if your ELN deployment was AI-enabled? 

Today, thousands of chemists working on small molecule programs use Dotmatics ELN to capture the reactions they have performed in the lab. We want to help these scientists, and all scientists, to become much more efficient. 

Now, imagine if your drug discovery R&D platform made useful predictions and recommendations on compounds to accelerate drug development. What if those predictions were based on a combination of models formed from neural network algorithms on local private data, as well as large public compound libraries that were folded into that private customer data. And thanks to feature engineering it could offer “magic suggestions” because it truly understands your data and the scientific domains behind the information. That’s exactly what Dotmatics is building. 

Dotmatics Luma Now…and What’s Ahead

Earlier this year Dotmatics unveiled Luma, a breakthrough scientific intelligence platform that simplifies collecting and processing instrument data, and helps non-technical users easily gain critical insights directly from data. Luma opens up worlds of possibilities for customers to supercharge their existing ELNs with AI.

Currently in Luma we are building out an AI prototype solution that leans on large public datasets of molecules available for purchase. When a compound is sketched, Luma runs algorithms to calculate properties such as molecular formula and mass, the IUPAC chemical name, and can register the compound to see if it is novel within that customer system.  

In that same prototype, Luma also embeds additional algorithms against the same structure. The extra algorithms search a database of billions of molecules listed as available for purchase to show the chemist if the compound they are considering synthesizing is available externally for purchase. Luma can also show other similar structures (close but non-identical) from both external and internal sources. 

Plus, Luma can calculate using Neural Network predicted activities for compounds on models constructed on the customers local private data. This is an important element of Dotmatics’ AI roadmap—because customer data and AI models built on that data will always remain private to that customer. To create a much broader pool of data we also can fold in large public datasets to augment those local private datasets, similar to what ChatGPT does.

Best in Breed Technology Under the Hood

The technical implementation of these additional algorithms lean on AWS hardware and Databricks to deliver a scalable cloud-based solution. 

For searching billions of compounds we use an AWS in-memory database that is highly scalable—that means queries for chemical similarity against 10B+ compounds take under one second.  

The neural network models used to predict activity are calculated by tokenizing the molecules using Dotmatics’ proprietary chemical toolkit, then training the models using TensorFlow within Databricks. TensorFlow was originally released to the community from the Google Brain project. It is highly scalable and makes effective use of any CPU or GPU resources available within the AWS hardware Databricks is running on, which is delivered to Dotmatics customers under the Luma platform.

Feature Engineering is a Must Have

In high complexity areas like drug discovery feature engineering is an essential requirement in any AI solution. Feature engineering simply means adding human knowledge to guide the AI towards better models. It’s what raises a generic AI method to an application-specific one. In ChatGPT, feature engineering is done by “tokenizing words.” Tokenization refers to the process of converting a sequence of text into smaller parts, known as tokens. In the context of small molecules within AI by Dotmatics, feature engineering consists of tokenizing molecules.  

A number of tokenizer options are available for molecules in Dotmatics, but as most ELN sketches done by chemists are 2D/flat structures, Morgan style fingerprints are a reasonable starting point. Additional tokens based on properties like molecular weight or a logP estimator can give the neural network further information to train on. 

Neural network models in TensorFlow can be trained taking tokenized molecules as inputs and known activities (i.e. pIC50) as output. The model is trained to “see” the patterns from the input data; when a molecule of unknown activity passes through, it returns a prediction. Showing the compound a chemist is making in the context of the similar chemistry within the intellectual property of the customer, plus similar chemistry from external purchasable sets, enables chemists to decide if they should continue with the synthesis in the lab or buy a compound instead. If compounds can be purchased from an external supplier it is highly likely many variants can be directly purchased too. 

This is critical because the ideal goal in drug discovery is to identify a developable series, not simply find one compound that is active. Discovering a family of compounds ensures that if one compound fails there is a supply of active “backup” compounds to go to next. Finding activity in purchasable compounds significantly reduces discovery time, cost and effort.

Small Molecules are Just the Start

Small molecules are the perfect starting point for these types of research programs and functionality pilots within the Luma platform; but this is only the beginning. Consider the possibility of modeling domains of information, such as DNA encoded libraries, formulations, images, etc.

Scientific acumen, world class technology infrastructure and domain-specific feature engineering are part of what separates Dotmatics solutions from the more general off-the-shelf R&D approaches. 

To learn more about Dotmatics Luma and how the scientific intelligence platform enables AI capabilities to accelerate scientific discovery.

Additional resources

View All Resources
iStock-dataBlog

Data Evolution in Pharma: The Spread of Multimodal

AIBlog

How to Plan Your AI Budget Now To Succeed in 2025

Blog Antibody and Protein Engineering SolutionBlog

New Dotmatics solution to streamline your antibody discovery process

Get the latest science news in your inbox.

Dotmatics Logo
Footer Icon 1Footer Icon 2Footer Icon 3
Request Demo
Get Support
Luma Scientific Intelligence Platform
Luma Overview
Instrument & Data Integration
Artificial Intelligence
Solutions
Antibody and Protein Engineering
Flow Cytometry
Biologics Discovery
Chemicals & Materials
Small Molecule Discovery
Resources
All resources
Blog
Case Studies
Demos
White Papers
Webinars
What’s New
Upcoming Events
FAQ
Explore
FAIR Data Principles
Lab workflow management
Lab Data Automation for Life Sciences
Lab Data Automation for Chemicals & Materials
Lab Data Informatics for Drug Discovery
Modern ELN
Products
All Dotmatics Products
Dotmatics ELN & Data Discovery
EasyPanel
FCS Express
Geneious Biologics
Geneious Prime
GraphPad Prism
LabArchives
M-Star
nQuery
OMIQ
Protein Metrics
SnapGene
SoftGenetics
Vortex
Virscidian
Company
About Us
Careers
Contact Us
News & Media
Partners
Footer Icon 1Footer Icon 2Footer Icon 3
Request Demo
Get Support
Do Not Sell or Share My Personal Information
UK Modern Slavery Act
Privacy Policy
Terms & Conditions
Trademarks