Dotmatics
  • Platform

    Scientific Intelligence Platform

    AI-powered data management and workflow automation for multimodal scientific discovery

    Learn More

    Capabilities

    Adaptive Workflows

    Customize, automate, and scale your lab workflows

    Artificial Intelligence

    Leverage AI and ML to accurately predict scientific outcomes

    Material & Ontology Management

    Classify materials and manage entities with full traceability

    Luma Products

    BioGlyph Luma

    Next-gen protein design for complex biologics – integrating molecular modeling, registration, and production with seamless data traceability and precision.

    Geneious Luma

    Accelerated antibody discovery for sequence analysis, construct design, and lab execution—integrating the power of Geneious Prime and Geneious Biologics with Luma’s adaptive workflows.

    Lab Connect

    Automated lab data ingestion and modeling—connect instruments, structure scientific data, and streamline lab operations with seamless integration.

  • Solutions

    The State of Chemicals & Materials

    Uncover key trends shaping the chemicals and materials industry

    Read More

    Solutions

    Antibody & Protein Engineering

    Integrated registration, lab workflow and data management

    Flow Cytometry

    Automated flow data processing and auto-gating

    Industry

    Biology Discovery

    Chemistry R&D

    Chemicals and Materials

  • Products

    R&D Software for Scientists

    Review our comprehensive portfolio of products driving scientific breakthroughs for R&D innovation and collaboration.

    Explore All

    BIOINFORMATICS

    SnapGene

    Geneious Prime

    Geneious Biologics

    CHEMINFORMATICS

    Vortex

    DATA ANALYSIS & VISUALIZATION

    Prism

    ELN

    ELN & Data Discovery Platform

    FLOW CYTOMETRY

    OMIQ

    FCS Express

    MULTIMODAL SCIENCE

    Scientific Intelligence Platform

    PROTEOMICS

    Protein Metrics

  • Resources

    Watch a Demo

    See Dotmatics in action with on-demand product tours and demos.

    View Demos

    Resources

    All Resources

    Explore the resource library

    Blog

    Latest insights and perspectives to lead your R&D

    Case Studies

    How our customers are using Dotmatics

    Ebooks & White Papers

    News and discoveries from industry leaders

    Videos

    On-demand videos from industry topics to product demos

    Events

    Dotmatics Summit

    Upcoming Events & Webinars

  • Company

    COMPANY

    About Us

    Careers

    Contact Us

    COMPANY

    News & Media

    Partners

    Portfolio

    Latest News

Request Demo
Dotmatics
Request Demo
  • Platform

    Scientific Intelligence Platform

    AI-powered data management and workflow automation for multimodal scientific discovery

    Learn More

    Capabilities

    Adaptive Workflows

    Customize, automate, and scale your lab workflows

    Artificial Intelligence

    Leverage AI and ML to accurately predict scientific outcomes

    Material & Ontology Management

    Classify materials and manage entities with full traceability

    Luma Products

    BioGlyph Luma

    Next-gen protein design for complex biologics – integrating molecular modeling, registration, and production with seamless data traceability and precision.

    Geneious Luma

    Accelerated antibody discovery for sequence analysis, construct design, and lab execution—integrating the power of Geneious Prime and Geneious Biologics with Luma’s adaptive workflows.

    Lab Connect

    Automated lab data ingestion and modeling—connect instruments, structure scientific data, and streamline lab operations with seamless integration.

  • Solutions

    The State of Chemicals & Materials

    Uncover key trends shaping the chemicals and materials industry

    Read More

    Solutions

    Antibody & Protein Engineering

    Integrated registration, lab workflow and data management

    Flow Cytometry

    Automated flow data processing and auto-gating

    Industry

    Biology Discovery

    Chemistry R&D

    Chemicals and Materials

  • Products

    R&D Software for Scientists

    Review our comprehensive portfolio of products driving scientific breakthroughs for R&D innovation and collaboration.

    Explore All

    BIOINFORMATICS

    SnapGene

    Geneious Prime

    Geneious Biologics

    CHEMINFORMATICS

    Vortex

    DATA ANALYSIS & VISUALIZATION

    Prism

    ELN

    ELN & Data Discovery Platform

    FLOW CYTOMETRY

    OMIQ

    FCS Express

    MULTIMODAL SCIENCE

    Scientific Intelligence Platform

    PROTEOMICS

    Protein Metrics

  • Resources

    Watch a Demo

    See Dotmatics in action with on-demand product tours and demos.

    View Demos

    Resources

    All Resources

    Explore the resource library

    Blog

    Latest insights and perspectives to lead your R&D

    Case Studies

    How our customers are using Dotmatics

    Ebooks & White Papers

    News and discoveries from industry leaders

    Videos

    On-demand videos from industry topics to product demos

    Events

    Dotmatics Summit

    Upcoming Events & Webinars

  • Company

    COMPANY

    About Us

    Careers

    Contact Us

    COMPANY

    News & Media

    Partners

    Portfolio

    Latest News

What’s Complicating Good Data Practices and Data Integrity?

Latest Blogs
Case Studies
White Papers
Upcoming Events
News
Search

Data integrity is an ongoing concern across all R&D organizations, no matter what part of the research lifecycle they’re navigating. These concerns extend beyond the potential for delayed timelines or cost overruns. Instead, it’s about something bigger: establishing a culture of quality; ensuring product efficacy and patient safety; and being a trusted brand, partner, or provider. 

Prioritizing Data Integrity in the Lab 

Good data practices throughout the R&D process can positively impact data integrity in the lab. Companies must be able to defend the fidelity and confidentiality of all records and data generated throughout a product’s entire lifecycle, starting with the earliest points in research, including raw data, metadata, and transformed data. To do this, companies must have the right processes and technologies in place to ensure proper:

  • Data integrity - How is the completeness, consistency, validity, and accuracy of data impacted by the way it is produced, captured, quality checked, transformed, and traced?

  • Data governance - How does the company manage and track who has access to what data, via what means, how it is used, and to what degree?

  • Data security - How is data encrypted, transferred, stored, and backed up?

These factors—each challenging in their own right—are all intertwined, adding to the complexity of upholding good data practices in the modern lab. 

A Shifting Data Management Landscape 

As R&D organizations digitize their data to make analytics at scale possible, best practices for data management must also evolve. Teams must have clear strategies for identifying and mitigating threats to data integrity, including technological, managerial, and external risks. This is no small task. In fact, in the realm of Pharmaceuticals, the U.S. Food and Drug Administration (FDA) reports increasing data integrity violations in recent years.[1-3] Data integrity is at risk in many cases because the complexity of R&D data, processes, and technologies present numerous opportunities for good data practices to go awry. The most common type of warnings and violations cited by the FDA include data loss; missing metadata; non-contemporaneous collection or backdating; data deletion and copying; sample elimination or reprocessing; poorly investigated out-of-specification results; data access and security issues; and inadequate or disabled audit trails.[1-3] Missteps like these at any point in the R&D process can impact the overall research validity. Data integrity and security breaches could potentially lead to incorrect or non-recreatable research results, raise implications on patient safety and product efficacy, or generate violations that might cause a drug to be rejected at submission or pulled from the market later.[1] 

Factors Impacting Good Data Practices

With stakes so high, it’s important to assess what’s inhibiting a culture of data quality. Three key factors complicating good data practices include:

  • Multimodal R&D creates huge volumes of disparate data that need proper handling.

  • Increases in collaboration are driving data to be more widely shared, and done so with security and privacy in mind.

  • Artificial intelligence (AI) is changing how data are used to drive innovation. 

Let’s explore each factor. 

1. Multimodal R&D

Companies hoping to drive innovation are diversifying their R&D efforts and working across different areas of science with novel modalities. As a result, data are pouring from wide-ranging sources via different means and in different formats. An organization or institution may have several different internal research groups collecting data from thousands of pieces of specialty equipment or instruments; in parallel, it could also be undertaking complex post-acquisition or legacy-data migration activities, all while working with multiple external CROs who have their own distinct systems and processes. All of these different data come from teams that work not only across different modalities and speciality areas of science, but also across different locations globally, each with its own compliance standards and regulations. This incredible volume and diversity of multimodal R&D data create lab integration and data management challenges that can risk compromising data integrity and security. Many companies are struggling to keep pace with a vast volume of diverse data and metadata needed to inform decision making throughout the R&D process.

2. Collaboration

Ensuring the success of R&D at scale means improving data flow between research groups so they can build off of their collective knowledge. The importance of data sharing in advancing science was recently underscored by the United States National Institutes for Health (NIH), which established new 2023 data management and sharing policies to confirm findings, encourage reuse, and spur innovation.[4] Whether it’s chemists and biologists collaborating on chemically modified biologics, or internal and external partners working on projects across modalities and diseases, teamwork is more important than ever; unfortunately, it’s not always easy. Many R&D groups, who have long worked in relative isolation, are now required to collaborate and share data, which requires shifts in mindset and culture. It also requires a governance and execution shift. Bespoke and insulated research teams don’t have the systems and processes in place to share and hand off well-annotated data while at the same time controlling access, tracking changes, and ensuring good data practices are followed by all participants and collaborators. For many companies, it’s hard to facilitate efficient and secure data sharing that doesn’t compromise data integrity. Even the most erudite collaborators have approaches to interaction with instruments, software, workflows, and data types that don’t align with each other. This complicates collaboration. Structured and unstructured data end up scattered in multiple repositories and across different mediums rather than within a secure, centralized, standardized data pool that appropriate collaborators can access and that leverages a well-defined data governance framework. Data sharing challenges are growing so common that they’ve prompted calls to establish better data management standards. One well-known example is the FAIR guiding principles for scientific data management, which promote the adoption of technology and processes that make all data findable, accessible, interoperable, and reusable by both humans and machines alike.[5] Becoming FAIR complaint requires changes in format, model, and storage of data, as well the ways that instruments, software, and systems are integrated. While this can seem overwhelming, the change can be done incrementally; it’s not an all-or-nothing proposition. Whether a company is building a comprehensive FAIR-compliant informatics ecosystem or adopting a data analysis and graphing solution that embraces FAIR data principles, moves toward implementing FAIR-aligned methods can pay dividends in time savings, reproducibility of research, improved knowledge sharing, and AI-readiness. 

3. Artificial Intelligence

As AI arrives in R&D, organizations and institutions will need data infrastructures to capture and manage the proprietary data that will differentiate their research in an AI-everywhere world. For many universities and health companies, becoming AI-ready means first adopting technology and process changes to support exponential growth in data volumes, elimination of data silos, integration of bespoke software and systems, and normalization of data. The ultimate goal is that any data created and captured throughout the R&D process will be trustworthy, well-structured, correlated, shareable, and model-ready. While achieving these aligned data standards is uniquely challenging in scientific R&D because of the complexity of the workflows, data types, software, and systems, it is, nonetheless, essential. Global compliance regulations are currently being updated to guide the use of AI and ML in medical and general research.[6-9] In March 2024, the EU passed an overarching Artificial Intelligence Act. This landmark law aims to protect human health, safety, and fundamental rights as AI is increasingly relied upon for innovation across a broad spectrum of industries, academia, government, and civil organizations. [9] Now is the time for companies to ensure that their existing systems and processes support the regulatory and ethical challenges of using AI in research, including assurance of data integrity, security, traceability, and bias limitation.

Good Data Practices 

Alignment of data management and integrity are vital to long-term research success and preparation for the automated, connected, and collaborative future of research. Considerations for systems that support these imperatives can include those that: 

  • Support research transparency, credibility, and reproducibility by ensuring complete data capture.

  • Automate results and metadata collection from instruments and other lab systems wherever possible.

  • Tie and track results back to their precise samples and fully documented experiments. 

  • Aggregate all relevant R&D data into intelligent, correlated, model-ready data structures.

  • Give scientists tools to easily manage, search, and visualize their R&D data. 

  • Unite applications that produce and analyze data within one secure data-management platform.

  • Centralize and store data securely, with end-to-end encryption in transfer and at rest.

  • Configure checks-and-balances throughout the R&D process using features such as audit trails, QC/QA and SOP checks, signature requirements, permission and access controls to different data sets and functionality, project codes and aliases, encrypted reports, and secure dashboards.

Contact us to learn more about Dotmatics.

References

  • Chen, S. Culture of Quality: Data Integrity and CGMP Compliance. U.S. Food and Drug Administration - SBIA Generic Drug Forum – April 26, 2022. (Accessed 02/06/2024)

  • Neumeyer, M. Data Integrity: 2020 FDA Data Integrity Observations in Review. American Pharmaceutical Review. Jun 23, 2020. 

  • Vazquez, M.; Rayser, J. Regulatory warning letters in pharma: What can we learn post-COVID? Cleanroom Technology. July 27, 2022. 

  • 2023 NIH Data Management and Sharing Policy. National Institutes of Health. (Accessed 02/06/2024)

  • Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data, 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18

  • Using Artificial Intelligence & Machine Learning in the Development of Drug & Biological Products. Discussion Paper and Request for Feedback. U.S. Food & Drug Administration. 2023. (Accessed 02/06/2024)

Artificial Intelligence in Drug Manufacturing. FDA Center for Drug Manufacturing and Research. Discussion Paper. 2023 (Accessed 02/06/2024)

Additional resources

View All Resources
Blog-GettingUnstuckBlog

Getting Unstuck on the Path to Digital Transformation

Data Integrity in the LabOn Demand Webinar

Data Integrity in the Lab

Luma Lab Connect blog imageBlog

Luma Lab Connect: From Instrument to Insight

Get the latest science news in your inbox.

Dotmatics Logo
Footer Icon 1Footer Icon 2Footer Icon 3
Request Demo
Get Support
Luma Scientific Intelligence Platform
Luma Overview
Instrument & Data Integration
Artificial Intelligence
Solutions
Antibody and Protein Engineering
Flow Cytometry
Biologics Discovery
Chemicals & Materials
Small Molecule Discovery
Resources
All resources
Blog
Case Studies
Demos
White Papers
Webinars
What’s New
Upcoming Events
FAQ
Explore
FAIR Data Principles
Lab workflow management
Lab Data Automation for Life Sciences
Lab Data Automation for Chemicals & Materials
Lab Data Informatics for Drug Discovery
Modern ELN
Products
All Dotmatics Products
Dotmatics ELN & Data Discovery
EasyPanel
FCS Express
Geneious Biologics
Geneious Prime
GraphPad Prism
LabArchives
M-Star
nQuery
OMIQ
Protein Metrics
SnapGene
SoftGenetics
Vortex
Virscidian
Company
About Us
Careers
Contact Us
News & Media
Partners
Footer Icon 1Footer Icon 2Footer Icon 3
Request Demo
Get Support
Do Not Sell or Share My Personal Information
UK Modern Slavery Act
Privacy Policy
Terms & Conditions
Trademarks