Mestrelab Research presents SCI-DATA, a platform that accelerates health research through the advanced analysis of scientific data

The project is developing a platform capable of unifying, standardising and processing large volumes of scientific data, facilitating interoperability
It helps accelerate research processes, optimise analyses and reduce the time required to develop new treatments
Mestrelab Research presented the results of the SCI-DATA project on Thursday. The initiative focuses on developing an advanced and scalable technological solution for managing, processing and standardising large volumes of scientific data from multiple sources, with a particular emphasis on research in healthcare, pharmaceuticals, chemistry and the life sciences. Its aim is to address one of the major challenges currently facing these sectors: the difficulty of integrating and harnessing complex, heterogeneous scientific data, which in many cases remains fragmented or insufficiently interoperable, limiting its use in research and development.
The SCI-DATA project has enabled the development of a technology platform capable of unifying, processing and automating large volumes of scientific data generated by different sources and analytical technologies used in laboratory environments. These include widely used research techniques such as nuclear magnetic resonance (NMR), mass spectrometry (LC/GC-MS) and various forms of spectroscopy, which are fundamental to chemical, pharmaceutical and biomedical analysis.
The presentation of the project results took place on Thursday at the Mestrelab Research Centre (CIM) in Santiago de Compostela. It brought together representatives of the organisations involved in the project, including Felipe Seoane, Software Development Director at Mestrelab; Agustín Barba, Software Development Director at SciY; Alejo Santolino, Data Legal Framework and Business Analysis Coordinator for the OneHealth Dataspace project at the Galician Supercomputing Centre (CESGA); Jorge Amor, Head of Data & AI Infrastructure at Gradiant; María Silveira, Innovation Project Manager at BIOGA; and Lucía Castro, Managing Director of DATAlife.
The challenge of spectral data in laboratory environments
One of the project’s main areas of focus has been the processing of spectral data in the pharmaceutical industry, a type of scientific information characterised by its high degree of complexity, volume and diversity of formats.
This data, which is commonly generated in laboratory environments, usually comes from a range of analytical technologies and is stored in heterogeneous formats, making it difficult to integrate, compare and reuse. SCI-DATA addresses this challenge through standardisation processes that transform this information into structured data ready to be used in research environments. As Felipe Seoane of Mestrelab explained: “The challenge was to enable data to be exported into other formats and to ensure that, during this export process, a schema was in place to support its interoperability and reusability.”
The technical solution developed therefore includes the automation of processes such as reading, analysing and converting obsolete formats. It has also been designed in accordance with the FAIR principles — Findable, Accessible, Interoperable and Reusable — helping to ensure that scientific data can be used efficiently across different environments and by different organisations.
This approach contributes to improving data interoperability and advancing towards more connected data ecosystems, in which research can draw on integrated data sources rather than isolated systems.
A strategic project for the digitalisation of science
The project is aligned with European strategies for the digital transformation of strategic sectors, which aim to foster the creation of data spaces, facilitate federated data sharing and strengthen scientific and industrial competitiveness. Alejo Santolino of CESGA summed up this objective with a key idea: “Isolated data describes the past, but when we connect it, we create the future.”
SCI-DATA has been developed in collaboration with CESGA, the organisation behind the One Health DATAlife Multisectoral Data Space Demonstrator, and with the involvement of other academic, industrial and technology organisations, including Gradiant, Optimal and BIOGA.
The project has also received funding from the Spanish Ministry for Digital Transformation and Public Administration under the Recovery, Transformation and Resilience Plan — funded by the European Union through NextGenerationEU — as part of the 2024 call for grants in the field of digitalisation, aimed at supporting the digital transformation of strategic productive sectors through the development of technological products and services for data spaces.
