What is INTERSECT?
The Interconnected Science Ecosystem (INTERSECT) connects scientific instruments and robot-controlled laboratories with computing and data resources at the edge, the Cloud or the high-performance computing center to enable autonomous experiments, self-driving laboratories, smart manufacturing, and artificial intelligence driven design, discovery and evaluation. It consists of a novel architecture with science use case design patterns, a system of systems architecture, and a microservice architecture. A Software Development Kit (SDK) implements this architecture.
What is the INTERSECT Scientific Data Layer (SDL)?
The INTERSECT Scientific Data Layer (SDL) is a system-of-systems architecture that enables seamless scientific data management using semantic Web technologies. Built with a microservices approach and linked data principles, the SDL empowers researchers, developers, and institutions to manage, explore, and interlink scientific knowledge with precision and purpose.
Why is scientific data management challenging?
Modern scientific research generates vast, heterogeneous, and often siloed datasets. These datasets vary in structure, format, and provenance, making them hard to:
- Discover and access across domains
- Reuse or reproduce reliably
- Integrate into collaborative workflows
- Annotate with consistent metadata
- Track across evolving experiments or publications
Traditional data repositories often fail to meet the needs of interdisciplinary teams working with complex data life cycles and distributed infrastructures.
How does the SDL improve scientific data management
The SDL introduces a semantic foundation for managing scientific data:
- Linked Data Every resource is URI-addressable, typed, and queryable
- OntologiesThe SDL supports domain-specific vocabularies (e.g., Data Catalog Vocabulary (DCAT), Semantic Sensor Network Ontology (SSN), PROV Ontology (PROV-O), Document Components Ontology (DoCO) for meaningful classification.
- Modular Microservices Each capability (cataloging, storage, provenance, content management) is handled by a dedicated service.
- Workspace Model Researchers organize content semantically within shareable, customizable spaces.
- Dynamic User Interface Interfaces render data and metadata based on RDF types, enabling flexible, reusable tools.
These features bridge the gap between raw data and curated knowledge.
Supporting Findable, Accessible, Interoperable, and Reusable (FAIR) Data Principles
The SDL is designed to support the FAIR data principles:
- Findable: Rich metadata with persistent Uniform Resource Identifiers (URIs), indexed by class and context.
- Accessible Content is served through HTTP using standard RDF serializations and application programming interfaces (APIs).
- Interoperable Data is aligned with shared ontologies and stored in machine-readable formats.
- Reusable Provenance, annotations, and versioning support reproducibility and long-term value.
By embedding FAIR principles into the platform's architecture, the SDL enables more collaborative, transparent, and durable science.
Frontend Overview
The SDL frontend is a modern, workspace-driven interface built with SvelteKit. Inspired by tools like Notion, it supports semantic blocks, multiple workspaces, and dynamic user interface rendering based on RDF data types. It integrates Storybook for component documentation and supports visual workflows and ontology-guided interfaces.
Backend Overview
The backend architecture is composed of Python microservices following the Linked Data Platform (LDP) standard. Key services include a catalog, registry, repository, storage, and workspace service. It supports RDF, JSON-LD, and Turtle, and utilizes Blazegraph, MinIO, and SQL/NoSQL stores.