
Pristine Seas Science Database
The Pristine Seas Science Database is a centralized system for ecological data collected across more than a decade of scientific expeditions led by National Geographic Pristine Seas. It enables high-integrity, reproducible research on marine biodiversity and informs global ocean conservation policy.
40+ expeditions • All major ocean basins • Multiple survey methods • Standardized taxonomy • Analysis-ready
Why This Matters
Conservation begins with knowledge. The Pristine Seas Science Database is a globally unique resource that answers foundational questions about ocean health, biodiversity, and ecosystem change.
Spanning all major ocean basins across 40+ expeditions, it brings together standardized data from Arctic fjords to tropical reefs—from coastal bays to deep-sea trenches and pelagic zones. Its strength lies in breadth and integration: seabird surveys to submersible dives, eDNA to benthic cover—all unified by shared spatial and taxonomic frameworks.
This database enables:
- Fast, reproducible analysis across sites, regions, and years
- Cross-method synthesis through modular yet consistent structure
- Scalable science that informs conservation, policy, and decision-making
- Global comparisons of pristine and impacted marine ecosystems
Key Features
🔬 Modular by Method
Each survey protocol—from reef fish and benthic cover to eDNA, BRUVS, and submersibles—maintains its own standardized schema while integrating seamlessly with the whole.
🌍 Spatially Anchored
Built on a hierarchical spatial model: expedition → region → subregion → site → station. This structure enables robust spatial integration and filtering across all datasets.
🐟 Taxonomically Standardized
Centralized taxonomy with harmonized species names, ecological traits, and functional groups. Based on WoRMS with expert curation for regional accuracy.
⚡ Analysis-Ready
Tidy-format tables, clear join keys, and native organization in Google BigQuery make the system efficient for large-scale ecological analyses.
🤝 Built for Collaboration
Transparent, well-documented, and modular. Designed for reuse, extension, and shared scientific workflows across institutions.
Getting Started
Common Applications
From baseline assessments to policy-relevant metrics, this database supports research that matters:
Community Ecology
- Biodiversity baselines in remote ecosystems
- Body size distributions and fishing impacts
- Food web structure and trophic cascades
Conservation Science
- Shark and apex predator abundance
- Reef health and resilience indicators
- MPA performance and recovery trajectories
Comparative Analysis
- Pristine vs. impacted ecosystem contrasts
- Depth gradient patterns across oceans
- Cross-basin biodiversity synthesis
FAIR Data Principles
Built to global standards for open science:
Findable
Persistent identifiers and standardized metadata
Accessible
Cloud-hosted with SQL query access
Interoperable
Tidy data, SI units, ISO standards
Reusable
Documented, versioned, reproducible
Technical Foundation
Built on proven tools for scalable, reproducible marine science:
Infrastructure
- Google BigQuery — Petabyte-scale data warehouse
- WoRMS — Authoritative taxonomic backbone
- GitHub — Open-source code and workflows
Analysis Stack
- R/Tidyverse — Statistical computing and data pipelines
- SQL — Efficient queries across millions of records
- Quarto — Literate programming and documentation
Connect With Us
Questions about the data? Reach out to the Pristine Seas science team at marine.data.science@ngs.org
Ready to collaborate? We partner with researchers, institutions, and conservation organizations worldwide. Let’s discuss how this database can support your work.
Explore the code: Visit our GitHub repository for data pipelines, analysis templates, and quality control workflows.