Pristine Seas Science Database
The Pristine Seas Science Database is a centralized system for ecological data collected across more than a decade of scientific expeditions led by National Geographic Pristine Seas. It enables high-integrity, reproducible research on marine biodiversity and informs global ocean conservation policy.
Why This Matters
Conservation begins with knowledge. The Pristine Seas Science Database is a globally unique resource that answers foundational questions about ocean health, biodiversity, and ecosystem change.
Spanning all major ocean basins across 40+ expeditions, it brings together standardized data from Arctic fjords to tropical reefs — from coastal bays to deep-sea trenches and pelagic zones. Its strength lies in breadth and integration: seabird surveys to submersible dives, eDNA to benthic cover — all unified by shared spatial and taxonomic frameworks.
The system enables:
- Fast, reproducible analysis across sites, regions, and years
- Cross-method synthesis through modular yet consistent structure
- Scalable science that informs conservation, policy, and decision-making
Key Features
Modular by Method
Each survey protocol — from reef fish and benthic cover to eDNA, BRUVS, and submersibles — maintains its own standardized schema while integrating seamlessly with the whole.
Spatially Anchored
Built on a hierarchical spatial model: expedition → region → subregion → site → station
. This structure enables robust spatial integration and filtering across all datasets.
Taxonomically Standardized
Centralized taxonomy with harmonized species names, ecological traits, and functional groups. Based on WoRMS with expert curation for regional accuracy.
Analysis-Ready
Tidy-format tables, clear join keys, and native organization in Google BigQuery make the system efficient for large-scale ecological analyses.
Built for Collaboration
Transparent, well-documented, and modular. Designed for reuse, extension, and shared scientific workflows across institutions.
Getting Started
- Browse the documentation using the sidebar navigation
- Start with Architecture to understand the system structure
- Explore Method Datasets for specific survey protocols
- Review Taxonomy for species standardization
- Query data via the BigQuery Console
Common Applications
- Species richness and diversity analysis
- Biomass assessments by trophic group
- Temporal trends in coral cover
- Multi-method data integration
- Conservation priority identification
FAIR Data Principles
The database adheres to FAIR data principles, ensuring all records are:
Findable
Unique identifiers (ps_site_id
, aphia_id
) and rich metadata enable discovery and indexing
Accessible
Hosted in Google BigQuery with open protocols and tools for querying and download
Interoperable
Tidy data principles, SI units, ISO 8601 dates, and controlled vocabularies
Reusable
Comprehensive documentation, versioning, and modular design support transparency and replication
Technical Foundation
The database is built on:
- Google BigQuery for scalable data storage and querying
- R/RStudio for data processing and analysis
- Quarto for documentation
- GitHub for version control
- WoRMS API for taxonomic standardization
Learn More
For questions about the database, contact the Pristine Seas data team or explore our GitHub repository.