Taxa Information

The taxa_info/ dataset is the taxonomic backbone of the Pristine Seas Science Database. It provides standardized species names, ecological traits, and functional classifications for all organisms recorded across expeditions—enabling consistent identification, trait-based analysis, and cross-method integration.

Why This Matters

Without standardized taxonomy, the same species recorded as “Epinephelus polyphekadion” in one expedition and “Camouflage grouper” in another cannot be compared. This dataset solves that problem—and adds the ecological context needed for conservation science.


Three Essential Functions

Identity Resolution

Maps field identifications to accepted scientific names using authoritative sources (WoRMS, Coral Traits Database, FishBase). Handles synonyms, taxonomic revisions, and regional naming variations.

Ecological Context

Assigns trophic groups, functional roles, and life history traits. Enables analyses like “total predator biomass” or “herbivore diversity” without manual trait lookups.

Data Integration

Provides the join key (accepted_aphia_id) that links observations across all survey methods to a single, authoritative taxonomic reference.


Dataset Structure

Each major taxonomic group maintains its own table, tailored to group-specific traits while following a consistent core schema:

taxa_info/
├── fish              # Reef and pelagic fishes (length-weight, trophic group)
├── benthos           # Sessile taxa (corals, algae, sponges, CCA)
├── inverts           # Mobile and sessile invertebrates (echinoderms, mollusks)
└── uvs_fish_codes    # Field code translator for UVS surveys

Core Schema

All taxa_info tables share these foundational fields:

Field Description
accepted_aphia_id WoRMS AphiaID — the canonical taxonomic key
accepted_name Valid scientific name (Genus species)
rank Taxonomic rank (Species, Genus, Family)
kingdomfamily Full taxonomic hierarchy
common_name Vernacular name(s) when widely used
trophic_group / functional_group Ecological classification (group-specific)

Additional traits vary by group (e.g., length-weight parameters for fish, growth forms for corals).


Key Principles

WoRMS as Foundation

All taxa are anchored to the World Register of Marine Species (WoRMS) via accepted_aphia_id. This ensures:

  • Global interoperability with other marine datasets
  • Automatic updates as taxonomy evolves
  • Synonym resolution across field records and literature

Expert Curation

While WoRMS provides the taxonomic backbone, ecological traits are curated from:

  • Peer-reviewed literature (e.g., Parravicini et al. 2020 for fish trophic groups)
  • Regional expert knowledge (e.g., Pacific giant clam taxonomy)
  • Field validation by expedition scientists

Hierarchy and Aggregation

Observations at higher taxonomic ranks (genus, family) are explicitly flagged via the rank field. This supports flexible aggregation:

  • Species-level analysis when identification is certain
  • Genus-level summaries when needed for incomplete IDs
  • Family-level trends for broad ecological patterns

Table Documentation

Detailed schemas, trait sources, and field definitions for each taxonomic group:

  • Fish — Reef and pelagic fishes with length-weight parameters
  • Benthos — Corals, algae, sponges, and other sessile organisms
  • Invertebrates — Mobile and sessile invertebrates
  • UVS Fish Codes — Field code mapping for underwater visual surveys