Environmental DNA (eDNA)

Overview

The edna/ dataset contains all environmental DNA (eDNA) data collected during Pristine Seas expeditions. eDNA sampling provides a non-invasive method to assess marine biodiversity by detecting genetic material shed by organisms into the water column.

The eDNA protocol involves collecting water samples at standardized depths and locations, filtering them to capture genetic material, and preserving the filters for subsequent laboratory analysis. This approach enables detection of species that may be difficult to observe through traditional visual methods, including cryptic, rare, or elusive taxa.

The dataset structure captures the hierarchical nature of eDNA sampling:

  • Sites: Deployment locations where eDNA collection occurred
  • Stations: Depth-stratified sampling units within each site
  • Samples: Individual water samples with specific collection and processing metadata
  • Detections: Species identifications from laboratory analysis (future implementation)
edna/
├── sites         # Deployment-level metadata for eDNA collection locations
├── stations      # Depth-stratified sampling metadata 
├── samples       # Individual water sample collection and processing data
└── detections    # Species detections from laboratory analysis (future)

Tables

Sites (edna.sites)

The edna.sites table contains one row per environmental DNA (eDNA) sampling site. Each site represents a distinct point in space and time where eDNA collection was deployed, serving as the primary spatial unit for eDNA fieldwork. Within a site, multiple water samples may be collected across different depth strata, recorded in the corresponding edna.stations table.

This table stores site-level metadata for all eDNA collection deployments, typically coordinated with other survey methods like UVS or BRUVS. It follows the core site schema shared across all methods and adds eDNA-specific fields including target taxa, coordination with paired survey methods through paired_site_id, and site documentation. The table captures spatial metadata, sampling platform, habitat descriptors, and effort summaries.

Table 1: Schema for edna.sites: eDNA collection site metadata
Field Type Required Description
ps_site_id STRING true Unique site ID (exp_id_method_###), e.g., PNG_2024_edna_001
exp_id STRING true Foreign key to expeditions.info
method STRING true Field method used: edna
region STRING true Broad geographic or administrative unit
subregion STRING true Intermediate feature within the region
locality STRING false Local named feature such as a village, bay, or reef
date DATE true Date of eDNA collection in ISO 8601 format (YYYY-MM-DD)
time TIME true Local time of collection start in 24-hour format (HH:MM:SS)
latitude FLOAT true Site latitude (decimal degrees, WGS84)
longitude FLOAT true Site longitude (decimal degrees, WGS84)
team_lead STRING true Lead scientist or team leader for the eDNA collection
notes STRING false Optional site-level comments or observations
habitat STRING true Dominant habitat type (e.g., fore_reef, lagoon)
exposure STRING true Wave and wind exposure (e.g., windward, leeward, lagoon).
bottom_type STRING false Substrate type (e.g., rubble, consolidated reef, spur and groove)
target STRING true Primary taxonomic target for eDNA analysis (e.g., fish, sharks, all_taxa)
paired_site_id STRING false ID of coordinated survey site (e.g., UVS or BRUVS site at same location)
n_stations INTEGER true Number of depth-stratified stations sampled at this site
n_samples INTEGER true Total number of individual water samples collected
water_l FLOAT true Total volume of water filtered across samples (L)
site_photos BOOLEAN false TRUE if photo documentation exists for this site
eDNA-Specific Fields

target — Indicates the primary taxonomic focus for laboratory analysis:

  • fish — General fish diversity assessment
  • sharks — Specialized primers for elasmobranch detection
  • all_taxa — Broad taxonomic coverage using universal primers
  • specific_taxa — Targeted detection of particular species or groups

paired_site_id — Links eDNA sites to paired visual underwater surveys or BRUVS deployments.


Stations (edna.stations)

This table contains metadata for depth-stratified eDNA sampling stations. Each row represents a discrete sampling unit within a site, corresponding to a specific depth stratum where multiple replicate samples were collected.

Stations follow the standard Pristine Seas depth stratification and include summary metrics about collection effort. Key spatial and site metadata are denormalized here to minimize joins for common analyses.

Table 2: Schema for edna.stations: depth-stratified eDNA sampling metadata
Field Type Required Description
ps_station_id STRING true Unique station ID (ps_site_id_depth), e.g., PNG_2024_edna_001_10m
ps_site_id STRING true Foreign key to edna.sites
exp_id STRING true Expedition ID (ISO3_YEAR) - denormalized for performance
region STRING true Region name - from edna.sites
subregion STRING true Subregion name - from edna.sites
locality STRING false Locality name - from edna.sites
habitat STRING true Habitat type - from edna.sites
collectors STRING true Team members who collected samples (pipe-delimited)
depth_m FLOAT true Target sampling depth (meters)
depth_strata STRING true Depth category: surface,supershallow, shallow, or deep
paired_site_id STRING false ID of coordinated survey site from uvs or pelagic bruvs
filter_type STRING true Primary filter type used (e.g., Sterivex)
filter_size_um FLOAT true Filter pore size in micrometers
target STRING false Primary target taxonomic group
n_replicates INTEGER true Number of replicate samples collected
total_water_l FLOAT true Total volume of water filtered across replicates (L)
collection_date DATE true Date of sample collection - denormalized for queries
notes STRING false Station-level comments or processing notes

Samples (edna.samples)

This table contains detailed metadata for individual water samples. Each row represents a single replicate sample with specific collection parameters, processing details, and preservation information. Key station and site metadata are denormalized to enable efficient querying without multiple joins.

Table 3: Schema for edna.samples: individual water sample metadata
Field Type Required Description
ps_sample_id STRING true Unique sample ID (ps_station_id_r###), e.g., PNG_2024_edna_001_10m_r001
ps_station_id STRING true Foreign key to edna.stations
ps_site_id STRING true Foreign key to edna.sites - denormalized for performance
exp_id STRING true Expedition ID (ISO3_YEAR) - denormalized for performance
region STRING true Region name - denormalized from sites
subregion STRING true Subregion name - denormalized from sites
habitat STRING true Habitat type - denormalized from sites
collection_date DATE true Date of sample collection - denormalized for temporal queries
replicate STRING true Replicate number within the station (e.g., 001, 002)
depth_strata STRING true Depth category - denormalized from station
depth_m FLOAT true Exact sampling depth for this replicate (meters)
water_liters FLOAT true Volume of water filtered for this sample (L)
filter_type STRING true Type of filter used for this sample
filter_size_um FLOAT true Filter pore size in micrometers
preservative STRING false Preservation method (e.g., ethanol, frozen, RNAlater)
filter_date DATE true Date when filtration was performed
filter_time TIME false Time when filtration started
target STRING false Target taxonomic group for laboratory analysis
notes STRING false Sample-specific comments or processing notes