nerc.ac.uk

Optimizing plankton image classification with metadata-enhanced representation learning

Masoudi, Mojtaba; Giering, Sarah L.C. ORCID: https://orcid.org/0000-0002-3090-1876; Eftekhari, Noushin; Massot-Campos, Miquel; Irisson, Jean-Olivier; Thornton, Blair. 2024 Optimizing plankton image classification with metadata-enhanced representation learning. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 1-18. https://doi.org/10.1109/JSTARS.2024.3424498

Full text not available from this repository.

Abstract/Summary

Automated camera-based sensors are widely used in vessel-based research to monitor plankton and marine particles. However, current methods suffer from the costly and time-consuming requirement of annotating data for fully supervised learning, especially in plankton grouping tasks characterized by long-tailed datasets. In response, we propose a novel self-supervised learning (SSL) framework that significantly reduces reliance on expensive human annotations by leveraging crucial metadata such as water depth and location. The method comprises three major steps: self-supervised training, innovative sampling, and final classification. It identifies key sample subsets from an unlabelled dataset using hierarchical clustering approach and incorporates an innovative balancing representative subsampling strategy that addresses the challenge of dataset imbalance and enhances generalisability across diverse plankton classes. Our approach prioritises discerning representation features observed in images that exhibit correlations with the patterns found in their associated metadata. Furthermore, our method introduce a novel grouping based on visual perspective selection method, enabling the identification of balanced subset views that depart from traditional class-based categorisation. Our experimental results showcase a significant enhancement in image classification accuracy, with a 23% improvement over methods that do not utilise metadata, and attains a macro F1-score of 54% for 10 populated species from a severely long-tailed dataset. This is achieved with a mere 0.3% of the entire dataset used for annotation.

Item Type: Publication - Article
Digital Object Identifier (DOI): https://doi.org/10.1109/JSTARS.2024.3424498
ISSN: 1939-1404
Date made live: 22 Jul 2024 14:23 +0 (UTC)
URI: https://nora.nerc.ac.uk/id/eprint/537742

Actions (login required)

View Item View Item

Document Downloads

Downloads for past 30 days

Downloads per month over past year

More statistics for this item...