nerc.ac.uk

Habitat-Lite: a GSC case study based on free text terms for environmental metadata

Hirschman, Lynette; Clark, Cheryl; Cohen, K. Bretonnel; Mardis, Scott; Luciano, Joanne; Kottmann, Renzo; Cole, James; Markowitz, Victor; Kyrpides, Nikos; Morrison, Norman; Schriml, Lynn M.; Field, Dawn. 2008 Habitat-Lite: a GSC case study based on free text terms for environmental metadata. OMICS: A Journal of Integrative Biology, 12 (2). 129-136. https://doi.org/10.1089/omi.2008.0016

Full text not available from this repository.

Abstract/Summary

There is an urgent need to capture metadata on the rapidly growing number of genomic, metagenomic and related sequences, such as 16S ribosomal genes. This need is a major focus within the Genomic Standards Consortium (GSC), and Habitat is a key metadata descriptor in the proposed “Minimum Information about a Genome Sequence” (MIGS) specification. The goal of the work described here is to provide a light-weight, easy-to-use (small) set of terms (“Habitat-Lite”) that captures high-level information about habitat while preserving a mapping to the recently launched Environment Ontology (EnvO). Our motivation for building Habitat-Lite is to meet the needs of multiple users, such as annotators curating these data, database providers hosting the data, and biologists and bioinformaticians alike who need to search and employ such data in comparative analyses. Here, we report a case study based on semiautomated identification of terms from GenBank and GOLD. We estimate that the terms in the initial version of Habitat-Lite would provide useful labels for over 60% of the kinds of information found in the GenBank isolation_source field, and around 85% of the terms in the GOLD habitat field. We present a revised version of Habitat-Lite defined within the EnvO Environmental Ontology through a new category, EnvO-Lite-GSC. We invite the community's feedback on its further development to provide a minimum list of terms to capture high-level habitat information and to provide classification bins needed for future studies.

Item Type: Publication - Article
Digital Object Identifier (DOI): https://doi.org/10.1089/omi.2008.0016
Programmes: CEH Programmes pre-2009 publications > Biodiversity
UKCEH and CEH Sections/Science Areas: Hails
NORA Subject Terms: Biology and Microbiology
Data and Information
Date made live: 12 Feb 2009 14:34 +0 (UTC)
URI: https://nora.nerc.ac.uk/id/eprint/5608

Actions (login required)

View Item View Item

Document Downloads

Downloads for past 30 days

Downloads per month over past year

More statistics for this item...