Data shopping in an open marketplace: introducing the Ontogrator web application for marking up data using ontologies and browsing using facets

Morrison, Norman; Hancock, David; Hirschman, Lynette; Dawyndt, Peter; Verslyppe, Bert; Kyrpides, Nikos; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver; Grethe, Jeff; Booth, Tim; Sterk, Peter; Nenadic, Goran; Field, Dawn. 2011 Data shopping in an open marketplace: introducing the Ontogrator web application for marking up data using ontologies and browsing using facets. Standards in Genomic Sciences, 4 (2). 286-292.

Full text not available from this repository.


In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present hete-rogeneous product lines to buyers using faceted browsing. Using this approach we have de-veloped the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assem-ble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully depen-dent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underly-ing quality of annotation and could lead to a systematic means for accessing integrated data resources.

Item Type: Publication - Article
Digital Object Identifier (DOI):
Programmes: CEH Topics & Objectives 2009 - 2012 > Biodiversity > BD Topic 1 - Observations, Patterns, and Predictions for Biodiversity > BD - 1.1 - Standards for data collection, quality, management and integration ...
UKCEH and CEH Sections/Science Areas: Hails
ISSN: 1944-3277
Additional Information. Not used in RCUK Gateway to Research.: Open Access article - click on the Official URL link for full text
NORA Subject Terms: Data and Information
Date made live: 07 Sep 2011 13:28 +0 (UTC)

Actions (login required)

View Item View Item

Document Downloads

Downloads for past 30 days

Downloads per month over past year

More statistics for this item...