Explore open access research and scholarly works from NERC Open Research Archive

Advanced Search

The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools

Wilke, Andreas; Harrison, Travis; Wilkening, Jared; Field, Dawn; Glass, Elizabeth M.; Kyrpides, Nikos; Mavrommatis, Konstantinos; Meyer, Folker. 2012 The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools. BMC Bioinformatics, 13 (1). 141. 10.1186/1471-2105-13-141

Abstract
Background Computing of sequence similarity results is becoming a limiting factor in metagenome analysis. Sequence similarity search results encoded in an open, exchangeable format have the potential to limit the needs for computational reanalysis of these data sets. A prerequisite for sharing of similarity results is a common reference. Description We introduce a mechanism for automatically maintaining a comprehensive, non-redundant protein database and for creating a quarterly release of this resource. In addition, we present tools for translating similarity searches into many annotation namespaces, e.g. KEGG or NCBI's GenBank. Conclusions The data and tools we present allow the creation of multiple result sets using a single computation, permitting computational results to be shared between groups for large sequence data sets.
Documents
Full text not available from this repository.
Information
Programmes:
UNSPECIFIED
Library
Metrics

Altmetric Badge

Dimensions Badge

Share
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email
View Item