Nucleic Acids Res; co-auth.: I.Xenarios

Nucleic Acids Res. 2012 Nov 27. [Epub ahead of print]

HAMAP in 2013, new developments in the protein family classification and annotation system.

Source

Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland, The EMBL Outstation, The European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven St NW, Suite 1200, Washington, DC 20007 and University of Delaware, 15 Innovation Way, Suite 205, Newark, DE 19711, USA.

Abstract

HAMAP (High-quality Automated and Manual Annotation of Proteins-available at http://hamap.expasy.org/) is a system for the classification and annotation of protein sequences. It consists of a collection of manually curated family profiles for protein classification, and associated annotation rules that specify annotations that apply to family members. HAMAP was originally developed to support the manual curation of UniProtKB/Swiss-Prot records describing microbial proteins. Here we describe new developments in HAMAP, including the extension of HAMAP to eukaryotic proteins, the use of HAMAP in the automated annotation of UniProtKB/TrEMBL, providing high-quality annotation for millions of protein sequences, and the future integration of HAMAP into a unified system for UniProtKB annotation, UniRule. HAMAP is continuously updated by expert curators with new family profiles and annotation rules as new protein families are characterized. The collection of HAMAP family classification profiles and annotation rules can be browsed and viewed on the HAMAP website, which also provides an interface to scan user sequences against HAMAP profiles.