Sequence Databases

Authors: Paul Rangel
Abstract: DNA and Protein sequence databases are the cornerstone of bioinformatics research. DNA databases such as GenBank and EMBL accept genome data from sequencing projects around the world and make it available for researchers via the internet. In a similar fashion protein sequence databases are to protein sequences what GenBank and EMBL are to nucleotide sequences. They are the central location of protein sequence data submissions. PIR's Protein Sequence Database (PSD) and SWISS-PROT are the two main databases. They provide a variety of ways to query the data and bioinformatics analysis tools to help facilitate genetic research. The underlying organization of these databases has shaped the way computer-based molecular biology research is conducted. This chapter will build an understanding of sequence databases by reviewing data storage, common tools and online resources pertaining to these resources.
