There are thousands online bioinformatics databases available on the Internet.
The best way to find a database that you are interested in is to look through
the list of NAR Database Issue or from the online journal Database.
List of databases
- The web portal to browse and search biological databases developed by
Beijing Institute of Genomics.
NAR Database Issue
- The Journal Nucleic Acids Research publishes a Database Issue on the 1st
January each year.
- The online Open Access Journal of Biological Databases and Curation.
Databases from China
- Databases of agricultural information at the Chinese Academy of Agricultural Sciences.
- The web portal to the NIH genetic sequence database maintained by NCBI,
also a part of the International Nucleotide Database Collaboration.
Literature citation, release notes and an example record can be found in this
- The web portal to EMBL nucleotide sequence database maintained by EBI, also
a part of the International Nucleotide Database Collaboration. Various
documentations such as release notes, database statistics, user guide, feature
table definition and sample entry and FAQs are provided.
- A data repository for omics data maintained by the Beijing Institute of
Genomics (BIG), Chinese Academy of Sciences (CAS), serving as a primary
archive of genome sequencing data.
- The Reference Sequence collection constructed by NCBI to provide a
comprehensive, integrated, non-redundant set of DNA, RNA sequences and protein
products. It provides a stable reference for genome annotation, gene
identification and characterization, mutation and polymorphism analysis,
expression studies and comparative analyses.
- An Organized View of the Transcriptome created by NCBI. Each UniGene
entry is a set of transcript sequences that appear to come from the same
transcription locus, together with information on protein similarities,
gene expression, cDNA clone reagents, and genomic location.
- The database of single nucleotide polymorphism maintained by NCBI.
- The main web site for international protein sequence database which
consists of the protein knowledgebase (UniProtKB), the sequence clusters
(UniRef) and the sequence archive (UniParc).
- The main repository of macromolecular structures maintained by the Research
Collaboration for Structural Bioinformatics.
- The entry point for the EBI macromolecular structure database.
- The PDB summary database maintained by NCBI.
- The macromolecular database maintained by NCBI.
- The biological magnetic resonance data bank maintained at University of
- The structural biology knowledgebase maintained by the Protien Structure
- The database of comparative protein structure models developed and
maintained at University of California, San Francisco.
- The database of Structure Classification of Proteins developed and
maintained by Cambridge University.
- The database of Calcification, Architecture, Topology and Homologous
superfamily developed and maintained by University College, London.
Protein function and domain databases
- A database of protein domains, families and functional sites, created and
maintained by the Swiss Institute of Bioinformatics.
- A database of protein fingerprints consisting of conserved motifs within
a protein family, created and maintained by Manchester University, UK.
BLOCKS - A database of multiply aligned
ungapped segments corresponding to the most highly conserved regions of
proteins, created and maintained by the Fred Hutchinson Cancer Research
- A database of conserved protein domains created and maintained by the
NCBI structure group.
- A database of comprehensive set of protein domain families automatically
generated from the SWISS-PROT and TrEMBL sequence databases, developed and
maintained by the University Claude Bernard, France.
- A web site for the the human protein atlas which shows expression and
localization of proteins in a large variety of normal human tissues, cancer
cells and cell lines with the aid of immunohistochemistry images, developed
and maintained y Proteome Resource Center, Sweden.
- The Pfam database is a large collection of protein families, each represented
by multiple sequence alignments and hidden Markov models (HMMs).
- The Rfam database is a collection of RNA families, each represented by multiple
sequence alignments, consensus secondary structures and covariance models (CMs).
- The Dfam database is a collection of Repetitive DNA element sequence alignments,
hidden Markov models (HMMs) and matches lists for complete Eukaryote genomes.
- A database composed of phylogenetic trees inferred from animal genomes. It
provides orthology/parology predictions as well the evolutionary history of
- The iPfam database is a catalog of protein family interactions, including
domain and ligand interactions, calculated from known structures.
- An open-source, open access, manually curated and peer-reviewed pathway database.
- The REACTOME mirror at the National Center for Protein Science, Beijing (NCPSB).
- The Plant REACTOME at Gramene.
- Kyoto Encyclopedia of Genes and Genomes.
Genome databases and genome browsers
- The web server of the European eukaryotic genome resource developed by EBI
and the Sanger Institute.
UCSC Genome Information
- The genome browser website containing the reference sequence and working
draft assemblies for a large collection of genomes at the University of
California at Santa Cruz (UCSC), originally known as GoldenPath.
NCBI Map Viewer
- The The NCBI genomic map viewer for the visualization of completed and
ongoing genome sequence.
- The entry portal to various NCBI genomic biology tools and resources,
including the Map Viewer, the Genome Project Database and the Plant Genomes
NCBI Genome Information
- The NCBI genomic information table lists the general information of genomes
for all species.
- A comprehensive suite of programs and databases for comparative analysis of
- Genomes Online Database, a comprehensive information resource for complete
and ongoing genome sequencing projects with flowcharts and tables of
Database of Model Organism
- The international database resource for the laboratory mouse,
providing integrated genetic, genomic, and biological data to
facilitate the study of human health and disease.
- The Rat Genome Database at the Wisconsin University, to collect,
consolidate, and integrate data generated from ongoing rat genetic
and genomic research.
- The Aferican clawed frog Xenopus laevis
and Xenopus tropicalis biology and genomics resource.
- The Zebrafish International Resource Center.
- A comprehensive database of drosophila genes and genomes maintained
by Indiana University.
- The biology and genome resource of the Caenorhabditis elegans
- The Saccharomyces Genome database.
Plant Genome Databases
- A tool for green plant comparative genomics, maintained by the Center for
Integrative Genomics, Joint Genome Institute.
- A curated open-source data resource for comparative genome analysis in the
grasses including rice, maize, wheat, barley, sorghum etc, as well as other
plants including arabidopsis, poplar and grape. Cross-species homology
relationships can be found using information derived from genomic and EST
sequencing, protein structure and function analysis, genetic and physical
mapping, interpretation of biochemical pathways, gene and QTL localization
and descriptions of phenotypic characters and mutations.
- The Arabidopsis information resource maintained by Stanford University.
It includes the complete genome sequence along with gene structure, gene
product information, metabolism, gene expression, DNA and seed stocks,
genome maps, genetic and physical markers, publications, and information
about the Arabidopsis research community.
- Araport is a one-stop-shop for Arabidopsis thaliana genomics. Araport offers
gene and protein reports with orthology, expression, interactions and the
latest annotation, plus analysis tools, community apps, and web services.
Araport is 100% free and open-source. Registered members can save their
analysis, publish science apps, and post announcements.
- A rice knowledgebase for data integration through community-contributed
modules, integrating data from remote resources through web APIs and featuring
collaborative integration of rice data from multiple committed modules and low
costs for database update and maintenance.
- A comprehensive rice science database maintained by National Institute
of Genetics, Japan. It contains genetic resource stock information, gene
dictionary, chromosome maps, mutant images and fundamental knowledge of
- The community database for biological information about the crop plant
Zea mays ssp. mays, with genetic, genomic, sequence, gene
product, functional characterization, literature reference.
- Integrating Genetics and Molecular Biology for Soybean Researchers.
- A collection of data resource of the Solanaceae species including
tomoto, potato, peppper, eggplant, petunia, nicotiana.
- The web portal for the International Cucurbit Genomics Initiative
including melon, cucumber, watermelen, pumpkin, etc.
Bacrerial Genome Databases
- the Bacterial Bioinformatics Resource Center, an information system
designed to support the biomedical research community¡¯s work on bacterial
infectious diseases via integration of vital pathogen information with rich
data and analysis tools.
- The bacterial genome database maintained at the Pasteur Institute.
- The genome database for cyanobacteria developed by Kazusa Institute, Japan.
Virus Genome Databases
- the main page of NCBI viral genome information resource.
- Global Initiative on Sharing Avian Influenza Data.
- A database for human and animal influenza virus.
- NCBI Influenza Virus Resource with influenza genomic data and analysis
- This site provides a central source of information about viruses, viroids
and satellites of plants, fungi and protozoa.