Automatically constructing a directory of molecular biology databases

Luciano Barbosa, Sumit Tandon, Juliana Freire

Research output: Chapter in Book/Report/Conference proceedingConference contribution


There has been an explosion in the volume of biology-related information that is available in online databases. But finding the right information can be challenging. Not only is this information spread over multiple sources, but often, it is hidden behind form interfaces of online databases. There are several ongoing efforts that aim to simplify the process of finding, integrating and exploring these data. However, existing approaches are not scalable, and require substantial manual input. Notable examples include the NCBI databases and the NAR database compilation. As an important step towards a scalable solution to this problem, we describe a new infrastructure that automates, to a large extent, the process of locating and organizing online databases. We show how this infrastructure can be used to automate the construction and maintenance of a Molecular Biology database collection. We also provide an evaluation which shows that the infrastructure is scalable and effective-it is able to efficiently locate and accurately identify the relevant online databases.

Original languageEnglish (US)
Title of host publicationData Integration in the Life Sciences - 4th International Workshop, DILS 2007, Proceedings
PublisherSpringer Verlag
Number of pages11
ISBN (Print)3540732543, 9783540732549
StatePublished - 2007
Event4th International Workshop on Data Integration in the Life Sciences, DILS 2007 - Philadelphia, PA, United States
Duration: Jun 27 2007Jun 29 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4544 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other4th International Workshop on Data Integration in the Life Sciences, DILS 2007
Country/TerritoryUnited States
CityPhiladelphia, PA

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Automatically constructing a directory of molecular biology databases'. Together they form a unique fingerprint.

Cite this