TY - GEN
T1 - Automatically constructing a directory of molecular biology databases
AU - Barbosa, Luciano
AU - Tandon, Sumit
AU - Freire, Juliana
PY - 2007
Y1 - 2007
N2 - There has been an explosion in the volume of biology-related information that is available in online databases. But finding the right information can be challenging. Not only is this information spread over multiple sources, but often, it is hidden behind form interfaces of online databases. There are several ongoing efforts that aim to simplify the process of finding, integrating and exploring these data. However, existing approaches are not scalable, and require substantial manual input. Notable examples include the NCBI databases and the NAR database compilation. As an important step towards a scalable solution to this problem, we describe a new infrastructure that automates, to a large extent, the process of locating and organizing online databases. We show how this infrastructure can be used to automate the construction and maintenance of a Molecular Biology database collection. We also provide an evaluation which shows that the infrastructure is scalable and effective-it is able to efficiently locate and accurately identify the relevant online databases.
AB - There has been an explosion in the volume of biology-related information that is available in online databases. But finding the right information can be challenging. Not only is this information spread over multiple sources, but often, it is hidden behind form interfaces of online databases. There are several ongoing efforts that aim to simplify the process of finding, integrating and exploring these data. However, existing approaches are not scalable, and require substantial manual input. Notable examples include the NCBI databases and the NAR database compilation. As an important step towards a scalable solution to this problem, we describe a new infrastructure that automates, to a large extent, the process of locating and organizing online databases. We show how this infrastructure can be used to automate the construction and maintenance of a Molecular Biology database collection. We also provide an evaluation which shows that the infrastructure is scalable and effective-it is able to efficiently locate and accurately identify the relevant online databases.
UR - http://www.scopus.com/inward/record.url?scp=34547468874&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34547468874&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-73255-6_3
DO - 10.1007/978-3-540-73255-6_3
M3 - Conference contribution
AN - SCOPUS:34547468874
SN - 3540732543
SN - 9783540732549
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 6
EP - 16
BT - Data Integration in the Life Sciences - 4th International Workshop, DILS 2007, Proceedings
PB - Springer Verlag
T2 - 4th International Workshop on Data Integration in the Life Sciences, DILS 2007
Y2 - 27 June 2007 through 29 June 2007
ER -