Exploring repositories of scientific workflows

Julia Stoyanovich, Ben Taskar, Susan Davidson

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Scientific workflows are gaining popularity, and repositories of workflows are starting to emerge. In this paper we present some initial experiences of information discovery in repositories of scientific workflows. In the first part of the paper we consider a collection of VisTrails workflows, and explore how this collection may be summarized when workflow modules are used as features. We present a hierarchical browsable view of the repository in which categories are derived using frequent itemset mining or latent Dirichlet allocation. We demonstrate that both approaches may be used for effective data exploration. In the second part of the paper we focus on a collection of Taverna workflows from myExperi-ment.org, and consider how these workflows may be browsed using modules and tags as features. Finally, we outline some interesting challenges and describe conditions under which these techniques work well for repositories of scientific workflows, and conditions under which additional work is needed for effective data exploration.

    Original languageEnglish (US)
    Title of host publicationProceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science, Wands '10
    DOIs
    StatePublished - 2010
    Event1st International Workshop on Workflow Approaches to New Data-centric Science, Wands '10 - Indianapolis, IN, United States
    Duration: Jun 6 2010Jun 6 2010

    Publication series

    NameProceedings of the ACM SIGMOD International Conference on Management of Data
    ISSN (Print)0730-8078

    Other

    Other1st International Workshop on Workflow Approaches to New Data-centric Science, Wands '10
    CountryUnited States
    CityIndianapolis, IN
    Period6/6/106/6/10

    ASJC Scopus subject areas

    • Software
    • Information Systems

    Cite this

    Stoyanovich, J., Taskar, B., & Davidson, S. (2010). Exploring repositories of scientific workflows. In Proceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science, Wands '10 [7] (Proceedings of the ACM SIGMOD International Conference on Management of Data). https://doi.org/10.1145/1833398.1833405