Lattice based clustering of temporal gene-expression matrices

Yang Huang, Martin Farach-Colton

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Individuals show different cell classes when they are in the different stages of a disease, have different disease subtypes, or have different response to a treatment or environmental stress. It is important to identify the individuals' cell classes, for example, to decide which disease subtype they have or how they will respond to a certain drug. In a temporal gene-expression matrix (TGEM) each row represents a time series of expression values of a gene. TGEMs of the same cell class should show similar gene-expression patterns. However, given a set of TGEMs, it can be difficult to classify matrices by cell classes. In this paper, we develop a tool called LABSTER (LAttice Based cluSTERing) to cluster gene-expression matrices by cell classes. Rather than treating each row or column as a vector, we create a Galois lattice for each matrix, which yields a natural distance function between gene expression matrices. Finally, we cluster based on these distances. A key advantage of our method is that it effectively handles missing values, which is a problem in gene expression data. We evaluated LABSTER on both simulation data and clinical data. The results show that LABSTER has better clustering performance than several widely used vector-based clustering methods. A bootstrapping procedure is also proposed to further improve the performance of LABSTER. LABSTER has the poteiitial to be used on matrices containing data other than gene expression.

    Original languageEnglish (US)
    Title of host publicationProceedings of the 7th SIAM International Conference on Data Mining
    PublisherSociety for Industrial and Applied Mathematics Publications
    Pages398-409
    Number of pages12
    ISBN (Print)9780898716306
    DOIs
    StatePublished - 2007
    Event7th SIAM International Conference on Data Mining - Minneapolis, MN, United States
    Duration: Apr 26 2007Apr 28 2007

    Publication series

    NameProceedings of the 7th SIAM International Conference on Data Mining

    Conference

    Conference7th SIAM International Conference on Data Mining
    Country/TerritoryUnited States
    CityMinneapolis, MN
    Period4/26/074/28/07

    Keywords

    • Clustering
    • Galois lattice
    • Gene expression
    • Matrix distance

    ASJC Scopus subject areas

    • General Engineering

    Fingerprint

    Dive into the research topics of 'Lattice based clustering of temporal gene-expression matrices'. Together they form a unique fingerprint.

    Cite this