Turning CARTwheels: An alternating algorithm for mining redescriptions

Naren Ramakrishnan, Deept Kumar, Bud Mishra, Malcolm Potts, Richard F. Helm

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present an unusual algorithm involving classification trees CARTwheels - where two trees are grown in opposite directions so that they are joined at their leaves. This approach finds application in a new data mining task we formulate, called redescription mining. A redescription is a shift-of-vocabulary, or a different way of communicating information about a given subset of data; the goal of redescription mining is to find subsets of data that afford multiple descriptions. We highlight the importance of this problem in domains such as bioinformatics, which exhibit an underlying richness and diversity of data descriptors (e.g., genes can be studied in a variety of ways). CARTwheels exploits the duality between class partitions and path partitions in an induced classification tree to model and mine redescriptions. It helps integrate multiple forms of characterizing datasets, situates the knowledge gained from one dataset in the context of others, and harnesses high-level abstractions for uncovering cryptic and subtle features of data. Algorithm design decisions, implementation details, and experimental results are presented.

Original languageEnglish (US)
Title of host publicationKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
EditorsR. Kohavi, J. Gehrke, W. DuMouchel, J. Ghosh
Pages266-275
Number of pages10
StatePublished - 2004
EventKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Seattle, WA, United States
Duration: Aug 22 2004Aug 25 2004

Publication series

NameKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Other

OtherKDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
CountryUnited States
CitySeattle, WA
Period8/22/048/25/04

Keywords

  • Classification trees
  • Data mining in biological domains
  • Redescriptions

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Turning CARTwheels: An alternating algorithm for mining redescriptions'. Together they form a unique fingerprint.

Cite this