Pincer-search: A new algorithm for discovering the maximum frequent set

Dao I. Lin, Zvi M. Kedem

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up breadth-first search direction. The computation starts from frequent 1-itemsets (minimal length frequent itemsets) and continues until all maximal (length) frequent itemsets are found. During the execution, every frequent itemset is explicitly considered. Such algorithms perform reasonably well when all maximal frequent itemsets are short. However, performance drastically decreases when some of the maximal frequent itemsets are relatively long. We present a new algorithm which combines both the bottom-up and topdown searches. The primary search direction is still bottom-up, but a restricted search is also conducted in the top-down direction. This search is used only for maintaining and updating a new data structure we designed, the maximum frequent candidate set. It is used to prune candidates in the bottom-up search. A very important characteristic of the algorithm is that it does not require explicite examination of every frequent itemset. Therefore the algorithm performs well even when some maximal frequent itemsets are long. As its output, the algorithm produces the maximum frequent set, i.e., the set containing all maximal frequent itemsets, which therefore specifies immediately all frequent itemsets. We evaluate the performance of the Mgorithm using a well-known benchmark database. The improvements can be up to several orders of magnitude, compared to the best current algorithms.

Original languageEnglish (US)
Title of host publicationAdvances in Database Technology, EDBT 1998 - 6th International Conference on Extending Database Technology, Proceedings
PublisherSpringer Verlag
Pages105-119
Number of pages15
ISBN (Print)3540642641, 9783540642640
DOIs
StatePublished - 1998
Event6th International Conference on Extending Database Technology, EDBT 1998 - Valencia, Spain
Duration: Mar 23 1998Mar 27 1998

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1377 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other6th International Conference on Extending Database Technology, EDBT 1998
Country/TerritorySpain
CityValencia
Period3/23/983/27/98

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Pincer-search: A new algorithm for discovering the maximum frequent set'. Together they form a unique fingerprint.

Cite this