Pincer-search: An efficient algorithm for discovering the maximum frequent set

Dao I. Lin, Zvi M. Kedem

Research output: Contribution to journalArticle

Abstract

Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up, breadth-first search direction. The computation starts from frequent 1-itemsets (the minimum length frequent itemsets) and continues until all maximal (length) frequent itemsets are found. During the execution, every frequent itemset is explicitly considered. Such algorithms perform well when all maximal frequent itemsets are short. However, performance drastically deteriorates when some of the maximal frequent itemsets are long. We present a new algorithm which combines both the bottom-up and the top-down searches. The primary search direction is still bottom-up, but a restricted search is also conducted in the top-down direction. This search is used only for maintaining and updating a new data structure, the maximum frequent candidate set. It is used to prune early candidates that would be normally encountered in the bottom-up search. A very important characteristic of the algorithm is that it does not require explicit examination of every frequent itemset. Therefore, the algorithm performs well even when some maximal frequent itemsets are long. As its output, the algorithm produces the maximum frequent set, i.e., the set containing all maximal frequent itemsets, thus specifying immediately all frequent itemsets. We evaluate the performance of the algorithm using well-known synthetic benchmark databases, real-life census, and stock market databases. The improvement in performance can be up to several orders of magnitude, compared to the best previous algorithms.

Original languageEnglish (US)
Pages (from-to)553-566
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume14
Issue number3
DOIs
StatePublished - May 2002

Keywords

  • Association rule
  • Data mining
  • Knowledge discovery
  • Maximum frequent candidate set
  • Maximum frequent set
  • Pincer Search

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Pincer-search: An efficient algorithm for discovering the maximum frequent set'. Together they form a unique fingerprint.

  • Cite this