TY - GEN
T1 - Pincer-search
T2 - 6th International Conference on Extending Database Technology, EDBT 1998
AU - Lin, Dao I.
AU - Kedem, Zvi M.
PY - 1998
Y1 - 1998
N2 - Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up breadth-first search direction. The computation starts from frequent 1-itemsets (minimal length frequent itemsets) and continues until all maximal (length) frequent itemsets are found. During the execution, every frequent itemset is explicitly considered. Such algorithms perform reasonably well when all maximal frequent itemsets are short. However, performance drastically decreases when some of the maximal frequent itemsets are relatively long. We present a new algorithm which combines both the bottom-up and topdown searches. The primary search direction is still bottom-up, but a restricted search is also conducted in the top-down direction. This search is used only for maintaining and updating a new data structure we designed, the maximum frequent candidate set. It is used to prune candidates in the bottom-up search. A very important characteristic of the algorithm is that it does not require explicite examination of every frequent itemset. Therefore the algorithm performs well even when some maximal frequent itemsets are long. As its output, the algorithm produces the maximum frequent set, i.e., the set containing all maximal frequent itemsets, which therefore specifies immediately all frequent itemsets. We evaluate the performance of the Mgorithm using a well-known benchmark database. The improvements can be up to several orders of magnitude, compared to the best current algorithms.
AB - Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up breadth-first search direction. The computation starts from frequent 1-itemsets (minimal length frequent itemsets) and continues until all maximal (length) frequent itemsets are found. During the execution, every frequent itemset is explicitly considered. Such algorithms perform reasonably well when all maximal frequent itemsets are short. However, performance drastically decreases when some of the maximal frequent itemsets are relatively long. We present a new algorithm which combines both the bottom-up and topdown searches. The primary search direction is still bottom-up, but a restricted search is also conducted in the top-down direction. This search is used only for maintaining and updating a new data structure we designed, the maximum frequent candidate set. It is used to prune candidates in the bottom-up search. A very important characteristic of the algorithm is that it does not require explicite examination of every frequent itemset. Therefore the algorithm performs well even when some maximal frequent itemsets are long. As its output, the algorithm produces the maximum frequent set, i.e., the set containing all maximal frequent itemsets, which therefore specifies immediately all frequent itemsets. We evaluate the performance of the Mgorithm using a well-known benchmark database. The improvements can be up to several orders of magnitude, compared to the best current algorithms.
UR - http://www.scopus.com/inward/record.url?scp=84890521199&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84890521199&partnerID=8YFLogxK
U2 - 10.1007/bfb0100980
DO - 10.1007/bfb0100980
M3 - Conference contribution
AN - SCOPUS:84890521199
SN - 3540642641
SN - 9783540642640
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 105
EP - 119
BT - Advances in Database Technology, EDBT 1998 - 6th International Conference on Extending Database Technology, Proceedings
PB - Springer Verlag
Y2 - 23 March 1998 through 27 March 1998
ER -