Frequent Pattern Mining (2024)

Home / Topics

Curated by: Xifeng Yan

Frequent patterns are itemsets, subsequences, or substructures that appear in a data set with frequency no less than a user-specified threshold. For example, a set of items, such as milk and bread, that appear frequently together in a transaction data set, is a frequent itemset. A subsequence, such as buying first a PC, then a digital camera, and then a memory card, if it occurs frequently in a shopping history database, is a (frequent) sequential pattern. A substructure can refer to different structural forms, such as subgraphs, subtrees, or sublattices, which may be combined with itemsets or subsequences. If a substructure occurs frequently in a graph database, it is called a (frequent) structural pattern. Finding frequent patterns plays an essential role in mining associations, correlations, and many other interesting relationships among data. Moreover, it helps in data indexing, classification, clustering, and other data mining tasks as well. Frequent pattern mining is an important data mining task and a focused theme in data mining research. Abundant literature has been dedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent itemset mining in transaction databases to numerous research frontiers, such as sequential pattern mining, structured pattern mining, correlation mining, associative classification, and frequent pattern-based clustering, as well as their broad applications [1]. A few text books are available on this topic, e.g., [2].

[1] Frequent Pattern Mining: Current Status and Future Directions, by J. Han, H. Cheng, D. Xin and X. Yan, 2007 Data Mining and Knowledge Discovery archive, Vol. 15 Issue 1, pp. 55 – 86, 2007

[2] Frequent Pattern Mining, Ed. Charu Aggarwal and Jiawei Han, Springer, 2014.

Related KDD2016 Papers

Title & Authors
DeepIntent: Learning Attentions for Online Advertising with Recurrent Neural Networks
Author(s): Shuangfei Zhai*, Binghamton University; Keng-hao Chang, Microsoft; Ruofei Zhang, Microsoft; Zhongfei Zhang,
Annealed Sparsity via Adaptive and Dynamic Shrinking
Author(s): Kai Zhang*, NEC labs America; Shandian Shan, Purdue University; Zhengzhang Chen, NEC Lab America; Chaoran Cheng, New Jersey Institute of Technology; Zhi Wei, New Jersey Institute of Technology; Guofei Jiang, NEC labs America; Jieping Ye,
Multi-Task Feature Interaction Learning
Author(s): KAIXIANG LIN*, Michigan State University; Jianpeng Xu, Michigan State University; Shuiwang Ji, Washington State University; Jiayu Zhou, Michigan State University
Analyzing Volleyball Match Data from the 2014 World Championships Using Machine Learning Techniques
Author(s): Jan Van Haaren*, KU Leuven; Horesh Ben sh*trit, PlayfulVision; Jesse Davis, KU Leuven; Pascal Fua, EPFL
Lexis: An Optimization Framework for Discovering the Hierarchical Structure of Sequential Data
Author(s): Payam Siyari*, Georgia Institute of Technology; Bistra Dilkina, Georgia Tech; Constantine Dovrolis, Georgia Institute of Technology
Just One More: Modeling Binge Watching Behavior
Author(s): William Trouleau, EPFL; Azin Ashkan*, Technicolor; Weicong Ding, Technicolor Research; Brian Eriksson, Technicolor
Online Feature Selection: A Limited-Memory Substitution Algorithm and its Asynchronous Parallel Vari
Author(s): Haichuan Yang*, University of Rochester; Ryohei Fujimaki, NEC Laboratories America; Yukitaka Kusumura, NEC lab; Ji Liu, University of Rochester
A Closed-Loop Approach in Data-Driven Resource Allocation to Improve Network User Experience
Author(s): Yanan Bao*, University of California, Davi; Huasen Wu, UC Davis; Xin Liu, UC Davis
Towards Robust and Versatile Causal Discovery for Business Applications
Author(s): Giorgos Borboudakis*, University of Crete; Ioannis Tsamardinos,
Interpretable Decision Sets: A Joint Framework for Description and Prediction
Author(s): Himabindu Lakkaraju*, Stanford University; Stephen Bach, Stanford University; Jure Leskovec, Stanford University
Causal Clustering for 1-Factor Measurement Models
Author(s): Erich Kummerfeld*, University of Pittsburgh; Joseph Ramsey, Carnegie Mellon University
Efficient Frequent Directions Algorithm for Sparse Matrices
Author(s): Mina Ghashami*, University of utah; Edo Liberty, Yahoo ; Jeff Phillips, School of Computing, University of Utah
Subjectively Interesting Component Analysis: Data Projections that Contrast with Prior Expectations
Author(s): Bo Kang*, Ghent University; Jefrey Lijffijt, Ghent University; Raul Santos-Rodriguez, University of Bristol; Tijl De Bie, Ghen University
Robust and Effective Metric Learning Using Capped Trace Norm
Author(s): Zhouyuan Huo, University of Texas, Arlington; Feiping Nie, University of Texas at Arlington; Heng Huang*, Univ. of Texas at Arlington
Inferring Network Effects from Observational Data
Author(s): David Arbour*, University of Massachusetts Am; Dan Garant, University of Massachusetts Amherst; David Jensen, UMass Amherst
Generalized Hierarchical Sparse Model for Arbitrary-Order Interactive Antigenic Sites Identification
Author(s): Lei Han*, Rutgers University; Yu Zhang, Hong Kong University of Science and Technology; Xiu-Feng Wan, Mississippi State University; Tong Zhang, Rutgers University
Predict Risk of Relapse for Patients with Multiple Stages of Treatment of Depression
Author(s): Zhi Nie*, Arizona State University; Pinghua Gong, ; Jieping Ye, University of Michigan at Ann Arbor

Comments

Frequent pattern mining is a fascinating aspect of data mining, focusing on identifying recurring patterns within datasets. It involves itemsets, subsequences, and substructures that appear frequently above a specified threshold. I've extensively delved into this realm, exploring various algorithms, applications, and research frontiers associated with this domain.

Let's break down the concepts mentioned in the article you provided:

  1. Frequent Itemsets: These are sets of items that commonly occur together in a dataset. For instance, if "milk" and "bread" frequently appear together in purchase transactions, they form a frequent itemset.

  2. Sequential Patterns: These refer to subsequences that occur regularly in a sequence of events or transactions. For instance, the sequence of purchasing a "PC," then a "digital camera," and finally a "memory card" forms a sequential pattern if it occurs frequently in a shopping history database.

  3. Substructures: These entail different structural forms like subgraphs, subtrees, or sublattices, combined with itemsets or subsequences. When these structural forms appear frequently in a graph database, they're termed frequent structural patterns.

The significance of mining frequent patterns extends beyond mere identification. It aids in discovering associations, correlations, and relationships within data. Moreover, it contributes to tasks like data indexing, classification, clustering, and various other data mining endeavors.

The article mentions a plethora of related research papers and themes in this field:

  • Sequential Pattern Mining: Analyzing sequences of events or transactions.
  • Structured Pattern Mining: Focusing on structural forms within data like graphs.
  • Correlation Mining: Identifying relationships between variables.
  • Associative Classification: Associating classification rules with data.
  • Frequent Pattern-Based Clustering: Clustering data based on frequent patterns.

The depth and breadth of literature, including textbooks and dedicated research articles, exemplify the vast scope and continuous advancements in frequent pattern mining, showcasing its pivotal role in data analysis and interpretation.

The papers cited in the article touch upon various applications, from online advertising to sports analytics, network optimization, behavioral modeling (like binge-watching behavior), and healthcare-related predictions for patient outcomes in depression treatments. These applications underscore the versatility and practical implications of frequent pattern mining across diverse domains.

Frequent Pattern Mining (2024)
Top Articles
Latest Posts
Article information

Author: Lilliana Bartoletti

Last Updated:

Views: 6591

Rating: 4.2 / 5 (73 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Lilliana Bartoletti

Birthday: 1999-11-18

Address: 58866 Tricia Spurs, North Melvinberg, HI 91346-3774

Phone: +50616620367928

Job: Real-Estate Liaison

Hobby: Graffiti, Astronomy, Handball, Magic, Origami, Fashion, Foreign language learning

Introduction: My name is Lilliana Bartoletti, I am a adventurous, pleasant, shiny, beautiful, handsome, zealous, tasty person who loves writing and wants to share my knowledge and understanding with you.