Christian Borgelt's Web Pages

FP-growth - Frequent Item Set Mining

Download

fpgrowth	(513 kb)	GNU/Linux executable
fpgrowth.exe	(356 kb)	Windows console executable
fpgrowth.zip	(284 kb)	C sources, version 6.21 (2022.11.22)
fpgrowth.tar.gz	(258 kb)
census.zip	(382 kb)	census data set (UCI ML repository)
census	(2 kb)	shell script used for the conversion

Description

FP-growth is a program to find frequent item sets (also closed and maximal as well as generators) with the FP-growth algorithm (Frequent Pattern growth [Han et al. 2000]), which represents the transaction database as a prefix tree which is enhanced with links that organize the nodes into lists referring to the same item. The search is carried out by projecting the prefix tree, working recursively on the result, and pruning the original tree. The implementation also supports filtering for closed and maximal item sets with conditional item set repositories as suggested in [Grahne and Zhu 2003], although the approach used in the program differs in as far as it used top-down prefix trees rather than FP-trees. It does not cover the clever implementation of FP-trees with two integer arrays as suggested in [Rasz 2004]. Since version 6.0 the program made available above can also be used to find association rules.

Note that the current version of this program can only find frequent item sets, not association rules.

This implementation may also be used through the Python interface provided by the PyFIM library.

Full description of the Fp-growth program (included in the source package).

If you have trouble executing the program on Microsoft Windows, check whether you have the Microsoft Visual C++ Redistributable for Visual Studio 2022 (see under "Other Tools and Frameworks") installed, as the program was compiled with Microsoft Visual Studio 2022.

Papers that describes this algorithm/implementation:

Frequent Item Set Mining
Christian Borgelt
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2(6):437-456.
J. Wiley & Sons, Chichester, United Kingdom 2012
doi:10.1002/widm.1074 wiley.com
(20 pages)
An Implementation of the FP-growth Algorithm
Christian Borgelt
Workshop Open Source Data Mining Software (OSDM'05, Chicago, IL), 1-5.
ACM Press, New York, NY, USA 2005.
fpgrowth.pdf (152 kb) fpgrowth.ps.gz (116 kb) (5 pages)

Some other references:

Mining Frequent Patterns without Candidate Generation
J. Han, H. Pei, and Y. Yin
Proc. Conf. on the Management of Data (SIGMOD'00, Dallas, TX), 1-12
ACM Press, New York, NY, USA 2000
Efficiently Using Prefix-trees in Mining Frequent Itemsets
G. Grahne and J. Zhu
Proc. Workshop Frequent Item Set Mining Implementations (FIMI 2003, Melbourne, FL)
CEUR Workshop Proceedings 90, Aachen, Germany 2003
Reducing the Main Memory Consumptions of fpmax* and fpclose
G. Grahne and J. Zhu
Proc. Workshop Frequent Item Set Mining Implementations (FIMI 2004, Brighton, UK)
CEUR Workshop Proceedings 126, Aachen, Germany 2004
nonordfp: An FP-growth Variation without Rebuilding the FP-Tree
B. Rász
Proc. Workshop Frequent Item Set Mining Implementations (FIMI 2004, Brighton, UK)
CEUR Workshop Proceedings 126, Aachen, Germany 2004

More information about frequent item set mining, implementations of other algorithms as well as test data sets can be found at the Frequent Itemset Mining Implementations Repository.