Ads by Google
Christian Borgelt's Web Pages

FP-growth - Frequent Item Set Mining

Download

fpgrowth (513 kb) GNU/Linux executable
fpgrowth.exe (337 kb) Windows console executable
fpgrowth.zip (284 kb) C sources, version 6.21 (2022.11.22)
fpgrowth.tar.gz (259 kb)
census.zip (382 kb) census data set (UCI ML repository)
census (2 kb) shell script used for the conversion

Description

FP-growth is a program to find frequent item sets (also closed and maximal as well as generators) with the FP-growth algorithm (Frequent Pattern growth [Han et al. 2000]), which represents the transaction database as a prefix tree which is enhanced with links that organize the nodes into lists referring to the same item. The search is carried out by projecting the prefix tree, working recursively on the result, and pruning the original tree. The implementation also supports filtering for closed and maximal item sets with conditional item set repositories as suggested in [Grahne and Zhu 2003], although the approach used in the program differs in as far as it used top-down prefix trees rather than FP-trees. It does not cover the clever implementation of FP-trees with two integer arrays as suggested in [Rasz 2004]. Since version 6.0 the program made available above can also be used to find association rules.

Note that the current version of this program can only find frequent item sets, not association rules.

This implementation may also be used through the Python interface provided by the PyFIM library.

Full description of the Fp-growth program (included in the source package).

If you have trouble executing the program on Microsoft Windows, check whether you have the Microsoft Visual C++ Redistributable for Visual Studio 2022 (see under "Other Tools and Frameworks") installed, as the program was compiled with Microsoft Visual Studio 2022.

Papers that describes this algorithm/implementation:

Some other references:

More information about frequent item set mining, implementations of other algorithms as well as test data sets can be found at the Frequent Itemset Mining Implementations Repository.