Christian Borgelt's Web Pages

Apriori - Association Rule Induction / Frequent Item Set Mining

Download

apriori	(399 kb)	GNU/Linux executable
apriori.exe	(250 kb)	Windows console executable
apriacc	(395 kb)	GNU/Linux executable (accretion analog)
apriacc.exe	(246 kb)	Windows console executable (accretion analog)
apriori.zip	(459 kb)	C sources, version 6.31 (2022.11.22)
apriori.tar.gz	(441 kb)
census.zip	(382 kb)	census data set (UCI ML repository)
census	(2 kb)	shell script used for the conversion

Description

Apriori is a program to find association rules and frequent item sets (also closed and maximal as well as generators) with the Apriori algorithm [Agrawal and Srikant 1994], which carries out a breadth first search on the subset lattice and determines the support of item sets by subset tests. This implementation is pretty fast as it uses a prefix tree to organize the counters for the item sets. However, Apriori is outperformed on basically all data sets by depth-first algorithms like Eclat or FP-growth.

This program is currently useful only, because it can generate association rules directly (while all other programs available on this web site find only frequent item sets) and allows to evaluate association rules as well as item sets by a large range of different measures.

The special program version "apriacc" is designed for the purpose of finding neuronal assemblies in spike train data and can be seen as an analog of the implementation of the accretion algorithm as it provides basically the same set of options. However, the output is not identical, because accretion executes an incomplete search.

Full description of the Apriori program (included in the source package).

Pseudo-code of the original Apriori algorithm, which does not refer to a prefix tree.

If you have trouble executing the program on Microsoft Windows, check whether you have the Microsoft Visual C++ Redistributable for Visual Studio 2022 (see under "Other Tools and Frameworks") installed, as the program was compiled with Microsoft Visual Studio 2022.

Earlier versions of this Apriori implementation are incorporated in the data mining tool Clementine, available from SPSS (Apriori version 1.8 in Clementine version 5.0, Apriori version 2.7 in Clementine version 7.0; last version shipped to SPSS is 4.30; nevertheless newer versions may be available in newer versions of Clementine).

A graphical user interface for this program (ARuleGUI), written in Java, is available here.

Another graphical user interface for this program, which is based on Gnome 2 and was developed by togaware, can be found here. Still another graphical user interface, which is based on the wxWidgets and was developed by the STK++ team can be found here. (However, I cannot guarantee that these GUIs work with the latest version of the command line program made available here.)

This program (possibly in an earlier version) is also accessible through the arules package of the statistical software package R. Furthermore it can be used through the Python interface provided by the PyFIM library.

Papers that describe the Apriori algorithm and some implementation aspects of this program:

Frequent Item Set Mining
Christian Borgelt
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2(6):437-456.
J. Wiley & Sons, Chichester, United Kingdom 2012
doi:10.1002/widm.1074 wiley.com
(20 pages)
Recursion Pruning for the Apriori Algorithm
Christian Borgelt
2nd Workshop of Frequent Item Set Mining Implementations (FIMI 2004, Brighton, UK).
fimi_04.pdf (41 kb) fimi_04.ps.gz (29 kb) (2 pages)
Efficient Implementations of Apriori and Eclat
Christian Borgelt
Workshop of Frequent Item Set Mining Implementations (FIMI 2003, Melbourne, FL, USA).
fimi_03.pdf (304 kb) fimi_03.ps.gz (197 kb) (9 pages)
Induction of Association Rules: Apriori Implementation
Christian Borgelt and Rudolf Kruse
15th Conference on Computational Statistics (Compstat 2002, Berlin, Germany), 395-400
Physica Verlag, Heidelberg, Germany 2002
cstat_02.pdf (105 kb) cstat_02.ps.gz (91 kb) (6 pages)

Some other references:

Fast Algorithms for Mining Association Rules
R. Agrawal and R. Srikant
Proc. 20th Int. Conf. on Very Large Databases (VLDB 1994, Santiago de Chile), 487-499
Morgan Kaufmann, San Mateo, CA, USA 1994
Fast Discovery of Association Rules
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Verkamo
In: U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds.
Advances in Knowledge Discovery and Data Mining, 307-328
AAAI Press / MIT Press, Cambridge, CA, USA 1996

More information about frequent item set mining, implementations of other algorithms as well as test data sets can be found at the Frequent Itemset Mining Implementations Repository.