Christian Borgelt's Web Pages

Apriori - Association Rule Induction / Frequent Item Set Mining

Download

32 bit 64 bit (32/64 bit only for executable)
apriori (290 kb) apriori (325 kb) GNU/Linux executable
apriori.exe (176 kb) apriori.exe (203 kb) Windows console executable
apriacc (286 kb) apriacc (321 kb) GNU/Linux executable (accretion analog)
apriacc.exe (170 kb) apriacc.exe (197 kb) Windows console executable
apriori.zip (352 kb) apriori.tar.gz (332 kb) C sources, version 6.12, 2014.08.28
census.zip (390 kb) census data set (UCI ML repository)
census (1 kb) shell script used for the conversion

Description

Apriori is a program to find association rules and frequent item sets (also closed and maximal as well as generators) with the Apriori algorithm [Agrawal and Srikant 1994], which carries out a breadth first search on the subset lattice and determines the support of item sets by subset tests. This implementation is pretty fast as it uses a prefix tree to organize the counters for the item sets. However, Apriori is outperformed on basically all data sets by depth-first algorithms like Eclat or FP-growth.

This program is currently useful only, because it can generate association rules directly (while all other programs available on this web site find only frequent item sets) and allows to evaluate association rules as well as item sets by a large range of different measures.

The special program version "apriacc" is designed for the purpose of finding neuronal assemblies in spike train data and can be seen as an analog of the implementation of the accretion algorithm as it provides basically the same set of options. However, the output is not identical, because accretion executes an incomplete search.

Full description of the Apriori program (included in the source package).

Pseudo-code of the original Apriori algorithm, which does not refer to a prefix tree.

If you have trouble executing the program on Microsoft Windows, check whether you have the Microsoft Visual C++ Redistributable Packages for Visual Studio 2013 installed, as the C program was compiled with Microsoft Visual Studio 2013.

Earlier versions of this Apriori implementation are incorporated in the data mining tool Clementine, available from SPSS (Apriori version 1.8 in Clementine version 5.0, Apriori version 2.7 in Clementine version 7.0; last version shipped to SPSS is 4.30; nevertheless newer versions may be available in newer versions of Clementine).

A graphical user interface for this program (ARuleGUI), written in Java, is available here.

Another graphical user interface for this program, which is based on Gnome 2 and was developed by togaware, can be found here. Still another graphical user interface, which is based on the wxWidgets and was developed by the STK++ team can be found here. (However, I cannot guarantee that these GUIs work with the latest version of the command line program made available here.)

This program (possibly in an earlier version) is also accessible through the arules package of the statistical software package R. Furthermore it can be used through the Python interface provided by the PyFIM library.

Papers that describe the Apriori algorithm and some implementation aspects of this program:

Some other references:

More information about frequent item set mining, implementations of other algorithms as well as test data sets can be found at the Frequent Itemset Mining Implementations Repository.