Ads by Google
Christian Borgelt's Web Pages

Apriori - Association Rule Induction / Frequent Item Set Mining

Download

apriori Linux executable (138 kb)
apriori.exe Windows console executable (147 kb)
apriacc Linux executable (accretion analog) (134 kb)
apriacc.exe Windows console executable (147 kb)
apriori.zip C sources, version 5.67, 2012.01.23 (306 kb)
apriori.tar.gz (289 kb)
census.zip census data set (from the UCI ML repository) (390 kb)
census shell script used for the conversion (1 kb)

Description

A program to find association rules and frequent item sets (also closed and maximal as well as generators) with the Apriori algorithm (Agrawal et al. 1993), which carries out a breadth first search on the subset lattice and determines the support of item sets by subset tests. This is a pretty fast implementation that uses a prefix tree to organize the counters for the item sets. However, Apriori is outperformed on basically all data sets by depth-first algorithms like Eclat or FP-growth.

The special program version "apriacc" is designed for the purpose of finding neuronal assemblies in spike train data and can be seen as an analog of the implementation of the accretion algorithm as it provides basically the same set of options.

Full description of the apriori program (included in the source package).

Earlier versions of this program are incorporated in the data mining tool Clementine, available from SPSS (Apriori version 1.8 in Clementine version 5.0, Apriori version 2.7 in Clementine version 7.0, last version shipped to SPSS is 4.30, nevertheless newer versions may be available in newer versions of Clementine).

A graphical user interface for this program (ARView), written in Java, is available here.

Another graphical user interface for this program, which is based on Gnome 2 and was developed by togaware, can be found here. Another graphical user interface, which is based on the wxWidgets and was developed by the STK++ team can be found here. (However, I cannot guarantee that these GUIs works with the latest version of the command line program made available here.)

This program (possibly in an earlier version) is also accessible through the arules package of the statistical software package R.

Papers that describe some implementation aspects of this program:

Some other references:

More information about frequent item set mining, implementations of other algorithms as well as test data sets can be found at the Frequent Itemset Mining Implementations Repository.