Ads by Google
Christian Borgelt's Web Pages

IsTa - Closed Frequent Item Set Mining

Download

ista Linux executable (121 kb)
ista.exe Windows console executable (121 kb)
ista.zip C sources, version 2.33, 2012.01.20 (147 kb)
ista.tar.gz (124 kb)
census.zip census data set (from the UCI ML repository) (390 kb)
census shell script used for the conversion (1 kb)

Description

A program to find closed frequent item sets by intersecting transactions (Intersecting Transactions), which is based on the insight that an item set is closed if it is the intersection of all transactions that contain it. Such an approach can be highly competitive in special cases, namely if there are few transactions and (very) many items, which is a common situation in biological data sets, like gene expression data. For other data sets, however, it is not a recommendable approach.

The program can also find maximal item sets, but the filtering of the closed item sets is a quick hack and not particularly efficient.

The algorithm used in this program is described in the following paper:

A reference to a closely related approach:

More information about frequent item set mining, implementations of other algorithms as well as test data sets can be found at the Frequent Itemset Mining Implementations Repository.