Ads by Google
Christian Borgelt's Web Pages

PyFIM - Frequent Item Set Mining for Python

Download

32 bit 64 bit (32/64 bit only for shared object/dynamic module)
fim.so (766 kb) fim.so (778 kb) GNU/Linux Python 2.7.x shared object
fim.so (766 kb) fim.so (778 kb) GNU/Linux Python 3.5.x shared object
fim.pyd (322 kb) fim.pyd (369 kb) Windows Python 2.7.x dynamic module
fim.pyd (322 kb) fim.pyd (369 kb) Windows Python 3.5.x dynamic module
pyfim.zip (817 kb) pyfim.tar.gz (757 kb) C sources, version 6.28 (2017.03.24)

Description

PyFIM is an extension module that makes several frequent item set mining implementations available as functions in Python 2.7.x & 3.5.x. Currently apriori, eclat, fpgrowth, sam, relim, carpenter, ista, accretion and apriacc are available as functions, although the interfaces do not offer all of the options of the command line program. (Note that lcm is available as an algorithm mode of eclat.) There is also a "generic" function fim, which is essentially the same function as fpgrowth, only with a simplified interface (fewer options). Finally, there is a function arules for generating association rules (simplified interface compared to apriori, eclat and fpgrowth, which can also be used to generate association rules.

How to use the functions can be seen in the example scripts testfim.py and testacc.py in the source package (directory pyfim/ex). From a Python script or command prompt interface, call help(fim), help(apriori) (or help(fim.apriori)), help(eclat) (or help(fim.eclat)) etc. or print, for example, apriori.__doc__, eclat.__doc__ etc. for a description of the functions and their arguments.

This extension module was originally developed for Python 2.7. The shared objects made available above were compiled particularly for Python 2.7.11 and Python 3.5.1 on Ubuntu 16.04 LTS and the dynamic modules made available above were compiled for Python 2.7.10 and Python 3.5.1 on Windows 10.

Installation: precompiled version

If you have a GNU/Linux system (Ubuntu 12.04 or later preferred), you can use this extension module by simply downloading the shared object made available above (choose the version depending on your system and your installed Python version) and storing it in a directory that is on your PYTHONPATH (environment variable). This should also work on other GNU/Linux distributions, particularly Debian based ones. A typical (local) installation directory is $HOME/lib/ while typical (global) installation directories are /usr/local/lib/python2.7/site-packages/ or /usr/local/lib/python3.5/site-packages/ (Note that you may need root rights to copy into the latter directories.) Typical (local) installation directories for the Anaconda Python distribution are $HOME/anaconda/lib/python27/site-packages/ or $HOME/anaconda/lib/python35/site-packages/.

If you have a Windows system, downloading the Python dynamic module made available above (choose the version depending on your system and your installed Python version) and placing it into the extension module directory of your Python installation should work. Consult the manual of your Python installation to find the correct directory. Typical directories for (global) installation are C:\Program Files\Python27\Lib\site-packages\ or C:\Program Files\Python35\Lib\site-packages\ (Note that you may need administrator rights to copy into these directory.) Typical installation directories for the Anaconda Python distribution are C:\Anaconda2\Lib\site-packages\ or C:\Anaconda3\Lib\site-packages\.

If you run into trouble on Microsoft Windows, check whether you have the Microsoft Visual C++ Redistributable Packages for Visual Studio 2015 installed, as these libraries were compiled with Microsoft Visual Studio 2015 (Community Edition).

Installation: using distutils

Another way to install the extension module for your system is to use the Python script setup_fim.py (in the source package), which uses Python's distutils package to build and install the module.

On a GNU/Linux system call the script with

./setup_fim.py install

in a terminal window to build and install the extension module. If you get a "Permission denied" error message, check whether the file setup_fim.py is marked as executable. If it is not, add the executable flag with the command

chmod +x setup_fim.py

Alternatively, call the script explicitly through Python:

python setup_fim.py install

On a Microsoft Windows system call the script with

python setup_fim.py install

in a command prompt window to build and install the extension module. Note, however, that this direct call to Python is possible on Microsoft Windows only if the directory, in which the program python.exe resides, is contained in your PATH variable (environment variable, check its contents at a command prompt with echo %PATH%). Otherwise you may have to specify the full path to the Python program. A typical form of the command for this case is

"C:\Program Files\Python-2.7.10\python.exe" setup_fim.py install

In addition, building the module requires a C compiler. On a GNU/Linux system Python uses the system C compiler, which for GNU/Linux is usually the GNU C compiler gcc. This compiler is essentially part of the system and thus basically always available. One only may have to install the Python development files (package python-dev or python3-dev for Debian based GNU/Linux distributions).

On a Windows system Python commonly uses Microsoft Visual Studio C/C++, which therefore needs to be installed. Note that the Community Edition of this C compiler can be obtained (perfectly legally) free of charge. Although Python's distutils package works directly with Microsoft Visual Studio 2008 (or 9.0), it is not recommended to use this compiler, because it lacks stdint.h, which is needed (and not much of a requirement as it is part of the C99 standard), but not included with this compiler. If you want to use Microsoft Visual Studio 2008 nevertheless, you can download the missing stdint.h here: stdint.h.

Rather it is recommended to use Microsoft Visual Studio 2010 (or 10.0), 2012 (or 11.0), 2013 (or 12.0) or 2015 (or 14.0). However, to make these work, an environment variable needs to be set/changed beforehand. Consult the files vs2010.bat, vs2012.bat, vs2013.bat and vs2015.bat (included in the source packages) for this. Executing vs2010.bat sets the environment variable for Microsoft Visual Studio 2010 (or 10.0), vs2012.bat for Microsoft Visual Studio 2012 (or 11.0), vs2013.bat for Microsoft Visual Studio 2013 (or 12.0) and vs2015.bat for Microsoft Visual Studio 2015 (or 14.0).

Note that on a 64 bit Microsoft Windows system some complications may arise. The reason is that on a 64 bit system the Microsoft Visual Studio Compiler (which is a 32 bit program, even on 64 bit systems, at least in the Community Editions up to Visual Studio 2015/14.0) comes in two version (both of which are usually installed): a so-called "native" version, which produces 32 bit programs ("native", because a 32 bit program creates a 32 bit program), and a so-called "cross" version, which produces 64 bit programs ("cross", because a 32 bit program creates a 64 bit program). As it seems, executing the above installation commands (that is, vs201[0235].bat followed by python setup_fim.py install) in a standard command prompt window invokes the "native" version of the Visual Studio Compiler, which will fail if the installed Python version is 64 bit. The reason is simply that the compiler, being "native", searches for the 32 bit Python interface, which it cannot find if the installed Python version is 64 bit.

However, there may also be other problems that can arise if the installation commands are used in a standard command prompt window, because Python is configured to work with Visual Studio 2008 (or 9.0). To avoid such problems, it is advisable to use one of the command prompt windows that come with the installation of Microsoft Visual Studio (to be found under Programs > Microsoft VisualStudio). For example, the Community Edition of Microsoft Visual Studio 2015 (or 14.0) has the two command prompt windows "VS2015 x86 Native Tools Command Prompt", which refers to the "native" version of the compiler (producing 32 bit output), and "VS2015 x64 Cross Tools Command Prompt", which refers to the "cross" version of the compiler (producing 64 bit output). Use the former if you have a 32 bit Python installed (which is necessarily the case if you are on a 32 bit system), and the latter if you have a 64 bit Python installed. Open the command prompt window and change directory (using the cd command) to the directory where vs201[0235].bat and setup_fim.py reside. Then execute vs201[0235].bat first (using the batch file whose name matches the name of the command prompt window, which also contains the Microsoft Visual Studio year/version) and python setup_fim.py install afterward.

Note generally (for GNU/Linux as well as for Microsoft Windows) that installing this extension module for all users may require root/administrator rights in order to copy the shared object/Python dynamic module to the standard extension module directory. Local installations (for individual users) are also possible. Consult the help provided by Python's distutils package by calling the setup script with

python setup_fim.py --help install

to get information about the installation options.

Installation: recompilation with makefiles

On GNU/Linux (provided the Python development files are installed – package python-dev on Debian based distributions), you may also install the extension module (for Python 2.7) by simply calling

make all

in the source directory pyfim/src and copying the resulting shared object fim.so to a directory that is on your PYTHONPATH (environment variable).

For Python 3.5.x (which requires the package python3-dev to be installed on Debian based distributions), simply call

make py3

in the source directory pyfim/src and copy the resulting shared object fim.so to a directory that is on your PYTHONPATH (environment variable).

On Windows you may also install the extension module (for Python 2.7.10) by simply calling

nmake /f pyfim.mak all

in a command prompt of Microsoft Visual Studio C/C++ in the source directory pyfim/src (make sure to chose the "native" or the "cross" command prompt depending on your Python version, see above, "Installation: using distutils") and copying the resulting dynamic module fim.pyd to the extension module directory of your Python installation.

For Python 3.5.x, simply call

nmake /f pyfim.mak py3

in a command prompt of Microsoft Visual Studio C/C++ in the source directory pyfim/src (make sure to chose the "native" or the "cross" command prompt depending on your Python version, see above, "Installation: using distutils") and copying the resulting dynamic module fim.pyd to the extension module directory of your Python installation.

Should the compilation fail, check the definition of the variables PY2DIR and/or PY3DIR in the files makefile (Gnu/Linux) or pyfim.mak (Windows)

If you are using the Anaconda Python distribution, you may use the special makefile pyfim_conda.mak, which is configured for Anaconda 1.8.0 installed in the default path. If you have a different version or installed to a non-standard path, you may have to adapt the definitions of CONDAINC and CONDALIB in pyfim_conda.mak.

References

An overview of frequent item set mining in general and several specific algorithms can be found in the following paper:

More information about frequent item set mining, implementations of other algorithms as well as test data sets can be found at the Frequent Itemset Mining Implementations Repository.