Package fim

Class SeqMine

java.lang.Object
fim.SeqMine
All Implemented Interfaces:
Runnable

public class SeqMine extends Object implements Runnable
Class for mining (all/closed/maximal) frequent sequences.
Since:
2017.06.26
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected boolean
    the flag whether a mining run has been aborted
    static final int
    target pattern subtype: closed frequent item patterns; to combined with SEQUENCE
    static final int
    operation mode: use a chain of prefix trees to filter for closed/maximal patterns
    static final int
    operation mode: check for closed/maximal patterns via extensions
    static final int
    mask for operation mode flags concerning closed/maximal filtering
    static final int
    operation mode: use a single prefix tree to filter for closed/maximal patterns
    static final int
    mask for operation mode flags concerning closed/maximal filtering with a prefix tree repository (single tree or tree chain)
    static final String
    the copyright information for this program
    static final int
    operation mode: default setting
    static final String
    the program description
    static final double
    the difference between 1.0 and the smallest number greater than 1.0 that is representable as a double precision number; used for handling rounding errors
    static final int
    target pattern subtype: simple frequent item patterns; to be combined with SEQUENCE
    protected util.IdMap
    the underlying item base
    protected int
    the operation mode (DEFAULT or REDUCE, set by initMining())
    protected double
    the minimum support of an item pattern (set by initMining())
    protected int
    the target pattern type and subtype of the search (ITEMSET or SEQUENCE as the main target pattern type and FREQUENT, CLOSED or MAXIMAL as the target pattern subtype)
    static final int
    target pattern type: item sets (item order is ignored)
    protected int
    the number of times an item may be used in a pattern (set by initMining())
    protected int
    the maximum length of an item pattern (number of items, set by initMining())
    protected int
    the minimum length of an item pattern (number of items, set by initMining())
    static final int
    target pattern subtype: maximal frequent item patterns; to be combined with SEQUENCE
    protected int
    the search/operation mode
    static final int
    operation mode: no special operation, identical to DEFAULT
    protected PatternSet
    the result set of frequent item patterns (item sequences)
    static final int
    operation mode: force pre-check pruning for closed/maximal mining (attention: may lead to wrong results!)
    static final int
    operation mode: reduce transaction suffix lists
    protected int
    the base support (support of empty sequence/database size)
    static final int
    target pattern type: item sequence
    protected int
    the minimum support of an item pattern (item sequence)
    static final int
    target pattern subtype mask; to extract the target pattern subtype, that is, FREQUENT, CLOSED or MAXIMAL
    protected TrActBag
    the transactions to mine in a threaded mining run
    protected int
    the target pattern type and subtype of the search
    static final int
    target pattern type mask; to extract the main target pattern type, that is, ITEMSET or SEQUENCE
    static final String
    the version of this program
    protected int
    the maximum length of an item pattern (number of items)
    protected int
    the maximum length of an item pattern (number of items) that needs to be checked (zmax+1 if a closed/maximal filter is to be used, so that extensions are being checked, otherwise equal to zmax)
    protected int
    the minimum length of an item pattern (number of items)
  • Constructor Summary

    Constructors
    Constructor
    Description
    Create a miner for item patterns (frequent item sequences).
  • Method Summary

    Modifier and Type
    Method
    Description
    final void
    Abort a mining run.
    final void
    Clear results of mining run.
    Get result of a sequence mining run.
    final Thread
    Get the thread that was started last (if any).
    final void
    initMining(TrActBag tabag, int target, double smin, int zmin, int zmax, int umax, int mode)
    Initialize mining frequent item patterns (sequences) in a thread.
    static void
    main(String[] args)
    Main function for command line use.
    mine(TrActBag tabag, int target, double smin, int zmin, int zmax, int umax, int mode)
    Find frequent item patterns (sequences).
    mineSeq(TrActBag tabag, int target, double smin, int zmin, int zmax, int umax, int mode)
    Find frequent item sequences.
    final void
    run()
    Run mining (which must have been initialized with InitMining).
    final Thread
    Run mining as a thread (must have been initialized with InitMining).

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • DESCRIPTION

      public static final String DESCRIPTION
      the program description
      See Also:
    • VERSION

      public static final String VERSION
      the version of this program
      See Also:
    • TYPEMASK

      public static final int TYPEMASK
      target pattern type mask; to extract the main target pattern type, that is, ITEMSET or SEQUENCE
      See Also:
    • SUBTYPEMASK

      public static final int SUBTYPEMASK
      target pattern subtype mask; to extract the target pattern subtype, that is, FREQUENT, CLOSED or MAXIMAL
      See Also:
    • ITEMSET

      public static final int ITEMSET
      target pattern type: item sets (item order is ignored)
      See Also:
    • SEQUENCE

      public static final int SEQUENCE
      target pattern type: item sequence
      See Also:
    • FREQUENT

      public static final int FREQUENT
      target pattern subtype: simple frequent item patterns; to be combined with SEQUENCE
      See Also:
    • CLOSED

      public static final int CLOSED
      target pattern subtype: closed frequent item patterns; to combined with SEQUENCE
      See Also:
    • MAXIMAL

      public static final int MAXIMAL
      target pattern subtype: maximal frequent item patterns; to be combined with SEQUENCE
      See Also:
    • DEFAULT

      public static final int DEFAULT
      operation mode: default setting
      See Also:
    • NONE

      public static final int NONE
      operation mode: no special operation, identical to DEFAULT
      See Also:
    • REDUCE

      public static final int REDUCE
      operation mode: reduce transaction suffix lists
      See Also:
    • CMCHAIN

      public static final int CMCHAIN
      operation mode: use a chain of prefix trees to filter for closed/maximal patterns
      See Also:
    • CMTREE

      public static final int CMTREE
      operation mode: use a single prefix tree to filter for closed/maximal patterns
      See Also:
    • CMEXTS

      public static final int CMEXTS
      operation mode: check for closed/maximal patterns via extensions
      See Also:
    • CMTREEMASK

      public static final int CMTREEMASK
      mask for operation mode flags concerning closed/maximal filtering with a prefix tree repository (single tree or tree chain)
      See Also:
    • CMMASK

      public static final int CMMASK
      mask for operation mode flags concerning closed/maximal filtering
      See Also:
    • PREPRUNE

      public static final int PREPRUNE
      operation mode: force pre-check pruning for closed/maximal mining (attention: may lead to wrong results!)
      See Also:
    • EPSILON

      public static final double EPSILON
      the difference between 1.0 and the smallest number greater than 1.0 that is representable as a double precision number; used for handling rounding errors

      Note that this value differs from double.EPSILON, which is rather the smallest positive number that is representable as a double precision number.

      See Also:
    • tabag

      protected TrActBag tabag
      the transactions to mine in a threaded mining run
    • itarget

      protected int itarget
      the target pattern type and subtype of the search (ITEMSET or SEQUENCE as the main target pattern type and FREQUENT, CLOSED or MAXIMAL as the target pattern subtype)
    • ismin

      protected double ismin
      the minimum support of an item pattern (set by initMining())
    • izmin

      protected int izmin
      the minimum length of an item pattern (number of items, set by initMining())
    • izmax

      protected int izmax
      the maximum length of an item pattern (number of items, set by initMining())
    • imode

      protected int imode
      the operation mode (DEFAULT or REDUCE, set by initMining())
    • iumax

      protected int iumax
      the number of times an item may be used in a pattern (set by initMining())
    • ibase

      protected util.IdMap ibase
      the underlying item base
    • target

      protected int target
      the target pattern type and subtype of the search
    • smin

      protected int smin
      the minimum support of an item pattern (item sequence)
    • sbase

      protected int sbase
      the base support (support of empty sequence/database size)
    • zmin

      protected int zmin
      the minimum length of an item pattern (number of items)
    • zmax

      protected int zmax
      the maximum length of an item pattern (number of items)
    • zmaxx

      protected int zmaxx
      the maximum length of an item pattern (number of items) that needs to be checked (zmax+1 if a closed/maximal filter is to be used, so that extensions are being checked, otherwise equal to zmax)
    • mode

      protected int mode
      the search/operation mode
    • pats

      protected PatternSet pats
      the result set of frequent item patterns (item sequences)
    • aborted

      protected boolean aborted
      the flag whether a mining run has been aborted
  • Constructor Details

    • SeqMine

      public SeqMine()
      Create a miner for item patterns (frequent item sequences).
      Since:
      2017.06.26 (Christian Borgelt)
  • Method Details

    • mine

      public final PatternSet mine(TrActBag tabag, int target, double smin, int zmin, int zmax, int umax, int mode)
      Find frequent item patterns (sequences).
      Parameters:
      tabag - the (sequence) transactions to mine
      target - the type of frequent item patterns to mine (pattern type SEQUENCE and pattern subtype FREQUENT, CLOSED or MAXIMAL)
      smin - the minimum support of an item pattern (positive: percentage, negative: absolute value)
      zmin - the minimum size of an item pattern (number of items)
      zmax - the maximum size of an item pattern (number of items)
      umax - the maximum number of times an item may be used in a pattern
      mode - the operation mode (e.g. REDUCE)
      Returns:
      the found set of frequent item patterns
      Since:
      2017.06.26 (Christian Borgelt)
    • mineSeq

      public final PatternSet mineSeq(TrActBag tabag, int target, double smin, int zmin, int zmax, int umax, int mode)
      Find frequent item sequences.
      Parameters:
      tabag - the (sequence) transactions to mine
      target - the type of frequent item sequences to mine (FREQUENT, CLOSED, or MAXIMAL)
      smin - the minimum support of an item sequence (positive: percentage, negative: absolute value)
      zmin - the minimum size of an item sequence (number of items)
      zmax - the maximum size of an item sequence (number of items)
      umax - the maximum number of times an item may be used in a pattern
      mode - the operation mode (e.g. REDUCE)
      Returns:
      the found set of frequent item sequences
      Since:
      2017.06.26 (Christian Borgelt)
    • initMining

      public final void initMining(TrActBag tabag, int target, double smin, int zmin, int zmax, int umax, int mode)
      Initialize mining frequent item patterns (sequences) in a thread.
      Parameters:
      tabag - the (sequence) transactions to mine
      target - the type of frequent item patterns to mine (pattern type SEQUENCE and pattern subtype FREQUENT, CLOSED, or MAXIMAL)
      smin - the minimum support of an item pattern (positive: percentage, negative: absolute value)
      zmin - the minimum size of an item pattern (number of items)
      zmax - the maximum size of an item pattern (number of items)
      umax - the number of times an item may be used in a pattern
      mode - the operation mode (e.g. REDUCE)
      Since:
      2017.06.26 (Christian Borgelt)
    • run

      public final void run()
      Run mining (which must have been initialized with InitMining). The result can be retrieved with getResult().
      Specified by:
      run in interface Runnable
      Since:
      2017.06.26 (Christian Borgelt)
    • runAsThread

      public final Thread runAsThread()
      Run mining as a thread (must have been initialized with InitMining).

      The result can be retrieved with getResult().

      Returns:
      the created and started thread
      Since:
      2017.06.26 (Christian Borgelt)
    • getThread

      public final Thread getThread()
      Get the thread that was started last (if any).
      Returns:
      the thread that was started last
      Since:
      2017.06.26 (Christian Borgelt)
    • abort

      public final void abort()
      Abort a mining run.
      Since:
      2017.06.26 (Christian Borgelt)
    • getResult

      public final PatternSet getResult()
      Get result of a sequence mining run.
      Returns:
      the sequence mining result
      Since:
      2017.06.26 (Christian Borgelt)
    • clear

      public final void clear()
      Clear results of mining run.
      Since:
      2017.06.26 (Christian Borgelt)
    • main

      public static void main(String[] args)
      Main function for command line use.
      Parameters:
      args - the command line arguments as an array of strings
      Since:
      2017.06.26 (Christian Borgelt)