Package fim

Class FIM


public class FIM extends Object
Class for Java interface to frequent item set mining in C
Since:
2014.09.26
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    processing mode: add all (closed) item sets to repository (for Carpenter algorithm)
    static final int
    target type: all frequent item sets
    static final int
    item may appear only in a rule body/antecedent
    static final int
    item may appear only in a rule body/antecedent
    static final int
    Apriori variant: basic algorithm
    static final int
    algorithm variant: automatic choice (always applicable)
    static final int
    aggregation mode: average of individual measure values
    static final int
    aggregation mode: average of individual measure values
    static final int
    item may appear only in a rule body/antecedent
    static final int
    item may appear anywhere in a rule
    static final int
    item may appear anywhere in a rule
    static final int
    Carpenter variant: transaction identifier lists
    static final int
    Carpenter variant: item occurrence counter table
    static final int
    Carpenter variant: transaction identifier lists
    static final int
    Carpenter variant: transaction identifier lists
    static final int
    evaluation measure: certainty factor (larger is better)
    static final int
    evaluation measure: certainty factor (larger is better)
    static final int
    evaluation measure: normalized chi^2 measure (larger is better)
    static final int
    evaluation measure: p-value from chi^2 measure (smaller is better)
    static final int
    target type: closed (frequent) item sets
    static final String
    pattern spectrum report format: three columns size (integer), support (integer) and (average) occurrence frequency (double)
    static final int
    evaluation measure: conditional probability ratio (larger is better)
    static final int
    evaluation measure: rule confidence (larger is better)
    static final int
    evaluation measure: absolute confidence difference to prior (larger is better)
    static final int
    evaluation measure: rule confidence (larger is better)
    static final int
    item may appear only in a rule head/consequent
    static final int
    item may appear only in a rule head/consequent
    static final int
    evaluation measure: conviction (larger is better)
    static final int
    evaluation measure: conditional probability ratio (larger is better)
    static final int
    evaluation measure: conviction (larger is better)
    static final int
    evaluation measure: difference of conviction to 1 (larger is better)
    static final int
    evaluation measure: difference of conviction quotient to 1 (larger is better)
    static final int
    Eclat variant: transaction id lists intersection (basic)
    static final int
    Eclat variant: transaction id lists as bit vectors
    static final int
    Eclat variant: transaction id difference sets (diffsets)
    static final int
    Eclat variant: occurrence deliver from transaction lists
    static final int
    Eclat variant: transaction id lists intersection (improved)
    static final int
    Eclat variant: occurrence deliver from transaction lists
    static final int
    Eclat variant: transaction id range lists intersection
    static final int
    Eclat variant: item occurrence table (simplified)
    static final int
    Eclat variant: item occurrence table (standard)
    static final int
    Eclat variant: transaction id lists intersection (improved)
    static final int
    evaluation measure: Fisher's exact test (chi^2 measure) (smaller is better)
    static final int
    evaluation measure: Fisher's exact test (information gain) (smaller is better)
    static final int
    evaluation measure: Fisher's exact test (table probability) (smaller is better)
    static final int
    evaluation measure: Fisher's exact test (support) (smaller is better)
    static final int
    FP-growth variant: complex tree nodes (children and sibling)
    static final int
    FP-growth variant: simple tree nodes (link and parent)
    static final int
    FP-growth variant: top-down processing on a single prefix tree
    static final int
    FP-growth variant: top-down processing of the prefix trees
    static final int
    target type: all frequent item sets
    static final int
    target type: generator (frequent) item sets
    static final int
    target type: generator (frequent) item sets
    static final int
    item may appear only in a rule head/consequent
    static final int
    processing mode: check extensions for closed/maximal item sets with a horizontal scheme (default: use a repository)
    static final int
    surrogate method: identity (keep original data)
    static final int
    item may not appear anywhere in a rule
    static final int
    evaluation measure: importance (larger is better)
    static final int
    evaluation measure: importance (larger is better)
    static final int
    evaluation measure: information difference to prior (larger is better)
    static final int
    evaluation measure: p-value from information difference (smaller is better)
    static final int
    item may appear anywhere in a rule
    static final int
    item may appear only in a rule body/antecedent
    static final int
    processing mode: invalidate evaluation below expected support
    static final int
    IsTa variant: patricia tree (compact prefix tree)
    static final int
    IsTa variant: standard prefix tree
    static final int
    JIM: Baroni--Buser S_B = (x+s)/(x+r)
    static final int
    JIM variant: basic algorithm
    static final int
    JIM: Czekanowski S_D = 2s/(r+s)
    static final int
    JIM: Dice S_D = 2s/(r+s)
    static final int
    JIM: Faith S_F = (s+z/2)/n
    static final int
    JIM: generic measure S = (c_0s +c_1z +c_2n +c_3x) / (c_4s +c_5z +c_6n +c_7x)
    static final int
    JIM: Gower--Legendre S_N = 2(s+z)/(n+s+z)
    static final int
    JIM: Hamming S_M = (s+z)/n
    static final int
    JIM: Jaccard/Tanimoto S_J = s/r
    static final int
    JIM: Kulcynski S_K = s/q
    static final int
    JIM: no cover similarity
    static final int
    JIM: Rogers--Tanimoto S_T = (s+z)/(n+q)
    static final int
    JIM: Russel-Rao S_R = s/n
    static final int
    JIM: Sokal--Michener S_M = (s+z)/n
    static final int
    JIM: Sokal--Sneath 1 S_S = s/(r+q)
    static final int
    JIM: Sokal--Sneath 2 S_N = 2(s+z)/(n+s+z)
    static final int
    JIM: Sokal--Sneath 3 S_O = (s+z)/q
    static final int
    JIM: Sorensen S_D = 2s/(r+s)
    static final int
    JIM: Jaccard/Tanimoto S_J = s/r
    static final int
    evaluation measure: binary logarithm of support quotient (larger is better)
    static final int
    evaluation measure: lift value (confidence divided by prior) (larger is better)
    static final int
    evaluation measure: difference of lift value to 1 (larger is better)
    static final int
    evaluation measure: difference of lift quotient to 1 (larger is better)
    static final int
    aggregation mode: maximum of individual measure values
    static final int
    target type: maximal (frequent) item sets
    static final int
    aggregation mode: maximum of individual measure values
    static final int
    aggregation mode: minimum of individual measure values
    static final int
    aggregation mode: minimum of individual measure values
    static final int
    item may not appear anywhere in a rule
    static final int
    processing mode: do not collate equal transactions (for Carpenter algorithm)
    static final int
    processing mode: do not use a 16-items machine
    static final int
    processing mode: do not use head union tail (hut) pruning (for maximal item sets)
    static final int
    evaluation measure/aggregation mode: none
    static final int
    processing mode: do not use perfect extension pruning
    static final int
    processing mode: do not prune the prefix/patricia tree (for IsTa algorithm)
    static final int
    processing mode: do not sort items w.r.t.
    static final int
    processing mode: do not organize transactions as a prefix tree (for Apriori algorithm)
    static final String
    pattern spectrum report format: objects of type PatSpecElem
    static final int
    processing mode: use original support definition for rules (body & head instead of only body)
    static final int
    item may appear only in a rule head/consequent
    static final int
    processing mode: a-posteriori pruning of infrequent item sets (for Apriori algorithm)
    static final int
    surrogate method: random transaction generation
    static final int
    RElim variant: basic recursive elimination algorithm
    static final int
    processing mode: filter maximal item sets with repository (for Carpenter algorithm)
    static final int
    target type: association rules
    static final int
    SaM variant: basic split and merge algorithm
    static final int
    SaM variant: split and merge with binary search
    static final int
    SaM variant: split and merge with double source buffering
    static final int
    SaM variant: split and merge with transaction prefix tree
    static final int
    target type: all frequent item sets
    static final int
    surrogate method: shuffle table-derived data (columns)
    static final int
    evaluation measure: item set size times cover similarity (larger is better) (only for JIM algorithm)
    static final int
    evaluation measure: rule support (larger is better)
    static final int
    evaluation measure: rule support (larger is better)
    static final int
    surrogate method: permutation by pair swaps
    static final String
    the version string
    static final int
    processing mode: check extensions for closed/maximal item sets with a vertical scheme (default: use a repository)
    static final int
    evaluation measure: normalized chi^2 measure (Yates corrected) (larger is better)
    static final int
    evaluation measure: p-value from chi^2 measure (Yates corrected) (smaller is better)
  • Constructor Summary

    Constructors
    Constructor
    Description
    FIM()
    Constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    static PatternSet
    accretion(TrActBag tracts, double supp, int zmin, int zmax, int stat, double siglvl, int maxext, int mode, int[] border)
    Java interface to Accretion algorithm in C.
    static PatternSet
    apriacc(TrActBag tracts, double supp, int zmin, int zmax, int stat, double siglvl, int prune, int mode, int[] border)
    Java interface to accretion-style Apriori algorithm in C (wrapper with Java objects).
    static ARuleSet
    apriori(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to Apriori algorithm in C (association rule induction, wrapper with Java objects).
    static ARuleSet
    apriori(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border, int[][] appear)
    Java interface to Apriori algorithm in C (association rule induction, wrapper with Java objects).
    static PatternSet
    apriori(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border)
    Java interface to Apriori algorithm in C (frequent item set mining, wrapper with Java objects).
    static ARuleSet
    arules(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int mode)
    Java interface to association rule induction in C (wrapper with Java objects).
    static ARuleSet
    arules(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int mode, int[][] appear)
    Java interface to association rule induction in C (wrapper with Java objects).
    static PatternSet
    carpenter(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to Carpenter algorithm in C (wrapper with Java objects).
    static ARuleSet
    eclat(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to Eclat algorithm in C (association rule induction, wrapper with Java objects).
    static ARuleSet
    eclat(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border, int[][] appear)
    Java interface to Eclat algorithm in C (association rule induction, wrapper with Java objects).
    static PatternSet
    eclat(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border)
    Java interface to Eclat algorithm in C (frequent item set mining, wrapper with Java objects).
    static PatSpecElem[]
    estpsp(TrActBag tracts, int target, double supp, int zmin, int zmax, int equiv, double alpha, int smpls, int seed)
    Estimate a pattern spectrum from data characteristics (wrapper with Java objects).
    static PatternSet
    fim(TrActBag tracts, int target, double supp, int zmin, int zmax, int[] border)
    Java interface to frequent item set mining in C (very simplified interface, wrapper with Java objects).
    static ARuleSet
    fpgrowth(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to FP-growth algorithm in C (association rule induction, wrapper with Java objects).
    static ARuleSet
    fpgrowth(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border, int[][] appear)
    Java interface to FP-growth algorithm in C (association rule induction, wrapper with Java objects).
    static PatternSet
    fpgrowth(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border)
    Java interface to FP-growth algorithm in C (frequent item set mining, wrapper with Java objects).
    static PatSpecElem[]
    genpsp(TrActBag tracts, int target, double supp, int zmin, int zmax, int cnt, int surr, int seed, int cpus, int[] ctrl)
    Pattern spectrum generation with surrogate data sets (wrapper with Java objects).
    static PatternSet
    ista(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to IsTa algorithm in C (wrapper with Java objects).
    static PatternSet
    jim(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, double thresh, int covsim, double[] simps, double sim, int algo, int mode, int[] border)
    Java interface to JIM algorithm in C (wrapper with Java objects).
    static void
    main(String[] args)
    Main program for testing.
    static PatternSet
    relim(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to RElim algorithm in C (wrapper with Java objects).
    static PatternSet
    sam(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
    Java interface to SaM algorithm in C (wrapper with Java objects).
    static PatternSet
    xfim(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, int agg, double thresh, int[] border)
    Java interface to frequent item set mining in C (less simplified interface, wrapper with Java objects).

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • VERSION

      public static final String VERSION
      the version string
      See Also:
    • SETS

      public static final int SETS
      target type: all frequent item sets
      See Also:
    • ALL

      public static final int ALL
      target type: all frequent item sets
      See Also:
    • FREQUENT

      public static final int FREQUENT
      target type: all frequent item sets
      See Also:
    • CLOSED

      public static final int CLOSED
      target type: closed (frequent) item sets
      See Also:
    • MAXIMAL

      public static final int MAXIMAL
      target type: maximal (frequent) item sets
      See Also:
    • GENERATORS

      public static final int GENERATORS
      target type: generator (frequent) item sets
      See Also:
    • GENERAS

      public static final int GENERAS
      target type: generator (frequent) item sets
      See Also:
    • RULES

      public static final int RULES
      target type: association rules
      See Also:
    • IGNORE

      public static final int IGNORE
      item may not appear anywhere in a rule
      See Also:
    • NEITHER

      public static final int NEITHER
      item may not appear anywhere in a rule
      See Also:
    • INPUT

      public static final int INPUT
      item may appear only in a rule body/antecedent
      See Also:
    • BODY

      public static final int BODY
      item may appear only in a rule body/antecedent
      See Also:
    • ANTE

      public static final int ANTE
      item may appear only in a rule body/antecedent
      See Also:
    • ANTECEDENT

      public static final int ANTECEDENT
      item may appear only in a rule body/antecedent
      See Also:
    • OUTPUT

      public static final int OUTPUT
      item may appear only in a rule head/consequent
      See Also:
    • CONS

      public static final int CONS
      item may appear only in a rule head/consequent
      See Also:
    • CONSEQUENT

      public static final int CONSEQUENT
      item may appear only in a rule head/consequent
      See Also:
    • BOTH

      public static final int BOTH
      item may appear anywhere in a rule
      See Also:
    • INOUT

      public static final int INOUT
      item may appear anywhere in a rule
      See Also:
    • CANDA

      public static final int CANDA
      item may appear anywhere in a rule
      See Also:
    • NONE

      public static final int NONE
      evaluation measure/aggregation mode: none
      See Also:
    • SUPPORT

      public static final int SUPPORT
      evaluation measure: rule support (larger is better)
      See Also:
    • SUPP

      public static final int SUPP
      evaluation measure: rule support (larger is better)
      See Also:
    • CONFIDENCE

      public static final int CONFIDENCE
      evaluation measure: rule confidence (larger is better)
      See Also:
    • CONF

      public static final int CONF
      evaluation measure: rule confidence (larger is better)
      See Also:
    • CONFDIFF

      public static final int CONFDIFF
      evaluation measure: absolute confidence difference to prior (larger is better)
      See Also:
    • LIFT

      public static final int LIFT
      evaluation measure: lift value (confidence divided by prior) (larger is better)
      See Also:
    • LIFTDIFF

      public static final int LIFTDIFF
      evaluation measure: difference of lift value to 1 (larger is better)
      See Also:
    • LIFTQUOT

      public static final int LIFTQUOT
      evaluation measure: difference of lift quotient to 1 (larger is better)
      See Also:
    • CONVICTION

      public static final int CONVICTION
      evaluation measure: conviction (larger is better)
      See Also:
    • CVCT

      public static final int CVCT
      evaluation measure: conviction (larger is better)
      See Also:
    • CVCTDIFF

      public static final int CVCTDIFF
      evaluation measure: difference of conviction to 1 (larger is better)
      See Also:
    • CVCTQUOT

      public static final int CVCTQUOT
      evaluation measure: difference of conviction quotient to 1 (larger is better)
      See Also:
    • CPROB

      public static final int CPROB
      evaluation measure: conditional probability ratio (larger is better)
      See Also:
    • CONDPROB

      public static final int CONDPROB
      evaluation measure: conditional probability ratio (larger is better)
      See Also:
    • IMPORTANCE

      public static final int IMPORTANCE
      evaluation measure: importance (larger is better)
      See Also:
    • IMPORT

      public static final int IMPORT
      evaluation measure: importance (larger is better)
      See Also:
    • CERTAINTY

      public static final int CERTAINTY
      evaluation measure: certainty factor (larger is better)
      See Also:
    • CERT

      public static final int CERT
      evaluation measure: certainty factor (larger is better)
      See Also:
    • CHI2

      public static final int CHI2
      evaluation measure: normalized chi^2 measure (larger is better)
      See Also:
    • CHI2PVAL

      public static final int CHI2PVAL
      evaluation measure: p-value from chi^2 measure (smaller is better)
      See Also:
    • YATES

      public static final int YATES
      evaluation measure: normalized chi^2 measure (Yates corrected) (larger is better)
      See Also:
    • YATESPVAL

      public static final int YATESPVAL
      evaluation measure: p-value from chi^2 measure (Yates corrected) (smaller is better)
      See Also:
    • INFO

      public static final int INFO
      evaluation measure: information difference to prior (larger is better)
      See Also:
    • INFOPVAL

      public static final int INFOPVAL
      evaluation measure: p-value from information difference (smaller is better)
      See Also:
    • FETPROB

      public static final int FETPROB
      evaluation measure: Fisher's exact test (table probability) (smaller is better)
      See Also:
    • FETCHI2

      public static final int FETCHI2
      evaluation measure: Fisher's exact test (chi^2 measure) (smaller is better)
      See Also:
    • FETINFO

      public static final int FETINFO
      evaluation measure: Fisher's exact test (information gain) (smaller is better)
      See Also:
    • FETSUPP

      public static final int FETSUPP
      evaluation measure: Fisher's exact test (support) (smaller is better)
      See Also:
    • LDRATIO

      public static final int LDRATIO
      evaluation measure: binary logarithm of support quotient (larger is better)
      See Also:
    • SIZESIM

      public static final int SIZESIM
      evaluation measure: item set size times cover similarity (larger is better) (only for JIM algorithm)
      See Also:
    • MIN

      public static final int MIN
      aggregation mode: minimum of individual measure values
      See Also:
    • MINIMUM

      public static final int MINIMUM
      aggregation mode: minimum of individual measure values
      See Also:
    • MAX

      public static final int MAX
      aggregation mode: maximum of individual measure values
      See Also:
    • MAXIMUM

      public static final int MAXIMUM
      aggregation mode: maximum of individual measure values
      See Also:
    • AVG

      public static final int AVG
      aggregation mode: average of individual measure values
      See Also:
    • AVERAGE

      public static final int AVERAGE
      aggregation mode: average of individual measure values
      See Also:
    • AUTO

      public static final int AUTO
      algorithm variant: automatic choice (always applicable)
      See Also:
    • APRI_BASIC

      public static final int APRI_BASIC
      Apriori variant: basic algorithm
      See Also:
    • ECLAT_BASIC

      public static final int ECLAT_BASIC
      Eclat variant: transaction id lists intersection (basic)
      See Also:
    • ECLAT_LISTS

      public static final int ECLAT_LISTS
      Eclat variant: transaction id lists intersection (improved)
      See Also:
    • ECLAT_TIDS

      public static final int ECLAT_TIDS
      Eclat variant: transaction id lists intersection (improved)
      See Also:
    • ECLAT_BITS

      public static final int ECLAT_BITS
      Eclat variant: transaction id lists as bit vectors
      See Also:
    • ECLAT_TABLE

      public static final int ECLAT_TABLE
      Eclat variant: item occurrence table (standard)
      See Also:
    • ECLAT_SIMPLE

      public static final int ECLAT_SIMPLE
      Eclat variant: item occurrence table (simplified)
      See Also:
    • ECLAT_RANGES

      public static final int ECLAT_RANGES
      Eclat variant: transaction id range lists intersection
      See Also:
    • ECLAT_OCCDLV

      public static final int ECLAT_OCCDLV
      Eclat variant: occurrence deliver from transaction lists
      See Also:
    • ECLAT_LCM

      public static final int ECLAT_LCM
      Eclat variant: occurrence deliver from transaction lists
      See Also:
    • ECLAT_DIFFS

      public static final int ECLAT_DIFFS
      Eclat variant: transaction id difference sets (diffsets)
      See Also:
    • FPG_SIMPLE

      public static final int FPG_SIMPLE
      FP-growth variant: simple tree nodes (link and parent)
      See Also:
    • FPG_COMPLEX

      public static final int FPG_COMPLEX
      FP-growth variant: complex tree nodes (children and sibling)
      See Also:
    • FPG_SINGLE

      public static final int FPG_SINGLE
      FP-growth variant: top-down processing on a single prefix tree
      See Also:
    • FPG_TOPDOWN

      public static final int FPG_TOPDOWN
      FP-growth variant: top-down processing of the prefix trees
      See Also:
    • SAM_BASIC

      public static final int SAM_BASIC
      SaM variant: basic split and merge algorithm
      See Also:
    • SAM_BSEARCH

      public static final int SAM_BSEARCH
      SaM variant: split and merge with binary search
      See Also:
    • SAM_DOUBLE

      public static final int SAM_DOUBLE
      SaM variant: split and merge with double source buffering
      See Also:
    • SAM_TREE

      public static final int SAM_TREE
      SaM variant: split and merge with transaction prefix tree
      See Also:
    • RELIM_BASIC

      public static final int RELIM_BASIC
      RElim variant: basic recursive elimination algorithm
      See Also:
    • JIM_BASIC

      public static final int JIM_BASIC
      JIM variant: basic algorithm
      See Also:
    • CARP_TABLE

      public static final int CARP_TABLE
      Carpenter variant: item occurrence counter table
      See Also:
    • CARP_LISTS

      public static final int CARP_LISTS
      Carpenter variant: transaction identifier lists
      See Also:
    • CARP_TIDLIST

      public static final int CARP_TIDLIST
      Carpenter variant: transaction identifier lists
      See Also:
    • CARP_TIDLISTS

      public static final int CARP_TIDLISTS
      Carpenter variant: transaction identifier lists
      See Also:
    • ISTA_PREFIX

      public static final int ISTA_PREFIX
      IsTa variant: standard prefix tree
      See Also:
    • ISTA_PATRICIA

      public static final int ISTA_PATRICIA
      IsTa variant: patricia tree (compact prefix tree)
      See Also:
    • NOFIM16

      public static final int NOFIM16
      processing mode: do not use a 16-items machine
      See Also:
    • NOPERFECT

      public static final int NOPERFECT
      processing mode: do not use perfect extension pruning
      See Also:
    • NOSORT

      public static final int NOSORT
      processing mode: do not sort items w.r.t. conditional support
      See Also:
    • NOHUT

      public static final int NOHUT
      processing mode: do not use head union tail (hut) pruning (for maximal item sets)
      See Also:
    • HORZ

      public static final int HORZ
      processing mode: check extensions for closed/maximal item sets with a horizontal scheme (default: use a repository)
      See Also:
    • VERT

      public static final int VERT
      processing mode: check extensions for closed/maximal item sets with a vertical scheme (default: use a repository)
      See Also:
    • INVBXS

      public static final int INVBXS
      processing mode: invalidate evaluation below expected support
      See Also:
    • ORIGSUPP

      public static final int ORIGSUPP
      processing mode: use original support definition for rules (body & head instead of only body)
      See Also:
    • NOTREE

      public static final int NOTREE
      processing mode: do not organize transactions as a prefix tree (for Apriori algorithm)
      See Also:
    • POSTPRUNE

      public static final int POSTPRUNE
      processing mode: a-posteriori pruning of infrequent item sets (for Apriori algorithm)
      See Also:
    • REPOFILT

      public static final int REPOFILT
      processing mode: filter maximal item sets with repository (for Carpenter algorithm)
      See Also:
    • ADDALL

      public static final int ADDALL
      processing mode: add all (closed) item sets to repository (for Carpenter algorithm)
      See Also:
    • NOCOLLATE

      public static final int NOCOLLATE
      processing mode: do not collate equal transactions (for Carpenter algorithm)
      See Also:
    • NOPRUNE

      public static final int NOPRUNE
      processing mode: do not prune the prefix/patricia tree (for IsTa algorithm)
      See Also:
    • JIM_NONE

      public static final int JIM_NONE
      JIM: no cover similarity
      See Also:
    • JIM_RUSSEL_RAO

      public static final int JIM_RUSSEL_RAO
      JIM: Russel-Rao S_R = s/n
      See Also:
    • JIM_KULCYNSKI

      public static final int JIM_KULCYNSKI
      JIM: Kulcynski S_K = s/q
      See Also:
    • JIM_JACCARD

      public static final int JIM_JACCARD
      JIM: Jaccard/Tanimoto S_J = s/r
      See Also:
    • JIM_TANIMOTO

      public static final int JIM_TANIMOTO
      JIM: Jaccard/Tanimoto S_J = s/r
      See Also:
    • JIM_DICE

      public static final int JIM_DICE
      JIM: Dice S_D = 2s/(r+s)
      See Also:
    • JIM_SORENSEN

      public static final int JIM_SORENSEN
      JIM: Sorensen S_D = 2s/(r+s)
      See Also:
    • JIM_CZEKANOWSKI

      public static final int JIM_CZEKANOWSKI
      JIM: Czekanowski S_D = 2s/(r+s)
      See Also:
    • JIM_SOKAL_SNEATH_1

      public static final int JIM_SOKAL_SNEATH_1
      JIM: Sokal--Sneath 1 S_S = s/(r+q)
      See Also:
    • JIM_SOKAL_MICHENER

      public static final int JIM_SOKAL_MICHENER
      JIM: Sokal--Michener S_M = (s+z)/n
      See Also:
    • JIM_HAMMING

      public static final int JIM_HAMMING
      JIM: Hamming S_M = (s+z)/n
      See Also:
    • JIM_FAITH

      public static final int JIM_FAITH
      JIM: Faith S_F = (s+z/2)/n
      See Also:
    • JIM_ROGERS_TANIMOTO

      public static final int JIM_ROGERS_TANIMOTO
      JIM: Rogers--Tanimoto S_T = (s+z)/(n+q)
      See Also:
    • JIM_SOKAL_SNEATH_2

      public static final int JIM_SOKAL_SNEATH_2
      JIM: Sokal--Sneath 2 S_N = 2(s+z)/(n+s+z)
      See Also:
    • JIM_GOWER_LEGENDRE

      public static final int JIM_GOWER_LEGENDRE
      JIM: Gower--Legendre S_N = 2(s+z)/(n+s+z)
      See Also:
    • JIM_SOKAL_SNEATH_3

      public static final int JIM_SOKAL_SNEATH_3
      JIM: Sokal--Sneath 3 S_O = (s+z)/q
      See Also:
    • JIM_BARONI_BUSER

      public static final int JIM_BARONI_BUSER
      JIM: Baroni--Buser S_B = (x+s)/(x+r)
      See Also:
    • JIM_GENERIC

      public static final int JIM_GENERIC
      JIM: generic measure S = (c_0s +c_1z +c_2n +c_3x) / (c_4s +c_5z +c_6n +c_7x)
      See Also:
    • IDENT

      public static final int IDENT
      surrogate method: identity (keep original data)
      See Also:
    • RANDOM

      public static final int RANDOM
      surrogate method: random transaction generation
      See Also:
    • SWAP

      public static final int SWAP
      surrogate method: permutation by pair swaps
      See Also:
    • SHUFFLE

      public static final int SHUFFLE
      surrogate method: shuffle table-derived data (columns)
      See Also:
    • COLUMNS

      public static final String COLUMNS
      pattern spectrum report format: three columns size (integer), support (integer) and (average) occurrence frequency (double)
      See Also:
    • OBJECTS

      public static final String OBJECTS
      pattern spectrum report format: objects of type PatSpecElem
      See Also:
  • Constructor Details

    • FIM

      public FIM()
      Constructor.
      Since:
      2023.07.30 (Christian Borgelt)
  • Method Details

    • fim

      public static PatternSet fim(TrActBag tracts, int target, double supp, int zmin, int zmax, int[] border)
      Java interface to frequent item set mining in C (very simplified interface, wrapper with Java objects).
      Parameters:
      tracts - transaction set to analyze
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED, MAXIMAL or GENERATORS)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • xfim

      public static PatternSet xfim(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, int agg, double thresh, int[] border)
      Java interface to frequent item set mining in C (less simplified interface, wrapper with Java objects).
      Parameters:
      tracts - transaction set to analyze
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED, MAXIMAL or GENERATORS)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      agg - evaluation measure aggregation mode
      (NONE, MIN, MINIMUM, MAX, MAXIMUM, AVG, AVERAGE)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • arules

      public static ARuleSet arules(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int mode)
      Java interface to association rule induction in C (wrapper with Java objects).
      Parameters:
      tracts - transaction set to analyze
      supp - minimum support of an association rule
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per association rule
      zmax - maximum number of items per association rule
      eval - measure for association rule evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      mode - operation mode indicators/flags
      (NONE or ORIGSUPP)
      Returns:
      a set of association rules
      Since:
      2014.10.23 (Christian Borgelt)
    • arules

      public static ARuleSet arules(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int mode, int[][] appear)
      Java interface to association rule induction in C (wrapper with Java objects).
      Parameters:
      tracts - transaction set to analyze
      supp - minimum support of an association rule
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per association rule
      zmax - maximum number of items per association rule
      eval - measure for association rule evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      mode - operation mode indicators/flags
      (NONE or ORIGSUPP)
      appear - map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
      This parameter may be null; and then items may appear anywhere in a rule.
      The item appearance indicators must be one of IGNORE, NEITHER, NONE, BODY, INPUT, ANTE, ANTECENDENT, HEAD, OUTPUT, CONS, CONSEQUENT, BOTH, INOUT, CANDA. The default appearance indicator is set via a pseudo-item which has a negative identifier.
      Returns:
      a set of association rules
      Since:
      2015.02.27 (Christian Borgelt)
    • apriori

      public static PatternSet apriori(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border)
      Java interface to Apriori algorithm in C (frequent item set mining, wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED, MAXIMAL or GENERATORS)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      agg - evaluation measure aggregation mode
      (NONE, MIN, MINIMUM, MAX, MAXIMUM, AVG, AVERAGE)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      prune - minimum size for evaluation filtering
      = 0: backward filtering (no subset check)
      < 0: weak forward filtering (one subset must qualify)
      > 0: strong forward filtering (all subsets must qualify)
      algo - algorithm variant to use
      (AUTO or APRI_BASIC)
      mode - operation mode indicators/flags
      (NONE, NOPERFECT, NOTREE, POSTPRUNE, INVBXS)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.01 (Christian Borgelt)
    • apriori

      public static ARuleSet apriori(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to Apriori algorithm in C (association rule induction, wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      algo - algorithm variant to use
      (AUTO or APRI_BASIC)
      mode - operation mode indicators/flags
      (NONE, NOTREE, POSTPRUNE, ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • apriori

      public static ARuleSet apriori(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border, int[][] appear)
      Java interface to Apriori algorithm in C (association rule induction, wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      algo - algorithm variant to use
      (AUTO or APRI_BASIC)
      mode - operation mode indicators/flags
      (NONE, NOTREE, POSTPRUNE, ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      appear - map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
      This parameter may be null; and then items may appear anywhere in a rule.
      The item appearance indicators must be one of IGNORE, NEITHER, NONE, BODY, INPUT, ANTE, ANTECENDENT, HEAD, OUTPUT, CONS, CONSEQUENT, BOTH, INOUT, CANDA. The default appearance indicator is set via a pseudo-item which has a negative identifier.
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • eclat

      public static PatternSet eclat(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border)
      Java interface to Eclat algorithm in C (frequent item set mining, wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED, MAXIMAL or GENERATORS)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      agg - evaluation measure aggregation mode
      (NONE, MIN, MINIMUM, MAX, MAXIMUM, AVG, AVERAGE)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      prune - minimum size for evaluation filtering
      = 0: backward filtering (no subset check)
      < 0: weak forward filtering (one subset must qualify)
      > 0: strong forward filtering (all subsets must qualify)
      algo - algorithm variant to use
      (AUTO, ECLAT_BASIC, ECLAT_TIDS, ECLAT_BITS, ECLAT_TABLE, ECLAT_SIMPLE, ECLAT_RANGES, ECLAT_OCCDLV, ECLAT_DIFFS)
      mode - operation mode indicators/flags
      (NONE, NOFIM16, NOPERFECT, NOSORT, NOHUT, HORZ, VERT, INVBXS, ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • eclat

      public static ARuleSet eclat(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to Eclat algorithm in C (association rule induction, wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      algo - algorithm variant to use
      (AUTO or ECLAT_OCCDLV)
      mode - operation mode indicators/flags
      (NONE or ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • eclat

      public static ARuleSet eclat(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border, int[][] appear)
      Java interface to Eclat algorithm in C (association rule induction, wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      algo - algorithm variant to use
      (AUTO or ECLAT_OCCDLV)
      mode - operation mode indicators/flags
      (NONE or ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      appear - map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
      This parameter may be null; and then items may appear anywhere in a rule.
      The item appearance indicators must be one of IGNORE, NEITHER, NONE, BODY, INPUT, ANTE, ANTECENDENT, HEAD, OUTPUT, CONS, CONSEQUENT, BOTH, INOUT, CANDA. The default appearance indicator is set via a pseudo-item which has a negative identifier.
      Returns:
      a set of (frequent) item sets
      Since:
      2015.02.27 (Christian Borgelt)
    • fpgrowth

      public static PatternSet fpgrowth(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, int agg, double thresh, int prune, int algo, int mode, int[] border)
      Java interface to FP-growth algorithm in C (frequent item set mining, wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED, MAXIMAL or GENERATORS)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      agg - evaluation measure aggregation mode
      (NONE, MIN, MINIMUM, MAX, MAXIMUM, AVG, AVERAGE)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      prune - minimum size for evaluation filtering
      = 0: backward filtering (no subset check)
      < 0: weak forward filtering (one subset must qualify)
      > 0: strong forward filtering (all subsets must qualify)
      algo - algorithm variant to use
      (AUTO, FPG_SIMPLE, FPG_COMPLEX, FPG_SINGLE, FPG_TOPDOWN)
      mode - operation mode indicators/flags
      (NONE, NOFIM16, NOPERFECT, NOSORT, NOHUT, INVBXS, ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • fpgrowth

      public static ARuleSet fpgrowth(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to FP-growth algorithm in C (association rule induction, wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      algo - algorithm variant to use
      (AUTO or FPG_SINGLE)
      mode - operation mode indicators/flags
      (NONE or ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • fpgrowth

      public static ARuleSet fpgrowth(TrActBag tracts, double supp, double conf, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border, int[][] appear)
      Java interface to FP-growth algorithm in C (association rule induction, wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      conf - minimum confidence of an association rule
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO, CONFIDENCE, CONF, CONFDIFF, LIFT, LIFTDIFF, LIFTQUOT, CONVICTION, CVCT, CVCTDIFF, CVCTQUOT, CPROB, CONDPROB, IMPORTANCE, IMPORT, CERTAINTY, CERT, CHI2, CHI2PVAL, YATES, YATESPVAL, INFO, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      thresh - threshold for evaluation measure (lower bound for measures for which larger is better, upper bound for measures for which smaller is better)
      algo - algorithm variant to use
      (AUTO or FPG_SINGLE)
      mode - operation mode indicators/flags
      (NONE or ORIGSUPP)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      appear - map from items to item appearance indicators as two integer arrays odf equal size, with the first holding the items, the second the corresponding item appearance indicators.
      This parameter may be null; and then items may appear anywhere in a rule.
      The item appearance indicators must be one of IGNORE, NEITHER, NONE, BODY, INPUT, ANTE, ANTECENDENT, HEAD, OUTPUT, CONS, CONSEQUENT, BOTH, INOUT, CANDA. The default appearance indicator is set via a pseudo-item which has a negative identifier.
      Returns:
      a set of (frequent) item sets
      Since:
      2015.02.27 (Christian Borgelt)
    • sam

      public static PatternSet sam(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to SaM algorithm in C (wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED or MAXIMAL)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO)
      thresh - threshold for evaluation measure
      algo - algorithm variant to use
      (AUTO, SAM_SIMPLE, SAM_BSEARCH, SAM_DOUBLE, SAM_TREE)
      mode - operation mode indicators/flags
      (NONE, NOFIM16, NOPERFECT)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • relim

      public static PatternSet relim(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to RElim algorithm in C (wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED or MAXIMAL)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation (NONE, LDRATIO)
      thresh - threshold for evaluation measure
      algo - algorithm variant to use
      (AUTO, RELIM_BASIC)
      mode - operation mode indicators/flags
      (NONE, NOFIM16, NOPERFECT)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • jim

      public static PatternSet jim(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, double thresh, int covsim, double[] simps, double sim, int algo, int mode, int[] border)
      Java interface to JIM algorithm in C (wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      target - type of the item sets to find (SETS, ALL, FREQUENT, CLOSED or MAXIMAL)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation
      (NONE, LDRATIO)
      thresh - threshold for evaluation measure
      covsim - cover similarity measure (JIM_NONE, JIM_RUSSEL_RAO, JIM_KULCZYNSKI, JIM_JACCARD, JIM_TANIMOTO, JIM_DICE, JIM_SORENSEN, JIM_CZEKANOWKSI, JIM_SOKAL_SNEATH_1, JIM_SOKAL_MICHENER, JIM_HAMMING, JIM_FAITH, JIM_ROGERS_TANIMOTO, JIM_SOKAL_SNEATH_2, JIM_GOWER_LEGENDRE, JIM_SOKAL_SNEATH_3, JIM_BARONI_BUSER, JIM_GENERIC)
      simps - cover similarity measure parameters (if generic) S = (c_0s +c_1z +c_2n +c_3x) / (c_4s +c_5z +c_6n +c_7x)
      sim - threshold for cover similarity measure
      algo - algorithm variant to use
      (AUTO, SAM_SIMPLE, SAM_BSEARCH, SAM_DOUBLE, SAM_TREE)
      mode - operation mode indicators/flags
      (NONE, NOFIM16, NOPERFECT)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2018.03.21 (Christian Borgelt)
    • carpenter

      public static PatternSet carpenter(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to Carpenter algorithm in C (wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      target - type of the item sets to find (CLOSED or MAXIMAL)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation (NONE, LDRATIO)
      thresh - threshold for evaluation measure
      algo - algorithm variant to use
      (AUTO, CARP_TABLE, CARP_TIDLIST)
      mode - operation mode indicators/flags
      (NONE, NOPERFECT, REPOFILT, MAXONLY, NOCOLLATE)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • ista

      public static PatternSet ista(TrActBag tracts, int target, double supp, int zmin, int zmax, int eval, double thresh, int algo, int mode, int[] border)
      Java interface to IsTa algorithm in C (wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      target - type of the item sets to find (CLOSED or MAXIMAL)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      eval - measure for item set evaluation (NONE, LDRATIO)
      thresh - threshold for evaluation measure
      algo - algorithm variant to use
      (AUTO, ISTA_PREFIX, ISTA_PATRICIA)
      mode - operation mode indicators/flags
      (NONE, NOPRUNE, REPOFILT)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • apriacc

      public static PatternSet apriacc(TrActBag tracts, double supp, int zmin, int zmax, int stat, double siglvl, int prune, int mode, int[] border)
      Java interface to accretion-style Apriori algorithm in C (wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      stat - test statistic for item set evaluation (NONE, CHI2PVAL, YATESPVAL, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      siglvl - significance level (maximum p-value)
      prune - minimum size for evaluation filtering
      = 0: backward filtering (no subset check)
      < 0: weak forward filtering (one subset must qualify)
      > 0: strong forward filtering (all subsets must qualify)
      mode - operation mode indicators/flags
      (NONE, INVBXS)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • accretion

      public static PatternSet accretion(TrActBag tracts, double supp, int zmin, int zmax, int stat, double siglvl, int maxext, int mode, int[] border)
      Java interface to Accretion algorithm in C.
      Parameters:
      tracts - transactions to analyze
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      stat - test statistic for item set evaluation (NONE, CHI2PVAL, YATESPVAL, INFOPVAL, FETPROB, FETCHI2, FETINFO, FETSUPP)
      siglvl - significance level (maximum p-value)
      maxext - maximum number of extension items
      mode - operation mode indicators/flags
      (NONE, INVBXS)
      border - array of support thresholds per item set size (item set size is index of this array); may be null if this additional filtering is not needed
      Returns:
      a set of (frequent) item sets
      Since:
      2014.10.23 (Christian Borgelt)
    • genpsp

      public static PatSpecElem[] genpsp(TrActBag tracts, int target, double supp, int zmin, int zmax, int cnt, int surr, int seed, int cpus, int[] ctrl)
      Pattern spectrum generation with surrogate data sets (wrapper with Java objects).
      Parameters:
      tracts - transactions to process
      target - type of the item sets to find (SETS, ALL or FREQUENT)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      cnt - number of surrogate data sets to generate
      surr - surrogate data generation method (IDENT, RANDOM, SWAP or SHUFFLE)
      seed - seed value for random number generator
      cpus - number of cpus/threads to use
      ctrl - control array (progress indicator, stop flag)
      Returns:
      an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency
      Since:
      2014.10.23 (Christian Borgelt)
    • estpsp

      public static PatSpecElem[] estpsp(TrActBag tracts, int target, double supp, int zmin, int zmax, int equiv, double alpha, int smpls, int seed)
      Estimate a pattern spectrum from data characteristics (wrapper with Java objects).
      Parameters:
      tracts - transactions to analyze
      target - type of the item sets to find (SETS, ALL or FREQUENT)
      supp - minimum support of an item set
      (positive: percentage, negative: absolute number)
      zmin - minimum number of items per item set
      zmax - maximum number of items per item set
      equiv - equivalent number of surrogate data sets
      alpha - probability dispersion factor
      smpls - number of samples per item set size
      seed - seed value for random number generator
      Returns:
      an array with objects of type PatSpecElem, each of which specifies a pattern signature together with its occurrence frequency
      Since:
      2014.10.23 (Christian Borgelt)
    • main

      public static void main(String[] args)
      Main program for testing.
      Parameters:
      args - the command line arguments
      Since:
      2013.11.21 (Christian Borgelt)