Need help with data_mine?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

blackAndrechen
137 Stars 37 Forks 6 Commits 6 Opened issues

Description

Apriori and fp-growth implement of python

Services available

!
?

Need anything else?

Contributors list

No Data

data_mine

求star!求star!求star!

introduce

In this repository implemente 6 class of Association rule data mining algorithm

1.Apriori (apriori.py)

apriori algorithm

2.Aprioricompress(aprioricompress.py)

transaction compression processing for apriori algorithm

3.Apriorihash(apriorihash.py)

hash method for apriori algorithm

4.Aprioriplus(aprioriplus.py)

transaction compress + dataset compress+hash + apriori

5.Fpgrowth(fpgrowth.py)

fp-growth algorithm

6.Fpgrowthplus(fpgrowthplus.py)

dataset compress + fp_growth

  • running progress

  • the result of association rule data mining

how to use

  • download the repository
    git clone https://github.com/blackAndrechen/data_mine
    
  • into this folder

    cd data_mine
    
  • write your own code,take apriori algorithm for example ``` from apriori import *

data=[[l1,l2,l3,l4], [l1,l3,l5], [l1,l3,l4]] minsupport=2 minconfident=0.6

apr=Apriori() rulelist=apr.generateR(data,minsupport,minconfident) ```

tips

  • if you want use others algorithm,the use method is same,for example
    from fp_growth import *
    fp=Fp_growth()
    rule_list=fp.generate_R(data,min_support,min_confident)
    
  • in my code ,i use
    groceries.csv
    and
    药方.xls
    data file,you can try running it ``` filename="groceries.csv" minsupport=25 minconf=0.7

filename="药方.xls"

min_support=600

min_conf=0.9

import os currentpath=os.getcwd() path=currentpath+"/dataset/"+filename

path='/home/czpchen/文档/github/data_mine/dataset/groceries.csv'

data=loaddata(path) apr=Apriori() rulelist=apr.generateR(data,minsupport,minconf)

- if you want use youself dataset,suggest you rewrite a function to read youself dataset,And make sure your data set looks like this.
data=[[l1,l2,l3,l4], [l1,l3,l5], [l1,l3,l4]]
- if you want save the result of Association rule data
save
path=savepath=currentpath+"/log/"+filename.split(".")[0]+"_apriori.txt"

savepath='/home/czpchen/文档/github/datamine/log/groceries_apriori.txt'

saverule(rulelist,save_path) ```

Performance analysis

simple analyse of my dataset

Reference

数据挖掘 第三版

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.