/
1  Association Rule Mining 1  Association Rule Mining

1 Association Rule Mining - PowerPoint Presentation

trish-goza
trish-goza . @trish-goza
Follow
380 views
Uploaded On 2018-12-16

1 Association Rule Mining - PPT Presentation

What Is Association Rule Mining Association rule mining is finding frequent patterns or associations among sets of items or objects usually amongst transactional data Applications include Market Basket analysis crossmarketing catalog design etc ID: 742329

support rules itemset frequent rules support frequent itemset confidence rule association items itemsets milk apriori find mining set beer

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "1 Association Rule Mining" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

1

Association Analysis /Association Rule Mining

Last updated

11/25/19Slide2

What Is Association Rule Mining?

Association rule mining is finding frequent patterns or associations among sets of items or objects, usually amongst transactional dataApplications include Market Basket analysis, cross-marketing, catalog design, etc.2Slide3

Association Rules

ExamplesRule form: “Body ® Head [support, confidence]”.buys(x, “diapers”) ® buys(x, “beer”) [0.5%, 60%]buys(x, "bread") ® buys(x, "milk") [0.6%, 65%]major(x, "CS") /\ takes(x, "DB") ® grade(x, "A") [1%, 75%]age=“30-45”, income=“50K-75K” ® car=“SUV”Usually one relation so rule represented more simplyDiapers ® Beer [0.5%, 60%]We have seen rules before. In what context?Rule Learning (e.g., Ripper)What is the left side of a rule called? The right side?LHS

: antecedent RHS:

consequent

3Slide4

Market-basket

Analysis & Finding AssociationsQuestion: do items occur together?Proposed by Agrawal et al in 1993. An important data mining task studied extensively by the database and data mining community. Assumes all data are categorical, usually binaryWe care if someone purchased “Diet Coke” not how many bottles of diet coke they purchasedInitially used for Market Basket Analysis to find how items purchased by customers are related. Bread  Milk [sup = 5%, conf = 100%]4Slide5

Association Rule Mining: The Goal

Given:Database of transactionsEach transaction is a list of items: an itemsetFor example, a list of items purchased by a customerGoal:Find all rules that correlate one set of items with another setExample: 98% of people who purchase tires and auto accessories also get automotive services done5Slide6

Simple Supermarket Example

Does “” mean causality or co-occurrence?co-occurence6

Market-Basket transactions

Example of Association Rules

{Diaper}

 {Beer

}{Milk, Bread}  {Eggs,Coke}{Beer, Bread}  {Milk}

An itemset is simply a set of itemsSlide7

Transaction data representation

A simplistic view of “shopping baskets” Some important information not considered: the quantity of each item purchasedthe price paid7Slide8

Applications

How could association rules help a grocery store manager? Lets say you have a rule X  Y for item X and YHow could you help your store?If you want to increase sales of Y then run sale on XLocate X and Y near each other. Examples?Bananas near cerealLocate X and Y far from each other to make the shopper walk through the storePrint out a coupon on checkout for Y if shopper bought X but not Y 8Slide9

Association “

rules”–standard formatRule format: (A set can consist of just a single item) If {set of items}  Then {set of items} If {Diapers, Baby Food}

Condition

{Beer, Chips}

Results

Then

Customer

buys diaperCustomer buys both

Customer

buys beer

Right side very often is a single item

Remember: Rules do not imply causality

9Slide10

What is an Interesting

Association?Requires domain-knowledge validationActionable, non-trivial, understandableAlgorithms provide statistics that help identify useful rules, but does not guarantee usefulTwo statistics universally used (assume C  R)Support (of rule) ≈ P(R & C): % transactions/baskets where rule holds (all items present)Confidence of a rule ≈ p(R|C)% of times R holds when C holds (% time rule fires and R present)What kind of rules are we most interested in (in terms of support and confidence)?10Slide11

Calculating Confidence from Support

The confidence of a rule LHS => RHS can be computed as the support of the whole itemset divided by the support of LHS:Confidence (LHS => RHS) = Support(LHS

È

RHS) / Support(LHS)

Customer

buys diaper

Customer buys both

Customerbuys beer11Slide12

Definition: Frequent Itemset

ItemsetA collection of one or more itemsExample: {Milk, Bread, Diaper}k-itemset: itemset with k itemsSupport count ()Frequency count of occurrence of itemsetE.g. ({Milk, Bread,Diaper}) = 2 SupportFraction of transactions containing the itemsetE.g. s({Milk, Bread, Diaper}) = 2/5Frequent ItemsetAn itemset whose support is greater than or equal to a minsup

threshold

12Slide13

Support and Confidence Calculations

Given Association

Rule

{Milk, Diaper}

 {Beer}

Rule Evaluation MetricsSupport (s)Fraction of transactions that contain both X and YConfidence (c)Measures how often items in Y appear in transactions thatcontain XNow Compute these two metrics13Slide14

Association Rule Mining Task

Given set of transactions T, the goal of association rule mining is to find all rules having support ≥ minsup thresholdconfidence ≥ minconf thresholdBrute-force approach:List all possible association rulesCompute the support and confidence for each rulePrune rules that fail the minsup and minconf thresholdsComputationally prohibitive!So what do we do? What have we done in such cases in this course?We have use heuristic methods and not achieved optimal solutionBut not in this case. In this case we will act smarter and find optimal (correct) solution14Slide15

Number of Itemsets

Given d distinct items, how many distinct itemsets can be formed?You all learned this is CISC 1100/1400It has to do with set theoryAnswer:It is the number of all possible subsetsIt is the cardinality of the Power Set of dEvery item is either present or not present (2 choices)So answer is 2d 15Slide16

Computational Complexity

Given d unique items, how many association rules can we come up with? Can someone explain the expression for R? Do not worry about the expression below itC(d,k) is the number of ways to pick k LHS items and next expression is number ways to pick RHS items. 16Slide17

Computational Complexity

17Exponential GrowthSlide18

Mining Association Rules

Example of Rules:

{

Milk,Diaper

}

 {Beer} (s=0.4, c=0.67)

{Milk,Beer}  {Diaper} (s=0.4, c=1.0){Diaper,Beer}  {Milk} (s=0.4, c=0.67){Beer}  {Milk,Diaper} (s=0.4, c=0.67) {Diaper}  {Milk,Beer} (s=0.4, c=0.5) {Milk}  {Diaper,Beer} (s=0.4, c=0.5)Observations: All the above rules are binary partitions of the same itemset: {Milk, Diaper, Beer}Rules originating from the same itemset have identical support (by definition) but may have different confidence valuesThis suggests decoupling support and confidence requirements18Slide19

Mining Association Rules

Two-step approach: Frequent Itemset GenerationGenerate all itemsets whose support  minsupRule GenerationGenerate high confidence rules from each frequent itemset, where each rule is a binary partitioning of a frequent itemsetFrequent itemset generation is still computationally expensive19Slide20

Many Association Rule Mining Algorithms

They use different strategies and data structuresThe resulting sets of rules are the sameGiven a transaction data set T, and a minimum support and a minimum confident, the set of association rules existing in T is uniquely determinedWe study one famous algorithm: the Apriori AlgorithmThis algorithm, like Kmeans, was voted to be in the top-10 Data Mining algorithms20Slide21

The

Apriori algorithmTwo step approach1) Frequent Itemset GenerationGenerate all itemsets with support ≥ minsup2) Rule GenerationGenerate all rules with confidence ≥ minconfUsually provide a template for the rules that restrict formExample: only one item on right sideOne step is computationally much more expensive than the other. Which one?Frequent itemset generationMinsup is also critical21Slide22

Apriori

PropertyThe Apriori property (downward closure property)All subsets of a frequent itemset are also frequentDoes this make sense?22

AB AC AD BC BD CD

A B C D

ABC ABD ACD BCDSlide23

Steps in Association Rule Discovery

Find frequent itemsetsItemsets with at least minimum supportSupport is “downward closed” so a subset of a frequent itemset must be frequent if {AB} is a frequent itemset, both {A} and {B} are frequent itemsetsIf an itemset does not satisfy minimum support, none of its supersets will be either (this is key point that allows pruning of search space)Algorithm: iteratively find frequent itemsets with cardinality from 1 to k (k-itemsets)Use the frequent itemsets to generate assoc. rulesGenerate all binary partitions (may have to fit template)Prune those with confidence < minconf

23Slide24

Frequent Itemset Generation

24

Given d items, there are 2

d

possible candidate

itemsetsSlide25

25

Illustrating the Apriori Principle

Found to be Infrequent

Pruned supersetsSlide26

Illustrating Apriori

PrincipleMinimum Support = 3

Items (1-itemsets)

If every subset is considered,

6

C

1 + 6C2 + 6C3 6 + 15 + 20 = 41With support-based pruning, 6 + 6 + 4 = 1626Slide27

Illustrating Apriori

PrincipleMinimum Support = 3

If every subset is considered,

6

C

1 + 6C2 + 6C3 6 + 15 + 20 = 41With support-based pruning, 6 + 6 + 4 = 16

Items (1-itemsets)27Slide28

Illustrating Apriori

Principle

Items (1-itemsets)

Pairs (2-itemsets)

(No need to generate

candidates involving Coke

or Eggs)

Minimum Support = 3If every subset is considered, 6C1 + 6C2 + 6C3 6 + 15 + 20 = 41With support-based pruning, 6 + 6 + 4 = 1628Slide29

Illustrating Apriori

Principle

Items (1-itemsets)

Pairs (2-itemsets)

(No need to generate

candidates involving Coke

or Eggs)

Minimum Support = 3If every subset is considered, 6C1 + 6C2 + 6C3 6 + 15 + 20 = 41With support-based pruning, 6 + 6 + 4 = 1629Slide30

Illustrating Apriori

Principle

Items (1-itemsets)

Pairs (2-itemsets)

(No need to generate

candidates involving Coke

or Eggs)

Triplets (3-itemsets)Minimum Support = 3If every subset is considered, 6C1 + 6C2 + 6C3 6 + 15 + 20 = 41With support-based pruning, 6 + 6 + 4 = 1630Slide31

Illustrating Apriori

Principle

Items (1-itemsets)

Pairs (2-itemsets)

(No need to generate

candidates involving Coke

or Eggs)

Triplets (3-itemsets)Minimum Support = 3If every subset is considered, 6C1 + 6C2 + 6C3 6 + 15 + 20 = 41With support-based pruning, 6 + 6 + 4 = 16

31Slide32

The Apriori Algorithm

Terminology:Ck is the set of candidate k-itemsetsLk is the set of k-itemsetsJoin Step: Ck is generated by joining two elements from Lk-1There must be a lot of overlap for the join to only increase length by 1As we will see k-2 items must overlap (each differs by 1)Prune Step: For a k-itemset to be frequent all of its subsets must be frequentIf any subsets are not frequent, prune the candidate k-itemset.To utilize this you simply start with k=1, which is single-item itemsets and they you work your way up from there!

32Slide33

The Algorithm

Iterative algorithmAlso called level-wise searchFind all 1-item frequent itemsetsThen all 2-item frequent itemsets, …In each iteration k, only consider itemsets that contain some k-1 frequent itemset. 33Slide34

The Join Step

All items in the itemset to be joined are in a consistent order– any orderSuch as lexicographic (alphabetical) orderElement in Ck are created by joining two itemsets from Lk-1, where the two itemsets have k-2 items in commonThe two k-1 itemsets are joined only if they differ in the last position (so k-2 in common)Then when you join them the size of the itemset goes up by one: (k-2) + 1 + 1 = kExample: join pqr and pqs34Slide35

Example of Generating Candidates (1)

L3={abc, abd, acd, ace, bcd}Self-joining: L3*L3. What do we merge first?abc and abd yields abcd (add to C4)Can abc successfully merge with any other element in L3?

No, they do not differ in only last item! What merges next?

acd

and

ace

yields acde (add to C4)We do not join abd and acd since differ in 2nd pos.Even though it would give abcd which is a candidateWhy: if the product were a candidate it would have already been generated (see next slide)35Slide36

Example of Generating Candidates

(2)Note that for abcd to be frequent by the Apriori property abc, bcd, and abd must be frequentabc and abd are alphabetically before bcdSo if we see abc and bcd we do not need to generate abcd because if abd were there it would have already been generatedIf it is not there then it would be pruned later36Slide37

Example of Generating Candidates

(3)So C4 tentatively equals {abcd, acde}Recall L3={abc, abd, acd, ace, bcd}Based on Apriori property, do we remove abcd or acde from C4?We remove acde but not abcd

. Why?

We do not remove

abcd

since

abc, abd, and acd are in L3 We remove acde since cde is not in L3 Join step does not guarantee Apriori not violatedIf Apriori property not violated, then you still need to scan the database to ensure support > minsup before placing item into L4. 37Slide38

The

Apriori Algorithm — Example (minsup = 30%)38

Database D

Scan D

C

1

L

1L2

C

2

C

2

Scan D

C

3

L

3

Scan DSlide39

Warning: Do Not Forget Pruning

Rules get pruned in two waysApriori property violatedIf Apriori not violated, still must scan database and if minsup not exceeded then pruneApriori property is necessary but not sufficient to keep a ruleIf you forget to prune via Apriori property, you will get same results since will catch on the scanBut I will take off points on an exam. Make it clear when prune using Apriori property (do not fill in count when crossing off)Apriori property cannot be violated until k=3. Begins go get trickier at k=4 since more subsets to check39Slide40

More Complex Example

Given the following database, list all frequent 3-itemsets and 4-itemsets with minsup of 40%40TIDABCDET1 1110

0

T

2

1

1110T310110T410111T511110Slide41

Solution

The details are provided on this webpage:http://www2.cs.uregina.ca/~dbd/cs831/notes/itemsets/itemset_apriori.htmlFrequent 3-itemsets: ABC, ABD, ACD, ACE,ADE, BCD,CDEFrequent 4-itemsetsABCD, ACDE 41Slide42

Step 2: Rules from

Frequent ItemsetsFrequent itemsets  association rulesOne more step is needed For each frequent itemset X, For each proper nonempty subset A of X, Let B = X - AA  B is an association rule ifConfidence(A  B) ≥

minconf

, where

Confidence(A

B) = support(A  B) / support(A)42Slide43

Generating Rules: an Example

Suppose {2,3,4} is frequent, with sup=50%Proper nonempty subsets: {2,3}, {2,4}, {3,4}, {2}, {3}, {4}, with sup=50%, 50%, 75%, 75%, 75%, 75% respectivelyThese generate these association rules Recall: Confidence(A  B) = support(A  B) / support(A), where support(A  B) =50%2,3  4, confidence=100% (50%/50%)2,4  3, confidence=100% (50%/50%)3,4  2, confidence=67% (50%/75%)2  3,4, confidence=67% (50%/75%)3  2,4, confidence=67% (50%/75%)4

2,3, confidence=67

% (50%/75

%)

All rules have support = 50%Then apply confidence threshold to identify strong rulesRules that meet the support and confidence requirementsIf confidence threshold is 80% we are left with 2 strong rules43Slide44

Generating Rules: Summary

To recap, in order to obtain A  B, we need to have support(A  B) and support(A)All the required information for confidence computation already recorded in itemset generation. This step is not as time-consuming as frequent itemset generationHint: I almost always ask this on the exam44Slide45

On

Apriori AlgorithmSeems to be very expensiveLevel-wise searchK = the size of the largest itemsetIt makes at most K passes over dataIn practice, K is bounded (10). The algorithm is very fast. Under some conditions, all rules can be found in linear time.Scale up to large data sets45Slide46

Granularity of items

One exception to the “ease” of applying association rules is selecting the granularity of the items.Should you choose:diet coke?coke product?soft drink?beverage?Should you include more than one level of granularity?Some association finding techniques allow you to represent hierarchies explicitly46Slide47

Multiple-Level Association Rules

Items often form a hierarchyItems at the lower level are expected to have lower supportRules regarding itemsets at appropriate levels could be quite usefulTransaction database can be encoded based on dimensions and levels

Food

Milk

Bread

Skim

2%

WheatWhite47Slide48

Mining Multi-Level Associations

A top-down, progressive deepening approachFirst find high-level strong rules:milk ® bread [20%, 60%]Then find their lower-level “weaker” rules:2% milk ® wheat bread [6%, 50%]Usually requires different thresholds at different levels to find meaningful ruleslower support at lower levels48Slide49

Interestingness Measurements

Objective measuresTwo popular measurements: SupportConfidenceSubjective measures (Silberschatz & Tuzhilin, KDD95)A rule (pattern) is interesting ifit is unexpected (surprising to the user); and/oractionable (the user can do something with it)49Slide50

Drawback of Confidence

Coffee

Coffee

Tea

15

5

20Tea

755809010100

Association Rule: Tea

Coffee

Confidence

= P(

Coffee|Tea

) =

15/20 = 0.75, but

P(Coffee) = 0.9

Although confidence is high, rule is misleading

P(

Coffee|Tea

) =

75/80 = 0.9375

We address this by considering the lift of a rule. For a rule to be interesting, it should not be near 1.0

Lift: P(Y|X)/P(Y)

Lift Tea

Coffee = P(

Coffee|Tea

)/P(Coffee) = .75/.9 = .833

Note that if P(Coffee) was .25 then Lift would have been 3.0

50Slide51

Customer Number vs. Transaction ID

In the homework you may have a problem where there is a customer id for each transactionYou can be asked to do association analysis based on the customer idIf this is so, you need to aggregate the transactions to the customer levelIf a customer has 3 transactions then you just create an itemset containing all of the items in the union of the 3 transactionsNote we will ignore the frequency of purchase51Slide52

Virtual items

If you’re interested in including other possible variables, can create “virtual items”gift-wrap, used-coupon, new-store, winter-holidays, bought-nothing,…52Slide53

Associations: Pros and Cons

Proscan quickly mine patterns describing business/customers/etc. without major effort in problem formulationvirtual items allow much flexibilityunparalleled tool for hypothesis generationConsunfocusednot clear exactly how to apply mined “knowledge”only hypothesis generationcan produce many, many rules!may only be a few nuggets among them (or none)53Slide54

Association Rules

Association rule types:Actionable Rules – contain high-quality, actionable informationTrivial Rules – information already well-known by those familiar with the businessInexplicable Rules – no explanation and do not suggest actionTrivial and inexplicable rules occur most often54