HYPE: Hierarchical Sequential Pattern Mining

July 12, 2017 | Autor: Maguelonne Teisseire | Categoría: Sequential Pattern Mining
Share Embed


Descripción

HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

HYPE: Hierarchical Sequential Pattern Mining

Future

M ARC P LANTEVIT , A NNE L AURENT, M AGUELONNE T EISSEIRE LIRMM, U NIVERSITY M ONTPELLIER II, F RANCE

DOLAP’06, Arlington, November the 10th 2006

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

1

Plan HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework

1

Introduction OLAP & data mining Sequential Patterns Multidimensional Framework Hierarchies

2

Contributions Data Model Definitions Algorithms Experiments

3

Conclusions and Future Work

Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

2

Plan HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework

1

Introduction OLAP & data mining Sequential Patterns Multidimensional Framework Hierarchies

2

Contributions Data Model Definitions Algorithms Experiments

3

Conclusions and Future Work

Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

3

OLAP & KDD HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

User Navigation

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

OLAP users are now decision makers. Users navigate in the aggregated datacube in order to discover knowledge. ROLL UP, DRILL DOWN, ...

Our Goal: Providing automatically knowledge thanks to data mining approaches

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

4

Sequential Patterns HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework

Well adapted for temporal data

Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

Discovering correlations between events through time. Several applications: marketing, decision making, protein sequence, network security, music, . . .

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

5

Sequential Patterns HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework

Well adapted for temporal data

Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

Discovering correlations between events through time. Several applications: marketing, decision making, protein sequence, network security, music, . . . A, B

06/10/11 M. Plantevit

time

HYPE: Hierarchical Sequential Pattern Mining

5

Sequential Patterns HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework

Well adapted for temporal data

Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

Discovering correlations between events through time. Several applications: marketing, decision making, protein sequence, network security, music, . . . A, B

06/10/11 M. Plantevit

A

HYPE: Hierarchical Sequential Pattern Mining

time

5

Sequential Patterns HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework

Well adapted for temporal data

Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

Discovering correlations between events through time. Several applications: marketing, decision making, protein sequence, network security, music, . . . A, B

A

B,C

time

h(A, B), (A), (B, C)i

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

5

Sequential Patterns HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Discovering correlations between events through time.

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

Well adapted for temporal data

and

Future

Several applications: marketing, decision making, protein sequence, network security, music, . . . A, B

A

B,C

time

h(A, B), (A), (B, C)i §: Sequential patterns are quite poor (only one mined dimension)

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

5

Data Cube HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions

Knowledge are mined among one dimension: product dimension. What about the other ones ?

Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

City

Product

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

Customer Group

6

Combining Several Analysis Dimensions HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

06/10/11 M. Plantevit

Items are not defined on one dimension, they are defined on several dimensions

HYPE: Hierarchical Sequential Pattern Mining

7

Combining Several Analysis Dimensions HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

Items are not defined on one dimension, they are defined on several dimensions Classical item: c

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

7

Combining Several Analysis Dimensions HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

Items are not defined on one dimension, they are defined on several dimensions Classical item: c Multidimensional item: (France, c, 100), (Germany , c, ∗)

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

7

Combining Several Analysis Dimensions HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model

Items are not defined on one dimension, they are defined on several dimensions

Definitions Algorithms Experiments

Conclusions Work

and

Future

Classical item: c Multidimensional item: (France, c, 100), (Germany , c, ∗) Multidimensional sequence: h{(France, c, 100), (Germany , d, 54)}{(∗, b, 2)}i instead of h(c, d), bi

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

7

Taking Hierarchies into account HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions

dilemma Support/#patterns

Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

Minimal support too high: too few frequent knowledge to be used and to enhance the decision making process. Minimal support too low: too much frequent knowledge, unusable for the decision maker. It is very difficult to choose the right support value for mining relevant knowledge

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

8

Taking Hierarchies into account HYPE

dilemma Support/#patterns

Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

Minimal support too high: too few frequent knowledge to be used and to enhance the decision making process. Minimal support too low: too much frequent knowledge, unusable for the decision maker. It is very difficult to choose the right support value for mining relevant knowledge

Taking hierarchies into account to solve this dilemma Mining rules on several levels of hierarchy. subsumption power. 06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

8

State-of-the-art HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

Multidimensionality Simulation of multi. Sequential patterns Hierarchy in patterns

(1) No No Yes Several

(2) No ?? No Single

(3) Yes __ Yes No

(1) Agrawal & Srikant (1995): the pioneer approach. (2) Han & Fu (2001): an original approach. (3) Yu & Chen (2005): Using hierarchies for a smart time representation.

No approach for mining multidimensional sequences among several levels of hierarchy

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

9

Plan HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework

1

Introduction OLAP & data mining Sequential Patterns Multidimensional Framework Hierarchies

2

Contributions Data Model Definitions Algorithms Experiments

3

Conclusions and Future Work

Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

10

Database & Blocks HYPE

BLOCK :

Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

06/10/11 M. Plantevit

A database can be partioned into different blocks according to some dimensions Market Carrefour Carrefour Carrefour Carrefour Carrefour Carrefour Carrefour Carrefour Carrefour wellmart wellmart wellmart wellmart wellmart wellmart wellmart

Cust-Grp Educ. Educ. Educ. Educ. Educ. Employ. Employ. Employ. Employ. retir. retir. retir. Educ. Educ. Educ. Educ.

Date 1 1 2 3 4 1 2 2 3 1 1 2 1 2 3 4

Place Germany Germany Germany Germany Germany France France France France UK UK UK LA LA NY NY

HYPE: Hierarchical Sequential Pattern Mining

Product beer pretzel M2 chocolate M1 soda wine chocolate M2 whisky pretzel M2 chocolate M1 whisky soda 11

Dimension Set Partition HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

D = DR ⊕ DA ⊕ Dt

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

Dt : temporal dimensions DA : analysis dimensions DR : reference dimensions tuple c = (d1 , · · · , dn ) = (r , a, t) where : r : is the restriction of c on DR a: is the restriction of c on DA t: is the restriction of c on Dt

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

12

Hierarchies HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework

Hierarchical relations on each analysis dimensions.

Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

These relations are materialized in the form of trees (Is-a relation). Only the leaves can be in the database. Hierarchies over PLACE and PRODUCT dimensions: Products Foods

Place Drinks

...

Medecines

Chocolate Pretzel

USA

Alcoholic drinks Whisky

06/10/11 M. Plantevit

Wine Beer

Soda

M1

...

...

EU

M2

L.A

HYPE: Hierarchical Sequential Pattern Mining

N.Y

Chicago

France

UK

Germany

13

Relation Between Elements HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

ancestor: xˆ is an ancestor of x according to the hierarchy. descendant: denoted xˇ . \ E.U = France \ Place = Germany

Place USA

L.A

06/10/11 M. Plantevit

N.Y

...

Chicago

...

EU

France

HYPE: Hierarchical Sequential Pattern Mining

UK

Germany

14

H-generalized(H.G.) Item, Itemset and Sequence HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model Definitions Algorithms Experiments

Conclusions Work

and

Future

H.G. Multidimensional Item: A tuple e = (d1 , . . . , dm ) defined over the set of the analysis dimensions DA such that di ∈ {label(Ti )}. Examples : (France, Chocolate), (Germany , Drinks)

06/10/11 M. Plantevit

HYPE: Hierarchical Sequential Pattern Mining

15

H-generalized(H.G.) Item, Itemset and Sequence HYPE

H.G. Multidimensional Item:

Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies

Contributions Data Model

A tuple e = (d1 , . . . , dm ) defined over the set of the analysis dimensions DA such that di ∈ {label(Ti )}.

Definitions Algorithms Experiments

Conclusions Work

and

Future

Examples : (France, Chocolate), (Germany , Drinks)

Hierarchical Inclusion 0 ), then: Let e = (d1 , . . . , dm ) and e0 = (d10 , . . . , dm

e is more general than e0 (e >h e0 ) if ∀di , di = dˆi0 or di = di0 e is more specific than e0 (e h (USA, soda).

Sequential Patterns Multidimensional Framework Hierarchies

Contributions

(France, wine)
Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.