HYPE: Hierarchical Sequential Pattern Mining
Descripción
HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
HYPE: Hierarchical Sequential Pattern Mining
Future
M ARC P LANTEVIT , A NNE L AURENT, M AGUELONNE T EISSEIRE LIRMM, U NIVERSITY M ONTPELLIER II, F RANCE
DOLAP’06, Arlington, November the 10th 2006
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
1
Plan HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework
1
Introduction OLAP & data mining Sequential Patterns Multidimensional Framework Hierarchies
2
Contributions Data Model Definitions Algorithms Experiments
3
Conclusions and Future Work
Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
2
Plan HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework
1
Introduction OLAP & data mining Sequential Patterns Multidimensional Framework Hierarchies
2
Contributions Data Model Definitions Algorithms Experiments
3
Conclusions and Future Work
Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
3
OLAP & KDD HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
User Navigation
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
OLAP users are now decision makers. Users navigate in the aggregated datacube in order to discover knowledge. ROLL UP, DRILL DOWN, ...
Our Goal: Providing automatically knowledge thanks to data mining approaches
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
4
Sequential Patterns HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework
Well adapted for temporal data
Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
Discovering correlations between events through time. Several applications: marketing, decision making, protein sequence, network security, music, . . .
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
5
Sequential Patterns HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework
Well adapted for temporal data
Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
Discovering correlations between events through time. Several applications: marketing, decision making, protein sequence, network security, music, . . . A, B
06/10/11 M. Plantevit
time
HYPE: Hierarchical Sequential Pattern Mining
5
Sequential Patterns HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework
Well adapted for temporal data
Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
Discovering correlations between events through time. Several applications: marketing, decision making, protein sequence, network security, music, . . . A, B
06/10/11 M. Plantevit
A
HYPE: Hierarchical Sequential Pattern Mining
time
5
Sequential Patterns HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework
Well adapted for temporal data
Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
Discovering correlations between events through time. Several applications: marketing, decision making, protein sequence, network security, music, . . . A, B
A
B,C
time
h(A, B), (A), (B, C)i
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
5
Sequential Patterns HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Discovering correlations between events through time.
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
Well adapted for temporal data
and
Future
Several applications: marketing, decision making, protein sequence, network security, music, . . . A, B
A
B,C
time
h(A, B), (A), (B, C)i §: Sequential patterns are quite poor (only one mined dimension)
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
5
Data Cube HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions
Knowledge are mined among one dimension: product dimension. What about the other ones ?
Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
City
Product
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
Customer Group
6
Combining Several Analysis Dimensions HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
06/10/11 M. Plantevit
Items are not defined on one dimension, they are defined on several dimensions
HYPE: Hierarchical Sequential Pattern Mining
7
Combining Several Analysis Dimensions HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
Items are not defined on one dimension, they are defined on several dimensions Classical item: c
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
7
Combining Several Analysis Dimensions HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
Items are not defined on one dimension, they are defined on several dimensions Classical item: c Multidimensional item: (France, c, 100), (Germany , c, ∗)
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
7
Combining Several Analysis Dimensions HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model
Items are not defined on one dimension, they are defined on several dimensions
Definitions Algorithms Experiments
Conclusions Work
and
Future
Classical item: c Multidimensional item: (France, c, 100), (Germany , c, ∗) Multidimensional sequence: h{(France, c, 100), (Germany , d, 54)}{(∗, b, 2)}i instead of h(c, d), bi
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
7
Taking Hierarchies into account HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions
dilemma Support/#patterns
Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
Minimal support too high: too few frequent knowledge to be used and to enhance the decision making process. Minimal support too low: too much frequent knowledge, unusable for the decision maker. It is very difficult to choose the right support value for mining relevant knowledge
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
8
Taking Hierarchies into account HYPE
dilemma Support/#patterns
Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
Minimal support too high: too few frequent knowledge to be used and to enhance the decision making process. Minimal support too low: too much frequent knowledge, unusable for the decision maker. It is very difficult to choose the right support value for mining relevant knowledge
Taking hierarchies into account to solve this dilemma Mining rules on several levels of hierarchy. subsumption power. 06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
8
State-of-the-art HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
Multidimensionality Simulation of multi. Sequential patterns Hierarchy in patterns
(1) No No Yes Several
(2) No ?? No Single
(3) Yes __ Yes No
(1) Agrawal & Srikant (1995): the pioneer approach. (2) Han & Fu (2001): an original approach. (3) Yu & Chen (2005): Using hierarchies for a smart time representation.
No approach for mining multidimensional sequences among several levels of hierarchy
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
9
Plan HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework
1
Introduction OLAP & data mining Sequential Patterns Multidimensional Framework Hierarchies
2
Contributions Data Model Definitions Algorithms Experiments
3
Conclusions and Future Work
Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
10
Database & Blocks HYPE
BLOCK :
Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
06/10/11 M. Plantevit
A database can be partioned into different blocks according to some dimensions Market Carrefour Carrefour Carrefour Carrefour Carrefour Carrefour Carrefour Carrefour Carrefour wellmart wellmart wellmart wellmart wellmart wellmart wellmart
Cust-Grp Educ. Educ. Educ. Educ. Educ. Employ. Employ. Employ. Employ. retir. retir. retir. Educ. Educ. Educ. Educ.
Date 1 1 2 3 4 1 2 2 3 1 1 2 1 2 3 4
Place Germany Germany Germany Germany Germany France France France France UK UK UK LA LA NY NY
HYPE: Hierarchical Sequential Pattern Mining
Product beer pretzel M2 chocolate M1 soda wine chocolate M2 whisky pretzel M2 chocolate M1 whisky soda 11
Dimension Set Partition HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
D = DR ⊕ DA ⊕ Dt
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
Dt : temporal dimensions DA : analysis dimensions DR : reference dimensions tuple c = (d1 , · · · , dn ) = (r , a, t) where : r : is the restriction of c on DR a: is the restriction of c on DA t: is the restriction of c on Dt
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
12
Hierarchies HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework
Hierarchical relations on each analysis dimensions.
Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
These relations are materialized in the form of trees (Is-a relation). Only the leaves can be in the database. Hierarchies over PLACE and PRODUCT dimensions: Products Foods
Place Drinks
...
Medecines
Chocolate Pretzel
USA
Alcoholic drinks Whisky
06/10/11 M. Plantevit
Wine Beer
Soda
M1
...
...
EU
M2
L.A
HYPE: Hierarchical Sequential Pattern Mining
N.Y
Chicago
France
UK
Germany
13
Relation Between Elements HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
ancestor: xˆ is an ancestor of x according to the hierarchy. descendant: denoted xˇ . \ E.U = France \ Place = Germany
Place USA
L.A
06/10/11 M. Plantevit
N.Y
...
Chicago
...
EU
France
HYPE: Hierarchical Sequential Pattern Mining
UK
Germany
14
H-generalized(H.G.) Item, Itemset and Sequence HYPE Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model Definitions Algorithms Experiments
Conclusions Work
and
Future
H.G. Multidimensional Item: A tuple e = (d1 , . . . , dm ) defined over the set of the analysis dimensions DA such that di ∈ {label(Ti )}. Examples : (France, Chocolate), (Germany , Drinks)
06/10/11 M. Plantevit
HYPE: Hierarchical Sequential Pattern Mining
15
H-generalized(H.G.) Item, Itemset and Sequence HYPE
H.G. Multidimensional Item:
Introduction OLAP & KDD Sequential Patterns Multidimensional Framework Hierarchies
Contributions Data Model
A tuple e = (d1 , . . . , dm ) defined over the set of the analysis dimensions DA such that di ∈ {label(Ti )}.
Definitions Algorithms Experiments
Conclusions Work
and
Future
Examples : (France, Chocolate), (Germany , Drinks)
Hierarchical Inclusion 0 ), then: Let e = (d1 , . . . , dm ) and e0 = (d10 , . . . , dm
e is more general than e0 (e >h e0 ) if ∀di , di = dˆi0 or di = di0 e is more specific than e0 (e h (USA, soda).
Sequential Patterns Multidimensional Framework Hierarchies
Contributions
(France, wine)
Lihat lebih banyak...
Comentarios