Complex object comparison in a fuzzy context

July 9, 2017 | Autor: Daniel Sánchez | Categoría: Information Systems, Literatura, Computer Software, Object Oriented Model, Fuzzy Set, Object Oriented, Object Oriented Database, Object Oriented, Object Oriented Database

Share Embed

Laporkan tautan ini

Descripción

Information and Software Technology 45 (2003) 431–444 www.elsevier.com/locate/infsof

Complex object comparison in a fuzzy context N. Marı´n*, J.M. Medina, O. Pons, D. Sa´nchez, M.A. Vila Intelligent Databases and Information Systems Research Group, Department of Computer Science and Artificial Intelligence, University of Granada, 18071 Granada, Andalucı´a, Spain

Abstract The comparison concept plays a determining role in many problems related to object management in an Object-Oriented Database Model. Object comparison is appropriately managed in a crisp object-oriented context by means of the concepts of identity and value equality. However, when dealing with imprecise or imperfect objects, questions like ‘To which extent may two objects be the same one?’ or ‘How similar are two objects?’ have not a clear answer, because the equality concept becomes fuzzy. In this paper we present a set of operators that are useful when comparing objects in a fuzzy environment. In particular, we introduce a generalized resemblance degree between two fuzzy sets of imprecise objects and a generalized resemblance degree to compare complex fuzzy objects within a given class. q 2003 Elsevier Science B.V. All rights reserved. Keywords: Object-oriented model; Fuzzy database; Objects resemblance; Inclusion degree

1. Introduction Probably one of the most important data paradigms in both the programming [1] and the databases world [2] is the Object-Oriented Data Model. Modeling the reality which is behind many software problems as a set of objects grouped around classes has proved to be a suitable approach for many developers, particularly when dealing with complex and dynamic problems. This well-deserved popularity has allowed the objectoriented data model to become one of the active research fields in the world of Computer Science. One of the major areas where this research has occurred has been in the field of fuzzy database modelling. As was the case with relational database modelling, the object-oriented data model is being widely studied in order to accept the representation of imperfect (i.e. imprecise, uncertain, vague) information in the database. As a result of this research many relevant works have appeared in the literature [3]. Identity equality is a fundamental concept in the objectoriented data model and constitutes the basic criterion used to distinguish objects. This notion of equality between objects states that two objects are identical if they are the same object, i.e. if they have the same object identifier. However, there exists situations where this criterion is * Corresponding author. E-mail addresses: [email protected] (N. Marı´n), medina@decsai. ugr.es (J.M. Medina), [email protected] (O. Pons), [email protected] (D. Sa´nchez), [email protected] (M.A. Vila).

insufficient and we need to use the concept of value equality i.e. two objects are equal if the values of all their attributes are recursively equal. Query management in object-oriented databases is an example where this last criterion is commonly used. When the user orders the execution of a query he or she writes a set of value conditions that must be fulfilled by the objects that will belong to the query result. It should be noted that identity restrictions may also appear. When we deal with perfect information the usual set of relational operators (including the basic equality operator) allows us to solve these kinds of problems. If the database is affected by imperfection the classical concept of equality is not valid. In some cases it could be replaced by a similarity concept or more generally by a resemblance concept. In the context of object-oriented knowledge bases there are some approximations to this problem. For example Yazici et al. propose in Ref. [4] an approach to calculate the matching between an object and the conditions of a rule. The aim of this paper is to propose a more general way of managing fuzzy object comparisons in a fuzzy objectoriented data model similar to that presented in Ref. [5]. The paper is organized as follows. Section 2 presents the comparison problem, while Section 3 is devoted to the study of the generalization of the value equality concept when dealing with basic objects. In Section 4 we explain how to compare fuzzy sets of fuzzy objects. Based on the material presented in the previous sections, Section 5 demonstrates how to obtain a resemblance degree between two complex

0950-5849/03/$ - see front matter q 2003 Elsevier Science B.V. All rights reserved. doi:10.1016/S0950-5849(03)00014-4

432

N. Marı´n et al. / Information and Software Technology 45 (2003) 431–444

Fig. 1. The problem.

objects in a fuzzy object-oriented environment. An explanatory example is presented in Section 6 and finally the paper ends with some concluding remarks and a discussion of future work in Section 7.

2. Value equality generalization: resemblance relationships Consider the example illustrated in Fig. 1. The rectangles represent two rooms characterized by their quality, their extension, the floor they are on, and the set of students which attend their lessons in each room. The three first attributes that characterize the class Room may take imprecise values: the quality is expressed by an imprecise label and the attributes extension and floor can be expressed using a numerical value or an imprecise label. The set of students is fuzzy, taking into account the percentage of time each student spends in the room receiving the lessons. If we want to compare these two objects we need to solve the following tasks: † Firstly we have to handle resemblance in basic domains. † Secondly we also have to be able to compare fuzzy collections of imprecise objects. † Finally we need to aggregate the resemblance information that we have collected by studying the attributes and calculate a general resemblance opinion for the whole complex objects. The following sections are devoted to the study of each one of these problems.

3. Resemblance in basic domains If we wish to compare complex objects, the first level where resemblance must be studied corresponds to basic objects. That is, simple objects that have no OID (object identifier) and whose definition is not made up using attributes. We will refer to these kinds of objects as values of a given domain.

We are going to consider the following classification of simple objects: † Precise values. This category of values involves all the classical basic classes that usually appear in an objectoriented data model (e.g. numerical classes, string classes, etc.). Values of this kind of domains are easily compared using the classic set of relational operators (e.g. equal, less than, greater than, etc.). In these situations, resemblance is implemented using the model proposed by the classical equality. † Imprecise values. The case of imprecise values is a bit more complex. Different types of imprecise values must be considered according to the semantics of the imprecise value. As we will see, the equality concept generalization depends on the domain nature. We are going to consider the different kinds of simple imprecise domains defined in the fuzzy object-oriented model proposed in Ref. [5]. In this work, we propose the use of linguistic labels [6 –8] in order to express vagueness. Taking into account the model presented in Refs. [9,10], we consider three different types of imprecise basic domains (see Fig. 2). † Domains made up by a set of labels whose semantics cannot be expressed over an underlying base domain (e.g. the domain used to express the quality of a room, ‘The quality of the room is high’, or the prospects of a student, ‘Mary’s prospects are good’). The only alternative to handle comparison of this type is to ask the designer of the domain for the definition of the fuzzy relation that manages the resemblance among the labels. As an example, Table 1 represents a similarity relation for the attribute quality. † Domains where the labels can be expressed using a fuzzy set defined over an underlying domain. There are two possibilities: (1) The interpretation of the fuzzy set that represents the label is disjunctive. That is, it is a possibility distribution and only represents one value among a set of possible values (e.g. in the domain used to express the extension of a room, ‘The room is big’ or the age of a student, ‘Mary is young’,

N. Marı´n et al. / Information and Software Technology 45 (2003) 431–444

433

Fig. 2. Three different kinds of imprecise information.

and so on). In this case, the domain is made up of precise values (those of the underlying domain) and imprecise labels, and the comparison can be managed considering a generalization of the classical equality that holds in the underlying domain, taking into account the definitions of the labels. If B stands for the basic domain, L is the labels set, and D ¼ B < L then ;x; y; [ D we can use the following resemblance relation (^ stands for a t-norm): 8 > 1 > > > > > 0 > > < mS ðx;yÞ ¼ ml ðzÞ > > > > > > > > : sup ðm ðzÞ^m ðzÞÞ z[B x y

ðx ¼ yÞ^ðx;y [ BÞ

4. Resemblance in fuzzy sets of fuzzy objects In Section 3, we have presented the way of comparing the values of the two first attributes that characterize a room. This section studies the comparison of fuzzy collections of objects. Fig. 4 shows the two sets of students that we have to compare in order to evaluate the resemblance between our two rooms. To compare the two fuzzy sets of students we need to generalize the fuzzy set comparison operators, taking into account that the objects of the set may be imprecise.

ðx – yÞ^ðx;y [ BÞ ððx ¼ l [ LÞ^ðy ¼ z [ BÞÞ

4.1. Resemblance driven inclusion degree

_ððy ¼ l [ LÞ^ðx ¼ z [ BÞÞ

Conjunctive fuzzy set comparison is often done by means of the concept of inclusion:

otherwise ð1Þ

Consider for example the attribute extension of the class Room. The basic underlying domain of this attribute is the interval [0,1)and we add to the domain the set of labels {small, middel-size, big} whose definition is represented in Fig. 3. Taking into account the definitions of the labels and Eq. (1) we have that mS ð30; 30Þ ¼ 1; mS ð30; 35Þ ¼ 0; mS ð30; bigÞ ¼ 0:9; and mS ðbig; middle-sizedÞ ¼ 0:5: Although Eq. (1) is an accepted resemblance approach for these situations, there are several alternative proposals in the literature that describe a calculus for similarity and resemblance measures between fuzzy sets. A complete study of them can be found in Ref. [11]. (2) If the interpretation of the fuzzy set that represents the labels is conjunctive then we are not comparing imprecise values but sets. In this last case, whether labels are used to express the set (e.g. academic years of a student, ‘Mary is in her final academic years’) or not (e.g. fuzzy collections of objects— the students of a room), the comparison process is more complicated. Section 4 will analyze this problem in depth.

A ¼ B if; and only if; ðA # BÞ ^ ðB # AÞ

ð2Þ

Several proposals for the calculus of this inclusion degree can be found in the literature. In Ref. [12] the inclusion of A in B is calculated as follows: NðBlAÞ ¼ min {IðmA ðuÞ; mB ðuÞÞ}

ð3Þ

u[U

where I stands for an implication operator, and mX is the membership function that describes the fuzzy set X: The implication operator can be chosen in accordance to the properties we want the inclusion degree to fulfill. Nevertheless, independently of the chosen implication operator, this formulation supposes that both A and B are defined over a reference universe U made up of precise elements, where the classical equality is the basis of Table 1 Domain for the quality attribute

High Regular Low

High

Regular

Low

1

0.8 1

0 0.8 1

N. Marı´n et al. / Information and Software Technology 45 (2003) 431–444

434

in a lower degree the implication condition between the membership degrees of this element to both sets. In Eq. (4) since the membership degrees of similar (but not necessarily equal) elements can be compared, we restrict the implication degree using the resemblance degree of the two elements. In summary, for each element that belongs with a certain degree to the set A we look for a quite similar object in U that belongs to the set B with a higher degree.

Fig. 3. Extension.

the comparisons. That is, the implication operator compares the degree with which each element of the universe U belongs to each one of the fuzzy sets (i.e. we compare the membership degrees of the same object to both sets). However in the context of fuzzy databases it frequently happens that the elements of the universe U are imprecise objects between which classical equality cannot be applied. Instead a similarity or a resemblance relationship must be used. That is, for a given element in the set A; it is not clear which element of B has to be taken in order to compare the membership degrees. In our example, the students may be defined with imprecision and two different students (from the identity equality point of view) may be the same one (from the value equality point of view). Let us study a way to solve this problem. If the reference universe U is formed by imprecise elements, Eq. (3) can be generalized as follows.

Consider the following example: let A ¼ 0:9=a þ 1=d and B ¼ 1=a þ 0:7=b þ 0:9=c be two fuzzy sets, where {a; b; c; d} is the reference universe U over which a resemblance relation S is defined, such that mS ða; bÞ ¼ mS ða; cÞ ¼ mS ða; dÞ ¼ mS ðb; cÞ ¼ mS ðb; dÞ ¼ 0 and mS ðc; dÞ ¼ 0:7: In this situation:

QS ðBlAÞ ¼ min{ max{^ðIðmA ðaÞ; mB ðaÞÞ; mS ða;aÞÞ;…^ðIðmA ðaÞ;

mB ðdÞÞ; mS ða; dÞÞ};…; max{^ðIðmA ðdÞ; mB ðaÞÞ; mS ðd;aÞÞ;…; ^ðIðmA ðdÞ;

mB ðdÞÞ; mS ðd; dÞÞ}}: If we use the product as t-norm and Eq. (6) as implication operator, then:

QS ðBlAÞ ¼ min{max{1; 0; 0;0};max{0; 1;0; 0}; max{0;0;1; 0:7}; max{0;0; 0:63; 0}}

Definition 1. (Resemblance driven inclusion degree). Let A and B be two fuzzy sets defined over a finite reference universe U, S be a resemblance relation defined over the elements of U, and ^ be a t-norm. The inclusion degree of A in B driven by the resemblance relation S is calculated as follows:

QS ðBlAÞ ¼ min max uA;B;S ðx; yÞ

ð4Þ

x[U y[U

where

uA;B;S ðx; yÞ ¼ ^ðIðmA ðxÞ; mB ðyÞÞ; mS ðx; yÞÞ

Iðx;yÞ ¼

¼ min{1;1; 1;0:63} ¼ 0:63: ( 1; if x # y

ð6Þ

y=x; otherwise 4.1.1. Properties of the resemblance driven inclusion degree The first property that the resemblance driven inclusion degree ðQÞ must fulfill is to be valid when the resemblance relation is the classic equality.

ð5Þ

In Eq. (3), the inclusion degree is calculated taking into account the element of the reference universe U that fulfills

Proposition 1. Let A and B be two fuzzy sets defined over a finite reference universe U over which a resemblance relation S is defined. Let S be the classic equality relation. Then, Q¼ ðBlAÞ ¼ NðBlAÞ: Proof 1. Q¼ ðBlAÞ ¼ minx[U maxy[U uA;B;¼ ðx; yÞ: Since uA;B;¼ ðx; yÞ ¼ ^ðIðmA ðxÞ; mB ðyÞÞ; m¼ ðx; yÞÞ and taking into account that ;x – y; m¼ ðx; yÞ ¼ 0 then ;x – y; uA;B;¼ ðx; yÞ ¼ ^ðIðmA ðxÞ; mB ðyÞÞ; 0Þ ¼ 0 which yields ;x [ U; maxy[U uA;B;¼ ðx; yÞ ¼ uA;B;¼ ðx; xÞ ¼ ^ðIðmA ðxÞ; mB ðxÞÞ; 1Þ ¼ IðmA ðxÞ; mB ðxÞÞ and thus Q¼ ðBlAÞ ¼ minx[U {IðmA ðxÞ; mB ðxÞÞ} ¼ NðBlAÞ (as we want to demonstrate). A

Fig. 4. Resemblance in fuzzy collections.

The inclusion degree must be monotonic with respect to the inclusion relationship.

N. Marı´n et al. / Information and Software Technology 45 (2003) 431–444

435

Proposition 2. Let A, B, and C be three fuzzy sets defined over a finite reference universe U over which a resemblance relation S is defined. Then, A # B ) QS ðClAÞ $ QS ðClBÞ:

and ^ be a t-norm. The generalized resemblance degree between A and B restricted by ^ is calculated by means of the following formulation:

Proof 2. A # B ) ;x [ U; mA ðxÞ # mB ðxÞ: That is, ;x [ U; mA ðxÞ þ 1 ¼ mB ðxÞ; with 1 $ 0: Also SupportðAÞ # SupportðBÞ: By the properties of implications, QS ðClAÞ ¼ minx[U maxy[U uA;C;S ðx; yÞ ¼ minx[U maxy[U ^ðIðmA ðxÞ; mC ðyÞÞ; mS ðx; yÞÞ $ minx[U maxy[U ^ðIððmA ðxÞ þ 1Þ; mC ðyÞÞ; mS ðx; yÞÞ; ;1 $ 0: Particularly, QS ðClAÞ $ minx[U maxy[U ^ ðIðmB ðxÞ; mC ðyÞÞ; mS ðx; yÞÞ ¼ QS ðClBÞ: A

Since every t-norm is commutative and the inclusion operator Q is idempotent, then the above measure is a resemblance relation.

The following property also holds. Proposition 3. Let A, B, and C be three fuzzy sets defined over a finite reference universe U over which a resemblance relation S is defined. Then, A # B ) QS ðBlCÞ $ QS ðAlCÞ: Proof 3. A # B ) ;x [ U; mA ðxÞ # mB ðxÞ: That is, ;x [ U; mA ðxÞ þ 1 ¼ mB ðxÞ; with 1 $ 0: Also SupportðAÞ # SupportðBÞ: By the properties of implications, QS ðBlCÞ ¼ minx[U maxy[U uC;B;S ðx; yÞ ¼ minx[U maxy[U ^ ðIðmC ðxÞ; mB ðyÞÞ; mS ðx; yÞÞ $ minx[U maxy[U ^ðIððmC ðxÞÞ; mB ðyÞ 2 1Þ; mS ðx; yÞÞ; ;1 $ 0: Particularly, QS ðBlCÞ $ minx[U maxy[U ^ ðIðmC ðxÞ; mA ðyÞÞ; mS ðx; yÞÞ ¼ QS ðAlCÞ: A

4.2. Matching resemblance opinions When comparing two fuzzy sets of imperfect elements we can consider both inclusion directions, as shown in Eq. 2, by means of a resemblance driven inclusion degree (Fig. 5). However to take this approach we need to propose a way to match both inclusion degrees in order to obtain the resemblance degree between the two fuzzy sets. Using Eq. (2) as a basis we can define a resemblance degree operator between two fuzzy sets as follows. Definition 2. (Generalized resemblance between fuzzy sets). Let A and B be two fuzzy sets defined over a finite reference universe U over which a resemblance relation S is defined,

S;^ ðA; BÞ

¼ ^ðQS ðBlAÞ; QS ðAlBÞÞ

ð7Þ

For example, let A ¼ 1=a þ 1=b þ 1=c and B ¼ 1=d be two fuzzy sets, such that {a; b; c; d} is a subset of a universe U over which a resemblance relation S is defined, and mS ða; dÞ ¼ mS ðb; dÞ ¼ mS ðc; dÞ ¼ 0:5: In this situation: QS ðBlAÞ ¼ 0:5 ¼ QðAlBÞ: Using minimum as t-norm, S;min ðA; BÞ ¼ 0:5: 4.2.1. Cardinality ratio In the last example, the generalized resemblance operator ( ) has indicated that both fuzzy sets of objects have a resemblance equal to 0.5. Now, suppose that C ¼ 1=a is another fuzzy set defined over the same universe. In this case, QS ðBlCÞ ¼ 0:5 ¼ QS ðClBÞ and S;min ðC; BÞ ¼ 0:5: The above example illustrates one drawback of the use of the generalized resemblance operator ( ). The use of resemblance relations instead of equality when calculating inclusion degrees, may make the generalized resemblance between two fuzzy sets be distinct to 0, even if there is a great difference on terms of cardinality. In the above example the cardinality of A and B is different whereas for B and C it is the same. However this is not taken into account when determining the resemblance degree. In some situations cardinality may not be important. Imagine that we are comparing two sets of tools: we are probably looking for similar capabilities and thus the number of tools is not relevant. However in the case of students in a room the actual number of students may be important when comparing the sets. If we wish to distinguish between these kinds of situations we need to weight the generalized resemblance degree with a factor that takes into account the distance between the cardinalities of the fuzzy sets that are being compared (Fig. 6). Definition 3. (Cardinality ratio). Let A and B be two fuzzy sets. We define the cardinality ratio between A and B as a measure of the relative resemblance between their cardinalities. This ratio can be calculated as follows: 8 if A ¼ B ^ B ¼ B > < 1; FðA; BÞ ¼ minðlAl; lBlÞ ð8Þ > ; otherwise : maxðlAl; lBlÞ

Fig. 5. Two directions of inclusion.

The ratio reaches its maximum value (1) when both sets present the same cardinality and approaches 0 when the cardinality of one set is much greater than the other.

436

N. Marı´n et al. / Information and Software Technology 45 (2003) 431–444

Fig. 6. Need for a cardinality ratio.

The cardinality ratio F preserves commutativity and idempotency. Consequently its use as a weighting factor for does not prevent it being a resemblance measure. Using the cardinality ratio in the example, we have that S;min ðA; BÞ ¼ 0:17 and S;min ðC; BÞ ¼ 0:5:

sets of objects. Let us now consider the comparison of more general kind of objects. An object can be viewed as a set of values that correspond with the set of attributes that describe the type of the class the object belongs to. Each of these values can be in turn a basic object, another object, or a fuzzy set of objects (basic or not). In general, an object is considered to be complex if other objects take part in its definition. When comparing two objects, the starting point is to study the resemblance between the attribute values that describe their definition. The previous sections have provided us with tools to face the attribute comparison. In this section we are going to explain the way to aggregate the resemblance degrees obtained from the comparison of the attributes values in order to get a general resemblance measure of the two objects. 5.1. Formal definition of the problem

4.2.2. Consistency degree between fuzzy sets In Section 4.2.1 we have studied the way of matching both directions of inclusion into the one resemblance degree. To achieve this we have proposed combining the degrees using a t-norm. The use of a t-norm as an aggregation function may be a bit restrictive, but it is suitable if we want to use resemblance to generalize the concept of equality. In some cases it is interesting to study the consistency degree between the definition of two fuzzy sets instead of studying the resemblance degree. Imagine we are interested in the resemblance of two collections and we only want to know to which extent one of the collections can be completed to match the other collection. In this case a less restrictive resemblance measure would be of interest. Definition 4. (Consistency degree between fuzzy sets). Let A and B be two fuzzy sets defined over a finite reference universe U over which a resemblance relation S is defined, and % a t-conorm. The consistency degree between A and B restricted by % is calculated as follows: S;% ðA; BÞ

¼ %ðQS ðBlAÞ; QS ðAlBÞÞ

ð9Þ

With the consistency degree we intend to calculate to which extent there exists an inclusion relation between the contents of both sets (in at least one of the two directions). This measure, as the reader can easily see, is also a resemblance measure.

5. Calculating objects resemblance The problem we want to solve is to establish the resemblance degree between two objects that belong to the same class. In the previous sections we have seen the way to calculate the resemblance between basic objects and fuzzy

Firstly, let us formally define the problem. Let o1 and o2 be two objects of the class C; characterized by the type TC ; whose structural component StrC is characterized by the attributes set {a1 ; a2 ; …; an }: Our goal is to find the resemblance between o1 and o2 from the aggregation of the resemblance degrees that can be observed in the values of their attributes. We can outline the problem as a multi criteria aggregation problem, where different resemblance opinions (one for each pair of values of the same attribute) must be aggregated to obtain a resemblance consensus. Let Sai ðo1 ; o2 Þ stand for the resemblance degree observed between the values of attribute ai in objects o1 and o2 ; and Sðo1 ; o2 Þ stand for the aggregated resemblance opinion we want to calculate. 5.2. Attribute importance It is natural to think that not all of the resemblance degrees calculated for the attribute values of the objects being compared have the same weight when computing the resemblance between the objects. For example, if we are comparing people, some attributes may be more determining (e.g. name, father, brother) while others may be less determining (e.g. weight, age). In order to reflect this fact, let us consider that every attribute ai has associated a weight pai that points out the importance that the resemblance in this attribute must have when computing the resemblance degree between objects of this class. Without sacrificing generality we are going to consider that ;i; pai [ ½0; 1: 5.3. Resemblance values aggregation The calculus of the resemblance degree between two objects of the same class is governed by the following

N. Marı´n et al. / Information and Software Technology 45 (2003) 431–444

special vague sentence: Two objects are similar if: ‘Most of the important attributes of the class present similar values in the objects’. Let us consider some interesting points with respect to the previous sentence: † The resemblance between the values that both objects have for a given attribute can be measured by means of a degree. In general, we will have a set of values {Sai ðo1 ; o2 Þ [ ½0; 1}: † The importance of an attribute in determining the resemblance between two objects must be given by the class designer. In general, we will have a set of values {pai }; also belonging to the [0,1] interval. † The term most is a linguistic quantifier that denotes the extent to which we want the whole set of attributes to be taken into account when computing the resemblance degree between two objects of the class. In the literature there are two main types of vague sentences involving linguistic quantifiers [13,14]. They can be represented as follows:

437

query environment: †

Zadeh’s approach [14] calculates the accomplishment degree using the relative cardinal of A with respect to D ðaA=D Þ; computing the compatibility degree of this cardinal with the quantifier Q definition: 0X 1 ðmA ðxÞ> mD ðxÞÞ B x[X C C ð12Þ X ZQ ðA=DÞ¼ mQ ðaA=D Þ¼ mQ B @ A mD ðxÞ x[X

†

Yager’s approach [13] is founded on the use of the Ordered Weighted Average (OWA) [15,16] aggregation operator. This operator involves the calculus of two sequences of values and its ranking from the highest to the lowest value: ·{ci ¼ mA ðxi Þ_ð12 mD ðxi ÞÞ}; with xi [X: ·{di ¼ mD ðxi Þ}; with xi [X: If {bi } and {ei } are the descendant ordered permutations of {ci } and {di } (respectively), then the accomplishment degree is given by the following formulation: YQ ðA=DÞ¼

n X

ð13Þ

wi ·bi

i¼1

† Type I: Q of X are A (e.g. ‘Most of the students are tall’). † Type II: Q of D are A (e.g. ‘Most of the tall students are intelligent’). where Q is a linguistic quantifier, X is a finite reference universe, and A and D are vague properties, both defined as fuzzy sets over X with membership functions mA and mD ; respectively. The vague expression that begins this section can be easily transformed in a Type II sentence. We only have to consider the following:

P P where wi ¼ mQ ðei = n1 di Þ2 mQ ððei = n1 di Þ21Þ † Vila’s approach [17] is founded on the concept of coherent family of quantifiers. Let us look at this proposal in more detail. Definition 5. (Coherent family of quantifiers). Let Q ¼ {Q1 ; …; Ql } be a set of linguistic quantifiers. Q is a coherent family of quantifiers if it verifies the following properties: The membership functions of the elements of Q are non-decreasing. (ii) A partial order relation X is defined in Q, and this relation has as maximal element Q1 ¼ ’ and as minimal element Ql ¼ ;: Besides, ;Qi ; Qj [ Q; Qi # Qj ) Qj X Qi : (iii) The membership function of the ’ quantifier is mQ1 ðxÞ ¼ 1 if x – 0 and mQ1 ð0Þ ¼ 0; and the membership function of the ; quantifier is mQl ðxÞ ¼ 0 if x – 1 and mQ1 ð1Þ ¼ 1: (i)

† Q is the quantifier most. † X ¼ {a1 ; a2 ; …; an } is the set of attributes that characterize the type of the class. † D is the set of the relevant attributes for the resemblance calculus. † A is the set of the attributes that have similar values in both objects. The membership function of these latter fuzzy sets are as follows:

mD ðai Þ ¼ pai ; ;ai [ X

ð10Þ

mA ðai Þ ¼ Sai ðo1 ; o2 Þ; ;ai [ X

ð11Þ

The next step in the resemblance calculation process is to calculate the accomplishment degree of the Type II sentence from the quantifier and the previously defined membership functions. The literature offers very different approaches to solve this kind of problems in the fuzzy

The basic idea is to consider that the fulfillment degree of the sentence is between the accomplishment degree of the extreme sentences ‘There exists a D that is A’ and ‘All Ds are A’: † The accomplishment degree of the sentence ‘There exists a D that is A’ is calculated as follows: ’x [ X; ðmD ðxÞ ^ mA ðxÞÞ ¼ max ðmD ðxÞ ^ mA ðxÞÞ x

ð14Þ

N. Marı´n et al. / Information and Software Technology 45 (2003) 431–444

438

† The accomplishment degree of the sentence ‘All Ds are A’ is the next one: ;x [ X;ðmD ðxÞ! mA ðxÞÞ ¼ min ðmA ðxÞ_ð12 mD ðxÞÞ ð15Þ x

From the above Vila’s approach for the calculus of the accomplishment degree of a Type II sentence can be carried out as follows.

type choose the quantifier that must be used to compare the objects of the class. That is, the designer will give the value of gQ he wants to use to combine both extreme cases. In this way, at the same time that we give freedom to the designer, we simplify a lot the calculus of the accomplishment degree. 5.4. Complex objects: recursivity

Definition 6. (Accomplishment degree VQ ). Let D and A be two fuzzy sets with the same domain X and Q be a coherent family of quantifiers such that every Qi has associated a value gi in [0,1] verifying:

g’ ¼ 1; g; ¼ 0; ;Q; Q0 [ Q; Q X Q0 Þ ) gQ $ gQ0 : ð16Þ In these conditions, the accomplishment degree of the sentence ‘Q Ds are A’ is calculated by means of the following formulation: VQ ðA=DÞ ¼ gQ max ðmD ðxÞ ^ mA ðxÞÞ x

þ ð1 2 gQ Þ min ðmA ðxÞ _ ð1 2 mD ðxÞÞÞ x

ð17Þ

As gi values, the authors propose the use of a value based on the orness measure given by Yager in Ref. [15]. Such an approximation is calculated as follows: ð1 oQ ¼ mQ ðxÞdx ð18Þ 0

The final expression for the calculus of the accomplishment degree is: VQ ðA=DÞ ¼ oQ max ðmD ðxÞ ^ mA ðxÞÞ x

þ ð1 2 oQ Þ min ðmA ðxÞ _ ð1 2 mD ðxÞÞÞ x

ð19Þ

We propose this latter approach for the aggregation of resemblance opinions. We do so because it has similar behaviour to that of Yager when the quantifier is close to the ; quantifier (as is the case here) and is easier to calculate (the calculus of oQ has to be performed only once for each quantifier). With respect to Zadeh’s approach, this method is less strict and has a similar calculus complexity. A comparative study of the three approaches can be found in Ref. [18]. Fig. 7 summarizes the process. In order to compare objects o1 and o2 ; we first compare their attribute values, obtaining the partial resemblance opinions Sai ðo1 ; o2 Þ: Then using VQ we obtain a general resemblance measure between the two objects. In order to apply Vila’s approach to the problem of resemblance aggregation we only need to determine the semantics of the quantifier we want to use. Instead of using the same quantifier to compute resemblance in all the classes, it is reasonable to let the designer of the class

From the previous sections it may seem that the problem of object resemblance is solved: couples of attribute values are compared obtaining partial resemblance degrees, and then a final degree is obtained by means of an aggregation process. However, the high interrelation amongst data in the object-oriented data model causes complexity in the calculus of object resemblance in a given class. We refer to the complex objects case and the cycle presence in the relationships graph associated to a given schema. In order to deal with the resemblance calculus between complex objects, the only solution is to propagate the problem by means of recursivity. Suppose that we are comparing objects o1 and o2 ; and the values of a given attribute in these objects are objects o3 and o4 : To compute the resemblance between o1 and o2 ; we previously need to compute the resemblance between o3 and o4 ; and so on. The use of recursivity is not a problem unless there are cycles in the relationships graphs. For example: Let us consider the graph in Fig. 8. Suppose that we have two objects of class A; namely, oA1 and oA2 ; and we want to calculate their resemblance degree. When we analyze attribute a; we will need to study the resemblance degree between two objects of the class B; for example oB1 and oB2 : When trying to solve this latter resemblance, we will need to study the resemblance between two objects oC1 and oC2 : Finally, to know the resemblance between these objects, we will need to compute the resemblance between two objects of the initial class A: The cycle in the above example introduces the following problems: † Firstly, we may have to solve a comparison problem in the same class that was our starting point. This increases the complexity of the problem and suggests the necessity of exploring a wide data set in order to establish the objects resemblance. † Secondly, it may be that the resemblance degree of the objects we want to compare is part of the recursive tree generated in the computation. For example, if attribute c led back to object oA1 and oA2 again, we would have an infinite cycle. The first problem is unavoidable because is due to the high interrelation of data in the object-oriented model. The

N. Marı´n et al. / Information and Software Technology 45 (2003) 431–444

439

Fig. 7. Dealing with complex object comparison.

second is far more dangerous because it prevent us from ending the computation. We can solve it by means of the following:

the type of a class, in accordance with their possibility of getting into a cycle within the application scheme where they are defined.

† Not propagating recursively, ignoring this in the general calculus. † Approximating the value in some way, making several iterations until reaching the final value. † Directly approximating the value with another semantically valid one.

Definition 7. (Superficial attributes). Let C be a class whose type is made up by the set of attributes StrC ¼ {a1 ; a2 ; …; an }: We define the set SStrC of superficial attributes of C as the subset of attributes of StrC that cannot get into a cycle that returns to C in the schema where the class is defined.

The first alternative (not propagating recursively) is not suitable because we may be ignoring important information. Although the second alternative (the use of an initial approximation and then iterate) is acceptable it requires a considerable calculus effort and a higher algorithmic complexity (more than one cycle may appear in a normal recursivity process). Our proposal is to use the third alternative. We can unfold the objects resemblance in two different ways: one that expresses the surface resemblance between the objects and the other based on an object exploration in depth. The first will be based on the object attributes which will not involve the cycling problem and the other will be a resemblance obtained taking into account the attributes that need recursive monitoring. The surface resemblance can be used as an approximation when, in the calculus of the second one (the real resemblance), a cycle is detected (see Fig. 9). Let us explore this in more detail. 5.4.1. Surface resemblance We are going to separate, into two different partitions, the attributes that characterize the structural component of

5.4.2. Deep resemblance: resemblance between two objects Using Definition 7 and ideas presented in the previous sections, we can define the calculus of the resemblance between two objects of a given class as follows. Definition 8. (Resemblance between two objects). Let C be a class whose type is made up by the set of attributes StrC ¼ {a1 ; a2 ; …; an }: Let o1 and o2 be two objects of C: We define the resemblance (SR) between the objects o1 and o2 as the value returned by the following function: SR : FC £OðFC Þ£OðFC Þ£PðOðFC Þ£OðFC ÞÞ£{0;1}!½0;1

Fig. 8. The problem of cycles.

N. Marı´n et al. / Information and Software Technology 45 (2003) 431–444

440

Fig. 9. Two ways of recursive call to deal with cycles.

where FC is the family of all the classes and OðFC Þ is the set of all the objects that exist in the database. The calculus of SRðC;o1 ;o2 ; V;tÞ involves the following case selections: If o1 ¼ o2 ; then: SRðC;o1 ;o2 ; V;tÞ ¼ 1

† There are two basic cases: ·When an identity equality holds between the objects. ·When a known defined resemblance relation exists in the class. † The third case uses the generalized resemblance degree to compare two fuzzy sets of objects, using recursivity in order to compare the elements of the sets. † The fourth case uses the variable V to determine the existence of a cycle. If this is the case, a recursive call is made that only focuses on the superficial attributes. † The last case is a general recursivity model, in which, according to the parameter t; the aggregation operator VQ is applied over the recursive calls using couples of attribute values. If t is equal to 1, only the surface is explored. Otherwise all attributes are studied in depth.

ð20aÞ

If exists a resemblance relation S defined in C; then: SRðC;o1 ;o2 ; V;tÞ ¼ mS ðo1 ;o2 Þ

ð20bÞ

Because the properties of the operators used in each of the basic cases are those of a resemblance relation, the operator SR is also a resemblance measure amongst objects of a given class.

If o1 and o2 are fuzzy sets then: SRðC;o1 ;o2 ; V;tÞ ¼

SRV;^

ðo1 ;o2 Þ ¼

^ðQSRV ðo2 lo1 Þ; QSRV ðo1 lo2 ÞÞ

5.5. Organization in an object oriented model ð20cÞ

If ðo1 ;o2 Þ [ V ^t – 1; then: SRðC;o1 ;o2 ; V;1Þ

ð20dÞ

Otherwise: oQ max ðpai ^SRðCai ;o1 ·ai ;o2 ·ai ;V

Lihat lebih banyak...

Complex object comparison in a fuzzy context

Descripción

Comentarios