Navigational accesses in a temporal object model

Share Embed


Descripción

Navigational Accesses in a Temporal Object Model Elisa Bertino 1

Elena Ferrari 1

Giovanna Guerrini 2

1

Dipartimento di Scienze dell'Informazione Universita degli Studi di Milano Via Comelico 39/41 - I20135 Milano, Italy e-mail: bertino,ferrarie @dsi.unimi.it f

2

g

Dipartimento di Informatica e Scienze dell'Informazione Universita di Genova Via Dodecaneso, 35 - I16146 Genova, Italy e-mail: [email protected] Technical Report 169-96

1

Abstract The management of temporal data is a crucial requirement for many data intensive applications. A considerable research e ort have been devoted in the past years to query languages for temporal data in the context of the relational model. Recently, temporal object-oriented data models have been proposed. In this paper, we focus on the navigational approach to querying object-oriented databases and show how this approach can be extended to a temporal context. In particular, we address issues related to the correctness of such navigational accesses.

2

1

Introduction

The importance of temporal data management has long been recognized by the database community and many techniques for modeling and managing temporal data have been introduced [19, 21]. Most research in temporal databases has been developed in the framework of the relational data model [8, 10, 17, 20]. However, also temporal object-oriented databases have recently received increasing attention, and several object-oriented data models have been proposed [16]. The reason for this interest is that most applications for which object-oriented database management systems are expected to provide support, exhibit some form of temporality. Examples are engineering databases, multimedia systems, and oce information systems. However, as pointed out also by Snodgrass in [16], in contrast to temporal relational data models, the speci cation of temporal object-oriented data models is in most cases informal. To overcome this drawback, we have proposed in [1] a formal temporal object-oriented data model, and we have addressed on this formal basis several issues deriving from the introduction of time in an object-oriented context. A considerable research e ort on temporal databases has been devoted to the design of temporal query languages, for both the relational and the object-oriented data model. In particular, temporal extensions of relational algebra [8, 18], relational calculus [18], and relational commercial query languages such as QUEL [15] and SQL [13, 17] have been proposed. In those temporal query languages, a relevant operation is represented by the temporal join. Indeed, in temporal relational query languages the join relational operator has two di erent avors, referred to as 2-join and timejoin in [8]. Two di erent join operators are required because in temporal data models two kinds of values (ordinary and temporal) are represented. 2-join plays the same role as in nontemporal relational data models, and allows to compare only attribute values that occur at the same point in time, whereas temporal join allows to impose conditions on times associated to tuples. Several temporal object-oriented query languages have also been proposed [16]. Many of these query languages are extensions of relational query languages, like DAPLEX [14], QUEL [12] and SQL [9], rather than of existing nontemporal object-oriented query languages. It is however important to note that object-oriented database systems [3] support a navigational approach for data access, in addition to traditional query language constructs available in relational database systems. This modality must be taken into account in designing temporal object-oriented query languages. The navigational approach is based on object identi ers and aggregation relationships: given an OID, the system directly accesses the corresponding object and navigates through objects referred to by its components. This access modality can be combined with a classical (e.g. SQL-like) access. Thus, the conditions in a query are imposed on nested attributes of the hierarchy rooted at the object under examination. Path expressions allow to conveniently describe joins, aiming at getting a component from an object. In an object-oriented query language a distinction can be made between implicit joins, corresponding to the hierarchical structure of objects, and explicit joins, analogous to the relational ones, explicitly comparing two objects. While issues related to explicit joins are quite similar to those arising in the relational context, implicit joins poses some new problems. Indeed, when the value of an object attribute is the identi er of another object, the identi er can be seen as a pointer to the referred object. Obviously, for the access to be correct, that pointer must not be dangling. In this paper, we investigate issues related to implicit joins and navigational accesses in a temporal context. Therefore, the goal of this paper is not to propose a new temporal object-oriented query language, rather is to investigate the impact of time on the peculiar features of object-oriented query languages. To this purpose, we rst introduce a formal notion of temporal path expression. Temporal path expressions are obtained as an extension of classical path expressions of objectoriented languages, in that for each attribute access a time can be speci ed, in addition to the 3

attribute name. The time can be expressed either explicitly, by specifying a time instant or a set of time instants, or implicitly, by means of a formalism to symbolically denote sets of time instants. Then, we investigate the notion of path expression correctness. As remarked by Cli ord and Croker in [8], a temporal model must enforce referential integrity constraint with respect to the temporal dimension. For example, the information that an employee worked in a division at time t, is correct if both the employee and the division existed in the database at time t. This means that some correctness conditions should be imposed on a database to ensure it satis es temporal constraints. We have proposed a notion of consistency for a temporal object-oriented database [1]. However, the consistency of the database alone is not enough to ensure that all the navigations through objects are correct. Thus, in this paper we investigate the issue of correctness of navigational accesses, and whether and how correctness can be statically veri ed. To best of our knowledge, this is the rst extensive investigation concerning navigational accesses in a temporal context and addressing the problem of a static analysis of path expressions. One of the few papers considering navigational access is the one by Cheng and Gadia [7]. Their language OOTempSQL provides a sublanguage for associative navigation relying on notions very similar to our concept of temporal expression. However, neither a formal semantics is given for the language nor correctness conditions have been stated. The remainder of the paper is organized as follows. Section 2 presents a brief overview of the T Chimera data model. Section 3 describes the formalism we use to symbolically denote a set of time instants in a temporal path expression. Section 4 formally introduces temporal path expressions and addresses the problem of path expression correctness. Section 5 deals with the static analysis of path expressions, whereas path expression equality is considered in Section 6. Finally, Section 7 concludes the paper. The paper includes two appendixes reporting the syntax of the language we use to specify boolean expressions and a sketch of the proofs of main results, respectively. 2

The Reference Temporal Ob ject-Oriented Model

T Chimera is the temporal extension of the Chimera object-oriented data model [11]. Chimera provides all concepts commonly ascribed to object-oriented data models, such as: object identity, complex objects, user-de ned operations, classes and inheritance. In the following, we denote with OI a set of object identi ers, with CI a set of class identi ers, that is, class names and with AN a set of attribute names. Moreover, V denotes the set of T Chimera legal values. In T Chimera the notion of type is supported. The existence of a nite set of basic prede ned value types is postulated, containing the types integer, real, bool, character and string. The type time is also a prede ned value type, used in the de nition of temporal types. T Chimera supports structured types such as sets, lists and records, and allows the use of class names in the de nition of structured types. In addition to the above-mentioned nontemporal types, T Chimera supports a collection of temporal types, to handle in a uniform way temporal and nontemporal domains. For each type T , a corresponding temporal type, temporal(T ), is de ned. Intuitively, instances of type temporal(T ) are partial functions from instances of type time to instances of type T . In T Chimera, temporal types can be used in the de nition of structured types. A function T 0 is de ned, which takes as argument a temporal type temporal(T ), and returns the corresponding static type T .1 We assume as the domain of the type time the domain T IME = f0,1,. . .,now, . . .g, isomorphic to the set of natural numbers IN. Symbol `0' denotes the relative beginning, while now denotes the 1

Function

T 0 is the identity function on nontemporal types. 4

current time. An interval, denoted as I = [t1; t2 ], is a set of consecutive time instants, including all time instants between t1 and t2 , t1 and t2 included. A single time instant t can be represented as the time interval [t; t], while [; ] denotes the null interval. The time dimension considered by our model is valid time. A class in T Chimera consists of two components: the signature, containing all the information for the use of the class and its instances, and the implementation, providing an implementation for the signature. A lifespan is associated with each class, representing the time interval during which the class has existed. We assume the lifespan of a class to be contiguous. The class signature contains information about the attributes (name and domain) and the methods (name and type of input and output parameters) of its instances. Attributes in T Chimera can be either temporal, immutable or nontemporal. A temporal (or historical) attribute is an attribute whose value may change over time, and whose values at di erent times are recorded in the database. An immutable attribute is an attribute whose value cannot be modi ed during the object lifetime, whereas a nontemporal (or static) attribute is an attribute whose value can change over time, but whose past values are not meaningful for the application at hand, and are thus not recorded in the database. Immutable attributes can be regarded as a particular case of temporal ones, since their value is a constant function from a temporal domain. Temporal attributes have temporal types as domains and their values are functions from the temporal domain time to the set of legal values for the attribute. Throughout the paper we represent the value of a temporal attribute of type temporal(T ) as a set of pair fh1; v1i; . . . ; hn; vn ig, where v1; . . . ; vn 2 V are legal values for type T , and 1; . . . ; n 2 T IME 2 T IME are time intervals such that the attribute assumes the value vi for each time instant in i , i = 1; . . . ; n. Given a class c 2 CI , A(c)  AN denotes the set of attributes of instances of that class, whereas dom(a; c) denotes the domain of attribute a in class c. Finally, since the objects belonging to a class vary over time, each class maintains the history of the objects instances or members of the class over time.2 The set of objects members of a class changes dynamically over time. Thus, to represent the extension of T Chimera classes, we introduce a function  : CI 2 T IME ! 2OI , assigning an extent to each class, for each instant t. For each c 2 CI ,  (c; t) is the set of the identi ers of objects that, at time t, belongs to c either as instances or as members.

Example 1 Suppose that i1; . . . ;i5

2 OI . The following are examples of T Chimera classes:

 class employee, with lif espan = [10,now], such that: { A(employee) = fname, salary, status, division, managerg { dom(name; employee) = temporal(string), dom(salary; employee) = temporal(integer), dom(status; employee) = string, dom(division; employee) = temporal(string), dom(manager; employee) = temporal(manager) {  (employee; t) = fi1 ; . . . ;i5 g 8t 2[10,now]

 class manager, with lif espan = [10,now], subclass of employee, such that: { A(manager) = A(employee) [ fdependents; official carg 2

According to the usual terminology, an object is an instance of a class c, if c is the most speci c class, in the inheritance hierarchy, to which the object belongs. If an object is an instance of a class it is also a member of all the superclasses of this class.

5

{ dom(dependents; manager) = temporal(set-of(employee)), dom(official car; manager) = string {  (manager; t) = fi2; i4g 8t 2[10,now] Instances of class employee contain an immutable attribute name, whose value never changes during the instance lifespans, a static attribute status, for which we do not keep track of the history of changes, and three temporal attributes salary, division and manager, for which we record the whole history of changes. Class manager is a subclass of employee with the additional attributes dependents and official car. 4 The notion of T Chimera object is formalized by the following de nition.

De nition 1 (Object) An object o is a 4-tuple (i; lif espan; v; class-history ), where

2 OI is the oid of o; lif espan 2 (T IME 2 T IME ) is the lifespan of o; v 2 V is a record value (a1 : v1 ; . . . ; an : vn ), where a1; . . . ; an 2 AN are the names of the attributes of o, and v1; . . . ; vn 2 V are their corresponding values; class-history is a set fh1; c1i; . . . ; hn; cn ig, where 1 ; . . . ; n 2 T IME 2 T IME are time intervals, c1 ; . . . ; cn 2 CI are class identi ers, such that ci is the class identi er of the most speci c class to which o belongs in i , 1  i  n. i

2 Example 2 Consider the classes of Example 1, and suppose that i1; . . . ;i5 are examples of T Chimera objects:

2 OI . The following

i =i1 lif espan = [20; now] v = f(name: fh[20,now],`Alan Smith'ig ), (salary: fh[20,60],15ki; h[61,now],30kig), (status: `full-time'), (division: fh[20,100],`Disks'i; h[101,now],`Printers'ig), (manager: fh[20,49],i4i; h[50,now],i2ig)g class-history = fh[20,now],employeeig i = i2 lif espan = [50,now] v = f(name: fh[50,now],`Mary Dole'ig), (salary: fh[50,now],50kig), (status: `full-time'), (division: fh[50,now],`Printers'ig), (manager: fh[50,now],i4ig), (dependents: fh[50,100],fi1, i3 gi; h[101,now],fi1,i3 ,i5ig)g class-history = fh[50,now],managerig

4

Given an oid i, we make use of function : OI 2 T IME ! CI to refer the class the object denoted by i belongs at time t, that is, (o:i; t) = c, if h; ci 2 o:class-history and t 2  . Note that

(o:i; t) is de ned for any t 2 o:lif espan. 6

We require that each object is a consistent instance of all the classes it belongs to. Our notion of consistency keeps into account that, in a temporal context, both the object state and the class an object belongs to change over time. Therefore, verifying the consistency of an object requires two steps. First the set of attributes characterizing the object for each instant t of its lifespan must be determined. Then, the correctness of their values, with respect to the most speci c class3 the object belongs at time t, must be veri ed. Note that, if we consider an instant t lesser than the current time, we are able to identify only the temporal attributes characterizing the object at time t, since for static attributes we record only their current values. Thus, for instants lesser than the current time, it only makes sense to check the correctness of the values of the temporal attributes of the objects. Moreover, at the current time also the correctness of the values of the static attributes, with respect to the most speci c class the object currently belongs, must be checked. We refer the interested reader to [1] for further details. 3

Temporal Expressions

Generally, in a nontemporal object-oriented database a query selects objects based on the evaluation of boolean expressions, involving attribute values, method invocation results, and so on. In a temporal context, queries must allow the retrieval of objects satisfying a given boolean expression (or a set of boolean expressions) for a speci c set of time instants, that can be de ned either explicitly or implicitly.

Example 3 The query: select the employees that earned more than their managers in the time interval [10,100], is an example of query in which the set of time instants in which the query condition must be veri ed is given explicitly. By contrast, the query: select the employees that work in the Printers department when it was headed by Mary Dole, is an example of query in which the set of time instants is implicitly speci ed.

4

To implicitly represent the time with respect to which a boolean expression must be evaluated, we need a formalism to symbolically denote a set of time instants. The symbolic formalism we use is similar to the one proposed by Gadia and Nair in [10], based on the notion of temporal expression. A temporal expression is a symbolic representation of a set of time instants. The main di erence between our notion of temporal expression and the one in [10] is that we allow the use of the selection operators: f irst(), last(), slice(), f irst instant(), last instant(), inst slice() in the de nition of temporal expressions (see De ntion 3 introduced later on). In the following, we use a set of disjoint intervals4 7 = f[ti,tj ], . . .,[tr ,ts ]g as a compact notation for the set of natural numbers included in these intervals. The operations of union (71 [ 72 ), intersection (71 \ 72 ), di erence (71 n 72 ), and complement (7c ) have the usual semantics of set operations. Moreover, t 2 7 is true if t is one of the natural numbers represented by 7. Finally, we de ne a projection operation 5(7; n), that takes as input a set of disjoint time intervals 7 and a natural number n. Function 5() orders the elements in 7 in increasing order, with respect to their upper bound, and returns the n-th interval in the ordering, if the cardinality of 7 is lesser than or equal to n; it returns null otherwise. For ex3 The consistency of an object is checked only with respect to its most speci c class. If an object is consistent with respect to its most speci c class, it is consistent with respect to all its superclasses. 4 Two intervals are considered disjoint if they cannot be collapsed into a single one (note that [1,2] and [3,4] are not disjoint).

7

ample, 5(f[1; 20]; [30; 40]; [70; 200]; [230; now]g; 3) = 5(f[1; 20]; [30; 40]; [230; now]; [70; 200]g; 3) = [70,200], whereas 5(f[1; 20]; [30; 40]; [70; 200]; [230; now]g; 5) = null.

In the following BE denotes the set of boolean expressions. Boolean expressions are speci ed using the language described in Appendix A. Before formally de ning the notion of temporal expression, we need to introduce the following de nition.

De nition 2 (Temporal Interpretation) Let be be a boolean expression. The temporal interpretation of be (written [[ be ]] ) is the set of instants in which be is true. 2 Example 4 The temporal interpretation of the boolean expression (salary > 20k and division = `Printers' ), when the above expression is evaluated on object i1 of Example 2, is f[101,now]g, whereas the temporal interpretation of (salary > 40k) is the empty set, as attribute salary has never reached this value during i1 lifespan. Note that the temporal interpretation of (salary>40k), when it is evaluated on object i2 , is f[50,now]g. Finally note that the interpretation of (status = `full-time') is the set f[now,now]g, on both i1 and i2 , as attribute status is static. This does not imply that attribute status has assumed a value di erent from `full-time' for the instants lesser then the current time. However, since we record the value of the attribute only at the current time, we are able to check the truth of the condition only at t =now. 4 De nition 3 (Temporal Expression) The set T E of T Chimera temporal expressions is recursively de ned as follows:

 the temporal interpretation of a boolean expression is a temporal expression: 8be 2 BE ; [[ be ]] 2 T E;  if te1 and te2 are temporal expressions, then te1 ^ te2 , te1 _ te2, te1 0 te2, :te1 are temporal expressions;

 if te is a temporal expression, then f irst(te), last(te), slice(te; n), f irst instant(te), last instant(te), inst slice(te; n) are temporal expressions.

2 Example 5 [[ salary = 20k and status = `full-time' ]] , [[ division = `Disks' ]] 0 [[ salary < 30k ]] , f irst( [[ salary > 40k or manager.name = `Mary Dole' ]] ) are examples of temporal expressions. 4 Each temporal expression uniquely denotes a set of time intervals. The semantics of a temporal expression, that is, the set of intervals it denotes, is formalized by means of function : T E ! 2T IME2T IME , de ned as follows.

De nition 4 (Temporal Expression Semantics) Let te be a temporal expression. The semantics of te, denoted as (te), is de ned as follows:

 ( [[ be ]] ) = [[ be ]] , 8be 2 BE ;  (te1 _ te2 ) = (te1) [ (te2), 8te1; te2 2 BE ;  (te1 ^ te2 ) = (te1) \ (te2), 8te1; te2 2 BE ; 8

       

(te1 0 te2 ) = (te1) n (te2), 8te1 ; te2 2 BE ; (:te) = (te)c, 8te 2 BE ; (f irst(te)) = 5( (te); 1), 8te 2 BE ; (last(te)) = 5( (te); m), 8te 2 BE , m =j te j; (slice(te; n)) = 5( (te); n), 8te 2 BE ; (f irst instant(te)) = f[t; t]g, where t = min(5( (te); 1), 8te 2 BE ; (last instant(te)) = f[t; t]g, where t = max(5( (te); m), 8te 2 BE , m =j te j; (inst slice(te; n)) = f[t; t]g, where t is the n-th instant in 1 [ . . . [ n , 1; . . . ; n are time intervals such that (te) = f1; . . . ; ng.

2 Example 6 Consider objects i1 and i2 of Example 2: ( [[ salary > 20k and division = `Printers' ]] _ [[ manager.salary > 40k ]] ) = f[101,now]g [ f[50,now]g = f[50,now]g, when the expression is evaluated on i1 . Similarly, (f irst instant( [[ manager.salary > 40k ]] ) = f[50,50]g, (last( [[ division = `Printers' or salary < 20k ]] )) = f[101; now]g. 4

In the following we refer to temporal expressions denoting a single time instant as instant-valued temporal expressions. Formally, a temporal expression te is an instant-valued temporal expression if and only if (te) = f[t,t]g.5 4

Temporal Path Expressions

Temporal path expressions are obtained as an extension of classical path expressions of objectoriented languages [4], in that for each attribute access a time can be speci ed, in addition to the attribute name. In such a way, the value of the attribute at a speci ed time is denoted. The time can be expressed either explicitly, by specifying a time instant or a set of time instants, or implicitly, by means of a temporal expression. The language to express path expressions allows the nesting of attribute accesses expressed by means of post x dot notation, but the restriction is imposed that all attribute accesses in the path expression, but the last one, are quali ed with a time instant, and not with a set of time intervals, thus denoting a nontemporal value, on which a further attribute access can be speci ed. Alternatively, if a set of time intervals were speci ed to qualify an intermediate attribute access in the path expression, that set would be seen as the set of time instants belonging to the intervals, so that the access would denote a set of nontemporal values, on which further accesses can be evaluated. The following subsections give a formal treatment of path expressions in a temporal context, and address the problem of path expression semantics and correctness. 5

The usefulness of instant-valued temporal expressions will be made clear in the following section.

9

4.1

Syntax

In this section we formally de ne the set of T Chimera path expressions. In the following we denote with V ar a set of object-denoting variables.

De nition 5 (Path Expression) The set PE of T Chimera path expressions is recursively de ned as follows:

 a simple path expression is de ned as follows: { if X 2 V ar is an object-denoting variable, then X is a simple path expression; { if e is a simple path expression, a 2 AN is an attribute name and te 2 T E is an instant-valued temporal expression, then e:a # te is a simple path expression; { if e is a simple path expression, a 2 AN is an attribute name and t 2 T IME is a time instant, then e:a # t is a simple path expression; { if e is a simple path expression and a 2 AN is an attribute name, then e:a is a simple path expression;

 a terminal path expression is de ned as follows: { if e is a simple path expression, a 2 AN is an attribute name and te 2 T E is a temporal expression, then e:a # te is a terminal path expression; { if e is a simple path expression, a 2 AN is an attribute name and 7 22T IME2T IME is a set of time intervals, then e:a # 7 is a terminal path expression; { if e is a simple path expression, a 2 AN is an attribute name, then e:a # all is a terminal path expression;

 simple path expressions and terminal path expressions are path expressions. 2 Example 7 Consider the classes of Example 1, and let X be a variable of type employee. The following are examples of path expressions: X:division # 30 X .salary # last interval(manager.name = `Mary Dole') X .manager # (f irst instant(division = `Printers')).salary.

4 4.2

Semantics

The semantics of a path expression (i.e., the value denoted by it) can only be speci ed starting from an oid-assignment, that is, a function assigning oids to object-denoting variables. However, the value denoted by a path expression depends on the temporal speci cations it contains. Consider rst the case of a path expression for which a time instant t is speci ed (either implicitly or explicitly). If the path expression is of the form e:a # t, that is, a temporal speci cation occurs in the terminal part of the path expression, then it denotes the nontemporal value o:v:a(t), where o 10

is the object identi ed by the oid to which e evaluates.6 Consider now an expression e:a, that is, an expression with no temporal speci cation in the terminal part. The most intuitive interpretation is that, if no time is speci ed, the expression is evaluated at the current time, that is, e:a is simply a shorthand for o:v:a # now. However, consider a sequence of attribute accesses composing a path, that is, suppose e:a = e0 :b # t:a. Let o be the object identi ed by the oid to which e0 evaluates. Because of the consistency of the database, o:v:b is certainly de ned at time t and so its attribute a is; by contrast, a could be unde ned at time now. Thus, when a path expression contains a temporal quali er, it is evaluated at the time denoted by the quali er till another explicit temporal quali er is encountered. Therefore the path expression X:a # t:a1: 1 1 1 :ai01 :ai # t0 :ai+1 : 1 11 :an01 :an is equivalent to X:a # t:a1 # t: 1 1 1 :ai01 # t:ai # t0 :ai+1 # t0 : 1 1 1 :an01 # t0 :an # t0 . A path expression e:a, where e does not contain any temporal speci cation, is simply a shorthand for e:a # now. Consider now the terminal path expression e:a # 7, where 7 is a set of time intervals (either explicitly or implicitly denoted). The value associated with this expression is o:v:aj7, that is, the restriction of the function which is the value of o:v:a to the time intervals in 7. However two di erent interpretations of this path expression are possible: according to the strong interpretation the denoted function must be de ned for each t 2 7, otherwise the path expression denotes no value. According to the weak interpretation, the path expression denotes a function which can be partial on 7. We assume that the default interpretation of a path expression is the weak one. However, the strong interpretation can also be required by using the alternative syntax e:a + 7. A (terminal) path expression e:a # all denotes the temporal value o:v:a whenever it is de ned. We now formalize these notions. First we introduce the time associated with a path expression e, denoted as 0(e).

De nition 6 (Time of a Path Expression) Let e be a simple path expression. Its time, denoted as 0(e), is a time instant in T IME recursively de ned as follows:

 if e = X 2 V ar, 0(e) = now;  if e = e0:a # t, t 2 T IME , 0(e) = t;  if e = e0:a # te, where te is an instant-valued temporal expression and (te) = f[t; t]g, then 

0(e) = t; if e = e0 :a, then 0(e) = 0(e0 ).

2 Example 8 Referring to Example 7: 0(X:division # 30) = 30 0(X .manager # (f irst instant(division = `Printers'))) = 101. 4

Consider now the restriction of a temporal value to a set of time intervals, as speci ed by the following de nitions. 6

For the sake of simplicity, here and in the following we represent the value of a static attribute of type T as a T , de ned only at t = now. partial function:

T IME !

11

De nition 7 (Temporal Value Weak Restriction) Let v be temporal value of type temporal(T ), and let 7 2 2T IME2T IME be a set of time time intervals. The weak restriction of v to 7, denoted as vj7 , is a function: T IME ! T such that: vj7 (t) =

(

v (t) if t 2 7 undef ined otherwise

2 De nition 8 (Temporal Value Strong Restriction) Let v be a temporal value of type temporal(T ), and let 7 2 2T IME2T IME be a set of time intervals. The strong restriction of v to 7, denoted as vjj7 , is a function T IME ! T such that: vjj7 =

(

vj7 if v (t) is de ned 8t 2 7 undef ined otherwise

2 We are now ready to de ne the semantics (the value denoted by) of a path expression e.

De nition 9 (Path Expression Semantics) Let # : V ar ! OI be an oid-assignment. The semantics of a path expression e under oid-assignment #, denoted as E [[ e ]] #, is recursively de ned as follows:

        

if X 2 V ar, E [[ X ]] # = #(X ), that is, it is the oid assigned to variable X by #; if t 2 T IME , E [[ e:a # t ]] # = E [[ e ]] #:v:a(t); if te is an instant-valued temporal expression, and (te) = f[t; t]g, E [[ e:a # te ]] # = E [[ e ]] #:v:a(t);

E [[ e:a ]] # = E [[ e ]] #:v:a(0(e)); if 7 2 2T IME2T IME , E [[ e:a # 7 ]] # = E [[ e ]] #:v:aj7; if 7 2 2T IME2T IME , E [[ e:a + 7 ]] # = E [[ e ]] #:v:ajj7; if te is a temporal expression, E [[ e:a # te ]] # = E [[ e ]] #:v:aj (te); if te is a temporal expression, E [[ e:a + te ]] # = E [[ e ]] #:v:ajj (te); E [[ e:a # all ]] # = E [[ e ]] #:v:a. 2

Example 9 Consider the path expressions of Example 7,and suppose that #(X ) =i1 , where i1 is the object of Example 2:

E [[ X:division # 30 ]] # = `Disks' E [[ X .salary # last interval(manager.name E [[ X .manager # (f irst instant(division

=

12

fh[50,60],15ki, h[61,now],30kig `Printers')).salary ]] # = fh[50,now],50kig. = `Mary Dole') ]] # =

4

Note that, according to De nition 9, E [[ e:a # all ]] # is simply a shorthand for E [[ e:a # dom(E [[ e:a ]] #) ]] #.7 Similarly, E [[ e:a ]] # is equivalent to E [[ e:a # 0(e) ]] #. Therefore in the remainder of the discussion we do not explicitly consider these cases. The following proposition relates the semantics of a path expression on a set of time intervals, either explicitly or implicitly denoted, to the semantics of the path expression on the time instants composing the set. This proposition thus characterizes nontemporal valued terminal expressions as derivable from temporal valued ones.

Proposition 1 Let e:a # 7, 7  2T IME2T IME , be a path expression. Under the weak semantics, the following relationship holds:

E [[ e:a # 7 ]] #(t) = E [[ e:a # t ]] # 8t 2 7 (3) By contrast, under the strong semantics, relationship (*) holds only if: dom(E [[ e:a + te ]] #) 6= ;:

4.3

Correctness

In this subsection we focus on conditions ensuring that a path expression e is correct, for a given assignment #. Such conditions ensure that E [[ e ]] # is de ned, that is, it denotes a value. The correctness conditions depend on the structure of the path expression. Starting from the basis of the inductive de nition of path expressions, the path expression X , with X 2 V ar, is correct for #, provided that the oid-assignment # assigns an oid to variable X . Thus, given a path expression e, we must consider oid-assignments that are de ned on variables in e. Consider now the case e:a # t, with t time instant (either implicitly or explicitly denoted). For this expression to denote a value, the object denoted by e must exist at time t, and a must be an attribute of the class to which that object belongs at t; nally, the value for a at time t must be stored. For terminal path expressions, the correctness conditions di er according to the two di erent interpretations (strong and weak) of the path expressions. In particular, the strong interpretation requires the conditions above to be satis ed for all time instants in the speci ed set of time intervals, whereas the weak interpretation only requires the existence of at least an instant in the speci ed set of intervals, in which the conditions are satis ed. The following proposition states the correctness conditions for a path expression.

Proposition 2 A path expression X:a1 # t1 :a2 # t2 : 1 1 1 :an01 assignment # if all the following conditions are satis ed:

#

tn01 :An is correct for an oid-

 X 2 dom(#);  for each i; 1  i  n 0 1: 1. ti 2 E [[ X:a1 # t1 :a2 # t2 : 1 11 :ai01 # ti01 ]] #:lif espan; 7

Here and in the following, given a function f , dom(f ) denotes the set of values on which f is de ned.

13



2. (E [[ X:a1 # t1 :a2 # t2 : 1 1 1 :ai01 # ti01 ]] #:i; ti) = ci , and ai 2 A(ci ); 3. ti 2 dom(E [[ X:a1 # t1 :a2 # t2 : 1 1 1 :ai01 # ti01 ]] #:v:ai); 4. T 0 (dom(ai; ci)) 2 CI ;

{ An = an # tn , and conditions 1. . . 3 above hold (with i = n), or { An = an # 7, and 9t 2 7 for which conditions 1. . . 3 above hold (with i = n), or { An = an + 7, and 8t 2 7 conditions 1. . . 3 above hold (with i = n).

We remark that the condition of Proposition 2 for terminal expressions under strong semantics does not require that object E [[ X:a1 # t1 :a2 # t2 : 1 1 1 :an01 # tn01 ]] # has never migrated during the intervals in 7, that is, that (E [[ X:a1 # t1 :a2 # t2 : 1 11 :an01 # tn01 ]] #:i; t) = c, with an 2 A(c), 8t 2 7. By contrast, it requires that for all the classes c^ to which E [[ X:a1 # t1 :a2 # t2 : 1 1 1 :an01 # tn01 ]] # has belonged during 7, c^ contains attribute an , that is, an 2 A(^). c The following corollary states correctness conditions for path expressions containing static attributes.

Corollary 1 Let e = X:a1 # t1 :a2 # t2 : 1 1 1 :an01 # tn01 :An be a path expression. If, 9i; 1  i  n 0 1, such that ai is a static attribute, then e is correct only if ti = now. Furthermore:

 if An = an # tn , and an is a static attribute, then e is correct only if tn = now;  if An = an # 7, and an is a static attribute, then e is correct only if now 2 7;  if An = an + 7,and an is a static attribute, then e is correct only if 7 = f[now; now]g.

Example 10 Referring to objects of Example 2, the path expressions of Example 7 are correct, whereas the following are examples of incorrect path expressions, given variable X of type employee: X:division # 5 X .status # last interval(manager.name = `Mary Dole') X .division # (f irst instant(division = `Printers')).salary.

4

The correctness test for a path expression can be performed in linear time with respect to the length of the path expression, where the length of a path expression is the number of attribute accesses (that is, \dots") appearing in it. The following proposition speci es the domains of temporal values denoted by correct path expressions. Their knowledge is relevant in order to check the correctness of subsequent uses of those values, for instance when path expressions are used in comparison formulas.

Proposition 3 Let e = X:a1 # t1 :a2 # t2 : 1 1 1 :an01 # tn01 :An be a correct path expression. Then:

 if An = an # 7, dom(E [[ e ]] #) = ft j t 2 7; X:a1 # t1:a2 # t2: 1 1 1 :an01 # tn01 :an # t is correct g;  if An = an + 7, dom(E [[ e ]] #) = 7;  if An = an # all, dom(E [[ e ]] #) = dom(E [[ X:a1 # t1:a2 # t2: 1 11 :an01 # tn01 ]] #:v:an).

14

5

Static Analysis of Path Expressions

In the previous subsection we have stated conditions ensuring that a temporal path expression is correct, that is, it denotes a value. However, these conditions can only be checked dynamically, since they depend on the speci c object assigned to the variable appearing in the path expression by the oid-assignment #. In this section, we establish conditions ensuring the correctness of a path expression for any oid-assignment #. The relevance of these conditions is that they can be checked statically, that is, at program (query) compilation time, thus allowing to ensure, by means of static checks, that a given path expression will always be correct at run-time. Our approach is related to static type checking techniques for nontemporal (database) programming languages. In such database languages, type declarations are exploited to statically check the correctness of a program, to ensure that no run-time type errors occur [6]. First, consider the problem of determining legal oid-assignments. In a (nontemporal) typed object-oriented language, a type information is associated with each variable (that is, a type is declared for the variable). We express the information that type T is declared for variable X as X 2 V arT . This type declaration implies that at run-time variable X can only be instantiated to an object member of class T (that is, instance of T or of a subclass of T ). Moving to a temporal context, we need to express information of the form X :t T , to denote that variable X can only be instantiated with an object member of T at time t, as stated by the following rule.

Rule 1 Let e be a path expression. The (temporal) type information X :t T is associated with X in e, if e has the form X:a # t:e0 and T is the type declared for X . Formally:8 e = X:a # t:e0 X 2 V arT X :t T

3

Therefore, given a variable X with type information X :t T , an oid-assignment # is legal for X if the object assigned to X by # is an instance of class T at time t.

De nition 10 (Legal Oid-Assignment) Let e be a path expression and let X :t T the type information associated with e. An oid-assignment # is legal for e if X 2 dom(#) and #(X ) 2  (T; t).

2

Figure 1 shows the typing rules for path expressions. We denote the type associated with a path expression e as type(e). Rule 1 states that the type of a path expression e = X 2 V ar is the type declared for X . Rules 2, 3 and 4 deal with path expressions on time instants, thus denoting nontemporal values, whereas the remaining rules deal with (terminal) path expressions on time intervals, thus denoting temporal values. For simplicity, throughout the paper, we have considered static attributes as a particular case of temporal attributes, whose values are de ned only at time instant now. To take into account the distinction between static and temporal attributes, the above rules should be re ned. For example, Rule 2 should be replaced by the following two rules:

(R 2a)

type(e)

=T

T

a 2 A(T ) dom(a; T ) = temporal(T 3 ) t 2 T I ME type(e:a # t) = T 0 (dom(a; T )) 2 CI

8

The meaning of the inference rule is the following: if the conditions in the rule premises (the upper part of the rule) are satis ed, then the rule consequence (the lower part of the rule) can be inferred.

15

(R 1)

X 2 V arT type(X ) = T

(R 2)

type(e) = T T 2 CI a 2 A(T ) t 2 T I ME type(e:a # t) = T 0 (dom(a; T ))

(R 3)

type(e) = T T 2 CI a 2 A(T ) te 2 T E type(e:a # te) = T 0 (dom(a; T ))

(R 4)

type(e) = T type(e:a)

(R 5) (R 6) (R 7) (R 8) (R 9)

T

2 CI

instant-valued

a 2 A(T )

= 0 (dom(a; T )) T

a 2 A(T ) dom(a; T ) = temporal(T 3 ) type(e:a # 7) = dom(a; T )

type(e)

=T

T

2 CI

type(e)

=T

T

2 CI

type(e)

=T

T

2 CI

type(e)

=T

T

2 CI

type(e)

=T

T

2 CI

7 2T IME2T IME 2

a 2 A(T ) dom(a; T ) = temporal(T 3 ) te 2 T E type(e:a # te) = dom(a; T )

a 2 A(T ) dom(a; T ) = temporal(T 3 ) type(e:a + 7) = dom(a; T )

7 2T IME2T IME 2

a 2 A(T ) dom(a; T ) = temporal(T 3 ) te 2 T E type(e:a + te) = dom(a; T )

a 2 A(T ) dom(a; T ) = temporal(T 3 ) type(e:a # all) = dom(a; T )

Figure 1: Temporal path expression typing rules

16

non instant-valued

non instant-valued

(R 2b)

type(e) = T T 2 CI a 2 A(T ) type(e:a # now) = T 0 (dom(a; T ))

and similar re nements should be done for other rules. However, there is no mean to statically check whether the only instant denoted by a temporal expression will be now. Thus, in the following, we will focus on temporal attributes. In order to apply the above typing rules, an important information is whether a temporal expression is instant-valued. In general, this property cannot be statically decided, since it depends on the content of the database. However, there are some sucient syntactic conditions on temporal expressions ensuring their instant-valuedness. In particular, temporal expressions of the form last instant(te), f irst instant(te), inst slice(te; n) are instant-valued, 8te 2 T E , 8n 2 IN. We are now able to introduce the notion of type correctness of a path expression and to relate this notion to the notion of (dynamic) correctness discussed in Section 4.3.

De nition 11 (Type Correctness) A path expression e is said to be type correct if a type for it can be deduced according to rules in Figure 1. 2 Proposition 4 Let e = X:a # t:a1 : 1 11 :an be a type correct path expression. Then, for any legal assignment #, e is correct for #. Moreover, if the type deduced for e is type(e) = T 2 CI , then E [[ e ]] # 2 (T; t).

By contrast, if a path expression e contains the speci cation of two di erent time instants, type correctness does not imply the correctness of e for any legal oid-assignment, as shown by the following example.

Example 11 Referring to classes of Example 1, consider the path expression e = X:manager # 50:division # 20, with X 2 V aremployee. The path expression is type correct, and type(e) = string. Consider the oid assignment # such that #(X ) = i1 of Example 2. The oid assignment is legal, since i1 2  (employee; 50). However, E [[ e ]] # = i2:v:division(20) is unde ned, since 20 62 i2 :lif espan. 4 For path expressions denoting a temporal value, a similar result can be obtained, under the weak interpretation, as stated by the following corollary.

Corollary 2 Let e = X:a # t:a1 : 1 11 :an # 7 be a type correct path expression, if t 2 7, then for any legal assignment #, e is correct for #.

As a particular case, note that if e = X:a # t:a1 : 1 11 an # all is a type correct path expression, then for any legal assignment #, e is correct for #. By contrast, under a strong interpretation, type correctness does not imply the correctness for any legal oid-assignment, as shown by the following example.

Example 12 Referring to classes of Example 1 and to objects of Example 2 it is easy to see that the path expression X:manager # 50:division + [40; now] is type correct but it is not correct with respect to the legal oid assignment # such that #(X ) = i1. Indeed, E [[ X:manager # 50:division ]] #jj[40;now] is unde ned, since f[40; now]g 6 dom(i2:v:division) = f[50; now]g. 4

17

6

Path Expression Equality

The value a path expression denotes can be compared (by means of a comparison operator such as =, 6=, >, j<

h

opi

::= + - * / j

j

j  j  j

j

= = j 6

j 2 j 62

j [ j \ j n

21

B

Proofs

Proof of Proposition 1

By De nition 9, [ e:a te ] # = [ e ] #:v:aj (te); by De nition 7, [ e ] #:v:aj (te)(t) = [ e ] #:v:a(t) t (te). But, by De nition 9, [ e:a t ] # = [ e ] #:v:a(t). Thus, the thesis follows. By contrast, by De nition 9, [ e:a te ] # = [ e ] #:v:ajj (te), and, by De nition 7, [ e ] #:v:ajj (te) = [ e ] #:v:aj (te) only if (te) dom( [ e ] #:v:a); otherwise, dom( [ e ] #:v:ajj (te)) = . E

#

E

E

E

#

E

E

E

8

2

E

+



E

E

E

E

;

5

Proof of Proposition 2

The proposition can be proved by induction on the structure of the path expression. Basis: path expression X , with X V ar; since X dom(#) #(X ) is de ned and so [ X ] # is. Inductive step: Given, by inductive hypothesis, the correctness of e, we prove the correctness of e:a t. [ e:a t ] # is de ned as [ e ] #:v:a(t). Conditions of Proposition 2 ensure that [ e ] #:v:a(t) is de ned. Indeed, condition 1 ensures that object [ e ] #11 exists at time t, while condition 2 ensures that in its state [ e ] #:v, the object has a value for attribute a. Condition 3 ensures that function [ e ] #:v:a is de ned at time t. Moreover, the consistency conditions on the database ([1]) ensure that the denoted value exists and that it is of the appropriate type 0 (dom(a; c)), that is, it is an oid in . Thus, condition 4 implies that the denoted value is an oid in and this allows to apply further attribute accesses to it. We now prove the correctness of the terminal expression e:a 7 given, by inductive hypothesis, the correctness of e. The terminal expression e:a 7 denotes the (temporal) value e:v:aj7 . If t 7 such that conditions 1. . .3 hold, then, for such t, e:a t is correct. Thus, e:v:a(t) is de ned and by De nition 7, since t 7, e:v:aj7 (t) = e:v:a(t) is de ned, too. Finally, we prove the correctness of the terminal expression e:a 7 given, by inductive hypothesis, the correctness of e. The terminal expression e:a 7 denotes the (temporal) value e:v:ajj7 . If t 7 conditions 1. . .3 hold, then, for all such t, e:a t is correct. Thus, e:v:a(t) is de ned t 7 and by De nition 7, e:v:ajj7 = e:v:ajj7 is de ned, too. 2

2

E

#

E

#

E

E

E

E

E

T

OI

OI

#

#

9

2

#

2

+

+

8

#

8

2

2

5

Proof of Corollary 1

If a is a static attribute, then, for any object o, dom(o:v:a) = [now; now] . f

g

5

Proof of Proposition 3

We consider each case separately: An = an 7 then, by De nition 9, [ e ] # = [ X:a1 t1 :a2 t2: :an01 tn01 ] #:v:anj7 . Moreover, by De nition 7, dom( [ X:a1 t1:a2 t2 : :an01 tn01 ] #:v:anj7) = dom( [ X:a1 t1 :a2 t2: :an01 tn01 ] #:v:an) 7. Finally, by de nition of path expression correctness, dom( [ X:a1 t1:a2 t2: :an01 tn01 ] #:v:an) = t X:a1 t1 :a2 t2: :an01 tn01:an t is correct , from which the thesis follows. An = an 7 then, by De nition 9, [ e ] # = [ X:a1 t1:a2 t2: :an01 tn01 ] #:v:anjj7. Moreover, by dom(vjj7 ) = 7. De nition 7, v 7; dom(vjj7 ) = An = an all then, by De nition 9, [ e ] # = [ X:a1 t1 :a2 t2 : :an01 tn01 ] #:v:an; thus, trivially, dom( [ e ] #) = dom( [ X:a1 t1:a2 t2 : :an01 tn01 ] #:v:an). #

E

E

111

#

#

E

#

#

#

111

#

111

#

#

E

\

111

#

f

+

E

8 8

6

#

E

#

j

#

E

#

#

#

111

#

111

#

#

#

g

; )

E

E

#

E

E

#

#

#

111

#

111

#

#

5

Proof of Proposition 4

First of all we recall that X:a t:a1: :an is simply a shorthand for X:a proposition can be proved by induction on the structure of the path expression. #

11

111

#

E [[ e ]] # is an object by inductive hypothesis.

22

t:a1

#

t: 1 1 1 :an

#

t.

The

Basis

: n = 0, that is, e = X:a t. Since e is type correct, rule 2 could be applied; thus #

X

2

V arT

T

a 2 A(T )

2 OI

(1)

Moreover, since e = X:a t and X V arT by Rule 1 X :t T , and thus, for # to be a legal oid assignment X dom(#) and #(X ) (T; t). Then, [ X ] # = #(X ) is de ned and is such that t [ X ] #:lifespan, and

( [ X ] #:i; t) = T is de ned. Since, moreover, a A(T ) conditions 1 and 2 of Proposition 2 are satis ed. Condition 3 follows from the consistency of the database, while condition 4 follows trivially from (1). Thus, e is correct for any legal assignment # by Proposition 2. Finally, [ e ] # ( 0 (dom(a; T )); t) follows from the consistency of the database. 0 Inductive step: n > 0, that is, e = e :a t. Since e is type correct, rule 2 could be applied; thus #

2

2

2



2 E



E

E

2

E

2

T

T

2 OI

#

type(e0 ) = T

a 2 A(T )

(2)

By inductive hypothesis, for any legal oid-assignment #, [ e0 ] # is an existing object and [ e0 ] # (T; t). Then: t [ e0 ] #:lifespan, and

( [ e0 ] #:i; t) = T is de ned. Since, moreover, a A(T ) conditions 1 and 2 of Proposition 2 are satis ed. Condition 3 follows from the consistency of the database at time t, while condition 4 follows trivially from (2). Thus, e is correct for any legal assignment # by Proposition 2. Finally, [ e ] # ( 0 (dom(a; T )); t) follows from the consistency of the database at time t. E



2 E



E

E

2

2

E

2

T

5

Proof of Corollary 2

Since t 7, t 7 such that e = X:a t:a1 t: :an t is correct for any legal oid assignment # (by Proposition 4), and thus the thesis follows by Proposition 2. 2

9

2

#

#

111

#

5

Proof of Proposition 5

Let us consider the rst item. By De nition 12, e1 = e2 is correct i [ e1 ] # = [ e2 ] #. By De nition 9, given a path expression e and an oid-assignment #, [ e2 ] # is always a temporal value, as we consider static values as a particular case of temporal ones, de ned only at t = now. Thus [ e ] # is a function from . Therefore, e1 = e2 only if [ e1 ] # and [ e2 ] # have the same domain. Conditions in the rst item express this requirement. Let us consider the second item. By De nition 13, e1 =ins e2 is correct if there exists an instant t such that: i) t (dom( [ e1 ] #) dom( [ e2 ] #)); ii) [ e1 ] #(t) = [ e2 ] #(t). Therefore for e1 =ins e2 to be correct, there must be an instant t belonging to the intersection of dom( [ e1 ] #) and dom( [ e2 ] #). Thus, we have proved the thesis, since conditions in the second item express the above requirement. Finally, let us consider the third item. By De nition 14, e1 =w e2 if there exist two instants t and t0 such that: i) t dom( [ e1 ] #); ii) t0 dom( [ e2 ] #); iii) [ e1 ] #(t) = [ e2 ] #(t0 ). Thus, e1 =w e2 is correct only if there exist two instants t and t0 such that t dom( [ e1 ] #) and t0 dom( [ e2 ] #). Hence, the claim holds. E

E

E

E

T I ME

E

2

E

\

E

E

E

E

E

2

E

2

E

E

2

E

E

E

2

E

5

23

Proof of Proposition 6

It is trivial to show the correctness of the rst two items. Let us consider the third item. By De nition 12, = e2 i [ e1 ] # = [ e2 ] #. By De nition 9, given a path expression e and an oid-assignment #, [ e2 ] # is a function from . If e1 = e2 are equal, it means that functions [ e1 ] # and [ e2 ] # assume the same value for each instant of their (common) domain. This implies that the restrictions of [ e1 ] # and [ e2 ] # to the same set of instants, or to the same instant, are equal too. Hence, the claim holds. Let us consider the fourth item. By De nition 15, e1 =ins e2 te, t ( (te) dom( [ e1 ] #) dom( [ e2 ] #)), such that [ e1 ] #(t) = [ e2 ] #(t). Thus, if e1 =ins e2 te, there exists t (dom( [ e1 ] #) dom( [ e2 ] #)), such that [ e1 ] #(t) = [ e2 ] #(t). This implies e1 =ins e2 . Proofs for e1 =ins e2 t and e1 =ins e2 7 are analogous. We omit the prove of the fth item as it is similar to the previous case. e1

E

E

E

T I ME

E

E

E

E

j

E

E

E

E

E

, 9

j

E

2

\

2

E

\

E

\

j

j

5

24

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.