Improving behavioral design pattern detection through model checking

Descripción

2010 14th European Conference on Software Maintenance and Reengineering

Improving Behavioral Design Pattern Detection through Model Checking Andrea De Lucia, Vincenzo Deufemia, Carmine Gravino, Michele Risi Dipartimento di Matematica e Informatica Università di Salerno Fisciano(SA), Italy {adelucia, deufemia, gravino, mrisi}@unisa.it Abstract — Recovering design pattern instances in a software system can help maintainers to understand its design and implementation. In this paper we present a fully automated design pattern recovery approach that analyzes the behavior of pattern instances both statically and dynamically. In particular, the proposed approach exploits model checking to statically verify the behavioral aspects of design pattern instances. To this end, we encode the properties defining the correct behavior of a pattern as LTL (Linear Temporal Logic) formulae and the sequence diagram representing the possible interaction traces among the objects involved in the candidate instances as PROMELA specifications. To verify whether the LTL properties are satisfied by the candidates we employ the SPIN model checking tool. The dynamic analysis of the pattern behavior is performed through a code instrumentation and monitoring phase applied on the candidate pattern instances. This phase allows us to obtain actual dynamic data during program execution, which is then used to verify its compliance to the pattern definition. The effectiveness of the proposed approach is shown by presenting and discussing the results obtained on JHotDraw and JRefactory.

strategies have their own drawback [18]. Indeed, the use of static analysis to recover dynamic data requires the investigation of the source code and the identification of the dynamic types of object references, but this is not plausible for complex systems [9]. On the other hand, the dynamic analysis allows to effectively analyze the interactions between objects, but this kind of analysis is able to represent only a part of the system’s behavior. In this paper we present a fully automated design pattern recovery approach that performs both static and dynamic analysis to verify the behavior of pattern instances. The static analysis exploits model checking to analyze the interactions among objects. In particular, the pattern instances that satisfy the structural definition of a pattern are analyzed by a model checker in order to verify whether they also satisfy the behavioral definition of the pattern. To this end, we encode the properties defining the correct behavior of a pattern as LTL (Linear Temporal Logic) formulae and the sequence diagram representing the possible interaction traces among the objects involved in the candidate instances with the PROMELA language. To verify whether the LTL properties are satisfied by the candidates we employ the SPIN model checking tool. The dynamic analysis is performed according to the approach proposed in [4]. In particular, the dynamic information is obtained through the automatic instrumentation and monitoring of the method calls involved in the candidate pattern instances identified during model checking. The obtained method trace is matched against the definitions of the pattern behaviors by using a parser able to recognize the valid sequences of messages exchanged among the pattern participants. The retrieval effectiveness of the proposed recovery approach has been assessed for the Observer, State, Strategy, Template Method, Command, and Visitor behavioral patterns by applying it on JHotDraw (release 5.1) and JRefactory (release 2.6.24). The results highlight that model checking considerably reduces the number of instances to be validated in the dynamic analysis phase. Furthermore, the comparison with other approaches applied on the same software systems highlighted that our proposal is characterized by better results. This paper extends the work in [4] in four main aspects: i) the recovery technique has been improved with a model checker able to reduce the candidate pattern instances obtained from a structural analysis phase; ii) the recovery approach have implemented and validated on more design

Keywords- Design Pattern Recovery, Model Checking, Dynamic Analysis

I.

INTRODUCTION

Design patterns are good design solutions to recurring problems and form a common vocabulary among developers [8]. Each design pattern is usually described by its intent, motivation, structural and behavioral solutions, consequences, known uses, etc. The structural and behavioral solutions are typically modeled by class and sequence diagrams, respectively. Design patterns also represent useful architectural information that can support a rapid understanding of software design and source code [8]. Thus, the recognition of design pattern instances in source code can provide reverse engineers with considerable insight on the software design. In the recent years, many approaches have been proposed to automate design pattern recovery [11][14][22][29]. One of the main challenges in the recovery of design pattern instances is the verification of their behavioral aspects. Indeed, it requires the analysis of the collaboration among objects at runtime by identifying and executing test cases on the software system. This is an extremely time consuming task due to the high number of object interactions in complex systems. Static analysis or dynamic analysis can be exploited to capture the behavior of a software system. However, both 1534-5351/10 $26.00 © 2010 IEEE DOI 10.1109/CSMR.2010.16

176

patterns (i.e., six instead of three); iii) the new approach and the tool implementing it have been evaluated on a further case study; iv) the experimental results have been compared in detail with the approaches that evaluated the same software systems. The paper is structured as follows. Section II reports on related work in the areas of design pattern recovery. The proposed design pattern recovery process and the details on model checking approach are presented in sections III and IV, respectively. Sections V reports and discusses the results of the performed case studies. Conclusion ends the paper. II.

carried out on the source code representation based on the static specification of the pattern. The dynamic analysis monitors the execution of the program and classifies the candidate pattern instances obtained by the static analysis according to their conformance to the expected pattern behavior. In particular, the employed predicates specify the relevant states and state transitions of a pattern’s behavior to detect in a concrete program run. The analysis verifies whether a concrete program run of the candidate pattern instance matches the specified pattern behavior. The approach has been assessed by recovering the instances of Observer, Composite, and Decorator from the tool implementing the approach itself. In [23] an investigation has been performed to highlight the parameters that can improve the precision of the proposed design pattern detection approach. It combines static and dynamic analyses, as well as other approaches. During static analysis data are gathered (facts) about the program and queries are performed over these facts using a tool called Crocopat [2] which uses binary decision diagrams to represent the relations. The Java front-end tool, called Recoder, is used to extract information from the source code and build an AST. In particular, data are collected for each method call and saved to a log file as tuples. For each tuple obtained from the static analysis a dynamic analyzer object is created during dynamic analysis and is used to verify violations to the dynamic protocol of the pattern. Experiments have been performed to detect instances of the Observer pattern in Java classes. The achieved results suggest that simple behavioral protocols can give very high precision when high coverage is obtained [23]. The recovery technique and corresponding tool (named DEMIMA) proposed in [11] are based on a multilayered approach able to identify idioms pertaining to the relationships between classes and design motifs characterizing the organization of the classes1. The first layer is devoted to the construction of models from source code, by using a language whose metamodel is inspired by UML [11] and includes all the elements of a Java system like class, interface, method, and so on. In the second layer programming idioms are identified, which define specific characteristics of classes or relationships between them. In this layer, the tool DEMIMA depends on the formalizations of definitions of unidirectional binary class relationships, focusing on use, association, aggregation, and composition relationships since they are used to describe design patterns [8]. The relationships are defined in terms of four properties (i.e., exclusivity, type of message receiver, lifetime, and multiplicity) that are derivable by executing a static and dynamic analysis [11], by exploiting information on the recovered UML class diagram, and by using a trace-analysis technique provided in [12] to determine information on exclusivity and lifetime properties. This dynamic analysis is influenced by the availability of unit tests to be executed. In the third layer the language used to describe the source code is also used to represent design motifs, and

RELATED WORK

Since in this paper we focus on behavioral design patterns, in the following we recall some recovery techniques and tools that exploit either dynamic analysis or a combination of static and dynamic analysis to identify behavioral design pattern instances. The approach proposed in [29] combines the static analysis defined in [21] with a dynamic analysis to recover design pattern instances from Java programs. The latter exploits automata to represent the behavioral aspects of design patterns and to validate the relevant method calls monitored at runtime. The dynamic analysis is only partially implemented since the automata are manually obtained from the input behavioral requirements. The authors did not perform a case study but they only verified the feasibility of the approach by analyzing two patterns (i.e., State and Strategy) in the Eclipse framework. Heuzeroth et al. [14] first apply static analysis to obtain a set of candidate pattern instances and then perform dynamic analysis of this set. During static analysis, source code is represented as an Abstract Syntax Tree (AST) and the patterns are defined in terms of predicate logic. The candidates for pattern instances are searched in the AST with the help of these predicates and represented as a set of tuples. During dynamic analysis the candidate tuples are monitored at runtime to verify whether they satisfy the dynamic pattern rules. The recovery technique has been evaluated for the Observer pattern on two software systems: the tool on itself and the SwingSet2 example of the JDK 1.3.1 including the javax.swing.* packages. The obtained results are encouraging only for particular design pattern definitions. The 3-step approach proposed in [18] identifies behavioral and creational design pattern instances in source code by using dynamic analysis only. The design patterns are described in terms of UML sequence diagrams. During a dynamic analysis, a UML sequence diagram is instantiated from the execution trace and then analyzed to find the pattern behavior. Thus, the problem of design pattern identification is reduced to a constraint satisfaction problem. A preliminary evaluation of the approach has been performed on JHotDraw 6.0b1 for the recovery of Visitor patterns. The approach proposed in [13] combines static and dynamic analyses to recover design pattern instances. In particular, two Prolog-based languages, namely SanD-Prolog and SanD, are used to specify predicates. The input source code is represented as a set of predicates that encode the corresponding AST. During static analysis queries are

1 It is worth noting that the authors used the term design motif to indicate design pattern and microarchitecture to refer to design pattern instance.

177

The analysis of the class diagram is performed through a visual language parser generated from a grammar modeling the structure of the patterns to be detected. Such parser reduces the problem of identifying pattern instances to the problem of recognizing subsentences in a class diagram, where a subsentence corresponds to a pattern instance. The analysis of class methods is performed by a source code analyzer that verifies the presence of the method declarations and implementations, and method invocations, based on the role of the classes involved in the pattern instances. As an example, for the Visitor pattern the visual parser identifies in the class diagram the set of classes and relationships that are shown in the class diagram of Fig. 2, while the source code analyzer verifies the presence of methods declarations and invocations described in the notes associated to the classes. Further details on the structural analysis can be found in [3].

microarchitectures similar to the specified design motifs are recovered from the model representing the source code by using explanation-based constraint programming and constraint relaxation [12]. In section V, we provide a detailed discussion of the results achieved by DEMIMA on JHotDraw 5.1 and JRefactory 2.6.24 since we also employed these systems in our case studies. In [4] we proposed a technique able to recover behavioral design pattern instances by combining structural analysis, based on visual language parsing, with dynamic analysis, based on source code instrumentation. In this paper extends the approach proposed in [4] by performing a static analysis of the behavior of candidate pattern instances through model checking. Other approaches proposed in the literature to recover design pattern instances from source code analyze behavioral aspects by performing only static analysis (e.g., [1][6][19] [22]), but due to space limitation we do not report on them. III.

OVERVIEW OF THE DESIGN PATTERN RECOVERY APPROACH

In this section we summarize our approach to identify instances of behavioral design patterns from Java programs. Fig. 1 shows the activity diagram describing the phases of the recognition process, where rectangles represent data and rounded rectangles represent phases. The Design Pattern Library stores both structural and behavioral information of the patterns. The former defines structural connections among the classes via call, delegation, or inheritance relationships. The latter defines sequences of actions and interactions of the objects, which are instances of the classes involved in the design patterns. The structure and the behavior of the patterns have been defined according to Gamma’s design pattern definitions [8]. The Structural Analysis phase identifies candidate pattern instances by exploiting structural information only. In order to detect such instances we employ the two-phase approach proposed in [3]. The Model Checking phase analyzes the source code of the pattern instances detected in the previous phase to verify whether they satisfy the behavioral definition of the pattern. Finally, the Dynamic Analysis phase checks at runtime whether the actual behavior of the recovered candidates conforms to the behavioral requirements specified in the pattern library. In the following we briefly describe each phase of the approach. The Visitor pattern will be considered as running example throughout the paper. Such behavioral pattern represents an operation to be performed on the elements of an object structure [8]. Visitor allows to define a new operation without changing the classes of the elements on which it operates.

Figure 1. The design pattern recovery process.

Figure 2. Structural information for Visitor pattern.

A. Structural analysis In the structural analysis phase instances of design patterns are identified by first analyzing the class diagram structure extracted from the input OO source code and then the method declarations and invocations within the classes.

B. Model checking In the Model Checking phase we verify whether there exists an interaction among the objects involved in the candidates that satisfies the pattern behavior specified in the

178

pattern library. This is accomplished by encoding the interactions among the objects involved in the candidates as PROMELA formal specification, which is then validated by using the SPIN model checker against the behavioral properties, expressed in terms of LTL and obtained from pattern definition. Fig. 3 illustrates the sequence diagram for the Visitor pattern. Such diagram defines the behavior of the objects involved in the pattern. The model checker verifies whether the objects involved in a candidate visitor instance are able to exchange the sequence of messages specified in the sequence diagram. As an example, it verifies that the client creates a ConcreteVisitor object and then goes through the object structure, visiting each element using the visitor. When an element is visited, it calls the visitConcreteElement(element) operation that corresponds to its class. The element supplies itself as an argument to this operation to let the visitor access its state, if necessary [8]. Traversing tree-like data structures is a typical use of the Visitor pattern: a visitor object contains some operations, and the data structure objects allow themselves to be traversed by accepting visitors.

IV. ANALYSIS OF PATTERN BEHAVIOR THROUGH MODEL CHECKING Model checking is a method to automatically verify whether a formally specified property holds in a model of a system. Thus, we can exploit model checking to statically verify whether the behavior of a pattern instance complies with the pattern definition. That is, the temporal ordering of object interactions satisfies the execution trace of the design pattern. Fig. 4 shows the process we propose to perform pattern behavior analysis through model checking. For each candidate pattern instance obtained from structural analysis, the approach generates the sequence diagram describing the interactions among the objects involved in the candidate. Such a diagram is then translated into a PROMELA formal specification which is validated by using SPIN against the behavioral properties, expressed in terms of LTL and automatically obtained from pattern definition through a control flow analysis. Notice that our approach does not directly translate Java code into a PROMELA specification by using thirdy-part tools (e.g., Java PathFinder [15]) because we apply the model checker on the sequence diagrams of the extracted pattern instances in order to limit the model checker’s state space. In the following we detail each phase of the model checking process.

Figure 3. Description of the Visitor pattern in terms of collaboration.

Figure 4. Validating behavioural pattern instances through model checking.

C. Dynamic analysis The goal of this phase is to verify whether the actual behavior of the candidates complies with the pattern definition they are instance of. This is performed in three steps. First, the bytecode of the classes involved in the candidate pattern instances is instrumented with Probekit tool [25]. The files obtained from instrumentation are executed together with the program to monitor the method calls of the candidates on a suitable test suite. Finally, the obtained method trace is validated by a parser able to recognize the sequence of method calls defining the design pattern behavior. Further details on the dynamic analysis can be found in [4].

A. Extracting Sequence Diagrams from the Source Code of the Candidate Pattern Instances During structural analysis to each candidate instance we associate the Client classes Ci and their methods mj which start the execution of the pattern. This information allows us to restrict the behavioral analysis to the classes involved in the candidate pattern instance. In particular, we perform a control flow analysis starting from each method invocation Ci.mj() to obtain the sequence diagram modeling the interactions among the objects involved in the instance. The construction of a sequence diagram is based on the generation of a call tree starting from the method invocation Ci.mj(). In particular, a data flow analysis is performed to associate to each node of the tree the class name of the object

179

specification consists of variable declarations and a process. The boolean variables keep track of the events occurring among the objects, whereas the message types in and out indicate that a method is invoked or is terminated, respectively. The process defines a sequence of states, each one representing an object of the sequence diagram. A set of decisional statements modeling the transitions among states is associated to each state. As an example, the fifth statement in state s0 models the method invocation visit(root,data) of PrettyPrintVisitor object from the LineNumberTool object. This statement has a pre-condition, that is the method getRoot() of the LineNumberTool object, which has been invoked and it is terminated. The execution of such statement activates the state s1.

on which a method is invoked and to each edge the name of the invoked method. To deal with infinite loop method invocations the tree is constructed by analyzing the source code and limiting its height to a fixed value. Successively, the tree is compacted by: (i) discarding the nodes representing the objects not involved in the pattern instance. When a node is removed from the tree its father is linked to its children, preserving the temporal order of method invocations; (ii) joining the nodes of the tree representing the same object; (iii) removing the recurrent sequence of method invocations and linking the node representing the last method invocation in the sequence to the first node of such sequence. The data structure obtained from such operations is a call graph which is then represented as a sequence diagram according to the temporal method invocation sequence. In particular, a control flow analysis is performed on the source code to identify alternatives, loops and object lifelines. As an example for the candidate Visitor instance shown in Fig. 5, we create the sequence diagram in Fig. 6 starting from method run() of class LineNumberTool. Notice that the description of the candidate instance in Fig. 5 is composed of two sections. The first is a list of involved classes in the design pattern instance, each associated with its playing role and the list of the invoked methods enclosed in the brackets {}. The second section provides the list of classes playing the Client role and the methods that invoke the objectStructure class. PATTERN Visitor ( objectStructure: SimpleNode { childrenAccept }, element: Node { childrenAccept, jjtAccept }, concreteElement: SimpleNode { childrenAccept, jjtAccept }, visitor: JavaParserVisitor { visit }, concreteVisitor: PrettyPrintVisitor { visit }) CALL_FROM_CLIENT ( LineNumberTool.run, ExtractMethodRefactoring.printFile, PrettyPrintFile.apply);

Figure 6. Sequence diagram generated from a Visitor candidate instance.

Figure 5. A candidate instance of Visitor.

Notice that in the PROMELA specification we encode the loops as alternatives because how many times a loop body is executed does not impact on the identification of design pattern instances.

B. Generating PROMELA Specification from Sequence Diagrams The SPIN model checker uses the PROMELA language for modeling the system under verification. Therefore the sequence diagrams obtained from the process described in the previous subsection have to be converted into an equivalent PROMELA specification. Note that every sequence diagram can be expressed with a PROMELA specification [20]. We have used the approach presented in [20] to map a sequence diagram into PROMELA code. Since the sequence diagram obtained from the code contains synchronous messages only, we use an unbuffered channel for PROMELA message exchange. The PROMELA specification corresponding to the sequence diagram of Fig. 6 is shown in Fig. 7. The

C. Generating LTL Formulae from Pattern Behavior Definition and Candidate Pattern Instances In order to verify whether the candidate instance behavior expressed as PROMELA code satisfies the behavioral design pattern definition, the latter has to be specified as LTL properties [20]. The LTLs are mathematical annotated formulae to make statements on a linearly progressing time. Such formulae allow to specify properties that should (not) hold in a model of a system. For our purposes, we use LTL to express an ordering between send and receive events of messages in a sequence diagram. In particular, for each behavioral pattern we have

180

particular, the LTL properties are specified by using the “never claim” construct, which is a process describing unwanted behaviors. In this way, the counterexample generated by SPIN corresponds to a correct behavior of the candidate instance.

defined an LTL formula that is automatically configured with information on candidate pattern instances. As an example, to validate the behavior of the instance shown in Fig. 5, a set of atomic propositions and an LTL formula for the Visitor pattern (shown in Fig. 8) have been instantiated according to the methods of the classes involved in the instance. In particular, the atomic propositions ts, t0,…,t4, and tf assert the state of each global boolean variable defined in the PROMELA specification. By using these propositions, we are able to check whether a method is active or not. The LTL formula at the bottom of Fig. 8 specifies the temporal sequence of messages shown in Fig. 3.

#define ts (_run == true) /* ts is true during the verification */ #define t0 (newPrettyPrintVisitor == true) #define t1 (getRoot == true) #define t2 (visit == true) #define t3 (childrenAccept == true) #define t4 (jjtAccept == true) #define tf (_end == true) /* tf is true when a final state is reached*/ tf Æ

bool _run = false, newPrettyPrintVisitor = false, getRoot = false, visit = false, childrenAccept = false, jjtAccept = false; bool _end = false; mtype = { in , out } mtype = { c_run, c_newPrettyPrintVisitor, c_getRoot, c_visit, c_childrenAccept, c_jjtAccept, c_end } chan flow[10] = [1] of { mtype }; active proctype LineNumberTool_run() {

( ts && ( t0 U t1 U t2 U t3 U ( t4 && !t2 && !t3 ) U ( t2 && !t3 ) U ( t3 ) ) )

/* eventually t0 happens */ /* eventually t1 happens */ /* eventually t2 happens */ /* eventually t3 happens */ /* t4 happens but not t2 and t3 */ /* t2 happens but not t3 */ /* t3 happens */

Figure 8. LTL formula for Visitor pattern.

atomic{ printf("Start\n"); flow[c_run]!in; goto s0; } s0: /* LineNumberTool */ if :: flow[c_run]?in Æ :: flow[c_run]?out Æ :: flow[c_newPrettyPrintVisitor]?out Æ :: flow[c_getRoot]?in Æ :: flow[c_getRoot]?out Æ :: flow[c_visit]?out Æ fi;

_run = true; flow[c_newPrettyPrintVisitor]!in; _run = false;flow[c_end]!in; flow[c_getRoot]!in; getRoot = true; flow[c_getRoot]!out; getRoot = false; flow[c_visit]!in; flow[c_run]!out;

V.

goto s1; goto end; goto s0; goto s0; goto s1; goto s0;

In order to highlight the performances of the proposed design pattern recovery approach we have performed two case studies. In particular, we have implemented in the DPRE tool [3] the proposed recovery approach for the Observer, State, Strategy, Template Method, Command, and Visitor behavioral patterns and then we have used it to analyze JHotDraw (release 5.1, 8.3 KLOC) and JRefactory (release 2.6.24, 60.5 KLOC). JHotDraw is a Java framework for drawing two-dimensional graphics in structured drawing editors [16], whereas JRefactory is a system designed to refactor or restructure Java programs [17]. In the following sections we first present and discuss the achieved results obtained on JHotDraw and JRefactory and then we compare them with those obtained by other approaches on the software systems.

s1: /* printer:PrettyPrintVisitor */ if :: flow[c_newPrettyPrintVisitor]?in Æ newPrettyPrintVisitor = true; flow[c_newPrettyPrintVisitor]!out; goto s1; :: flow[c_newPrettyPrintVisitor]?out Æ newPrettyPrintVisitor = false; flow[c_newPrettyPrintVisitor]!out; goto s0; :: flow[c_visit]?in Æ visit = true; flow[c_childrenAccept]!in; goto s2; :: flow[c_childrenAccept]?out Æ visit = false; flow[c_visit]!out; goto s0; :: flow[c_visit]?in Æ visit = true; flow[c_childrenAccept]!in; goto s3; :: flow[c_childrenAccept]?out Æ visit = false; flow[c_visit]!out; goto s3; fi; s2: /* root:SimpleNode */ if :: flow[c_childrenAccept]?in

:: flow[c_jjtAccept]?out fi; s3: /* s:SimpleNode */ if :: flow[c_jjtAccept]?in

:: flow[c_visit]?out :: flow[c_childrenAccept]?in :: flow[c_childrenAccept]?out :: flow[c_jjtAccept]?out fi; end: if :: flow[c_end]?_ fi; }

Æ

atomic { childrenAccept = true; if /* alt */ :: flow[c_jjtAccept]!in; goto s3; :: flow[c_jjtAccept]!out; goto s2; fi; } childrenAccept = false;flow[c_childrenAccept]!out;

goto s1;

Æ Æ Æ Æ

atomic { jjtAccept=true; if /* loop */ :: flow[c_visit]!in; goto s1; :: flow[c_visit]!out; goto s3; fi; } flow[c_jjtAccept]!out; childrenAccept = true; flow[c_childrenAccept]!out; childrenAccept = false; flow[c_childrenAccept]!out; jjtAccept=false; flow[c_visit]!out;

goto s2; goto s3; goto s1; goto s1;

Æ

_end = true; printf("Accept\n");

Æ

Æ

CASE STUDIES

A. Results and Discussion Table I and Table II summarize the results obtained by applying the proposed recovery process. These tables show the candidates identified by the Structural Analysis phase, the Model Checking phase, and those obtained at the end of the recovery process. First of all, we can observe that the use of a model checker for analyzing the behavior of pattern instances has allowed us to considerably reduce the number of candidates for the Dynamic Analysis. In particular, the percentages of false positives eliminated by performing model checking phase range from 25% (i.e., the Observer pattern) to 100% (i.e., Command pattern) for JHotDraw (see Table I), while these percentages for JRefactory range from 81.1% (i.e., for Observer pattern) to 99.2% (i.e., Template Method). To validate the recovery results we have manually analyzed the identified design pattern instances and computed the precision and the recall measures. The precision provides the percentage of the recovered pattern instances that are correct design pattern instances, while the recall measures the percentage of correct design pattern

Figure 7. PROMELA specification of the sequence diagram in Fig. 5.

D. Model Checking UML Sequence Diagrams. The PROMELA code describing the behavior of the pattern instance and the LTL formulae describing the behavior to be satisfied by the instance are given in input to SPIN. The automated model checker SPIN creates a finite state model of the system to be checked. Thereafter this state space is searched by an automated model checker to verify or falsify (by generating counterexamples) the LTL properties. Since we are interested to verify whether there exists a sequence of messages in a sequence diagram that satisfies the pattern behavior, we have to specify the LTL properties introduced in the previous section in negative form. In

181

We have found one instance of Template Method pattern, where a class named CH.ifa.draw.standard.AbstractFigure plays the role of the AbstractClass. The analysis of documentation has also revealed that only one instance of Command design pattern is present, where a class named CH.ifa.draw.util.Command plays the role of Command. The documentation does not report enough and clear information to determine all the correct instances of the State and Strategy design patterns. However, we have verified that it is possible to identify the classes that play the role of State and Strategy in the corresponding pattern instances (see Table V), respectively, by analyzing code and comments. Thus, we have manually verified the recovered instances of State and Strategy and determined the precision and recall statistics taking into account the information of Table V. Instances of Visitor design pattern are not present in JHotDraw and DPRE has recovered no instance of it. From Table III we can observe that we have achieved a recall value of 100% for all analyzed patterns, while for the precision values we obtained 55% for Observer, 46.51% for Strategy, 5.6% for State, 100% for Command, and 100% for Template Method.

instances that have been correctly identified [26]. The results we obtained in our case studies are shown in Table III. TABLE I. RECOVERED DESIGN PATTERN INSTANCES FOR JHOTDRAW #Observer

#Strategy

#State

#Command

#Visitor

#Template Method

20

54

54

3

5

12

12 9

45 43

45 36

1 1

2 0

4 1

53.3%

18%

17%

100%

60%

81.8%

Phases

Structural Analysis Model Checking Dynamic Analysis % false positives eliminated with model checking

TABLE II. RECOVERED DESIGN PATTERN INSTANCES FOR JREFACTORY

29

29

27

8

132

5 0

2 0

2 1

1 0

3 2

1 0

81.1%

93.1%

96.4%

96.3%

83.3%

99.2%

#Template Method

#Command

28

#Visitor

#State

% false positives eliminated with Model Checking

#Strategy

Structural Analysis Model Checking Dynamic Analysis

#Observer

Phases

TABLE IV. OBSERVER PATTERN INSTANCES FROM THE JHOTDRAW DOCUMENTATION

Subject Drawing StandardDrawing Connector Drawing Figure

We can observe that the model checker has allowed us to considerably reduce the number of candidates for the Dynamic Analysis and to eliminate many false positives. Indeed, the correct pattern instances we obtained at the end of the recovery process are the same of those we achieved without employing model checking (i.e., those achieved with the previous version of the recovery approach [4]). Regarding the case study performed on JHotDraw, to compute precision and recall we have considered as correct design pattern instances those reported in the documentation. Starting from this information we have determined the correct design pattern instances of the Observer, Command, and Template Method. Note that we have also performed an analysis of code and comments in order to verify them.

Observer DrawingChangeListener DrawingChangeListener ConnectorFigure DrawingView FigureChangeListener

TABLE V. CLASSES PLAYING THE ROLE OF STATE AND STRATEGY IN JHOTDRAW State Strategy Tool Connector PointConstrainer Locator Painter

As for the case study performed on JRefactory, the documentation explicitly specifies one instance of the pattern State (where the class org.acm.seguin.print.xml.State plays the role of State) and one instance of the pattern Visitor (where the class org.acm.seguin.parser.JavaParserVisitor plays the role of Visitor). These two instances have been recovered by our approach. We also recovered an instance of Visitor where the role of Visitor is played by the class org.acm.seguin.summary.SummaryVisitor. JRefactory documentation does not indicate instances of Observer, Command, Template Method, and Strategy. Our approach did not recover instances of these design patterns. Thus, we have decided to consider as correct the above specified design pattern instances. However, since we are not sure that the documentation indicates all the instances and we have not manually analyzed all the source code (to identify all the correct pattern instances) we do not calculate the recall statistics. This is the reason why Table III shows only the precision statistics for JRefactory.

TABLE III. PRECISION AND RECALL JHotDraw JRefactory Design Pattern Precision Recall Precision Recall 55% 100% 100% Observer 46.51% 100% 100% Strategy 5.6% 100% 100% State 100% 100% 100% Command 100% 100% 50% Visitor 100% 100% 100% Template Method -

The documentation explicitly specifies 4 Observer instances (the first four instances in Table IV) and gives indications (through comments and naming of classes) about the presence of another instance (the one reported in the fifth row of Table IV). We have considered as correct all the five instances since they satisfy the static and dynamic requirements of the Observer pattern.

182

two instances indicated as correct instances in the documentation (i.e., 15 and 16 in Table VII). Thus, this analysis has highlighted that five of nine recovered instances are correct. On the other hand, it is worth noting that we have manually verified that the nine instances satisfy the structural definition of the Observer pattern [8]. Moreover, our analysis has shown that these instances also satisfy the behavioral requirements characterizing the Observer pattern [8]. In [27] the authors performed a manual structural verification of the code and stated that the instances 10, 12, 13, and 14 in Table VII are correct. However, the manual checking of the code is very subjective and biased by the considered specification of the design patterns [7]. So, we have decided to consider as correct only the instances specified in the JHotDraw documentation.

B. Comparison with other approaches In this section we compare the results of our case studies with those presented in [11],[27],[28] and with the statistics included in the P-MARt repository [10][24]. We selected the studies that report on the instances of Observer, State, Strategy, Command, Template Method, and Visitor design patterns in JHotDraw 5.1 and JRefactory 2.6.24. We start by comparing the results on JHotDraw. Table VI shows the number of instances of Observer, Strategy, Command, and Template Method patterns specified by PMARt [24], Tsantalis et al. [27], DEMIMA [11], and DPRE as well as the number of instances that DPRE shares with [24], [27], and [11]. On the other hand, Table VII shows the Observer instances specified in [24], the ones recovered in [27], and those identified in our case study by DPRE, while Table VIII shows the instances of Template Method design pattern. Furthermore, we share with [24] and [27] the instance of Command specified in the JRefactory documentation.

TABLE VII. INSTANCES OF THE PATTERN OBSERVER FOR JHOTDRAW P-MARt [24] id Subject Observer It is 1 Figure FigureChangeListener TP 2 Drawing DrawingChangeListener TP Tsantalis et al. approach [27] id Subject Observer It is 3 CommandMenu Command FP 4 StandardDrawing DrawingChangeListener TP 5 StandardDrawingView Painter FP 6 CompositeFigure Figure FP 7 StandardDrawingView Figure FP DPRE id Subject Observer It is 8 Figure FigureChangeListener TP 9 Drawing DrawingChangeListener TP 10 CommandMenu Command FP 11 StandardDrawing DrawingChangeListener TP 12 StandardDrawingView Painter FP 13 CompositeFigure Figure FP 14 StandardDrawingView Figure FP 15 Connector ConnectorFigure TP 16 Drawing DrawingView TP

TABLE VI. NUMBER OF RECOVERED PATTERN INSTANCES IN JHOTDRAW JHotDraw 5.1

2 22 21

43 12

36 2 27

-

-

Template Method

19

Command

2 5 7 9 2 5 -

State

Strategy

P-MARt [24] Tsantalis et al. [27] DEMIMA [11] DPRE DPRE ∩ P-MARt DPRE ∩ Tsantalis et al. DPRE ∩ DEMIMA

Observer

Source

1 1* 11 1 1 1 -

2 5 31 1 1 1 -

*observe that Tsantalis et al. include Adapter and Command instances in the same set

Let us observe that Tables VII and VIII also classify the instances in true and false positives, where the former represent the recovered instances that are correct while the latter represent the recovered instances that do not comply with the pattern definition. For sake of space constraints, we do not report on all instances of State and Strategy recovered by DPRE, DEMIMA [11], and Tsantalis et al. approach [27]. Instead, in Tables IX and X we show the State and Strategy instances reported in [24] and [27], but not recovered by DPRE and we discuss them in the following. We do not report the instances recovered in [11] since they are not provided by the authors. However, we suppose that the two instances of Observer, the two instances of Template Method, and the instance of Command recovered by [11], and declared as correct, are those specified in [24] and [27]. As for the pattern Observer, our approach has identified the two instances classified as valid by P-MARt and included in the JHotDraw documentation (see Table VI). All the five instances identified in [27] have been also recovered by our approach. However, only one of them can be considered correct according to the documentation (i.e., instance 11 in Table VII) while the remaining are classified as false positives. Furthermore, our approach has recovered the other

TP = True Positive; FP = False Positive

TABLE VIII. INSTANCES OF THE PATTERN TEMPLATE METHOD FOR JHOTDRAW P-MARt [24] id AbstractClass It is 17 CH.ifa.draw.standard.AbstractFigure TP 18 CH.ifa.draw.standard.AbstractHandle FP Tsantalis et al. approach [27] id AbstractClass It is 19 CH.ifa.draw.standard.AbstractFigure TP 20 CH.ifa.draw.standard.AbstractHandle FP 21 CH.ifa.draw.standard.ActionTool FP 22 CH.ifa.draw.standard.ChangeConnectionHandle FP 23 CH.ifa.draw.util.PaletteButton FP DPRE id AbstractClass It is 24 CH.ifa.draw.standard.AbstractFigure TP

Concerning the Template Method pattern we have recovered only the instance specified in the documentation (i.e., 24 in Table VIII). P-MARt have declared another instance as correct (i.e., 18 in Table VIII) but we have manually checked it (in the source code) and we have classified it as false positive. Similarly, we have also

183

org.acm.seguin.summary.SummaryVisitor can be considered a false positive. Indeed, the class playing the role of element is not present in the documentation. Observe that DEMIMA identifies other two false positives [11].

manually checked the other three instances recovered in [27] (i.e., 21, 22, and 23 in Table VIII) and we have indicated them as false positives. TABLE IX. INSTANCES OF THE STRATEGY AND STATE PATTERNS REPORTED IN [24] AND [27] FOR JHOTDRAW AND NOT RECOVERED BY OUR APPROACH DURING STRUCTURAL ANALYSIS id 25 26 27 id 28 29 30 31 32 33

P-MARt [24] ContextStrategy/ContextState AbstractFigure DecoratorFigure ChangeConnectionHandle Tsantalis et al. [27] ContextStrategy/ContextState AbstractFigure ChangeConnectionHandle ChangeConnectionHandle AbstractConnector PaletteButton ChangeConnectionHandle

TABLE XI. NUMBER OF RECOVERED PATTERN INSTANCES IN JREFACTORY JRefactory Source

Strategy/State Connector Connector Connector

P-MARt [24] Tsantalis et al. [27] DEMIMA [11] DPRE DPRE ∩ P-MARt DPRE ∩ Tsantalis et al. DPRE ∩ DEMIMA

Strategy/State FigureChangeListener Figure ConnectionFigure Figure PaletteListener Connector

State

Visitor

2 11* 22* 1 2 1

2 2 4 2 2 2 2

*observe that they include State and Strategy instances in the same list

As for the design pattern State, Table XIII shows the instances specified by P-MARt and the instance recovered by DPRE. By analyzing the JRefactory documentation we have manually verified that the instance shared with P-MARt (i.e., 41) is correct. The other instance (i.e., 42) has to be considered a false positive since the class playing the role of State is not present in the documentation (only classes playing the role of ConcreteState have been identified).

TABLE X. INSTANCES OF THE STRATEGY/STATE PATTERNS REPORTED IN [24] AND [27] FOR JHOTDRAW 5.1 AND NOT RECOVERED BY OUR APPROACH DURING MODEL CHECKING P-MARt [24] id ContextStrategy/ContextState Strategy/State 34 ConnectionHandle Connector 35 LocatorHandle Locator Tsantalis et al. [27] id ContextStrategy/ContextState Strategy/State 36 DrawApplet Tool 37 DrawApplication Tool 38 AbstractTool DrawingView

TABLE XII. INSTANCES OF THE PATTERN VISITOR FOR JREFACTORY IDENTIFIED BY OUR APPROACH

id 39

Regarding the patterns Strategy and State, first we want to highlight that some instances specified as correct in [24] and identified in [27] do not satisfy the structural definitions we have considered in our approach. In particular, the grammar specifications we have specified for these patterns require that the classes playing the role of Context have to be concrete (i.e., they cannot be interfaces or abstract classes). Table IX (Table X, resp.) shows the instances of the Strategy and State patterns indicated as correct instances in [24] and [27] but not recognized by our approach during Structural Analysis (Modeling Checking, resp.). We have manually checked these instances and the analysis has highlighted that they are not correct. To discuss the results obtained on JRefactory in Table XI we have reported the number of instances of the State and Visitor patterns specified by P-MARt [24] and identified by Tsantalis et al. approach [27], DEMIMA [11], and DPRE as well as the number of instances that DPRE shares with [24], [27], and [11]. We have recovered the two instances of Visitor that are also specified in P-MARt [24] and recovered by Tsantalis et al. approach [27] and DEMIMA [11] (reported in Table XII). Even if in [11] the recovered pattern instances are not provided we suppose as correct the two instances of Visitor specified in [24]. The analysis of the documentation and the manual check of the source code have highlighted that the instance where the role of Visitor is played by the class

40

P-MARt [24], Tsantalis et al. [27], and DPRE Visitor Element org.acm.seguin.parser. org.acm.seguin.parser.Node JavaParserVisitor org.acm.seguin.summary. org.acm.seguin.summary. SummaryVisitor Summary

It is TP FP

Tsantalis et al. have recovered eleven instances of State/Strategy patterns and declared that one is false positive [27]. The analysis of these instances have revealed that they do not include those shown in Table XIII and they are not specified in the JRefactory documentation. Concerning the results of DEMIMA [11], 22 instances of State/Strategy patterns are recovered but only two are declared as correct (and we suppose they are those specified in P-MARt). TABLE XIII. INSTANCES OF THE PATTERN STATE FOR JREFACTORY IDENTIFIED BY OUR APPROACH

id 41 42

id 43

P-MARt [24] ContextState State org.acm.seguin.print.xml. org.acm.seguin.print.xml.State XMLLinePrinter org.acm.seguin.summary. FileSummary DPRE ContextState State org.acm.seguin.print.xml. org.acm.seguin.print.xml.State XMLLinePrinter

It is TP FP

It is TP

Regarding the remaining considered design patterns, the approaches proposed in [11] and [27] have also recovered 49 and 17 Template Method instances, respectively. However, as mentioned in the previous section the JRefactory

184

documentation does not report on these instances. Thus, we can conclude that our approach is characterized by a better precision than [11] and [27]. VI.

[8] E. Gamma, R. Helm, R. Johnson, J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, Menlo Park, CA, 1995. [9] C. Ghezzi, M. Jazayeri, and D. Mandrioli. Fundamentals of Software Engineering. Prentice-Hall, NJ, USA, 1991. [10] Y.-G. Guéhéneuc, “P-MARt: Pattern-like Micro Architecture Repository”, in Proc. of the 1st EuroPLoP Focus Group on Pattern Repositories, July 2007. [11] Y.-G. Guéhéneuc, G. Antoniol, “DeMIMA: A Multilayered Approach for Design Pattern Identification”, IEEE Transactions on Softw. Engineer., 34(5), 2008, pp. 667–684. [12] Y.-G. Gueheneuc, R. Douence, and N. Jussien, “No Java without Caffeine – A Tool for Dynamic Analysis of Java Programs,” in Proc. of Conference on Automated Software Engineering, W. Emmerich and D. Wile, eds., pp. 117–126. [13] D. Heuzeroth, S. Mandel, and W. Lowe, “Generating Design Pattern Detectors from Pattern Specifications”, in Proc. of the International Conference on Automated Software Engineering (ASE’03), 2003, pp. 245–248. [14] D. Heuzeroth, T. Holl, G. Högström, and W. Löwe, “Automatic Design Pattern Detection”, in Proc. of International Workshop on Program Comprehension (IWPC’03), 2003, pp.94–103. [15] http://babelfish.arc.nasa.gov/trac/jpf [16] JHotDraw, http://www.jhotdraw.org [17] JRefactory: http://jrefactory.sourceforge.net [18] J. Ka-Yee Ng and Y.-G. Guéhéneuc, “Identification of Behavioral and Creational Design Patterns through Dynamic Analysis”, Proc. of the Workshop on Program Comprehension through Dynamic Analysis (PCODA’07), 2007, pp. 34–42. [19] C. Kramer, L. Prechelt, “Design Recovery by Automated Search for Structural Design Patterns in Object Oriented Software”, Proc. of Working Conference on Reverse Engineering (WCRE’96), 1996, pp. 208–215. [20] S. Leue and P. B. Ladkin, “Implementing and Verifying MSC Specifications using PROMELA/XSPIN”, Proc. of the 2nd SPIN workshop, Sept. 1996. [21] J. Niere, W. Shafer, J. P. Wadsack, L. Wendehals, J. Walsh, “Towards Pattern-based Design Recovery”, Proc. of International Conference on Software Engineering (ICSE’02), 2002, pp. 338–348. [22] R. Olsson, N. Shi, “Reverse Engineering of Design Patterns from Java Source Code”, Proc. of IEEE/ACM International Conference on Automated Software Engineering (ASE'06), 2006, pp. 123–134. [23] N. Pettersson, “Measuring Precision for Static and Dynamic Design Pattern Recognition as a Function of Coverage”, Proc. of Workshop on Dynamic Analysis (WODA’05), 2005, pp. 43–49. [24] P-MARt: http://ptidej.dyndns.org/downloads/pmart/ [25] Probekit. Creating Custom Profilers with Probekit, http://www.eclipse.org/tptp/platform/documents/probekit/pro bekit.html. [26] G. Salton and M. McGill, Introduction to Modern Information Retrieval, McGraw Hill, 1983. [27] N. Tsantalis, A. Chatzigeorgiou, G. Stephanides, S. T. Halkidis, “Design Pattern Detection using Similarity Scoring”, IEEE Trans. on Software Engineering, 32(11), 2006, pp. 896–909. [28] N. Tsantalis, A. Chatzigeorgiou, G. Stephanides, S. T. Halkidis, Design Pattern Detection, http://java.uom.gr/~nikos/pattern-detection.html, 2006. [29] L. Wendehals, A. Orso, "Recognizing Behavioral Patterns at Runtime using Finite Automata," in Proc. of Workshop on Dynamic Systems Analysis, (WODA’06), 2006, pp.33–40.

CONCLUSION

In this paper we present a design pattern recovery approach that performs both static and dynamic analysis to verify the behavior of pattern instances. The former exploits model checking to statically verify the behavioral aspects of design pattern instances with the aim of reducing the candidate pattern instances to be given in input to the latter. The dynamic analysis is performed through the automatic instrumentation and monitoring of the method calls involved in the candidate pattern instances. Two case studies have been performed by applying the recovery process on JHotDraw and JRefactory in order to assess its retrieval effectiveness. The results are interesting and encourage us to further investigate the proposed design pattern recovery approach (also considering larger software systems). Indeed, we have obtained good results in terms of precision and recall and the model checking phase has allowed us to eliminate many false positives, thus reducing the effort needed to select the test cases that verify the behavior of the candidate pattern instances. Furthermore, the comparison with other approaches applied on the same software systems has highlighted that our proposal is characterized by better results. Regarding scalability issues, we want to remark that the static analysis of the pattern behavior is limited to the recovered candidate instances. Thus, the use of model checking does not introduce scalability issues as the size of systems increases. REFERENCES [1] Z. Balanyi and R. Ferenc, “Mining Design Patterns from C++ Source Code”, Proc. of International Conference on Software Maintenance (ICSM’03), 2003, pp. 305–314. [2] D. Beyer, C. Lewerentz, “CrocoPat: Efficient Pattern Analysis in Object-Oriented Programs”, in Proc. of the IEEE International Workshop on Program Comprehension (IWPC’03), 2003, pp. 294–295. [3] A. De Lucia, V. Deufemia, C. Gravino, M. Risi, “Design Pattern Recovery through Visual Language Parsing and Source Code Analysis”, Journal of Systems & Software, 18(7), 2009, 1177–1193. [4] A. De Lucia, V. Deufemia, C: Gravino, M. Risi, “Behavioral Pattern Identification through Visual Language Parsing and Code Instrumentation”, Proc. of European Conference on Software Maintenance and Reengineering (CSMR’09), 2009, pp. 99–108. [5] J. Dong, Y. Zhao, and Y. Sun, “A Matrix-Based Approach to Recovering Design Patterns”, IEEE Transactions on Systems, Man, and Cybernetics — Part A: Systems and Humans, 39(6), (2009), pp. 1271–1282. [6] R. Ferenc, A. Beszedes, L. J. Fülöp, J. Lelle, ”Design Pattern Mining Enhanced by Machine Learning”, Proc. of the International Conference on Software Maintenance (ICSM’05), 2005, pp. 295–304. [7] L. J. Fülöp, T. Gyovai, R. Ferenc, “Evaluating C++ Design Pattern Miner Tools”, Proc. of the International Workshop on Source Code Analysis and Manipulation (SCAM’06), 2006, pp. 127–138.

185

Lihat lebih banyak...

Improving behavioral design pattern detection through model checking

Descripción

Comentarios