Generic causal probabilistic networks: A solution to a problem of transferability in medical decision support

Share Embed


Descripción

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

journal homepage: www.intl.elsevierhealth.com/journals/cmpb

Generic causal probabilistic networks: A solution to a problem of transferability in medical decision support Karsten Jensen ∗ , Steen Andreassen Center for Model-based Medical Decision Support, Aalborg University, Denmark

a r t i c l e

i n f o

a b s t r a c t

Article history:

Causal probabilistic networks provide a natural framework for representation of medical

Received 15 January 2007

knowledge, allowing clinical experts to encode assumptions about causal dependencies

Received in revised form

between stochastic variables. Application in medical decision support has produced promis-

24 October 2007

ing results. However, model features and parameters may vary geo- or demographically.

Accepted 31 October 2007

Therefore methods are needed that allow for easy adjustment of the model to a change in conditions. We present a method to represent causal probabilistic networks generically that

Keywords:

maximizes the transferability of a models relevance and completeness, when moved from

Probabilistic network

one environment to another, and illustrate application of the method with an example from

Decision support systems

a medical decision support system. © 2007 Elsevier Ireland Ltd. All rights reserved.

Generic model Transferability

1.

Introduction

The aim of medical decision support is to assist the clinical decision maker in the endeavour of maximizing the quality of health care. The support can include management of a decision space with multiple dimensions like diagnosis, prognosis, risk, cost, therapy planning and retrieval and processing of relevant information. In recent years a probabilistic modeling technology, known as Bayesian networks or belief networks [1], has been used in connection with decision theory [2] to manage some of these dimensions, in particular diagnosis [3–8] and therapy planning [9,10]. A Bayesian network represents dependencies between a set of stochastic variables and the term causal probabilistic network (CPN) is often used to denote instances of Bayesian networks, where the dependencies are causal by nature [11].



A recurring theme in the context of medical decision support systems is the problem of transferability [12–15]. A system developed in one geographical site may not be transferable to another location. The problem of transferability in the context of medical decision support involves more than the usual problem of physical portability of a computer program: it also involves resolving differences in medical definitions, varying standards of practice and differences in patient populations [16,17]. The problem of tranferability in medical decision support would be minimized if clinical practice was based on universal standards, agreed upon by all practioners. However, attempts to introduce standard practices based on clinical evidence, the philosophy known as evidence-based medicine [18], has not been embraced by the medical community without controversy [19]. Even proponents of evidence-based medicine have acknowledged that clinical practice requires the con-

Corresponding author. Tel.: +45 9635 7463. E-mail address: [email protected] (K. Jensen). 0169-2607/$ – see front matter © 2007 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.cmpb.2007.10.015

190

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

sideration of values, both patient and professional, prior to arriving at medical decisions. Thus, evidence-based medicine has been redefined as the integration of best research evidence with clinical expertise and patient values [20,21]. The fact that preferences of professionals and patients affect the decision making process in clinical practice represents a challenge for knowledge engineers engaged in the construction of medical decision support systems since it adds to the complexity of the problem of transferability. The aim of the study documented in the present paper was to investigate whether a parameterized feature-based procedural modeling approach, developed to handle problems of transferability in computer-aided design (CAD), could be adapted to the context of CPN models in medical decision support. More precisely, we wanted to examine whether the relevance of a particular CPN model, representing attributes of a neurophysiological nerve conduction study, could be easily adjusted to a change in conditions affecting both the numbers, the state space and the structure of the model, without compromising the completeness of the model. The next section will provide the necessary background information needed to appreciate the relevance of the problem and the proposed solution. The background section includes: • A review on the use of CPN models in medical decision support. • A review on the problem of transferability as it presents itself in CPN modeling in general and in a CPN-based decision support system for neurophysiology in particular. • A review on how problems of transferability are solved in CAD. In the third section we propose a new conceptual and technical framework needed to address the problem of trans-

ferability in the context of CPN models and demonstrate how to use a prototype implementation of the methodology. A subsequent discussion leads to the conclusion that the demonstrated examples prove the feasability of the proposed approach.

2.

Background

2.1.

Causal probabilistic networks

A CPN is a compact representation of a joint probability distribution over a set of stochastic variables, P(x1 , · · · , xn ), utilizing that a joint probability distribution may be decomposed into a product of conditional probability distributions [22], P(x1 , · · · , xn ) =



P(xj |parentsxj ),

(1)

j

where parentsxj represents the direct causal influences determining the state of xj . The elements of the modeling language are a directed acyclic graph (DAG), embodying the causal structure of the model, and a collection of local conditional probability distributions, one for each variable, encoding the plausibility of causal influence for each configuration of states in the parent variables (Fig. 1). The increasing popularity of CPNs in medical decision support derives from a number of advantageous features inherent in the semantics of Bayesian probability theory. For example Pearl, in a general comparison with extensional modeling systems, showed that intensional probabilistic models are superior in bidirectional and non-modular inferences and in treatment of correlated sources of evidence [22]. The strength of Bayesian methods originates from their capacity to encode prior information in terms of likelihood, condition-

Fig. 1 – A small fragment of munin, a diagnostic system for neurophysiology, representing plausible propositions about the effect of a local nerve lesion on the anatomical structure of a nerve. Each node represents a stochastic variable associated with a prior or conditional probability table. Directed links represent causal interactions. The network fragment has been simplified for the purpose of illustration.

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

ing, relevance and causation. However, in the seventies and early 1980s Bayesian methods were considered impractical for more complex domains, with the exception being the socalled naive-Bayes model applying Bayes’ theorem with strong (naive) independence assumptions. This condition changed when Kim and Pearl found an efficient algorithm for inference in a probabilistic graph without cycles [23]. Further progress was later made by Lauritzen and Spiegelhalter who described an efficient inference algorithm for probabilistic graphical structures with some kinds of cycles [24]. CPN’s where introduced to the medical domain by Andreassen et al. [25], and subsequently Heckerman et al. [4,26]. The latter group, in their Pathfinder project, designed for diagnosis of lymph-node diseases, compared the performance of different inference methods, including rule-based reasoning, Dempster–Shafer theory and a simple naive-Bayes model and found that naive-Bayes, in spite of the overly simplifying assumption that the set of disorders are exhaustive and mutually exclusive, provided the greatest diagnostic accuracy of the compared methods [4,27,26]. Their subsequent analysis of the results lead to the conclusion that the other inference schemes considered encodes even stronger assumptions of conditional independence than the naive-Bayes model, when there are more than two mutually exclusive and exhaustive diseases in a domain [4,27]. They also considered fuzzy decision theory, but rejected it because different general pathologist and experts in hemapathology disagreed on fuzzy descriptions of the problem domain. Finally they investigated QMR and demonstrated that QMR’s ad hoc scoring scheme was isomorphic to the odds-likelihood updating scheme [4,28]. Consequently they built a CPN with some cycles, but with the naive-Bayes assumption embodied in a single discrete stochastic variable representing a set of exhaustive and mutually exclusive disorders. The first CPN-based diagnostic system utilizing the inference algorithm by Spiegelhalter and Lauritzen, and removing the naive-Bayes assumption, thereby allowing multiple diseases to be diagnosed, was munin, a diagnostic system for neurophysiology [3,25]. A simplified example of a small fragment of the munin network is shown in Fig. 1. The parent nodes represent a local nerve lesion being described by three different pathological conditions and three degrees of severity. The child node represents five degrees of a pathophysiological condition known as ‘axonal loss’, signifying a loss of nerve fibers that may be the effect of a local nerve lesion. Each parent node is associated with a discrete prior probability distribution, and the child node is associated with a conditional probability table containing a discrete probability distribution over the states of the child node for each configuration of states in the parent nodes. The early version of the network was a ‘nanohuman’ network representing only three diseases, 15 clinical findings from a single muscle and 9 pathophysiological variables relating the disorders and the clinical findings. The network was later extended to become a ‘microhuman’ network representing 22 diseases and 186 findings associated with 6 muscles and 8 nerves [29,30]. Including the pathophysiological attributes of the anatomical elements, the ‘microhuman’ network presently contains around 1100 stochastic variables [31]. An evaluation of munin demonstrated that within its

191

Fig. 2 – An overview over some important transferability factors in medical decision support. Factors within the scope of the present study are in boldface.

limited anatomy the diagnostic performance of the system was at the same level as an experienced neurophysiologist [32]. Present work on the munin network revolves around expanding the model to encompass all relevant parts of the of the peripheral neuromuscular system. This means that the model must be scaled up by a factor of minimum 10 including more than 11,000 variables, at least 200 diseases and no less than 2000 findings. This must be done in a manner that maximizes the economy of the specification and the tranferability of the model between different neurophysiological laboratories. Here we only consider maximization of transferability.

2.2. The problem of transferability in medical decision support The term transferability has been defined as the degree to which a system retains it usefulness and reliability, when applied in different organizational environments, where the concept of usefulness has dimensions like accessibility, relevance and completeness and reliability denotes the systems correctness, robustness and sensitivity [14]. Fig. 2 illustrates how the usefulness of a transferred medical decision support system may be evaluated across two categories of transferability factors, those concerned with the medical domain and those concerned with the information technology utilized by the system [14]. Important domain factors are medical epidemiology, terminology and methodology. Among the most important technology factors are methods of knowledge acquisition and representation [14]. The scope of the study documented here is confined to investigating a systems transferability of usability in the case of varying epidemiology and methodology, with the aim of demonstrating how a proper choice of a knowledge representation technique can maximize the relevance and completeness of the system at a local site.

2.2.1.

Transferability of causal probabilistic networks

The large majority of published applications of probabilistic graphical models consists of a fixed set of stochastic variables, obtained by some combination of expert judgment and learning from observation, representing a unique joint probability distribution. But, since this joint probability distribution is dependent on a frame of prior background knowledge, the model is no longer valid if the frame of background knowledge changes. To overcome this deficiency early designs adapted an approach known as the fixed parameterized model approach [33]. In this approach quantitative parameters of the model are

192

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

Fig. 3 – The munin fragment from Fig. 1 extended with first-order logical symbols used to represent general propositions about the effect of a local nerve lesion on nerve segments located distally to the site of the lesion. distalTo(X) is a hypothetical Boolean function evaluating to true if a given nerve segment is distal to a particular local nerve lesion.

allowed to vary, in order to cover a family of related decision situations. The fixed parameterized model approach can be exemplified by treat, a recently developed, and clinically tested, decision support system for antibiotic treatment [10]. In the design of treat the problem of transferability was encountered on different levels. At the level of user interaction some information is specific to either countries, hospitals or hospital departments, including language, names, available antibiotics and antibiotics used for testing of susceptibilities [34]. However, the problem of transferability also involves the basic reasoning of the system. For example the prevalence of different pathogens, and their associated susceptibility to antibiotics, differ from region to region and from hospital to hospital and even different units in the same hospital have distinct patterns of susceptibility [35]. These transferability factors are all medical domain issues related to variation of both epidemiology, terminology and methodology between different locations. To resolve these differences the treat system contains a number of calibration databases [34]. In the treat system calibration data are inserted into a CPN with a fixed structure and state space to generate different instances of the CPN, each one being calibrated to a particular location. However, more than a decade ago it was argued that a fixed parameterized model is not sufficient to account for context-sensitive variation implying changes in the structure of graphical probabilistic and decision models and acknowledgement of this fact spurred a new line of research in automatic construction of graphical models from knowledge bases [33]. This research effort, also known as knowledgebased model construction, was stimulated by the observation that probabilistic graphical models offers no means to capture general knowledge about probabilistic relationships across classes of events [36]. As work in knowledge-based model construction progressed, interest gradually shifted to the relation between probabilistic representation languages and classical first-order logic. In recent years there have been many pro-

posals for first-order Bayesian networks (e.g. [37–41]). In a related line of research concepts and techniques from the object-oriented programming paradigm have been invoked, to extend Bayesian networks with general knowledge representation capabilities [42–45]. The basic idea in first-order Bayesian network formalisms is the extension of Bayesian networks with first-order logical expressions, using existential and universal quantifiers to represent generalized plausible propositions. An example is shown in Fig. 3. Here the fragment of the munin network in Fig. 1 is shown with some additional features expressing firstorder logical relationships between the nodes of the network fragment. The rectangles in the figure are plates, invented by Buntine to represent classes of entities in a graphical model [41,46]. Each plate is labelled with the name of the entity class and each conditional probability table is associated with a logical expression related to the set of plausible propositions defined by the table. It is assumed that instances of the entity class Local Nerve Lesion have the same prevalence, encoded as the probability for having a ‘Moderate’ or‘Severe’ local nerve lesion (Fig. 3). It is furthermore assumed that there is a function, distalTo(X), attributed to the entity class Nerve Segment, the truth-value of which can be determined from a search in an anatomical database. The value of the function evaluates to true if a given nerve segment is distal to an instance of the entity class Local Nerve Lesion, affecting the nerve containing the nerve segment. Using the extra knowledge representation capabilities, provided by first-order logic, it becomes possible to represent general conditional probabilistic assertions related to a class of local nerve lesions and nerve segments. The network fragment shown in Fig. 3 may be instantiated by a program that will create all instances of the network fragment, for which the general probabilistic assertions are plausible. The example shown here illustrates a characteristic feature predominant in all first-order and object-oriented extensions to Bayesian networks known to the authors. They all allow

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

Fig. 4 – A typical nerve conduction study where a nerve is stimulated electrically at the wrist and the nerve signal is recorded subsequently in digit 2. LNLW is local nerve lesion at the wrist, length is the length of the nerve segment, AMP and CV are amplitude and conduction velocity (mean (S.D.)) of the nerve signal in a normal population.

a fixed parameterized network fragment to be reused in a multiple of contexts, however parameterization is confined to the naming of particular instances of the general class of network fragments. While some of these formalisms have been designed with the intention of providing aims to optimize the ability of a model to account for context variation, this restriction, often imposed deliberately to have a declarative modeling framework (i.e. [42]), severely impairs the transferability, or reusability, of models created in this type of modeling frameworks. For example a particular local nerve lesion of the peripheral nerve system, may have variable prevalences at different locations, either due to epidemiological factors, or because of various reasons of referral. More severely, a class of related entities may have variation in both state space and topology, when being represented in a CPN. This, more complex, problem of transferability has been encountered in the design of the munin CPN.

2.2.2.

A problem of transferability in munin

The munin CPN represents the mapping of local and systemic neuromuscular disorders into structural and functional changes of the neuromuscular system. Furthermore it represents the translation of the structural and functional changes into clinically observable variables monitored in electrodiagnostic studies. In particular the part of the CPN associated with nerve conduction studies poses a challenging problem of transferability. Electrical stimulation of a nerve sends an

193

impulse along the nerve, which may be recorded to assess conduction characteristics of the nerve. In a typical nerve conduction study a nerve, for example the median nerve, is stimulated electrically at the wrist and the nerve signal is recorded distally at a digit (e.g. digit 2), as shown in Fig. 4. A distorted nerve signal recording may signify the presence of a local nerve lesion at the wrist. To interpret the finding the neurophysiologist must know mean and standard deviation of amplitude and conduction velocity of the nerve signal, in a normal patient, and the length of the nerve segment. The stimulation points used for electrical stimulation defines a set of nerve segments and in the munin CPN each nerve segment is represented by a CPN fragment. Fig. 5 shows a CPN fragment representing a nerve segment. The CPN fragment contains stochastic variables representing: • A local nerve lesion affecting the nerve segment. • Structural and functional changes of the nerve segment (pathophysiology). • Amplitude and conduction velocity of the nerve signal conducted by the nerve segment. When a CPN fragment is created to model the attributes of a nerve conduction study a number of complicating transferability issues must be addressed:

• Most tests employed in electrodiagnostic medicine, including nerve conduction studies, depend on normative data for their interpretation [47]. • Reference values used to assess abnormalities are sensitive to height and to a lesser degree to age and temperature [48]. • Different laboratories may use different stimulation and recording techniques [49]. • Differences in the techniques applied in different laboratories may induce a bias in the comparison with normative data, so each laboratory is encouraged to develop their own normal ranges, using standardized methods, to minimize the bias [49]. The demo- and geographic variation in normative data can be handled using the fixed parameterized model approach, with calibration data being compiled into a local instance of the network. But, the variable professional preferences in the

Fig. 5 – A fragment of the munin network representing attributes of a nerve conduction study in the wrist-digit 2 nerve segment of the median nerve. Each node represents an attribute of the nerve segment. The network fragment has been simplified for the purpose of illustration.

194

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

specification of nerve conduction studies implies that munin must represent nerve conduction studies generically in a way that allows the model to be adjusted to a range of specifications. For example, the lack of universal standards for specification of stimulation points in a nerve conduction study implies adjustments in both the state space and topology of munin: (1) The state space of some variables in munin is dependent on physical parameters that are continuous by nature. These variables are discretized, using methods for nonlinear sampling of continuous stochastic variables [50,51]. To minimize the load of inference, the number of discrete states in each variable must be minimized accordingly. However, to maintain a minimal number of states, the state space of each instance of these variables must be adjusted to the physical parameters they depend on, for example the length of a nerve segment. Because the length of a nerve segment is defined by the specification of stimulation points, the state space of these variables must be parameterized for munin to be adjustable to a range of possible specifications of nerve conduction studies. (2) The topology of munin is partly determined by the number of nerve segments to be represented in the model, since each nerve segment must be represented by a CPN fragment in the model. This number in turn depends on the specification of stimulation points. Variable preferences may also affect the number of local nerve lesions that must be represented in a particular nerve segment. Consequently the topology of munin must also be generic for the model to be adaptable to a range of possible specifications of stimulation points in a nerve conduction study. We investigated whether it is possible to represent a CPN generically in such a way that the state space and topology of the model can be parameterized to account for variable professional preferences. More precisely the study seeked an answer to the question: Can the relevant part of munin be specified generically in a way that allow particular instances of the model to be created from an arbitrary specification of stimulation points and relevant normative data in a series of nerve conduction studies?

2.3. The problem of transferability in computer-aided design In computer-aided design the problem of transferability is mainly related to the exchange of model data between different CAD systems. The data exchange problem is addressed by an international standard, ISO 10303 or STEP (‘STandard for the Exchange of Product model data’) [52]. The standard provides a system-independent format for the transmission of data in computer-interpretable form between different CAD systems, or between CAD and other computer-based engineering systems. The initial release of STEP was aimed entirely at the exchange of explicit models, defined in terms of geometry and, possibly, additional topological information

providing connectivity relationships between geometric elements. In recent years application of STEP has revealed a problem that was not accounted for in the early versions of the standard. Standards for explicit model translations does not ensure the transfer of model behaviour, implying that a transferred model cannot necessarily be modified [53]. Put in another way, explicit model transfers does not capture the method by which the model was originally constructed, or what is often referred to as design intent [54]. In CAD systems, design intent is represented by model features, model parameters, model constraints and the model’s history of construction. Therefore present work, in relation to exchange of CAD models, is aimed at the extension of STEP to permit exchange of parameterized feature-based procedural representations, between different CAD systems [53]. Pratt et al. give the following definitions for two of the defining attributes for design intent [53]: • Features are high-level geometric constructs used during the design process to create shape configurations in the model that are usually related to the intended functionality of the designed product. • Parameters are values of quantities in the model (typically dimensions) that may be regarded as variables for purposes of editing the model. In the next section these attributes of design intent will be redefined in the context of CPN models in medical decision support. However, the aim of standardization in CAD systems is to ensure the invariance of design intent, when a model is transferred from one system to another. The scope of the present study is to investigate whether it is possible formulate a generic model in a way that allow particular instances of the model to reflect variation of design intent, due to variable professional preferences, when a model is transferred from one location to another. Thus, the basic lesson that will be drawn from the experiences accumulated in CAD design, is the definition of design intent in terms of a model’s construction history, features and parameters.

2.3.1.

Design intent in CPN modeling

As mentioned in the previous section design intent reflects the method by which a model was originally constructed, indicating what is variable in the model, and what is fixed. When considering the procedure of constructing a CPN it is possible to uncover elements of the model that are parallels to features and parameters in CAD models. This similarity is not as surprising as it may seem. An explicit CAD model is defined in terms of topological relationships between geometric elements. Likewise one can define an explicit CPN model in terms of topological relationships between probabilistic elements. In the context of CPN models the following informal definitions are used. • Features are classes of CPN fragments representing highlevel information entities that are recognizable to the designer of a CPN-based system. In a medical diagnostic system typical features are local and systemic disorders or anatomical, physiological and observational units. Exam-

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

ples of features in the munin system are local nerve lesion, nerve, nerve segment, pathophysiology of a nerve segment and conduction velocity of a nerve signal. A feature is associated with a parameterized construction procedure. An instance of a feature may be composed from instances of other features. The specification of a nerve conduction study (Fig. 4), represented as a CPN, is an example of a composite feature. • Parameters are instances of features or quantities associated with instances of features, e.g. the length of a nerve segment. The state space of a stochastic variable contained in an instance of a feature may also be a parameter. For example a relevant set of latencies of the nerve signal may be passed as a parameter to the construction procedure for a nerve segment feature. • A Stochastic parameter is a probability distribution over the state space of a stochastic variable. An example is the probabilities over the state space of the node ‘Severity’ in Fig. 1.

2.3.2.

Features and object-orientation

Readers that are familiar with object-oriented methodologies may recognize that the relation of a parameterized family of features to its created examples can be seen as analogous to the class-instance relation in object-oriented languages. The classical extensional view of this relation defines a class as a generalization of a set of objects with common properties. However, object-oriented languages often apply an intensional, or feature-based, definition, emphasizing a class as a descriptor and constructor of objects [55,56]. In addition to classification, object-oriented languages often utilizes a number of techniques aimed at maximizing the adaptability of software solutions to varying conditions. The basic elements of this set of tools are class inheritance, object composition and generics [57]: • Class inheritance allows a new class of objects to be defined economically by extension of other class definitions. It would be possible to incorporate the notion of inheritance into a feature-based approach to CPN modeling, but this option has not been explored in the present study. • Object composition is an alternative to class inheritance. Here new functionality is obtained by assembling simpler objects into more complex aggregates. The definition of CPN features given above, allows an instance of a feature to be composed from instances of other features in analogy to object composition. • Generics is a technique that allows a designer to define a functionality without specifying all the data types it uses. The unspecified data types may be given as parameters at the point of use. For example, a List class can be parameterized by the type of elements it contains. This technique can be used in the feature-based approach to CPN modeling suggested here, as it is possible to parameterize the state space of the variables contained in a feature. Thus, the procedural feature-based approach to CPN modeling may be formulated in an object-oriented terminology. This observation has been utilized in the implementation of presented ideas to be described in the next section.

195

3. A method for generic representation of causal probabilistic networks The analysis described in previous sections has defined some design objectives to be implemented in a working computer system: (1) The system must allow a designer to compose a generic CPN as a structure of parameterized features, each feature being associated with a pertinent construction procedure. (2) The system must allow a user to create an instance of the generic CPN that is adjusted to a set of preferences. The preferences must be expressed in terms of: (a) A structured list of features to be represented in the CPN. (b) A set of parameters needed for the construction procedure of each selected feature.

3.1.

System description

A working prototype system has been implemented to demonstrate the feasibility of the suggested method for generic representations of CPN models in medical decisions support systems. The system is implemented on a Windows PC (2.8 GHz Pentium(R)4 CPU) as a JAVA preprocessor to Hugin Expert, a commercially available tool for construction and application of probabilistic graphical models [58]. Hugin Expert, among other things, has a JAVA API for the decision engine, containing functionality needed for construction and manipulation of Bayesian networks. The designed JAVA preprocessor extends the functionality provided by the Hugin Expert JAVA API with additional functionality incorporating the design objectives described in the previous section. The added functionality includes five new JAVA packages, a structured parameter format and a library of probability models (Fig. 6). The Probability Library is a library of probabilistic models. The probabilistic models are stored in text format, either as conditional probability tables, or as parameterized models for construction of conditional probability tables. Examples on how to translate various parameterized probability models into conditional probability tables is given by Andreassen [50] and Olesen and Andreassen [51]. A Structured Parameter is a text string specifying the features to be represented in an instance of a generic CPN model. A simple example of a Structured Parameter, representing a Local Nerve Lesion feature, is shown in Fig. 7. The Feature Parser contains classes that create the internal representation of a Structured Parameter provided by a user to create an instance of the generic CPN. Other classes in this package create an internal representation of the Probability Library. The Feature Library is a container for a set of construction procedures that must be associated with the features of a generic CPN. The construction procedures must be implemented as JAVA constructors. To represent a CPN generically, using the software packets, a designer must formulate a construction procedure for each class of features in the model. The construction procedure is composed of a header and a parameterized procedure. A header for a Local Nerve Lesion feature is shown in Fig. 8. The header declares stable proper-

196

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

Fig. 6 – An overview over JAVA packages and specification formats implemented by the authors and extending the functionality of the Hugin Expert JAVA API for construction and manipulation of CPN’s and other belief networks. Arrows show the dependencies between the elements of the design.

ties that are shared by all instances of the feature, for example stochastic variables (nodes), variable names, state space and a prefix to variable names, the value of which will be determined by a parameter to the parameterized procedure. The construction procedures in the Feature Library all depend on functionality provided by the Feature Builder package. The Feature Builder contains the classes providing the basic functionality needed to compose an instance of a generic CPN from a set of parametrized features, including: 1 Subclasses extending classes in the Hugin API representing different kinds of stochastic variables in a CPN. The added functionality is, for example, needed to impose constraints on the number of parents to a given node. 2 Classes for internal representations of Structured Parameter objects and stochastic parameters. The stochastic parameters are stored in the Probability Library and are

passed by reference to the construction procedures in the Feature Library. 3 A superclass for all features contained in the Feature Library. The superclass provides the functionality needed to manipulate instances of the different features when an instance of the generic CPN is composed. The Feature Builder depends on functionality from two other packages, the Table Generator and the State Space Generator. The Table Generator contains functionality extending a table generator facility provided by the Hugin Expert JAVA API. A table generator generates conditional probability tables from parametric probabilistic models. The table generator used here is based on a design originally developed by Olesen and Andreassen for an earlier preprocessor to Hugin Expert [51]. The State Space Generator is a list of parameterized mathematical models needed to generate the state space of a discrete stochastic variable.

3.1.1.

Fig. 7 – A Structured Parameter representing a local nerve lesion at the wrist (LNLW). The keyword ‘table’ is followed by a reference to a probability table contained in the Probability Library.

Fig. 8 – A header for the construction procedure of the generic feature, Local Nerve Lesion. The header declares two CPN nodes, two generic node names and a parameterized prefix to be appended to generic node names, when an instance of the generic feature is created. In addition, state names are declared for each of the two nodes.

Parameterization of conditional probabilities

Fig. 9 shows a construction procedure for the feature, Local Nerve Lesion. The construction procedure creates two nodes shared by all instances of the feature, names them and inserts state space and conditional probabilities. It takes a ParameterObject as a parameter. A ParameterObject is a class of objects defined in the Feature Builder packet. It is an internal representation of a Structured Parameter and an object of this class has a number of methods for retrieval of the necessary information needed to create an instance of the feature. For example the method getName is needed append a prefix to the node names and setData insert conditional probabilities passed by reference with the ParameterObject. Fig. 7 shows a structured list of parameters for a local nerve lesion, LNLW (Local Nerve Lesion at the Wrist). The parameters to the construction procedure for Local Nerve Lesion are the name of the nerve lesion, LNLW, and two references to the Probability Library, LNLW.SEVERITY and LNLW.PATHOLOGY, used by setData to retrieve the conditional probabilities for the two nodes. Fig. 10 shows the output of the construction pro-

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

197

Fig. 9 – A construction procedure for a Local Nerve Lesion feature.

Fig. 10 – The output from a parameterized construction procedure for a Local Nerve Lesion feature with laboratory prevalences inserted. These are different from the population prevalences used in Fig. 1 and 3.

cedure in Fig. 9, with specific laboratory prevalences encoded into the prior probabilities of the node LNLW.SEVERITY.

responsibility of the designer to provide the mathematical functions in the State Space Generator.

3.1.2.

3.1.3.

Parameterization of state space

The state space of some variables in munin depends on physiological parameters that may vary between different instances of a feature. An example is the state vector, CVslow , representing measured conduction velocities of a nerve signal in a nerve segment. The values this state vector can assume are calculated by the following mathematical function contained in the State Space Generator: CVslow =

length , tdelay + (length/CV)

(2)

where length is the length of a given nerve segment and CV is the mean value of the conduction velocity of the nerve signal in a normal population. tdelay is a state vector [0, 0.75, 1.6, 7.0] ms, representing different degrees of delay of the nerve signal, due to a demyelinating nerve lesion. To illustrate how the state vector CVslow depends on length and CV, consider the following two examples:

Feature composition

In a model of a nerve conduction study some relevant features are Nerve Segment, Local Nerve Lesion, Pathophysiology of a nerve segment and Findings related to a nerve segment. The features can be organized in a compositional hierarchy. In Fig. 12 the hierarchical organization is illustrated by a graph. Each node in the graph represents a construction procedure, and arrows signify the asymmetric relationship between two construction procedures at different levels of the compositional hierarchy. For each nerve segment, defined by a stimulation point, the construction procedure for a Nerve Segment must be called. The construction procedure for Nerve Segment in turn must call the construction procedures for Local Nerve Lesion, Pathophysiology and Findings. The output of the composite construction procedure depends on the

(1) A nerve conduction study with length = 270 mm and CV = 64 m/s. In this case the state vector CVslow becomes [64, 54, 46, 24] m/s. (2) A nerve conduction study where length = 140 mm and CV = 60 m/s. In this case the state vector CVslow becomes [60, 46, 36, 15] m/s. To account for this variability CV and length must be contained in the Structured Parameter passed as a ParameterObject to the construction procedure of the generic feature Nerve Segment, as shown in Fig. 11. The construction procedure must then call the pertinent function contained in the State Space Generator with the correct set of arguments. It is the

Fig. 11 – A Structured Parameter for the construction procedure of a Nerve Segment feature. Wrist-digit 2 is the name of the nerve segment. The length of the nerve segment is 165 mm. CV and AMP are normative data (mean (S.D.)) for conduction velocity and amplitude of a nerve signal conducted by the wrist-digit 2 nerve segment in a normal patient.

198

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

The operations needed in the construction procedures are somewhat trivial. Apart from the ones shown in Fig. 9 they are operations needed to add links between nodes, and control structures necessary to iterate or branch, depending on the arguments to the construction procedure.

3.1.4. Fig. 12 – Graph illustrating a compositional hierarchy of construction procedures. The top level construction procedure for the feature Nerve Segment must call the construction procedures for each of the features, Local Nerve Lesion, Pathophysiology and Findings and add the necessary links between instances of the features.

ParameterObject passed to the top level construction procedure for the Nerve Segment feature, where it is decomposed and distributed to lower level construction procedures. Fig. 13 illustrates the relation between the Structured Parameter in Fig. 11 and an instance of a generic CPN fragment representing a nerve conduction study of a single nerve segment. The Structured Parameter is passed as a ParameterObject to a hierarchy of construction procedures creating an instance of a Nerve Segment feature. To represent all of munin generically it is necessary to specify construction procedures for other features. For example instances of a Nerve feature must be composed by calling the construction procedure for Nerve Segment iteratively. Subsequently the construction procedure of Nerve must create the necessary links between the instances of Nerve Segment and add additional links from features representing systemic nerve disorders. However, the principles for specification of higher order features and construction procedures are identical to those already illustrated.

Parameterization of structure

In Fig. 13 it was illustrated how the specification of a nerve conduction study, passed as a ParameterObject to the construction procedure for Nerve Segment, outputs a CPN fragment representing attributes of a particular nerve segment. Among the attributes was a common local nerve lesion, LNLW. A pragmatic neurophysiologist may only want to represent this nerve lesion in the wrist digit segment, since it accounts for the large majority of nerve lesions in this nerve segment. However, a completist may wish to have the rare local nerve lesion LNLP (Local Nerve Lesion at the Palm) represented in the model. This is done by adding the nerve lesion to the specification of the nerve conduction study, and pass it as a ParameterObject to the construction procedure for Nerve Segment. This construction procedure must then retrieve the list of ParamameterObjects representing local nerve lesions, and for each nerve lesion call the construction procedure for Local Nerve Lesion. The output will be a CPN fragment representing a nerve segment with two local nerve lesions, as shown in Fig. 14.

4.

Discussion

The study presented here has demonstrated that it is possible to represent a CPN generically in a way that allows specific clinical preferences to be encoded into particular instances of the CPN, under the condition that the different clinical preferences are expressed in terms of a structured

Fig. 13 – An instance of a generic Nerve Segment feature is created from a compositional hierarchy of parameterized construction procedures, taking a program representation of a Structured Parameter as argument.

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

199

Fig. 14 – An instance of a Nerve Segment feature created from a Structured parameter containing two instances of a Local Nerve Lesion feature.

list of parameters. The structure of the list of parameters must be recognizable to a set of parameterized construction procedures, one for each feature of the domain being modeled. The variation in clinical preferences may affect both the state space and the structure of the CPN. The proposed method has been illustrated by an example from neurophysiology, but only the construction procedures in the Feature Library, the State Space Generator, the Structured Parameter, and of course the Probability Library are related to the specifics of neurophysiology. Thus, it is possible to use the prototype system described above to create CPN models for other domains by specifying the needed parameters, state space generators and construction procedures for another set of features. A distinguishing feature of the proposed methodology is that it supports variation of design intent between different localities by parameterizing the construction history of the model. The construction history can be represented explicitly because the representation is procedural in contrast with the large number of declarative modeling techniques extending graphical models with first-order, general, knowledge capabilities. The parameterization of the construction history allows variable local preferences to be incorporated into the model at the point of use. Although the prototype system has been implemented in an object-oriented language the methodology could also be expressed within other programming paradigms based on, for example, first-order logic. The main contribution of the proposed method, compared to previous first-order extensions of belief networks, is that it allows all attributes of a CPN to be parameterized, both conditional probabilities, state space, topology and naming of network fragments and nodes. No other method of CPN representation known to the

authors supports parameterization along all these dimensions simultaneously. Of course, having extra degrees of freedom also introduces a number of complicating issues that must be addressed in the design of the construction procedures. An example are control of the number of parents to any given node, which is needed to control the size of the state space, and thus efficiency of inference. It is a relevant question whether the complexity of adapting a CPN from one place to another, should lead one to abandon the modeling language as the basis for a medical decision support system. However, we believe that the factors complicating the transferability problem for a CPN are the same factors that render the modeling technique useful for, e.g. diagnostic, problems with high dimensionality. The high degree of accuracy of inference it is possible to obtain with a CPN, depends on a high capacity of the modeling language to encode specific prior information. But the more specific prior information we encode into the model, the less likely it is that the encoded information is transferable, whereas more general information is easier to transfer. In our view, this will be the case for any modeling technique with equivalent expressive power, and accuracy of inference. Thus, in constructing an inference model for a medical decision support system, one may have to consider a trade-off between the need for accuracy of inference, and the need to minimize the complexity of the problem of transferability.

5.

Conclusion

To summarize, a new method for representation of generic CPN’s has been proposed as a solution to some hitherto

200

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

unsolved problems of transferability in CPN-based medical decision support. The proposed methodology has been implemented in a prototype software system, and application of the system has been demonstrated on a particular CPN-based medical decision support system. The illustrated examples have demonstrated the feasability the proposed approach.

[18]

[19] [20]

references [21] [1] F.V. Jensen, Bayesian Networks and Decision Graphs, Springer Verlag, New York, 2001. [2] J.O. Berger, Statistical Decision Theory and Bayesian Analysis, second ed., Springer-Verlag, New York, 1985. [3] S. Andreassen, F.V. Jensen, S.K. Andersen, B. Falck, U. Kjærulff, M. Woldbye, A.R. Sørensen, A. Rosenfalck, F. Jensen, MUNIN—an expert EMG assistant, in: J.E. Desmedt (Ed.), Computeraided Electromyography and Expert Systems, vol. 2, Elsevier, Amsterdam, 1989, pp. 255–277. [4] D. Heckerman, E. Horvitz, B. Nathwani, Towards normative expert systems: Part 1. the pathfinder project, Methods Inform. Med. 31 (1992) 90–105. [5] P. Haddawy, C.E. Kahn, M. Butarbutar Jr., A Bayesian network model for radiological diagnosis and procedure selection: work-up of suspected gallbladder disease, Med. Phys. 21 (7) (1994) 1185–1192. [6] P.W. Hamilton, N. Anderson, P.H. Bartels, D. Thompson, Expert system support using Bayesian belief networks in the diagnosis of fine needle aspiration biopsy specimens of the breast, J. Clin. Path. 47 (4) (1994) 329–336. [7] R. Montironi, P.H. Bartels, D. Thompson, M. Scarpelli, P.W. Hamilton, Prostatic intraepithelial neoplasia (PIN). Performance of Bayesian belief network for diagnosis and grading, J. Pathol. 177 (2) (1995) 153–162. [8] C.E. Kahn, L.M. Roberts, K.A. Shaffer, P. Haddawy Jr., Construction of a Bayesian network for mammographic diagnosis of breast cancer, Comput. Biol. Med. 27 (1) (1997) 19–29. [9] O.K. Hejlesen, S. Andreassen, R. Hovorka, D.A. Cavan, DIAS—the diabetes advisory system: an outline of the system and the evaluation results obtained so far, Comput. Methods Programs Biomed. 54 (1/2) (1997) 49–58. [10] M. Paul, S. Andreassen, E. Tacconelli, A.D. Nielsen, N. Almanasreh, U. Frank, R. Cauda, L. Leibovici, Improving empirical antibiotic treatment using TREAT, a computerized decision support system: cluster randomized trial, J. Antimicrob. Chemother. 58 (2006) 1238–1245. [11] S. Andreassen, Medical Decision Support Systems, second ed., Aalborg University Press, Aalborg, Denmark, 2001. [12] R.J. Zagoria, J.A. Reggia, Tranferability of medical decision support systems based on bayesian classification, Med. Decis. Making 3 (4) (1983) 501–509. [13] G. Lindberg, R. Seensalu, L.H. Nilsson, P. Forsell, L. Kagar, R.P. Knill-Jones, Transferability of a computer system for medical history taking and decision support in dyspepsia. A comparison of indicants for peptic ulcer disease, Scand. J. Gastroenterol. Suppl. 128 (1987) 190–196. [14] J. Nolan, P. McNair, J. Brender, Factors influencing the transferability of medical decision support systems, Int. J. Biomed. Comput. 27 (1) (1991) 7–26. [15] T. Schioler, J. Talmon, J. Nolan, P. McNair, Information technology factors in transferability of knowledge based systems in medicine, Artif. Intell. Med. 6 (2) (1994) 189–201. [16] J.A. Reggia, Computer-assisted medical decision making: a critical review, Annal. Biomed. Eng. 9 (5–6) (1981) 605–619. [17] D.J. Spiegelhalter, R.P. Knill-Jones, Statistical and knowledge-based approaches to clinical decision-support

[22]

[23]

[24]

[25]

[26] [27]

[28]

[29]

[30]

[31]

[32]

[33] [34]

[35]

[36]

systems, with an application in gastroenterology, J. R. Stat. Soc. 147 (1) (1984) 35–77. Evidence-based Medicine Working Group, Evidence-based medicine. A new approach to teaching the practice of medicine, J. Am. Med. Assoc. 268 (1992) 2420–2425. M.R. Tonelli, The limits of evidence-based medicine, Respir. Care 46 (12) (2001) 1435–1440. D.L. Sackett, W.M.C. Rosenberg, J.A.M. Gray, R.B. Haynes, W.S. Richardson, Evidence based medicine: what it is and what it isn’t, Br. Med. J. 312 (1996) 71–72. F.G. Smith, J.L. Tong, J.E. Smith, Evidence-based medicine, continuing education in anaesthesia, Crit. Care Pain 6 (4) (2006) 148–151. J. Pearl, Probabilistic Reasoning in Intelligent Systems, Morgan Kaufmann Pulishers, INC, San Mateo, California, 1988. J.H. Kim, J. Pearl, A computational model for combined causal and diagnostic reasoning in inference systems, in: Proceedings of the IJCAI-83, 1983, pp. 190–193. S.L. Lauritzen, D.J. Spiegelhalter, Local computations with probabilities on graphical structures and their application to expert systems (with discussion), J. R. Stat. Soc. Ser. B 50 (1988) 157–224. S. Andreassen, M. Woldbye, B. Falck, S.K. Andersen, MUNIN—a causal probabilistic network for interpretation of electromyographic findings, in: Proceedings of the Tenth International Joint Conference on Artificial Intelligence, 1987, pp. 366–372. D. Heckerman, E.H. Shortcliffe, From certainty factors to belief networks, Artif. Intell. Med. 4 (1) (1992) 35–52. D. Heckerman, An empirical comparison of three inference methods, in: Proceedings of the Fourth Annual Conference on Uncertainty in Artificial Intelligence, 1990, pp. 283– 302. D. Heckerman, R.A. Miller, Towards a better understanding of the INTERNIST-1 knowledge base, in: R. Salamon, B. Blum, M. Jorgenson (Eds.), Proceedings of MEDINFO 86, 1986, pp. 22–26. K.G. Olesen, U. Kjærulff, F. Jensen, F.V. Jensen, B. Falck, S. Andreassen, S.K. Andersen, A MUNIN network for the median nerve—a case study on loops, Appl. Artif. Intell. 3 (1989) 385–403. S. Andreassen, B. Falck, K.G. Olesen, Diagnostic function of the microhuman prototype of the expert system—MUNIN, Electroencephalogr. Clin. Neurophysiol. 85 (1992) 143–157. K.G. Olesen, S. Andreassen, M. Soujanen, Modularizing inference in large causal probabilistic networks, Int. J. Intell. Syst. 18 (2003) 179–191. S. Andreassen, A. Rosenfalck, B. Falck, K.G. Olesen, S.K. Andersen, Evaluation of the diagnostic performance of the expert EMG assistant MUNIN, Electroencephalogr. Clin. Neurophysiol. 101 (1996) 129–144. M.P. Wellman, J.S. Breese, R.P. Goldman, From knowledge bases to decision models, Knowl. Eng. Rev. 7 (1992) 35–53. S. Andreassen, L. Leibovici, M. Paul, A.D. Nielsen, A. Zalounina, L.E. Kristensen, K. Falborg, B. Kristensen, U. Frank, H.C. Schønheyder, A probabilistic network for fusion of data and knowledge in clinical microbiology, in: D. Husmeier, R. Dybowsky, S. Roberts (Eds.), Probabilistic Modeling in Bioinformatics and Medical Informatics, Springer, London, 2005, pp. 451–452. L. Leibovici, M. Fishman, H.C. Schønheyder, C. Riekehr, K. Kristensen, I. Shraga, S. Andreassen, A causal probabilistic network for optimal treatment of bacterial infections, IEEE Trans. Knowl. Data Eng. 12 (2000) 517–528. J.S. Breese, R.P. Goldman, M.P. Wellmann, Introduction to the special section on knowledge-based construction of probabilistic and decision models, IEEE Trans. Syst. Man Cybernet. 24 (11) (1994) 1577–1579.

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 9 ( 2 0 0 8 ) 189–201

[37] D. Poole, Probabilistic horn abduction and Bayesian networks, Artif. Intell. 64 (1) (1993) 81–129. [38] P. Haddawy, Generating Bayesian networks from probability logic knowledge bases, in: Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, 1994, pp. 262–269. [39] M. Jaeger, Relational Bayesian networks, in: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, 1997, pp. 266–273. ˇ [40] D. Koller, Probabilistic relational models, in: S. Dzeroski, P. Flach (Eds.), Proceedings of the Nineth International Workshop on Inductive Logic Programming, vol. 1634 of Lecture Notes in Artificial Intelligence, 1999, pp. 3–13 (Invited paper). [41] D. Heckerman, C. Meek, D. Koller, Probabilistic entity-relationship models, PRMs, and plate models, in: Working Notes of the ICML-2004 Workshop on Statistical Relational Learning and Connections to Other Fields, 2004, pp. 55–60. [42] D. Koller, A. Pfeffer, Object-oriented Bayesian networks, in: D. Geiger, P. Shenoy (Eds.), Proceedings of the Thirteenth Annual Conference on Uncertainty in Artificial Intelligence, 1997, pp. 302–313. [43] K.B. Laskey, S.M. Mahoney, Network fragments: representing knowledge for construction of probabilistic models, in: D. Geiger, P. Shenoy (Eds.), Proceedings of the Thirteenth Annual Conference on Uncertainty in Artificial Intelligence, 1997, pp. 334–341. [44] O. Bangsø, P.H. Wuillemin, Top-down construction and repetitive structures representation in Bayesian networks, in: J. Etheredge, B. Manaris (Eds.), Proceedings of the Thirteenth International Florida Artificial Intelligence Research Society Conference, 2000, pp. 282–286. [45] O. Bangsø, Object oriented Bayesian networks, Ph.D. thesis, Aalborg university, Aalborg, Denmark, 2004. [46] W.L. Buntine, Operations for learning with graphical models, J. Artif. Intell. Res. 2 (1994) 159–225.

201

[47] L.J. Dorfman, L.R. Robinson, AAEM Minimonograph #47: normative data in electrodiagnostic medicine, Muscle Nerve 20 (1) (1997) 4–14. [48] W.W. Campbell, L.R. Robinson, Deriving reference values in electrodiagnostic medicine, Muscle Nerve 16 (4) (1993) 424–428. [49] J. Kimura, Electrodiagnosis in Diseases of Nerve and Muscle: Principles and Praxis, Oxford University Press, Oxford, 2001. [50] S. Andreassen, Knowledge representation by extended linear model, in: E. Keravnou (Ed.), Deep Models for Medical Knowledge Engineering, Elsevier, Amsterdam, 1992, pp. 129–145. [51] K.G. Olesen, S. Andreassen, Specification of models in large expert systems based on causal probabilistic networks, Artif. Intell. Med. 5 (1993) 269–281. [52] M.J. Pratt, Introduction to ISO 10303—the STEP standard for product data exchange, J. Comput. Inform. Sci. Eng. 1 (1) (2001) 102–103. [53] M.J. Pratt, B.D. Anderson, T. Ranger, Towards the standardized exchange of parameterized feature-based CAD models, Comput.-Aid. Design 37 (12) (2005) 1251–1265. [54] C. Shih, B. Anderson, A design/constraint model to capture design intent, in: Proceedings of the Fourth ACM Symposium on Solid Modeling and Applications, 1997, pp. 255–264. [55] G. Booch, Object-Oriented Analysis and Design with Applications, Benjamin/Cummings, Redwood City, CA, 1994. [56] B. Meyer (Ed.), Object-Oriented Software Construction, second ed., Prentice Hall, Santa Barbara, 1997. [57] E. Gamma, R. Helm, R. Johnson, J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley, Boston, 1995. [58] A.L. Madsen, F. Jensen, U.B. Kjærulff, M. Lang, The Hugin tool for probabilistic graphical models, Int. J. Artif. Intell. Tools 14 (3) (2005) 507–543.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.