Water reservoir control under economic, social and environmental constraints

Share Embed


Descripción

Automatica 44 (2008) 1595–1607 www.elsevier.com/locate/automatica

Water reservoir control under economic, social and environmental constraintsI Andrea Castelletti ∗ , Francesca Pianosi, Rodolfo Soncini-Sessa Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milan, Italy Received 31 March 2007; received in revised form 28 March 2008; accepted 31 March 2008 Available online 5 May 2008

Abstract Although great progress has been made in the last 40 years, efficient operation of water reservoir systems still remains a very active research area. The combination of multiple water uses, non-linearities in the model and in the objectives, strong uncertainties in inputs and high dimensional state make the problem challenging and intriguing. The purpose of this paper is to review, in a strict Control Theory perspective, recent and significant advances in designing management policies for water reservoir networks, under economic, social and environmental constraints. A general and thorough problem formulation is provided, along with a description of traditional solution techniques, their limitations and possible alternative approaches. c 2008 Elsevier Ltd. All rights reserved.

Keywords: Stochastic control; Nonlinear control; Multiobjective optimisation; Multipurpose water reservoirs; Uncertain dynamic systems

1. Introduction Accounting for almost 20% of the World’s electrical output, hydropower is currently the World’s largest renewable source of electricity. Although it typically costs more per kW h than burning coal, oil or natural gas, hydropower is basically a non-polluting way of producing electricity: hydroplants do not emit any of the standard atmospheric pollutants, such as carbon dioxide or sulphur dioxide, that contribute to global warming and acid rain. It can therefore be a valuable contribution to meeting the Kyoto requirements on carbon emission reduction.1 In addition, plant operating costs are usually low because there are no fuel costs, and maintenance requirements are minimal: hydropower is essentially inflation proof. I This paper was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Editor Alain Haurie. ∗ Corresponding author. Tel.: +39 2 23999601; fax: +39 2 23999611. E-mail addresses: [email protected] (A. Castelletti), [email protected] (F. Pianosi), [email protected] (R. Soncini-Sessa). 1 This assertion is questioned by a few recent studies, e.g. Fearnside (2004), on large tropical reservoirs. They suggested that decaying vegetation, submerged by flooding, may give off quantities of greenhouse gases equivalent to those from other sources of electricity, and that consequently hydropower would not, after all, be a panacea for climate change. This is still the focus of active debate (Rosa, Santos, Matvienko, Santos, & Sikar, 2004).

c 2008 Elsevier Ltd. All rights reserved. 0005-1098/$ - see front matter doi:10.1016/j.automatica.2008.03.003

However, hydropower is not without economic, social and environmental impacts. By significantly altering water levels and downstream water flow patterns, the normal operation of hydropower storage systems, namely reservoir networks, can have very negative effects on a range of economic interests (irrigated agriculture, fisheries, forestry, etc.), on the local human population (potable water supply, flooding, navigation and recreation activities) and on the flora and fauna that inhabit the surrounding areas and downstream water bodies. Typically, hydropower-irrigation conflicts may arise as a consequence of the difference in seasonal timing of power demand patterns and irrigation water needs, the former having their highest peak in winter and the latter having their greatest value in the growing summer season. Another usual effect of the spring-to-winter water volume reallocation operated by hydropower reservoirs is the increased risk of flooding on reservoir shores during the spring, snow-melt driven, floods. Superimposed on these seasonal conflicts are short-term, even hourly, fluctuations in downstream flows, in response to changing daily demands for hydropower (usually due to hydropeaking), which inevitably result in a conflict between hydropower and the downstream environment: the existing flow pattern of the river is disrupted, and along with this, all the habitats and species that depend on those patterns are endangered.

1596

A. Castelletti et al. / Automatica 44 (2008) 1595–1607

Fortunately, a change in water use priority away from power generation exclusively towards a multi-purpose, multistakeholder and integrated perspective (see GWP-Global Water Partnership (2000)) has been underway in many parts of the world for several decades, driven by a growing consciousness of potentially disastrous effects of a global ‘water crisis’ (Brown, 2001) and the increasing opposition worldwide to new large storage projects (McCully, 2001). In many systems (see for instance Kotchen, Moore, Lupi, and Rutherford (2006) and references therein), current regulation agreements are being relicensed with explicit inclusion of economic, social and environmental constraints on hydropower operations, e.g. by imposing regulation ranges and minimum environmental flows on downstream rivers. The existence of multiple water uses, often unplanned in the original design of the hydropower storage system, greatly complicates its current operation: the potential number of operational alternatives multiplies and existing conflicts and competitions make it more complicated to decide which one to adopt. This partly explains why many systems worldwide are failing to produce the level of performance for which they were designed (World Commission on Dams, 2000). Although only 10% of the worlds hydropower potential is currently being exploited (Khagram, 2004), the combination of few additional sites, huge construction costs, reduction of financial support from the World Bank and the above-mentioned changing water use priorities sharply limit the potential for expansion. Attention must therefore focus on efficient and effective operation of existing multipurpose reservoir networks, with the aim of maximizing their performance with respect to all the water uses involved. This requires the adoption of a rational approach that considers all the economic, social and environmental interests in a fully integrated manner and allows for systematically selecting more efficient (in the Pareto sense) operational alternatives with respect to these interests. The problem of finding operational alternatives for efficiently managing water reservoir networks has fascinated analysts since the pioneering work of Rippl (1883). However, only towards the end of the 50s was it understood (Maas et al., 1962), under the influence of the budding research areas of Control Theory and Operations Research, that the rational approach to the problem is to formulate operational alternatives in the form of feedback control policies. When this new approach was introduced, the design of a policy appeared, from the system analysts’ viewpoint, a well-posed problem, where the only difficulties were of a computational nature, due to limitations in speed and memory size of the computers available. In the 70s, thanks to a rapid increase in computer performance, more and more complex algorithms were developed (among the others, see Heidari, Chow, Kokotovic, and Meredith (1971), Sniedovich (1979), Su and Deininger (1974), Tauxe, Inman, and Mades (1979), Yeh (1985) and references therein). These algorithms claimed to solve different problems, but actually solved different formulations of the same problem: a Stochastic Optimal Control Problem for a periodic system.

The problem still remains intellectually challenging despite being extensively studied in the last few decades, since its formulation and solution pose a number of intriguing difficulties. Precisely: (1) Multiple and conflicting interests (objectives). The keystone hypothesis upon which the simple approach of the 70s was based proves unfounded: policy design is not well structured and fully rational. As a consequence, the concept of optimality must be replaced by that of Pareto efficiency: policies have to be designed through the formulation of a multi-objective (MO) problem, whose solution is no longer a mere technical exercise but requires consideration of the preference structure of the parties involved. To separate technical issues from preference aspects, the most commonly adopted strategy is to reduce the multiobjective (MO) problem into a set of parametric singleobjective (SO) optimal control problems. The solution, as the parameter varies, provides the Pareto Efficient Decision Set from which the Pareto Frontier is derived. Solution techniques from Control Theory are used to solve the SO problems, while the choice of a point on the Frontier requires the adoption of Decision Making methods and Negotiation Theory, in the presence of multiple decision makers. In this paper we will consider only aspects pertaining to Control Theory, i.e. the formulation and solution of the SO control problem. (2) The model of the system to be controlled is highly nonlinear. (3) Objective functions are usually non-linear and strongly asymmetric. (4) Strong uncertainties (e.g. of the inflow) affect the system and cannot be neglected. (5) The problem is usually formulated over an infinite horizon, since the lifetime of the water system under examination cannot be arbitrarily limited. (6) Different formulations of the problem are possible: aggregation of step-cost functions over time can be performed by using either summation and/or other operators, such as the max; the criterion for filtering uncertainties depends on the risk aversion of the parties involved; the infinite horizon can be managed in different ways, depending on the predominant characteristics of the objectives (economic or physical) and on whether or not transient periods have to be considered. (7) The policy actuator is a human being, i.e. the policy does not directly control the penstock sluice gates, but proposes a control decision to the regulator. Moreover, according to World Commission on Dams (2000), the ‘non-prescriptive’ nature of the policy is the key to its acceptance by the regulator. Therefore set-valued policies, providing a set of equivalent controls, are preferred to pointvalued policies traditionally employed to control electromechanical systems. The purpose of this paper is to review, in a strict Control Theory perspective, the most recent advances in designing management policies for water reservoir networks under

1597

A. Castelletti et al. / Automatica 44 (2008) 1595–1607

economic, social and environmental constraints. Emphasis is given to technical implications on the problem formulation and solution of the very nature of the water reservoir systems, with the aim of clarifying why many traditional control techniques, widely and successfully applied in other fields, are here unapplicable or strongly limited. Mind that the term reservoir network is used here to refer to a physical network of catchments, reservoirs, power plants, and other users (including environmental services) hydraulically interconnected. Clearly, as a hydropower generator, the reservoir network may be electrically interconnected with other generators within an energy market, but these interactions will not be discussed in this paper. Many authors (see among the others Breton, Haurie, and Kalocsai (1978), Thompson, Davison, and Rasmussen (2004) and references therein) discuss this topic, however, they usually consider hydropower production only, while here the truly multi-objective nature of the problem is recognized. The paper is organized as follows. Section 2 introduces models of water system components, formulates the policy design problem as a MO problem and, finally, transforms it into an SO problem. Section 3 shows how Stochastic Dynamic Programming (SDP) is the more natural algorithm for solving the design problem and why, on the contrary, the well known and appealing LQG approach is totally unsuited. Section 4 presents different approaches that have been proposed to cope with the computational complexity of SDP; attention is given to the particular implication of each of them in the context here considered. Finally, in the concluding section, remarks on the limits of this review and on it neighboring areas are scrutinized. 2. Formulation of policy design problems 2.1. Model of the system The system under study is composed of N reservoirs that drain water from M catchments. Reservoirs are connected with each other and with water users like, for example, power plants or irrigation districts, by a network of natural and artificial canals. Even if the physical processes that are involved in the system are obviously time-continuous, the model is timediscrete since the decision time-step is such. Here we will briefly discuss models of reservoirs, catchments and water users with the aim of highlighting those characteristics that influence the formulation and solution of the control problem, while we will give for granted models of canals, diversion dams and junctions. For more detailed discussion on the model of the water system, see Soncini-Sessa, Castelletti, and Weber (2007). 2.1.1. Reservoirs Model of the j-th water reservoir is based on the mass balance equation j

j

j

j

st+1 = st + qt+1 − rt+1 j

(1a) j

where st is the storage in the j-th reservoir at time t, qt+1 j

is the inflow volume in the time interval [t, t + 1) and rt+1 is the release in the same interval. Other terms like direct

precipitation on the reservoir, infiltration and evaporation have been neglected but they can be added to the mass balance when necessary. The time subscript of each variable denotes the time instant at which it assume a deterministic value, e.g. lake storage is measured at each time t and thus is denoted with st , while inflow in the interval [t, t + 1) is denoted with qt+1 since it can be deterministically known only at the end of the interval. j Inflow qt+1 is the outflow of a drainage network fed by the i releases rt+1 (i = 1, . . . , i 6= j) of the upstream reservoirs k (if any) and by the outflows at+1 (k = 1, . . .) from natural and uncontrolled catchments; the latter are described in the following section. j j Release rt+1 is function of the control variable u t (which is the release decision made at time t for reservoir j), of the j j storage st and of the inflow qt+1 j

j

j

j

j

rt+1 = Rt (st , u t , qt+1 ). j

The function Rt (·) is called release function and it is a nonlinear periodic function of the following form  j j j j j j j  vt (st , qt+1 ) if u t < vt (st , qt+1 ) j j j j j j j j j Rt (st , u t , qt+1 ) = Vt j (stj , qt+1 ) if u t > Vt (st , qt+1 ) (2)   j ut otherwise j

j

where vt (·) and Vt (·) are the minimum and maximum releases that can be produced in the time interval [t, t + 1) by keeping all the sluice gates completely closed and completely open respectively. These two functions are computed by integrating, over the interval [t, t + 1), the continuous-time mass balance equation of the j-th reservoir ds j = q j (ζ ) − r j (ζ ) dζ

(3)

where the instantaneous inflow q j (ζ ) is supposed to be constant j and equal to qt+1 /∆, ∆ being the length of the modelling step, and the instantaneous outflow r j (ζ ) is given   by the minimum min, j j N s (ζ ) or maximum N max, j s j (ζ ) storage-discharge relation of the reservoir sluice gates and spillway (Piccardi & Soncini-Sessa, 1991). Introduction of the release function thus allows for the inclusion of physical constraints into the model, j which makes it possible for the actual release rt+1 to differ j

from the release decision u t , e.g. when the available water is not sufficient to realize the decision or when a spill takes place. 2.1.2. Uncontrolled catchments Physically-based models for the description of outflows from natural, uncontrolled catchments are difficult to include in the formulation of the reservoir control problem: the state of such models, in fact, is usually not observable and, more importantly, it can be very large (thousands of variables when the model is spatially-distributed), thus leading to computational problems. Therefore simple statistical models k are generally adopted; for example, outflow at+1 from the k-th uncontrolled catchment can be assumed to be a cyclostationary,

1598

A. Castelletti et al. / Automatica 44 (2008) 1595–1607

lognormal, stochastic process with periodic mean and standard deviation µkt and σtk and its dynamics be described as   k k (4a) at+1 = exp yt+1 · σtk + µkt k k Ak (z −1 )yt+1 = εt+1

(4b)

where Ak is a polynomial in the backward shift operator z −1 k and εt+1 is a zero mean Gaussian white noise with constant variance. 2.1.3. Water users and environmental, social and economic constraints The presence of various water users and other social and environmental interests can be formalized either by defining step-cost functions associated with the system’s transitions or by imposing constraints on some variables. Given the variety of issues that can be considered in reservoir control problems, it is not possible to make a general discussion of the topic; in the following we will thus provide only some examples for the most common water users and environmental constraints. A safeguard of the river and riparian ecosystem downstream from the j-th reservoir can be guaranteed by introducing a minimum environmental flow (MEF) constraint on reservoir release, while conflict and competition among water users can be mitigated by imposing a regulation range on reservoir storage. These constraints are accounted for by suitably modifying the minimum and maximum instantaneous storagedischarge relations N min, j (·) and N max, j (·) that are used in the computation of the minimum and maximum release volumes j j j vt (·) and Vt (·) (Section 2.1.1). For example, let q˜t be the min, j max, j MEF value for the time interval [t, t + 1) and (st , st ) the regulation range for the same interval. Then, the minimum instantaneous storage-discharge relation is modified as min, j j j N˜ t (s (ζ ), qt+1 )  ( j j ) j  q˜t q˜t qt+1  min, j j  , , if N (s (ζ )) ≤ min    ∆ ∆ ∆ = otherwise   max j   N (s (ζ )) if s j (ζ ) > stmax,t    N min (s j (ζ )) otherwise.

j

j

j

j

(5)

(6a)

j

where ϑt is the price of electricity, averaged over interval j j [t, t + 1), wt is the demand for electricity, G t+1 is electricity production, and the operator z + = max(z, 0) is used. Demand for electricity and its price can be either computed using complex dynamic models (see for example Thompson et al. (2004)) or simply specified as a given scenario. Production j G t+1 is computed as j

d, j

G t+1 = η j qt+1 Ht

The interests of the l-th irrigation district may be described by introducing a step-cost function that expresses the supply deficit d,l + l dt+1 = (wtl − qt+1 )

(7)

d,l where qt+1 is the flow supplied to the irrigation district and l wt is its water demand. The latter can be specified as a given scenario or it can be the output of a dynamical model of the crop’s growth (Wallach, Makowski, & Jones, 2006).

2.1.4. Global model The model of the water system is obtained by suitably aggregating models of reservoirs, catchments, water users, canals, diversion dams and junctions that compose it. The result is a discrete-time, periodic, non-linear, stochastic (or uncertain) system of the form xt+1 = f t (xt , ut , εt+1 )

(8)

where xt ∈ Rn x , ut ∈ Rn u and εt ∈ Rn ε are the state, control and disturbance vectors. The state is composed of state variables of the N reservoirs, i.e. their storage, state variables of the M catchments, and, when applicable, the state of the canals and water users 1 M M T xt = [st1 , . . . , stN ; yt1 , . . . , yt− p1 ; . . . ; yt , . . . , yt− p M , . . .] (9)

Interests of the hydropower company owning the j-th plant are described by introducing a step-cost function that expresses lost revenue in the interval [t, t + 1) lt+1 = ϑt (wt − G t+1 )+

d, j

where η j is a unit conversion and efficiency factor, qt+1 is the flow in the penstock and Ht is the hydraulic head. When j the reservoir itself is the pondage of the plant, Ht depends on the level of the water surface in the reservoir and thus on j d, j its storage st . The flow qt+1 does not always coincide with reservoir release due to the presence of a minimum q min, j and j maximum q max, j flow turbinable in the plant and/or a MEF q˜t    j + max, j j  , min (rt+1 − q˜t ) , q d, j j j + min, j (6c) qt+1 = if (rt+1 − q˜t ) ≥ q   0 otherwise.

(6b)

where pk is the order of polynomial Ak (z −1 ) in Eq. (4b). The control vector is composed of N release decisions for N reservoirs ut = [u 1t , . . . , u tN ]T . The disturbance vector is composed of M random disturbances that appear in models of uncontrolled catchments and any other random variable that could be used to describe random terms in the reservoir mass balance equation (e.g. evaporation, infiltration, etc.) or in the model of canals and water users. For example, if uncontrolled catchments are described with models of the form (4b) and no other disturbance affects the water system, the disturbance vector is given by 1 M T εt+1 = [εt+1 , . . . , εt+1 ] .

Depending on how scalar disturbances have been modelled, the disturbance vector εt+1 is either uncertain or stochastic and is described in terms of a membership-set Ξt or a pdf

1599

A. Castelletti et al. / Automatica 44 (2008) 1595–1607

φt (·) respectively. At each time t, either Ξt and φt (·) may be a function of the state and control at the same time εt+1 ∼ φt ( · |xt , ut )

or

εt+1 ∈ Ξt (xt , ut ).

(10)

efficient policies (see for instance Miettinen (1999)). Each efficient policy in P can be computed by solving the following single objective (SO) optimal control problem min J

(15)

p

2.2. The control problem For each of the m issues present in the system (water demand for hydropower production and/or irrigation, flood control, respect of environmental quality standards, etc.) an objective function J i (with i = 1, . . . , m) can be defined to express the cost paid by the i-th sector over the time horizon [0, h], h  Ji = Ψ Φ g0i (x0 , u0 , ε1 ), . . . , ε 1 ,...,ε h

i gh−1 (xh−1 , uh−1 , ε h ), ghi (xh )

i

(11)

where gti (·) for t = 1, . . . , h − 1 are step-cost functions associated with transitions from t to t + 1, ghi (·) is a penalty function over the final state, Φ is an operator for aggregation over time and Ψ is a statistic used to filter the disturbance. Examples of step-cost function are Eqs. (6a) and (7). Common choices for aggregation over time are the sum (Φ = Σ ) and the maximum (Φ = max). As for the filtering operator, the expected value is often used (Ψ = E); however the maximum (Ψ = max) is preferred when stakeholders are risk averse (Orlovski, Rinaldi, & Soncini-Sessa, 1983, 1984; SonciniSessa, Zuleta, & Piccardi, 1991). Only two combinations of these operators are of interest for practical applications: Ψ = E and Φ = Σ (so called Laplace problem) and both Ψ and Φ equal to the maximum operator (Wald problem). Thus in the following we will consider only these two cases. At each time step, the release decision for each reservoir is given by the control law ut = m t (xt ).

(12)

The scope of the control problem is to define the sequence of control laws m t (·) over the horizon [0, h − 1], i.e. the release policy p = [m 0 (·), . . . , m h−1 (·)].

(13)

Therefore the multi-objective (MO) control problem is formulated as h i (14) min J 1 , J 2 , . . . , J m p

subject to the constraints (8), (10), (12) and (13) and given x0 . The pdf formulation in Eq. (10) is used when the filtering criterion Ψ in Eq. (11) is the expected value, the membershipset formulation is used when Ψ is the maximum. Note that the control variable is unconstrained because unfeasible decisions are not transformed into feasible ones due to the form of the reservoir’s model. As anticipated in the introduction, an ‘optimal’ solution to the control problem, i.e. a policy p ∗ that minimizes all the objectives, does not generally exist. In a multi-objective framework, the solution is constituted by the set P of Pareto

subject to the constraints (8), (10), (12) and (13) and given x0 , where J is derived from J 1 , J 2 , . . . , J m with a suitable method (see for instance Lotov, Bushenkov, and Kamenev (2004)) and is of the form  J = Ψ Φ g0 (x0 , u0 , ε1 ), . . . , ε 1 ,...,ε h

 gh−1 (xh−1 , uh−1 , ε h ), gh (xh ))

(16)

where gt (·) and gh (·) are the aggregate step-cost and penalty functions obtained from gti (·) and ghi (·) (with i = 1, . . . , m) according to the aggregation method used to trace back the MO problem to a SO problem. The choice of the method is constrained by the formulation of the problem that has been adopted, and in particular by the choice of the filtering operator Ψ . Note that the number of efficient alternatives is generally infinite and thus only a finite subset of P can actually be computed. In the context of environmental system management, the choice of the length of the time horizon and of the penalty function gh (xh ) is critical since the life time of the system is obviously infinite. Therefore it is more convenient to use an infinite horizon and let gh (·) = 0. If the model of the system and all the step-cost functions are cyclostationary with period T , the problem is well-posed and its solution is a periodic policy. The SO problem over an infinite horizon is thus formulated as min lim J

(17)

subject to (8), (10) and (12), given x0 and   p = m 0 (·), . . . , m T −1 (·)

(18)

p h→∞

instead of (13). Note that if the aggregation over time consists of summing the step-costs, i.e. Φ = Σ in Eq. (16), the objective function must be adjusted in order to avoid divergence because it is not guaranteed that the controlled system will converge to a stable cycle where all costs are zero. To overcome this difficulty, the objective function can be defined as the Total Discounted Cost (TDC) " # h X J = lim Ψ γ t gt (xt , ut , εt+1 ) (19) h→∞ ε 1 ,...,ε h

t=0

with 0 < γ < 1, or as the Average Expected Value (AEV) " # h 1 X J = lim Ψ gt (xt , ut , εt+1 ) . (20) h→∞ ε 1 ,...,ε h h + 1 t=0 The TDC form gives more weight to short-term, transient conditions and is well suited for expressing economic costs. The AEV, instead, gives emphasis to steady-state conditions and is more suitable when social or environmental costs are considered.

1600

A. Castelletti et al. / Automatica 44 (2008) 1595–1607

3. Solving the control problem 3.1. Stochastic dynamic programming Stochastic Dynamic Programming (SDP) (Bellman, 1957) appears to be the most suitable method for solving problem (15). The first application of (deterministic) dynamic programming to water systems management is probably owed to Hall and Buras (1961). Since then, the method has been applied with success to the control of reservoirs, especially for hydropower production (see, among others, Esogbue (1989), Fults and Hancock (1972), Hall, Butcher, and Esogbue (1968), Heidari et al. (1971), Trott and Yeh (1973), Turgeon (1980)). Beginning in the early 1980s, interest also spread in the stochastic version of dynamic programming for the control of multi-purpose reservoirs and networks of reservoirs (see the reviews Yakowitz (1982), Yeh (1985) and the contributions Gilbert and Shane (1982), Hooper, Georgakakos, and Lettenmaier (1991), Read (1989), Tejada-Guibert, Johnson, and Stedinger (1995), Vasiliadis and Karamouz (1994)). Note that here, under the name of stochastic dynamic programming, we also consider the extension proposed by Piccardi (1993a,b) to the uncertain case. One of the reason for the success of SDP lies in its wide applicability. In fact, the only conditions required for is application are: (1) inputs in the model be either controls or random disturbances, i.e. it is not possible to consider uncontrolled, exogenous, deterministic variables whose values are known in real time (e.g. rainfall measures); (2) the membership-set or the pdf of the disturbance vector be of the form (10), i.e. that the disturbance process be independent in time or that, at time t, any dependency on the past could be completely accounted for by the value of the state at the same time; and (3) step-cost functions gt (·) only depend on variables defined for the same time interval. The Bellman equation for the SO finite horizon optimal control problem (15) is Bertsekas (1976), Piccardi (1993a)    Ht (xt ) = min Ψ Φ gt (xt , ut , εt+1 ), Ht+1 (xt+1 ) (21) u t ε t+1

where Ht (·) is the optimal cost-to-go for the aggregate objective and only the following combinations of operators Φ and Ψ are considered Φ[v, w] = v + w and Ψ = E Φ[v, w] = max{v, w} and Ψ = max . The solution is computed by initializing Hh (xh ) with gh (xh ) and recursively computing Ht (xt ) with Eq. (21). Once the optimal cost-to-go have been computed for all time instant t = h −1, . . . , 0, the optimal control law at any time t is derived as    m t (xt ) = arg min Ψ Φ gt (xt , ut , εt+1 ), Ht+1 (xt+1 ) . (22) u t ε t+1

Thus it is a look-up table in which each state value xt is associated with the optimal control value ut . Practical computation of (21) requires that the sets Sxt , Su t , and Sεt , of state, control and disturbance variables be

finite at each time t. If this is not the case, the sets Sxt , Su t , and Sεt must be discretized and the model replaced by the corresponding automaton. Uniform discretization is suitable when no information is available about the form of the optimal cost-to-go function Ht (·). Intuition is confirmed by some numerical analysis results (Cervellera & Muselli, 2004), which show that the error in estimation of Ht (·),  given the values that it assumes in P points xit , Ht (xit+1 ) with xit ∈ Sxt , is proportional to an index, called the discrepancy index, which expresses the minimum density of the points xit among all subsets of Sxt . For fixed P, uniform discretization has a low discrepancy index and thus produces a low estimation error. However, when adopting a uniform grid, P = N xntx and thus the number of points P can not be increased continuously and the distance between two successive values of P increases exponentially with n x − 1. Methods have been developed (Fang & Wang, 1994; Niederreiter, 1992) to iteratively produce non-uniform discretizations whose discrepancy index decreases polynomially with P (low-discrepancy sequences). When an infinite horizon is considered, the idea is still to recursively solve Eq. (21); however the algorithm is started at time t = 0 and with suitable initialization for H0 (x0 ) and it continues backwards in time until the optimal costto-go function converges to a periodic function of period T . Initialization can be arbitrary chosen when Ψ = E, while it must be equal to H0 (x0 ) =

inf

xt ∈Sxt ,ut ∈Su t ,ε t+1 ∈Sεt+1

gt (xt , ut , εt+1 )

when Ψ = max. If the TDC formulation (19) is used, the operator Φ[·, ·] in the Bellman equation (21) must be defined as Φ[v, w] = v + γ w, which guarantees that Ht (·) do not diverge. If instead the AEV formulation (20) is used, it is not possible to avoid divergence of Ht (·) if it is recursively computed with Eq. (21). To overcome this difficulty the idea is to replace Ht (xt ) with the difference between Ht (xt ) and the cost-to-go Ht (¯xt ) of a reference state x¯t . Based on this idea, the Successive Approximation Algorithm (ASA) has been proposed for either the stationary (White, 1963) and cyclostationary (Su & Deininger, 1972) case. Asymptotical convergence of both the algorithms is guaranteed under suitable conditions (see Bertsekas (1976) for the stochastic case, Piccardi (1993a) for the uncertain one) which are always satisfied by real world water systems. The main limit of SDP is its computational complexity. Let N xt , Nu t and Nεt be the number of elements in the discretized state, control and disturbance sets Sxt ⊂ Rn x , Su t ⊂ Rn u and Sεt ⊂ Rn ε : the recursive resolution of (21) for K iteration steps (with K = h if the optimization horizon is finite and K = kT if the horizon is infinite, where T is the period and k is usually lower than ten) requires  K · N xntx · Nuntu · Nεntε (23) evaluations of the operator Φ[·, ·] in (21). Eq. (23) show the so called curse of dimensionality, i.e. exponential growth of computational complexity with the state and control dimension.

A. Castelletti et al. / Automatica 44 (2008) 1595–1607

It follows that SDP cannot be applied to water systems where the number of reservoirs is greater than a few units. 3.2. Set-valued control policy Note that Eq. (22) might have more than one solution. If this is the case, the set Mt of all solutions of (22) can be computed. This set contains all the equivalent optimal controls and is a function of the state, thus it is a set-valued control law and the sequence P = [M0 (·), . . . , Mh−1 (·)] can be proved to be the optimal set-valued policy. Aufiero, SonciniSessa, and Weber (2001, 2002) prove that P is the ‘largest’ set-valued policy that solves problem (15). Determining the general set-valued policy requires almost the same computing time as determining a point-valued policy and it can prove to be much more effective for a reservoir’s control problem. In fact, not only uniqueness of the solution is not necessary, since control is supposed to be implemented by a human regulator, but it is not even favourable: leaving the regulator the possibility of choosing a control in Mt is preferable since in this way (s)he can consider other information that are available when the release decision is taken (e.g. down-time periods of some plant) but that have not been included in the model of the system when formulating the control problem. Adoption of a set-valued policy approach turns out to be particularly useful also when some priority among the objectives can be established a priori (e.g. accordingly to national regulations). In this event, the optimal control problem (15) can be reformulated decomposing it into a hierarchy of q (with q ≤ m) single or/and multi-objective subproblems (lexicographic approach), each of which is formulated considering as a feasible control set the optimal set-value policies obtained by solving the problem at a higher level in the hierarchy. A numerical implementation of the lexicographic approach can be found in Weber, Rizzoli, Soncini-Sessa, and Castelletti (2002).

1601

control/release (Ozelkan et al., 1997). As a nominal trajectory of the inflow, they assume its cyclostationary mean, computed over past measures. As for storage and control/release trajectories, Ozelkan et al. (1997) use cyclostationary mean values observed in the past; McLaughlin and Velasco (1990) suggest to use the trajectories obtained by simulating the system under a simple operating rule and nominal inflow trajectory. The advantages of this approach are the well known advantages of the LQG control scheme. In particular, it does not require discretization and does not suffer of the curse of dimensionality. On the other hand, it requires introducing a number of strong approximations and a priori assumptions which compromise optimality of the solution. First, the deviation from the nominal inflow value is assumed to be a zero mean gaussian noise, either white or modelled as an autoregressive process. However, in most of the cases this assumption is rather unrealistic since deviations above the mean value are much greater than deviations below it. Second, linearization of the reservoir model implies that unintentional spills do not occur and that storage is unbounded both superiorly and inferiorly. Thus, the policy obtained by solving the LQG problem can suggest unfeasible controls; if the difficulty is overcome by adopting the nearest feasible control, the resulting policy is sub-optimal. Third, in most of the cases the formulation of the step-cost function gt (·) as a linear combination of the squared deviations of the state and control/release from nominal values is a too rough an approximation, since gt (·) is derived from step-costs gti (·) that are in general strongly asymmetrical (see, for example, Eqs. (6) and (7)). Finally, the policy obtained with LQG approach aims at maintaining the system on the nominal trajectory but the latter is assumed a priori and thus it is not the optimal trajectory according to the problem objective, as defined by Eq. (16). The point is that finding the trajectory to be followed is the very scope of the problem. 4. Reducing computational complexity

3.3. Linear Quadratic Gaussian control If the system were linear and the cost function quadratic, the well known results of Linear Quadratic Gaussian (LQG) control could be used to solve the optimal control problem. Some authors (see for instance McLaughlin and Velasco (1990), Ozelkan, Galambosi, Fernandes, and Duckstein (1997), Wasimi and Kitanidis (1983)) have followed this approach. In order to obtain a linear model, they simplify the reservoir’s mass balance equation by making the release coincide with the release decision and express it in terms of deviations of storage, control/release and inflow from some pre-computed nominal values. In order to obtain a quadratic cost function, for any t either they replace the step-cost function gt (·) with its first order Taylor expansion around the same nominal values (McLaughlin & Velasco, 1990) or they directly define it as a linear combination of the squared deviations of storage and

Many approaches have been proposed to partially remedy the computational complexity of SDP, e.g. coarse grid approximation, the use of Lagrange multipliers, approximation with Legendre polynomials (Bellman & Dreyfus, 1962; Kaufmann & Cruon, 1967; Larson, 1968) and techniques for particular problem formulation (Luenberger, 1971; Wong & Luenberger, 1968). However these methods have been conceived mainly for deterministic problems and thus are of scarce interest for the optimal control of reservoirs networks where the impact of uncertain inputs, especially those due to uncontrolled catchments, cannot be neglected. In the following sections we will present other approaches that have been proposed to overcome the curse of dimensionality. They can be classified based on the strategy that is adopted for reducing the problem complexity: reducing the degrees of freedom of the control problem (Section 4.1) or modifying the model (Section 4.2).

1602

A. Castelletti et al. / Automatica 44 (2008) 1595–1607

4.1. Reducing degrees of freedom of the problem In order to reduce the complexity of the problem by acting on its degrees of freedom, two approaches can be followed: fixing a priori the form of the optimal cost-to-go function, as discussed in Section 4.1.1, or directly fixing the form of the control law, as discussed in Section 4.1.2. 4.1.1. Fixed-class optimal cost-to-go Instead of computing the exact value of Ht (·) for N xt state values, the idea is to evaluate it in a smaller number ( N˜ xt < N xt ) of points and then interpolate such points with a function of a fixed-class. Thereby Eq. (21) must be replaced by h i Hˆ t (xt ) = min Ψ Φ gt (xt , ut , εt+1 ), H˜ t+1 (xt+1 ) (24) u t ε t+1

where H˜ t+1 (·) is an estimate of the optimal cost-to-go Ht+1 (·). This estimate is derived from the N˜ xt+1 evaluations of Hˆ t+1 (·) made at previous step, by interpolating the points {(xit+1 , Hˆ t+1 (xit+1 )); i = 1, . . . , N˜ xt+1 }, with a fixed-class function. As for the choice of the latter, different classes have been proposed, e.g. linear polynomials (Bellman, Kabala, & Kotkin, 1963; Tsitsiklis & Van Roy, 1996), cubic Hermite polynomials (Foufoula-Georgiou & Kitanidis, 1988) and splines (Johnson, Stedinger, Shoemaker, Li, & Tejada-Guibert, 1993). However, the most successful choice (Castelletti, de Rigo, Rizzoli, Soncini-Sessa, & Weber, 2005, 2007) appears to be that of neural networks, which leads to the so called Neural Stochastic Dynamic Programming (NSDP) approach. NSDP can be used for either finite and infinite horizon except for the AEV formulation, since in this case the convergence of the solution algorithm is not guaranteed. As for the other formulations, Bertsekas and Tsitsiklis (1996) proved that under broad hypothesis it is guaranteed that the solution H˜ · (·) lies in a bounded neighbourhood of the exact solution H· (·). A numerical implementation is described in Castelletti et al. (2007). Finally note that computing time reduces because the term N xt in (23) reduces; however, exponential growth with the state dimension n x is not avoided. This is why, with currently available computing power, NSDP can be used when n x is indicatively of the order of ten units at most Sharma, Jha, and Naresh (2004). However some recent experiments (Baglietto, Cervellera, Sanguineti, & Zoppoli, 2006; Cervellera, Chen, & Wen, 2006) have demonstrated that coupling NSDP and state discretization with low-discrepancy sequences allows for solving problems (on a finite or receding horizon) with even higher state dimension (30 state variables in Cervellera et al. (2006)). 4.1.2. Fixed-class policy Assume that, for any t, the control law belongs to a given class function {m(·; θ t )} where θ t is a vector of unknown parameters. Then the optimal control problem (over a finite horizon) can be formulated as

Φ(g0 (x0 , u0 , ε1 ), . . . ,  gh−1 (xh−1 , uh−1 , ε h ), gh (xh ))

min

Ψ



θ 0 ,...,θ h−1 ε 1 ,...,ε h

subject to the constraints (8) and (10), x0 given and ut = m(xt ; θ t ). The same could be done for an infinite horizon cyclostationary problem, were the unknown would be the sequence [θ 0 , . . . , θ T −1 ]. The clear advantage of this approach is that the optimal control problem is traced back to an optimization problem that can be solved by means of classical Mathematical Programming techniques (see among the others Guariso, Rinaldi, and Soncini-Sessa (1985), Orlovski et al. (1984), where complete numerical implementations are also presented) or more recent soft computing optimization approaches such as genetic algorithms (see among the others Momtahen and Dariane (2007) and references therein) or ant colony optimization (Jalali, Afshar, & Mari˜no, 2006). Its limit is that results depend on the choice of the class of functions (e.g. linear, piecewise linear, fuzzy rule base, etc.) to which the control law belongs and, obviously, optimality can not be guaranteed. Regulation practice often provides indications for this choice: a review of fixed-class approaches based on empirical experience can be found in Oliveira and Loucks (1997). Alternatively, universal approximators (e.g. neural networks) can be used; particularly promising is the approach recently proposed by Baglietto et al. (2006), which uses feedforward neural networks to approximate the control law and a stochastic approximation algorithm to optimize the network parameters (see also the extension proposed by Pianosi and Soncini-Sessa (2008)). 4.2. Modifying the water system model A radical solution for overcoming the curse of dimensionality is to modify the water system model and reduce the dimensionality of its state. The first applications of this idea trace back to the work of Turgeon (1981) who proposed to modify the topology of reservoir networks in order to reduce the number of storage variables. The idea is to replace the n-reservoirs control problem with n subproblems, each considering two reservoirs: one of the actual reservoirs plus an equivalent reservoir that accounts for all the downstream storages. With this approach, the overall computing time for the solution of the problem grows linearly with n. Other authors that have followed this idea are Saad, Turgeon, Bigras, and Duquette (1994), who propose the aggregation of the whole reservoir network in a single storage unit, and Archibald, McKinnon, and Thomas (1997), who suggests a decomposition technique where each subproblem includes an actual reservoir and two equivalent reservoirs for upstream and downstream storages respectively. With the latter technique, the computational complexity is reduced to a quadratic function of the state dimension. These approaches demonstrated to be of value in practical applications but no optimality property can be proved. Another approach that suitably exploits particular topological structures has been proposed by Delebecque and Quadrat (1978). They apply singular

1603

A. Castelletti et al. / Automatica 44 (2008) 1595–1607

perturbation and averaging techniques to a large hydropower reservoir network composed of several valleys with a large seasonal reservoir at the top end and a number of smaller weekly reservoirs downstream. The simplification is operated at the valley level, while the problem for the whole hydropower network is formulated as a SO, multi-DMs problem and solved using team theory. Another possibility is to reduce the state of the system by eliminating the model (4) of uncontrolled catchments thus obtaining the so called reduced model of the water system. In Section 4.2.1 an on-line approach is presented in which outflows from uncontrolled catchments are considered among the system’s disturbances and their dynamics is accounted for by solving the problem on-line and updating the disturbance membership-sets or pdfs with real-time information. This information is collected into a vector It , which includes not only the current state of uncontrolled catchments but also any variable that is useful for predicting their future outflows, like, for example, precipitation or snow-cover measures. In Section 4.2.2 an off-line partial model-free approach is presented that allows for completely eliminating the models of the uncontrolled catchments; in this case, the release policy is composed of control laws whose argument include also some of the components of the information vector It . 4.2.1. On-line approach The idea is as follows: models of uncontrolled catchments k , k = 1, . . . , M, are included are eliminated and outflows at+1 among disturbances of the reduced water system model. This is possible because these subsystems are not influenced by the control ut . By doing so, the number of components in k the state vector (9) is reduced, since the components yt−i do not appear. At each time t, an on-line optimal control problem over a finite horizon [t, t + h] is formulated and solved. For each time τ in the finite horizon [t, t + h], the membership-set Ξτ or pdf φτ (·) of the disturbance is provided by a dynamic predictor that uses all information It available at time t. Once the on-line problem has been solved, only the control for the first time step [t, t + 1) is actually applied and, at time t + 1, a new problem is formulated over the horizon [t + 1, t + 1 + h] with new membership-sets or pdfs for the disturbances, based on It+1 (receding horizon principle). As previously anticipated, the information vector It obviously contains the state of catchments at time t and it may also contain uncontrolled exogenous variables like, for example, measures of precipitation, snow-cover, etc. that are significant for the prediction of catchment outflow. In other words, on-line updating of outflow membership-sets or pdfs can be based on a model more sophisticated than model (4). In most of the cases, in fact, the description of uncontrolled catchment provided by model (4) is a rough approximation but it cannot be improved due to the need to limit the state dimension in the off-line solution with SDP. The on-line problem can be formulated as: 1. A deterministic open-loop control problem min

ut ,...,ut+h−1

Φ gt (˜xt , ut , ε¯ t+1 ), . . . , gt+h (˜xt+h )



subject to x˜ τ +1 = f˜τ (˜xτ , uτ , ε¯ τ +1 ),

τ = t, . . . , t + h − 1

x˜ t given where x˜ τ is the reduced state vector, f˜(·) is the corresponding state transition function and, for each τ = t, . . . , t + h − 1, ε¯ τ +1 is the expected or maximum value of ετ +1 based on φτ (·|It ) or Ξτ (It ). 2. A stochastic open-loop control problem   min Φ gt (˜xt , ut , εt+1 ), . . . , gt+h (˜xt+h ) Ψ ut ,...,ut+h−1 ε t+1 ,...,ε t+h

subject to x˜ τ +1 = f˜τ (˜xτ , uτ , ετ +1 ),

(25a)

ετ +1 ∼ φτ ( · |It )

(25b)

or

ετ +1 ∈ Ξτ (It ),

τ = t, . . . , t + h − 1 x˜ t given.

(25c) (25d)

3. A stochastic closed-loop control problem   min Φ gt (˜xt , ut , εt+1 ), . . . , gt+h (˜xt+h ) Ψ p

ε t+1 ,...,ε t+h

subject to (25) and uτ = m τ (˜xτ ),

τ = t, . . . , t + h − 1

p = [m t (·), . . . , m t+h−1 (·)]. Problem 1 is referred to by Bertsekas (1976) as Naive Feedback Control NFC problem, problem 2 as Open-Loop Feedback Control (OLFC) and problem 3 as Partial Open-Loop Feedback Control (POLFC). Problem 1 and 2 can be solved by means of Mathematical Programming techniques, problem 3 is solved by means of SDP. For all problems, one of the main difficulties is the choice of the penalty function gh (·), which influences both the performance of the closed loop scheme and its stability (Mayne, Rawlings, Rao, & Scokaert, 2000). One possibility (Nardini, Piccardi, & Soncini-Sessa, 1994) is to let gh (·) be equal to the optimal cost-to-go Hh (·) obtained by solving an off-line infinite horizon problem with the reduced model and a trivial predictor, i.e. with a priori pdf or membership-set for the description of the disturbance. However, since a solution of the latter problem requires using SDP, this approach can be followed only if the reservoir network is composed of few reservoirs. An application of the POLFC scheme to a real world case study can be found in Castelletti, de Rigo, Soncini-Sessa, Tepsich, and Weber (2008). As for optimality, it is well known from the certainty equivalence principle that the solution of problem 1 coincides with the optimal solution of the off-line closed-loop problem with the complete model (8)–(10), i.e. of the original problem. Bertsekas (1976) proved that, independently of the form of the model, the solution of problem 3 cannot be worse than the solution of the off-line open-loop problem with the complete model. As for the other problems, it is reasonable that performances of their solution increase when passing from problem 1 to problem 3, but there exist cases when the solution to problem 2 is better the that of problem 3.

1604

A. Castelletti et al. / Automatica 44 (2008) 1595–1607

Finally note that the Extended Linear Quadratic Gaussian (ELQG) approach proposed by Georgakakos (1989), Georgakakos and Marks (1987), which encountered wide diffusion and recognition in reservoir management practice, is an algorithm for the resolution of problem 2. Because of its name, it is sometimes incorrectly cited as a variation of the traditional LQG approach introduced in the previous section. On the contrary, it has been proposed and it is suited for constrained, non-linear models and for on-line resolution. The underlaying idea is to simulate the system subject to a control trajectory (i) (i) ut , . . . , ut+h−1 , obtain the corresponding trajectory of the expected value of the state, linearize the model around such trajectory and apply the Newton method to obtain a new control (i+1) (i+1) trajectory ut , . . . , ut+h−1 , until convergence has been reached. The method allows for the introduction of reliability constraints over the state and the control; the former are accounted for by increasing the value of the cost function when the constraints are violated, the latter by projecting the Newton descent direction into the feasible control set. 4.2.2. Off-line partial model free approach The only way to use the reduced model of the system also in off-line policy design, without resorting to the unrealistic assumption that outflows from uncontrolled catchments are purely random disturbances, is that of using a solution approach based on Reinforcement Learning (see Barto and Sutton (1998), Kaelbling, Littman, and Moore (1996)). With this approach, the control law depends on the reduced state vector x˜ t and on a reduced information vector I˜t , constituted with those components of the information vector It that the Analyst considers having a key role in the outflow formation process. Reinforcement Learning is based on the idea of designing the policy through a trial-and-error learning process, in which model-based estimates of the system transitions are substituted with direct observations of real system evolution (modelfree). Precisely, alternative controls are experimented online, corresponding effects on the system outputs are directly observed and the Q-factor is updated (Q-learning by Watkins and Dayan (1992)). The latter is somehow analogous to optimal cost-to-go and is associated to the quadruple (t, x˜ t , ut , I˜t ). Unfortunately, on-line experiments can not be performed on real world reservoirs, as this may result in unacceptable social costs (all the controls, even those producing disastrous effects, have to be experimented as these can be only evaluated expost) and the learning process would take too much time. To overcome this hurdle a mixed, partial model-free, approach has been recently proposed (Castelletti, Corani, Rizzoli, SonciniSessa, & Weber, 2001). It combines the model-free approach of Q-learning with SDP-based off-line policy design. In the k ) are learning process, registered time series of ( I˜t , I˜t+1 , at+1 used as if they were produced on-line by nature, while other parts of the water system are described with the reduced model. 5. Concluding remarks Although the problem of designing efficient water reservoir management policies has been extensively studied in the last

years in many disciplines, ranging from Hydrology through Decision Theory to Electrical Engineering, it is still a very intriguing research theme. This paper reviewed some of the recent, and in the authors’ opinion, more significant advances in policy design by a Control Theory perspective. Focus was mainly on the implications that the very nature of the storage systems has on the formulation and solution of the control problem. The problem proposed has many other facets that have not been dealt with in the paper, but are worthwhile mentioning. When new water reservoir networks are being planned, the control problem discussed in the paper has to be nested in a mathematical programming problem whose arguments are the planning variables (e.g. number and capacity of reservoirs). In normal, real-time management of water reservoir networks a number of changes may occur in system conditions (e.g. down-time periods for some hydropower units, irrigation canals under maintenance, etc.) that could not be accounted for in designing the off-line policy and thus require it to be modified. This argument is not dealt with in the paper but on-line approaches discussed in Section 4.2.1 are well suited for this purpose as they can be viewed as adaptive control schemes. Each time relevant changes occur in the system one can switch from the off-line policy to an on-line policy, computed as explained in that section, and then re-adopt the off-line policy once normal system conditions are restored. In a multipurpose and multistakeholder context, the choice of policy to adopt in the set of the efficient policies is the final step of a complex, often recursive, decision making process that involve many different phases: from stakeholder analysis, through system model identification and the very policy design, to comparison and negotiations of efficient policies. Activities within these phases involve skills from Systems and Control Theory as well as Decision Making, Hydrology, Sociology, Alternative Dispute Resolution, and require full stakeholder involvement and integration among the different and disparate issues. They have to be organized in a procedure (Castelletti & Soncini-Sessa, 2006) and supported by proper computer tools, namely, MultiObjective Decision Support Systems (see among the others Liu and Stewart (2004), Nandalal and Simonovic (2002), Salewicz and Nakayama (2004), Soncini-Sessa, Rizzoli, Villa, and Weber (1999)). Acknowledgment Partially supported by FONDAZIONE CARIPLO TWOLE2004. References Archibald, T. W., McKinnon, K. I. M., & Thomas, L. C. (1997). An aggregate stochastic dynamic programming model of multireservoir systems. Water Resources Research, 33(2), 333–340. Aufiero, A., Soncini-Sessa, R., & Weber, E. (2001). Set-valued control laws in minmax control problem. In Proceedings of IFAC workshop on modelling and control in environmental issues.

A. Castelletti et al. / Automatica 44 (2008) 1595–1607 Aufiero, A., Soncini-Sessa, R., & Weber, E. (2002). Set-valued control laws in TEV-DC control problem. In Proceedings of 15th IFAC world congress on automatic control. Baglietto, M., Cervellera, C., Sanguineti, M., & Zoppoli, R. (2006). Water reservoirs management under uncertainty by approximating networks and learning from data. In Topics on system analysis and integrated water resource management. Amsterdam: Elsevier. Barto, A., & Sutton, R. (1998). Reinforcement learning: An introduction. Boston: MIT Press. Bellman, R. E. (1957). Dynamic programming. Princeton: Princeton University Press. Bellman, R. E., & Dreyfus, S. (1962). Applied dynamic programming. Princeton: Princeton University Press. Bellman, R. E., Kabala, R., & Kotkin, B. (1963). Polynomial approximation a new computational technique in dynamic programming. Mathematics of Computation, 17(8), 155–161. Bertsekas, D. P. (1976). Dynamic programming and stochastic control. New York: Academic Press. Bertsekas, D. P., & Tsitsiklis, J. N. (1996). Neuro-dynamic programming. Boston: Athena Scientific. Breton, A., Haurie, A., & Kalocsai, R. (1978). Efficient management of interconnected power systems: A game theoretic approach. Automatica, 14, 443–452. Brown, L. R. (2001). How water scarcity will shape the new century. Water Science and Technology, 43(4), 17–22. Castelletti, A., Corani, G., Rizzoli, A. E., Soncini-Sessa, R., & Weber, E. (2001). A reinforcement learning approach for the operational management of a water system. In Proceedings of IFAC workshop modelling and control in environmental issues. Yokohama: Elsevier. Castelletti, A., de Rigo, D., Rizzoli, A. E., Soncini-Sessa, R., & Weber, E. (2005). An improved technique for neuro-dynamic programming applied to the efficient and integrated water resources management. In 16th IFAC world congress. Castelletti, A., de Rigo, D., Rizzoli, A. E., Soncini-Sessa, R., & Weber, E. (2007). Neuro-dynamic programming for designing water reservoir network management policies. Control Engineering Practice, 15(8), 1001–1011. Castelletti, A., de Rigo, D., Soncini-Sessa, R., Tepsich, L., & Weber, E. (2008). On-line design of water reservoir policies based on inflow prediction. In 17th IFAC world congress. Castelletti, A., & Soncini-Sessa, R. (2006). A procedural approach to strengthening integration and participation in water resource planning. Environmental Modelling & Software, 21(10), 1455–1470. Cervellera, C., Chen, V. C. P., & Wen, A. (2006). Optimization of a large-scale water reservoir network by stochastic dynamic programming with efficient state space discretization. European Journal of Operational Research, 171(3), 1139–1151. Cervellera, C., & Muselli, M. (2004). Deterministic design for neural network learning: An approach based on discrepancy. IEEE Transaction on Neural Networks, 15(3), 533–544. Delebecque, F., & Quadrat, J. P. (1978). Contribution of stochastic control singular perturbation averaging and team theories to an example of large-scale systems: The management of hydropower production. IEEE Transaction on Automatic Control, 23(2), 209–222. Esogbue, A. O. (1989). Dynamic programming and water resources: Origins and interconnections. In Dynamic programming for optimal water resources systems analysis. Englewood Cliffs: Prentice-Hall. Fang, K. T., & Wang, Y. (1994). Number-theoretic methods in statistics. London: Chapman & Hall. Fearnside, P. M. (2004). Greenhouse gas emissions from hydroelectric dams: Controversies provide a springboard for rethinking a supposedly ‘clean’ energy source. Climatic Change, 66, 1–8. Foufoula-Georgiou, E., & Kitanidis, P. K. (1988). Gradient dynamic programming for stochastic optimal control of multidimensional water resources systems. Water Resources Research, 24, 1345–1359. Fults, D. M., & Hancock, L. F. (1972). Optimal operations models for ShastaTrinity system. Journal of the Hydraulic Division ASCE, 98, 1497–1514.

1605

Georgakakos, A. P. (1989). Extended Linear Quadratic Gaussian (ELQG) control: Further extensions. Water Resources Research, 25(2), 191–201. Georgakakos, A. P., & Marks, D. H. (1987). A new method for realtime operation of reservoir systems. Water Resources Research, 23(7), 1376–1390. Gilbert, K. C., & Shane, R. M. (1982). TVA hydroscheduling model: Theoretical aspects. Journal of Water Research Planning and Management — ASCE, 108(1), 21–36. Guariso, G., Rinaldi, S., & Soncini-Sessa, R. (1985). A decision support system for water management: The Lake Como case study. European Journal of Operational Research, 21, 295–306. GWP-Global Water Partnership, (2000). Integrated water resources management. TAC Background paper 4, GWP Secretariat, Stokholm. Hall, W. A., & Buras, N. (1961). The dynamic programming approach to water resources development. Journal of Geophysical Research, 66(2), 510–520. Hall, W. A., Butcher, W. S., & Esogbue, A. (1968). Optimization of the operation of a multi-purpose reservoir by dynamic programming. Water Resources Research, 4(3), 471–477. Heidari, M., Chow, V. T., Kokotovic, P. V., & Meredith, D. (1971). Discrete differential dynamic programming approach to water resources systems optimisation. Water Resources Research, 7(2), 273–282. Hooper, E. R., Georgakakos, A. P., & Lettenmaier, D. P. (1991). Optimal stochastic operation of Salt River Project, Arizona. Journal of Water Research Planning and Management — ASCE, 117(5), 556–587. Jalali, M. R., Afshar, A., & Mari˜no, M. A. (2006). Reservoir operation by ant colony optimization algorithms. Iranian Journal of Science & Technology, 30, 107–117. Johnson, S. A., Stedinger, J. R., Shoemaker, C., Li, Y., & Tejada-Guibert, J. A. (1993). Numerical solution of continuous-state dynamic programs using linear and spline interpolation. Operations Research, 41, 484–500. Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285. Kaufmann, A., & Cruon, R. (1967). Dynamic programming. New York: Academic Press. Khagram, S. (2004). Dams and development: Transnational struggles for water and power. Ithaca: Cornell University Press. Kotchen, M. J., Moore, M. R., Lupi, F., & Rutherford, E. S. (2006). Environmental constraints on hydropower: An ex post benefit-cost analysis of dam relicensing in Michigan. Land Economics, 82(3), 384–403. Larson, R. E. (1968). State incremental dynamic programming. New York: American Elsevier. Liu, D., & Stewart, T. J. (2004). Object-oriented decision support system modelling for multicriteria decision making in natural resource management. Computers & Operations Research, 31, 985–999. Lotov, A. V., Bushenkov, V. A., & Kamenev, G. K. (2004). Interactive decision maps approximation and visualization of pareto frontier. Heidelberg: Springer-Verlag. Luenberger, D. G. (1971). Cyclic dynamic programming: A procedure for problems with fixed delay. Operations Research, 19(4), 1101–1110. Maas, A., Hufschmidt, M. M., Dorfam, R., Thomas, H. A., Marglin, S. A., & Fair, G. M. (1962). Design of water resource systems. Boston, MA: Harvard University Press. Mayne, D. Q., Rawlings, J. B., Rao, C. V., & Scokaert, P. O. M. (2000). Constrained model predictive control: Stability and optimality. Automatica, 36, 789–814. McCully, P. (2001). Silenced rivers. London: Zed Books. McLaughlin, D., & Velasco, H. L. (1990). Real-time control of a system of large hydropower reservoirs. Water Resources Research, 26(4), 623–635. Miettinen, K. (1999). Nonlinear multiobjective optimization. Dordrecht: Kluwer Academic Publishers. Momtahen, Sh., & Dariane, A. B. (2007). Direct search approaches using genetic algorithms for optimization of water reservoir operating policies. Journal of Water Resources Planning and Management, 133(3), 202–209. Nandalal, K. D. W., & Simonovic, S. P. (2002). State of the art report on system analysis methods for resolution of conflicts in water resources management. Div. of Water Sciences, UNESCO. Nardini, A., Piccardi, C., & Soncini-Sessa, R. (1994). A decomposition approach to suboptimal control of discrete-time systems. Optimal Control Applications and Methods, 15(1), 1–12.

1606

A. Castelletti et al. / Automatica 44 (2008) 1595–1607

Niederreiter, H. (1992). Random number generation and quasi-monte carlo methods. Philadelphia: SIAM. Oliveira, R., & Loucks, D. P. (1997). Operating rules for multireservoir systems. Water Resources Research, 33(4), 839–852. Orlovski, S., Rinaldi, S., & Soncini-Sessa, R. (1983). A min max approach to storage control problems. Applied Mathematics and Computations, 12(2–3), 237–254. Orlovski, S., Rinaldi, S., & Soncini-Sessa, R. (1984). A min max approach to reservoir management. Water Resources Research, 20(11), 1506–1514. Ozelkan, E. C., Galambosi, A., Fernandes, E., & Duckstein, L. (1997). Linear quadratic dynamic programming for water reservoir management. Applied Mathematical Modeling, 21, 591–598. Pianosi, F., & Soncini-Sessa, R. (2008). Extended ritz method for reservoir management over an infinite horizon. In 17th IFAC world congress. Piccardi, C. (1993a). Infinite-horizon minimax control with pointwise cost function. Journal of Optimization Theory and Applications, 78, 317–336. Piccardi, C. (1993b). Infinite-horizon periodic minimax control problem. Journal of Optimization Theory and Applications, 79, 397–404. Piccardi, C., & Soncini-Sessa, R. (1991). Stochastic dynamic programming for reservoir optimal control: Dense discretization and inflow correlation assumption made possible by parallel computing. Water Resources Research, 27(5), 729–741. Read, E. G. (1989). A dual approach to stochastic dynamic programming for reservoir release scheduling. In Dynamic programming for optimal water resources systems analysis (pp. 361–372). Englewood Cliffs: Prentice-Hall. Rippl, W. (1883). The capacity of storage reservoirs for water supply. Minutes of Proceedings, Institution of Civil Engineers, 71, 270–278. Rosa, L. P., Santos, M. A., Matvienko, B., Santos, E. O., & Sikar, E. (2004). Greenhouse gas emissions from hydroelectric reservoirs in tropical regions. Climatic Change, 66, 9–21. Saad, M., Turgeon, A., Bigras, P., & Duquette, R. (1994). Learning disaggregation technique for the operation of long-term hydroelectric power systems. Water Resources Research, 30(11), 3195–3203. Salewicz, K. A., & Nakayama, M. (2004). Development of a web-based decision support system (DSS) for managing large international rivers. Global Environmental Change, 14, 25–37. Sharma, V., Jha, R., & Naresh, R. (2004). Optimal multi-reservoir network control by two-phase neural network. Electric Power Systems Research, 68, 221–228. Sniedovich, M. (1979). Reliability-constrained reservoir control problems: 1. Methodological issues. Water Resources Research, 15(6), 1574–1582. Soncini-Sessa, R., Castelletti, A., & Weber, E. (2007). Integrated and participatory water resources management. Theory. Amsterdam: Elsevier. Soncini-Sessa, R., Rizzoli, A. E., Villa, L., & Weber, E. (1999). TwoLe: A software tool for planning and management of water reservoir networks. Hydrological Science Journal, 44(4), 619–631. Soncini-Sessa, R., Zuleta, J., & Piccardi, C. (1991). Remarks on the application of a risk-averse approach to the management of El-Carrizal reservoir. Advance in Water Resources, 13(2), 76–84. Su, Y. S., & Deininger, R. A. (1972). Generalization of White’s method of successive approximations. Operations Research, 20(2), 318–326. Su, Y. S., & Deininger, R. A. (1974). Modeling regulation of Lake Superior under uncertainty of future water supplies. Water Resources Research, 10(1), 11–25. Tauxe, G. V., Inman, R. R., & Mades, D. M. (1979). Multiobjectives dynamic programming with application to a reservoir. Water Resources Research, 15(6), 1403–1408. Tejada-Guibert, J. A., Johnson, S. A., & Stedinger, J. R. (1995). The value of hydrologic information in stochastic dynamic programming models of a multireservoir system. Water Resources Research, 31(10), 2571–2579. Thompson, M., Davison, M., & Rasmussen, H. (2004). Valuation and optimal operation of electric power plants in competitive markets. Operations research, 52(4), 546–562. Trott, W. J., & Yeh, W. (1973). Optimization of multiple reservoir systems. Journal of the Hydraulic Division ASCE, 99, 1865–1884. Tsitsiklis, J. N., & Van Roy, B. (1996). Feature-based methods for large scale dynamic programming. Machine Learning, 22, 59–94.

Turgeon, A. (1980). Optimal operation of multi-reservoir power systems with stochastic inflows. Water Resources Research, 16(2), 275–283. Turgeon, A. (1981). A decomposition method for the long-term scheduling of reservoirs in series. Water Resources Research, 17(6), 1565–1570. Vasiliadis, H. V., & Karamouz, M. (1994). Demand-driven operation of reservoirs using uncertainty-based optimal operating policies. Journal of Water Research Planning and Management—ASCE, 120(1), 101–114. Wallach, D., Makowski, D., & Jones, J. (2006). Working with dynamic crop models. Evaluation, analysis, parameterization, and applications. Amsterdam: Elsevier. Wasimi, S. A., & Kitanidis, P. K. (1983). Real-time forecasting and daily operation of a multireservoir system during floods by Linear Quadratic Gaussian control. Water Resources Research, 19(6), 1511–1522. Watkins, C. J. C. H., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3–4), 279–292. Weber, E., Rizzoli, A.E., Soncini-Sessa, R., & Castelletti, A. (2002). Lexicographic optimisation for water resources planning: The case of lake verbano, Italy. In A. E. Rizzoli & A. J. Jakeman (Eds.), Integrated assessment and decision support, proceedings of 1st biennial meeting of IEMSS. White, D. J. (1963). Dynamic programming, Markov chains, and the method of successive approximations. Journal of Mathematical Analysis and Applications, 6, 373–376. Wong, P. J., & Luenberger, D. G. (1968). Reducing the memory requirements of dynamic programming. Operations Research, 16(6), 1115–1125. World Commission on Dams, (2000). Dams and development: A new framework for decision-making. London, UK: Earthscan Publications Ltd. Yakowitz, S. (1982). Dynamic programming applications in water resources. Water Resources Research, 18(4), 673–696. Yeh, W. (1985). Reservoir management and operations models: A state of the art review. Water Resources Research, 21(12), 1797–1818. Andrea Castelletti was born in Genova in 1974. He received a MS degree in Environmental Engineering and a Ph.D. in Information Engineering from Politecnico di Milano, Italy, in 1999 and 2005. Since 2006 he is Assistant Professor of Modelling and Control of Environmental Systems in the same university and since 2008 Honorary Research Fellow at the Centre for Water Researches of the University of Western Australia. His main research interests focus on participatory and integrated modelling and control of environmental systems, namely water resource systems, and Decision Support System design. He has co-authored two international books on integrated water resource management and more than 20 papers in international journals and conference proceedings. He is currently member of the IFAC Technical Committee on Modelling and Control of Environmental Systems (TC 8.3). Francesca Pianosi was born in Milan in 1980. She received a MS degree in Environmental Engineering from Politecnico di Milano, Italy, in 2004 and a Ph.D. in Information Engineering in 2008. Her research interests focus on modelling and control of environmental systems, and in particular, time series analysis and stochastic optimal control of water systems. She is co-author of an international book on integrated water resource management.

Rodolfo Soncini-Sessa was born in 1948 in Milano. In 1972 he received a Master in Electronic Engineering from the Politecnico di Milano, Italy. He has been associate professor of Modelling and Control of Natural Resources (1982–1986, Politecnico di Milano) and full professor of Automatic Control (1986–1990, Universit`a di Brescia), before becoming full professor of Natural Resources Management at the Politecnico di Milano in 1990. He has been invited to the International Institute for Applied System Analysis (IIASA) in Austria for several research periods. His main research interests are the Design

A. Castelletti et al. / Automatica 44 (2008) 1595–1607 of Decision Support Systems (DSS) for integrated and participatory decision making in the field of water resources, with attention to both quality and quantity of the water. He is chair of the Technical Committee on Modelling

1607

& Control of Environmental Systems of IFAC, and on the Editorial Boards of Water International and Journal of Environmental Modelling and Software. He is author or co-author of several books and many papers.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.