Nonclassical control problems and Stackelberg games

Share Embed


Descripción

IEEB TRANSACTIONS ON A m M A T I C CONTROL, VOL. AC-24,NO.

2,

APRIL

155

1979

NonclassicalControlProblems Stackelberg Games GEORGE P. PAPAVASSILOPOULOS, STUDENT

~

Abstmcf-A nonelassid control problem, where the control dep&ds 011 state and time, and its partial derivative with respeet to the state appears m the state equation and in the cast function is analyzed. Stackelkq dynamic games whicb lead to sach nondassical conlrd problems are considered and studied

E

RIEEE, , AND

and

JOSE B. CRUZ, JR.,

FELLOW, IEEE

two-level Stackelberg game is as follows. Let U, V be two sets and J , , J2 two real-valued functions

Ji:U X V+R,

i= (1) 1,2.

We consider the set valued mapping T

INTRODUCTION

H

IERARCHICAL and largescalesystemshavereceived considerable attention during the last few years; firstly because of their importance in engineering, economics, and other areas, and secondly because of the increased capability of computer facilities [13], [14]. An important characteristic of many large scale systems is the presence of manydecisionmakerswith different and usually conflicting goals. The existence of many decision makerswho interact throughthesystem and have different goals may be an inherent property of the system under consideration (e.g., a market situation), or may be simply the result of modeling the system as such (e.g., a largesystemdecomposedtosubsystems for calculation purposes). Differential games are useful in modeling and studying dynamic systems where more than one decision maker is involved. Most of the questions posed in the area of the classical control problem may be considered in a game situation, but theirresolutionisgenerallymore difficult. In addition, many questions can be posed in a game framework,which are meaninglessortrivialin a classical control problemframework.Thesuperiorconceptual wealth of gameover control problems,which makesthempotentiallymuchmoreapplicable, counterbalancesthe additional difficultiesencounteredintheir solution. A particular class of games are the so-called Stackelberg differential games [ 11-[SI. Stackelberg games provide a natural formalism for describing systems which operate on many different levels with a corresponding hierarchy of decisions. The mathematical definition of a general Manuscript received May 5, 1978; revised November 16,1978. Paper recommended byD. silja!~, Chairman of the Large Scale Systems, Differential Games Committee. This work was supported in part by the National science Foundation under Grant ENG-7420091, in part by the Department of Energy, Electric Energy Systems Division under Contract U.S. ERDA EX-76-C-01-2088, and in part by the Joint Services Electronics F’rogram under Contract DAAB-07-72-C-0259. The authors are with Decision and Control Laboratory, Coordinated Science Laboratory, University of Illinois, Urbana, IL 61801.

T: U+V,

T

ut+

TuC V

(2)

defined by Tu={uIu=arginf [J2(u,G);G E V ] } .

(3)

Clearly Tu =0 if the inf in definition (3) is not achieved. We also consider the minimization problem infJ,(u,u) subject to: u E U , .o E Tu,

(4)

where we use the usual convention J,(u, u ) = + ca if u E Tu = 0.

Definition: A pair (u*,o*) E U X V is called a Stackelberg equilibrium pair if (u*, u*)solves (4). The sets U and V are calledtheleader’s and follower’s strategy spaces, respectively.The game situation described by the mathematical formulation above is as follows. The follower tries to minimize his cost function J,, for a given choice of u E U by the leader. The leader knowing the follower’s rationale, wishes to announcea u* such that the follower’s reaction u* to thisgiven u* will result to the minimum possible J,(u*,u*). The general N-level Stackelberg game is defined analogously. Stackelberg differential games were firstintroduced and studied in the engineering literature in[2] and further studied in [3]-[8]. They are mathematically formalized as follows: 4 t ) = f ( x ( t ) , G ( t ) , W ,t ) ,

X ( to)= x,

J , ( u , v ) = g i ( n ( t , ) , t , ) +IOS f ’ ~ , ( x ( t ) , u ( t ) , G ( t ) , t ) ~ t ,

i=1,2

(5)

where f, g,, Li are appropriately defined functions. Also, u E U, .o E V, where U, V are appropriately defined funcof u and u, tionspaces and E(r), G ( I ) arethevalues respectively, at time t , i.e., zi(t)=ulf, G(t)=ul,. The type of and strategyspaces U and V whichwereconsidered

00l8-9286/79/0400-0155$00.75

01979 IEEE

156

treated successfully in the previous literature were the spaces of piecewise continuous functions of time. In this case, the problem of deriving necessary conditions for the Stackelberg differential game with fixed time interval and initial condition xo, falls within the area of classical control. Thus, variational techniques can be used in a straightforward manner. The case where the strategy spaces are spaces of functions whose values at instant t depend on the current state x ( t ) and time I,i.e., ii(t)= uI, = u ( x ( t ) ,t), 5(t)= t ' l I = o ( x ( t ) , t ) , was not treated. This case results in a nonclassical control problembecause & / a x appears in the follower'snecessary conditions. Since the follower's necessary conditions are seen as state differential equations by the leader, the presence of au/ax in themmakes the leader face a nonclassical control problem. In the present paper, the nonclassical control problem arising from the consideration of the above strategy spaces is embedded in a more general class of nonclassical control problems; see (20)-(22). The characteristics of this general class of problems are the following: 1) each of the components u i , of the control m-vector u, depends on the current time t and on a given function of the current state and time, i.e., u'I,=u'(h'(x(t),t),t); 2) the state equation and the cost functional depend on the first-order partial derivative of u with respect to the state x . The vector valued functions hi may represent outputs ormeasurements available to the ith "subcontroller," in a decentralized control seting. The only restriction to be imposed on h i is to be twice continuously differentiable with respect to x . This allows for a quite large class of hi's whch can model output feedback or open-loop control laws. It can also model mixed cases of open-loop and output feedback control laws whereduring only certain intervals of time an output is available. The appearance of the partial derivative of u with respect to x prohibits the restriction of the admissible controls to those which are functions of time only. It will become clear that the extension of our results to the case where higher order partial derivatives of u with respect to x , up to order N, appear is straightforward. This case isof interest in hierarchical systemssinceit arises, for example, in an N-level Stackelberg game where the playersuse control values dependent on the current state and time. Although the bulk of the analysis provided in this paper concerns continuous-time problems, the corresponding discrete-time results can be derived in a very similar manner. The structure of the present paperis as follows: In Section I a two-level Stackelberg differential game is introduced for a fixedtime interval [to,$1 and initial condition x(to) = xo. The leader's and follower's strategies are functions of the current stateand time. This game leads to the consideration of a nonclassical control problem which is studied in Section 11. In Section 111weuse the results of Section I1 to study further the game of Section I, and in particular we work out a linear quadratic Stackelberg game. In Section IV the relation of the Stackelberg game, introduced in Section I, to the principle of

IEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL AC-24, NO. 2, APRIL 1979

optimality is investigated. Finally we have a Conclusions section and two Appendices. Notation and Abbreviations

R": n-dimensional real Euclidean space with the Euclidean metric; 11:11 denotes the Euclideannormforvectors and the sup norm for matrices; ': denotes transposition for vectors and matrices. For a function f: R"+Rm we say that f E C k if f has continuous mixed partial derivatives of order k. For f: R"+R, Vf is considered an n X 1 column vector and f,, denotes the Hessian off. For$ Rn+ R m, Vf is considered an n X m matrix (Jacobian). For f: R" X R k+R "', where x E R", y E R k, f ( x , y ) E R", we denote by af/dx or f, or V f the Jacobian matrix of the partial derivatives off with respect to x and is considered as n X m matrix. w.r. to: with respect to; w.1.o.g.: without loss of generality; n.b.d.: neighborhood.

I. A STACKELBERG GAME Inthissection we introduce a two-level Stackelberg game and show how it leads us to the consideration of a nonclassical control problem. This nonclassical control problem falls into the general class to be considered in Section 11. Let U = { u ( u : R " x [ t , , t f ] + R m l , u ( x , r ) E R m l for x ER" and t E [ t O , t / ] .u,(x,r) exists and u ( x , t ) , u,(x.r) are continuousin x and piecewise continuous in t }

(6)

V = { clv: [to,$]+Rm2, ous in t } .

(7)

D

is piecewise continu-

Consider the dynamic system i ( t ) = f ( x ( t ) ,E( r), C( I ) , t ) , x ( to)= xo7 t E [ I,,

$1

(8)

and the functionals J , ( u , c ) = g ( x ( r / ) ) + J " L ( x ( f ) . u ( t ) , t - ( r ) , t ) ~ r(9) r0

J2(u.c)=h(x(t,))+lr'M(x(t),U(t),Z1(t),t)

(10)

f0

where U E U,c E V , x is the state of the system, assumed to be a continuous function of r and piecewise in C' w.r. to t , x: [fo,tf]+R",and the functionsf: R" X R"1 X Rm2X [to.tf]-+Rn, g , h : R"+R. L , M : R " x R m BRm2x[ro,t/]-+ x R, are in C w.r. to the x , u, c arguments and continuous in t. The u and t: are called strateges and are chosen from U and V which are called the strategy spaces, by the two players, the leader and the follower, respectively. With the given definitions, for each choice of u and t', the behavior of the dynamicsystem is unambiguously determined,

'

157

P A P A V ~ I U ) P O U L O SAND CRUZ: CONTROL PROBLEMS AND STACKELBERG GAMES

(124 (12b)

type to be considered in the next section, since the partial derivative of the control u w.r. to x appears in the constraints (15) which play the role of the system differential equations andstate control constraints, withnew state ( x ' , ~ ' ) 'Notice . that the leader usesonly x ( t ) and t in evaluating u(x(t),t) and not the whole state i.e., the value of u at time r is composed in a partial feedback form with respect to the state (x/,$)'; (recall the output feedback in contrast to the state feedback control laws). If one were concerned with a Stackelberg game composed of N (> 2) hierarchical decision levels[7], [8], then the leader would face a nonclassical control problemwhere the ( N - 1)th partial of u with respect to x would appear. We will assume that the state-control constraint (15c) can be solved for u over the whole domain of interest to give

(12c)

u = S(X,P,u, t )

assuming of course, that for the selected pair (u,u) the solution of the differential equation (8) exists over [to,$]. Let us assume that a Stackelberg equilibrium pau (u*,u*)E U X V exists. For fixed u E U , Tu is determined by the minimization problem

(XI,$)';

minimize J2(u, o) subject to: o E V = f ( x , u ( xt),u, , t ) , x(to) = x09 1 E [ to,

(1 1)

$1

and thus, applying theminimumprinciple we conclude that for o E V to be in Tu, there must exist a function p : [to,$J-+Rnsuch that

x=f ( x , u , o , t ) M,+fg=O -ij = M

x

+ %Mu + ( f x + UJUIP

(16)

where S is continuous and in C ' w.r. to x and p . This assumption holds in many cases, as for example in the We further assume that U is properly topologized. Condi- linear quadratic case to be considered in the next section. tions(12) define a setvalued mapping T ' : U-+ V . By In any case, direct handling of the constraint (15c) by using the nature of the defined U and V and the fact that appending it, or assumption of its solvability in u, does (12) are necessary but not sufficient conditions it is easily not seem to be the core of the matter from a game point ofview. However, the following remark is pertinent here. proven that Assume that we allow u E where Tu T'u, i) ~ = { o ~ u : R " X [ t o , t f ] ~v R ( x", t~) , piecewise ii) J2(u, 0') >J2(u, o ) V :u' E T u ,o E Tu, t and Lipschitzian in x (17) continuous in T u * n Tu* 2 { o * } Z0. iii) where x E R" and t E [ t o , t f ] } Notice that J2(u,o) takes one value for given u and uqy u E Tu, while J2(u,o'), o' E T'u does not necessarily do so. instead of u E V . The assumption of solvability of (15c) will again give We assume now the following. Assumption A : o ( x ,t ) = S(X,P,u, t). (18) J,(u,o')>J,(u,o) foro'ET'u,uETu,uEU$ (13) Since o ( x , t ) will be substituted in the rest of(15) with S ( x , p ,u,t) from (18), the leader will be faced with exactly where U: is a n.b.d. of u* in U. For Assumption A to holditsuffices for example: the same problem as after substituting u(t) with S from T = T' on U;.' We conclude that if Assumption A holds, (16). Therefore, no additional difficulty arises if one allows instead of V and assumes solvability of (15c). then u* is a local minimum of the problem Substituting o from (16) to (15) we obtain minimize J,( u ,u ) subject to: u E U, u E T'u

v,

v

or equivalently

subject to: minimize J,(u, o) subject to: u E U, u E V

i = f ( x , u, 0,t )

(14) (154

- P = ~ x + ~ x ~ u + ( f x + u J u ) p (15b) M*+fg=O (15c) where L, F,, F2,, F22 stand for the resulting composite functions. Problem (19) is a nonclassical control problem like the one treated in Section I1 where (x',p'j' is the state of the system. The problem (14) is a nonclassical control problem of the Besides the procedure described above which leads to the consideration of the problem (19), there are other casesinwhich such problemsarise. For example, in a 'See Appendix A.

I58

AUTOMATIC CObiTROL, VOL. AC-24, NO. 2, APRIL

IEE TRANSACTIONS ON

control problem where the state x is available, stochastic disturbances are present, and the time interval [to,r,] is very large, synthesis of the control law as a function of x and t is preferable over a synthesis not using x (open loop). In addition, u, might be penalizedinthe cost function or be subjected to bounds of the form Iu,(x(t),t)l < K,t E[ I,, $1, where K > 0 is a constant.

This problem is posed for a fixed time interval [to,r,] and a fixed initial condition x(to)= x,. Therefore, the solution u*, if it exists, will in general be a function of to, r,, xo, in addition to being a function of h(x,t), t, but we do not show this dependence on t, r,, x. explicitly. We use the notation -

af

ad

11. A NONCLASSICAL PROBLEM

f,=

af

L, =

[ :]

, m X n matrix,

*

- aum

u=

1979

-

, m X 1 vector

i= 1;. ux=[uJ

: ... : ur],

- ,m

nxmmatrix.

It should be pointed out that the arguments usedin classical control theory for showing that for the fixed ui: RQX[ t,,$]+R, i= 1; * ,m. initial point case, it is irrelevant for the optimal trajectory and cost whether the control value at time t is composed u$h'(x,t),t) exists and u'(h'(x,t),t), ui(h'(x,t),t) are con- byusing x(r) and t or only t,3 do not apply here in tinuous in x and piecewise continuous in t , for x E R", general. If uI, = u(t), t E [to,91, then u, =O and this changes t E [LO,$1, i = 1; . ,m so as to minimize J(u). We denote the structure of problem (22). Consideration of variations by I/ the set of all such 24's. Therefore, the problem under of u, is also needed and this was wheretheprevious investigation is researchers stopped; see [4]. This problem is successfully treated here. We provide two different ways of doing that, minimize J( u ) the first of which is based on anextension (Lemma2.1) of (22) subject to u E and (20). the so-called "fundamental lemma" in the Calculus of Variations (see [ 121). Urn

-

-

e

'The restriction h E C ' w.r. to x is somewhat strong. For example, the case h ( x , t ) = x if t , < t < t , h(x,t)=O if t l < t < ~ i.e., , the state is availaholds if 1) theset of theadmissibleclosed-loop control laws ble onlyduring a part of the [te9] is not mcluded. Nonetheless, it can be contains the set of the admissible open-loop control laws and 2) if u* approximated arbitrarily close by a C2 function, like any function which is an optimal closed-loop control law generating an optimal trajectory is only piecewise c2.consequently, from an engineering point of view, x*(r), then u*(t) u*(x*(t),t) is an admissible open-loop control law. h E C w.r. to x is not a serious restriction.

T 'h s i

159

PAF'AVASSIU)POULOS AND CRUZ: CONTROL PROBLEMS AND STACKELBWG GAMES

Consider the problem The followingtheoremprovidesnecessary conditions for a function u E 0to be a solution to the problem (22) minimize J ( U, u],* - ,E") = g( x ( 9)) in a localsense;(weassume that isproperlytopologized). It is assumed in this theorem that the optimum u* + ~ ~ ( x , u , v , h ' ( x , t ) u , , . ,V,h"(X,t)U",t)dt has strong differentiability properties, an assumption which w ill be relaxed later, in Theorem 2.1. The proof of subject to~=f(x,U,V,h'(X,t)U,,..., this theorem is based on the following lemma. Vxh"(x,t)Em,t), x(t,)=x, tE[1,,$] Lemma 2.1: Let M : [tO,$]+Rm, Ni:[fO,tf]+R",i1,. ,m,y: [t,,,$]+R", be continuous functions, such that U€Um, ~ E U , ,i = l , . - . , m . (29)

u

s

--

-

for every continuous function 'p: R" X[tO,$]+Rm,where and 'p is in C ' w.r. to y . Then M , N , , . ,Nmare identically zero on [to,$]. choice 'pi = Proof of Lemma 2.1: The (o,...,O,'p',O,...,O)', ' p i : [t,$]+R, ' p i continuous in t, i = I,... ,m, yields M - 0 on [to,$]: Since M - 0 , the ,y'+, 0,. . ,O)', 'p' =y'+, where = choice Gi= (0, ,+J, 4: [t,,,$]+R", continuous in t, results iii JfoN;(t)+(t)dt= 0, for every such and thus NiE O on [to,$1 is proven in the same way as M E O was proven. The conclusion of the above lemma holds even if the y,".i t4. isimposed,where restriction 'p i ( x ,t ) =y f 1 i kli, ,kfli,X,are nonnegativeintegers,sincethepolynomials are dense in the space of measurable functions on

cp=(cpl,---,QI")'?

-

(+],e

Clearly, if J:, 3; are the infima of (22) and (29), respectively, it will be J:O, i = l , - - * , m , , (69) then (61) holds again. The idea behind the condition A2=0 on [to,t,] is that the leader is not redly constrained by the follower's adjoint equation and therefore the leader's problem, being independent of the follower's problem, becomes a team control problem. In conclusion, a necessarycondition for the principle of optimality to hold for the Stackelberg games of Section I (and 111),is that the leader's problem is actually a team control problem.But for a control problemwithfixed initial conditions, the principle of optimality does hold. We thus have the "if and only if" statement: the principle of optimality holds for the problems of Section I (and 111) (see (6)-(lo), (16) and (42)-(a), respectively) if and only if the leader's problem is a team control problem for both the leader and follower. V. CONCLUSIONS

In the present paper, anonclassical control problem motivatedbyStackelberggameswas introduced and analyzed.Problems of this type arise in the study of hierarchical systems and take into account several information patterns that might be available to the controllers. Two different approaches were presented. The first uses variational techniques, while the second reduces the nonclassical problem to a classical one. The nonexistence of closed-loop control laws for this problem was shown. The nonuniqueness of the solution of this problem was considered and explained. The results obtained forthis nonclassical control problem were used to study a Stackelberg differential gamewhere the players have current state information only ((x(t),t)). Necessary conditions that the optimal strategies must satisfy were derived. The inapplicability of dynamic programming to Stackelberg dynamic

gameswasexplained and discussed. The singular character of the leader'sproblemwas proven and the nonuniqueness of his strategies was proven and characterized. In particular, it was shown that commitment of the leader to an affine time-varying strategy does not induce any change to the optimal costs and trajectory.A linear quadratic Stackelberggamewas also worked out as a specific application. We end by outlining certain generalizations of the work presented here. We consider, first, the discrete-time versions. Consider the dynamic system Xk+I=f(Xkrul(hl(Xk,k),k),'

* '

,u"(h"(xk?k),k),

-

u$+(x,,k),k),* * ,U,m(h"(Xk,k),k),k) x,, given k=O,-. ,N- 1 and the cost N- 1

J(u)=g(x,)+

2

L(x,u'(h'(x,,k),k),.

* *

I

k=O

urn(h " ( X k ,

k),k),

-

U;(hl(xk,k)k),* ,u,m(h"(x,,k),k).

The proof of the corresponding Theorem 2.1 is straightforward. An immediate consequence is that the restriction u i ( h i ( x k , k ) , k ) = ~ : h i ( x k , k ) + ~ : , i = 1,-

,m

where A i , BL are matrices, does not induce any loss of generality as far as the optimal cost and trajectory are concerned, [compare to (31)].Clearly Proposition 2.1 carries over, too. Adiscrete-timeversion of the Stackelberggame of Section I can be defined (see [8n, and analyzed similarly to Section 111. Several information patterns can be exploited by employing different hi's (see [SI). The restriction of the leader to affine strategies can also be imposed in the discrete case. The linear quadratic discrete analog of problem (42)-(44) can also be worked out in a similar way. The case where higherorder partial derivatives of u w.r. to x appear in (20) and (21) can be treated, and all the analysis of SectionI1 carries over. One should assume higher order differentiability of the functions involved. Lemma 2.1 can easily be extended to the casewhere higher order of partials of rp w.r. t o y appear, making the proof of the corresponding Theory 2.1' possible. We can also restrict ui to a polynomial form in terms of the hi's. The analog of Theorem 2.1 can be easily stated and proven and Proposition 2.1 also carries over. Finally, an N-level Stackelberg gamewhere on each i-level (i = 1, . ,N ) n, followers operate (ui, ,u;), play Nash (or Pareto) among them, and $Il = u,'(h,'(x,t), t ) j = 1,. ,n,, i = 1,- , N , with given h,! and fixed x, t, fr can be easily treated by using the analysis for d e nonclassical control problem supplied here.

-

-

---

-

164

IEE W S A C T I O N S ON AUTOMATIC CONTROL,VOL

APPENDIX A

APPENDIX

AC-24,NO. 2, APRIL 1979

B

Proof of Theorem 2.1 ': Let g G 0 w.1.o.g. (see [IO]). ConIn thisAppendix wegive certain conditions under sider a function rpE U, rp=(rp'; . . ,qm)which has the which Assumption A (Section I) holds. Lemma A.1: Let U, be a subset of U [see (6)],defined same continuity and differentiability properties as u*. Such a rp willbe called admissible. Using the known as theoremson the dependence of solutions of differential V , = { U E U I u ( x , t ) = C ( r ) x + D ( t ) , where the equations on parameters, we conclude that for E E R, E m , X n matrix C(t) and the m ,X 1 vector sufficientlysmall, u* ~ r p gives rise to a trajectory D(t) are piecewise continuous functions of { ( x ( ~ , t ) , t ) l t E [ t ~ , t f ] } , x ( O , t ) = x *and ( t ) ,that x ( E , ~is) in time over [ t, 91). C' w.r. to E. Direct calculation yields

+

(A-1) Then it holds: inf[J,(u,u); U E U,, U E Tu] > i n f [ J , ( u , u ) ;u E U , u E T u ] >inf[J,(u,u);uEU,uET'u] =inf[J,(u,u); u E u,, UET'U].

(A-2)

Proof.. The inequalities follow from the facts U, G U, We set Tu G T'u V u E U.The last equality is obvious in the light of (3 1) and proof the of Theorem 2.1. 0 An immediate conclusion of Lemma A.l is that if

inf[J,(u,u); ~ E U , , U E T ~ ] = m f [ J , ( u , r ; ) ;u E U , , u E T ' u ]

4,

(A-3)

i=l

1

1

j= 1

holds, then Assumption A holds (with U , = U ) . For (A-3) B d t ) =f: (B-4 to hold, it suffices that the first-order necessary conditions Bi(t)=f;'V,hi, i = 1 , . ,m 03-51 for the follower'sproblem are also sufficient, for each fixed u E U,. More specifically, for fixed C(t), D ( t ) as in where A , B , , B; are evaluated at t, x*, u*, u,* and, thus, definition (A-l), we consider the problem for E = 0, (B-I) can be written as

--

x B$$., m

minimize h ( x ( l / ) ) + / f ' M ( x , C ( r ) x + D ( r ) , t . , r ) d t

i=A~+B,cp+

IO

~(to)=O.

(B-6)

i= I

subject to: r; E V (A4

For fixed rp we consider

i = f ( x , C( t ) x + D(t ) , U,t ) , x(to) = xo, t E [ to,9 1

and seek conditions under which the first-order necessary conditions for an optimal c* for problem(A-4)[see (15b)-(15d)] are also sufficient. Such conditions can be found in [15, ch. 5-21. We formalize this discussion in the following proposition. Proposition A.1: If for each u E U,, the first-order necessary conditions (15b)-( 15d) for problem (A-4) are also sufficient, then Assumption A holds. in the present Appenhx generalizes Thediscussion clearly to the casewhereeach u i depends on h'(x,t) instead of x and to the case where different Ul's are considered; see for example Proposition 2.1ii). As an example where Proposition A.l can be applied, we consider the linear quadratic game of Section 111. Then, [15, Theorem 5, p. 3411 in conjunction with Proposition A. 1 yields that if Q, > 0, R, >0, R,, > 0, K2f> 0, then Assumption A holds.

J(E)=J(u+Erp).

Since J(B)is in C ' w.r. to must hold that

E

and u* is a local optimum, it

9

/c=o=o.

Direct calculation yields

m

(B-7) i= 1

Setting

165

PAPAVASSILOPOULOS AND CRUZ: CONTROL PROBLBMS A N D STACKELBEFlG GAMES

r(t ) = L, + U, L,,

Applying Lemma 2.1 to (B-lQ, we obtain 4,

V,hi~$,,;V,hi’+2

~iV,,l$

j= I

i= 1

1

Li (B-8)

AI( t ) = LL

03-91

-

p ’ ( r ) B l ( ~ ) + A 1 ( r ) = O , on [toy$]

(B-17)

p ’ ( r ) B ~ ( r ) + A ~ ( r ) = O , on [ t o , $ ] .

(B-18)

Using (B-4), (B-5) and (B-9), (B-10) in (B-17, (B-IS), we have equivalently (25) and (26). Differentiation of (B-15) with r, AI, Ai evaluated at x*, u*, u,*, we conclude from and use of (B-3) and (3-8) give the equivalent to (B-15) (B-7)-(B-10) that

~ i ( t ) L=~ ! v , ~ ~i =, 1,- * ,m

lo‘[

(B-10)

m

1

m

rz+A,V+

2 i= 1

(B-11)

dt=O.

-p= L,+

4

f.+ r, c, U,’V,hj‘(Li+

fg)

;=1 j = 1

P=O. Therefore, (€3-11) must hold foreveryadmissible rp. Let @(t,7) be the transition matrix of A ( t ) . Let The assumption g=O, is removed in the knownway, 0 also $(t) denote thevector (rp’(h’(x*(t),t), t), , resulting in (27). rp”(hm(x*(t),t),t))’and @,‘(t)thevector(arp’(h’(x*(t),t),t)) /ax. Then from (B-6) we obtain

--

m

1$1

REFERENCES

B,(T)@(T)+2 B;(T)@’(T)dr i= 1

t E [ to,

(B-12)

and substituting in (B-1 1) we obtain

x m

+Al(t)+(t)+

i= 1

I

4 ( t ) G i ( t ) dt-0.

(B-13)

Let denote the indicator function of [a,b]c[tm $1. We can interchange the order of integration in(B-13) sincethe integrated quantities are bounded on [to,$]X [to,$] (Fubini’s theorem). Using the fact X ( C ) [ ~ = ,~] X(b)ic,qlwe have successively

m

+2 i=l

I+[

p‘(r)Bi(r)+k~(r)]-IjTi(r)d7=0. (B-16)

Io

H. von Stackelberg, TheTheory of the Market Econony. Oxford, England: Oxford Univ. Press, 1952. C. I. Chen and J. B. Cruz, Jr., “Stackelberg solution for two-person games with biased information patterns,” IEEE Trans. Automat. Contr., vol. AC-17, pp. 791-798, Dec. 1972. M. Simaan and J. B. Cruz, Jr., “On the Stackelberg strategy in nonzero-sumgames,” J. Optimiz. Theory Appl., vol. 11, pp. 533555, May 1973. -, “Additional aspects of the Stackelberg strategy in nonzerosum games,” J. Optimiz. Theory Appl., vol. 11, pp. 613-626, June 1973. D. Castanon and M. Athans, “On stochastic dynamic Stackelberg strategies,” Automatica, vol. 12, pp. 177-183, 1976. J. V. Medanic, “Closed-loop Stackelberg strategies in linearquadratic problems,” in Proc. 1977 JACC, San Francisco, CA, June 1977, pp. 1324-1329. B. F. Gardner, Jr. and J. B. Cruz, Jr., “Feedback Stackelberg Strategies for M-level hierarchical games,” IEEE Trans. Automat. Conrr., vol. AC-23, pp. 489-491, June 1978. J. B. Cruz, Jr., “Leader-follower strategies for multilevel systems,” IEEE Trans. Automat. Contr., vol. AC-23, pp. 244-255, Apr. 1978. D. J. Bell and D. H. Jacobson, Singular Optimal Control Problems. New York: Academic, 1975. L. D.Berkovitz, Optimal Control 7lteory. NewYork:SpringerVerlag, 1974. W. H. Fleming and R. W. Rischel, Deterministic and Stochastic Optimal Control. NewYork:Springer-Verlag, 1975. G. A. Bliss, Lectures on theCalculus of Vm.ations. Chicago, IL: Univ. of Chicago Press, 1946. Y. C. Ho and S. K. Mitter, Eds., Directions in Large-Scale Systems. New York: Plenum, 1976. G. Guardabassi and A. Locatelli, Eds., “Large-scale systems, theory and applications,” in Proc. ZFAC Symp., Udine, Italy, June 1976. E. B. Lee and L. Markus, Foundations of Optimal Control Theory. New York: Wiley, 1967.

George P. Papavasdopodos (S’79) was born in Athens,Greece, on August 29,1952. Hereceived the Diploma in mechanical and electrical engineering from the National Technical University of Athens, Greece, in 1975 and the M.S.E.E. from the University of Illinois, Urbana-Champaign, in 1977. He is currently completing the Ph.D. degree in electricalengineering at the University of Illinois. Since 1975 hehasbeenaResearch Assistant at the Coordinated Science Labora-

166

IEEE TRANSACTIONS ON AUMMATIC CONTROL, VOL. AC-24,NO. 2, APRIL 1 9 9

tory, University of Illinois. His current research interests are in control of large scale systems, differential games, and algorithms. Mr.Papavassilopoulos is amember ofthe Technical Chamber of Greece.

Jose B. Cruz, Jr. (S’56-”57-SM’61-F’68) was born in the Philippines in 1932. He received the B.S.E.E. degree ( s u m m a cumlaude)from the University of the Philippines, Diliman, in 1953, the S.M. degreefromtheMassachusetts Institute of Technology, Cambridge, in 1956, and the Ph.D. degree from the University of Illinois, Urbana, in 1959, all in electrical engineering. From 1953 to 1954 hetaught at theUniversity $: of thePhilippines. He was aResearchAssistant in the M.I.T. Research Laboratory of Electronics,Cambridge,from 1954 to 1956. Since 1956 he has been with the Department of Electrical Engineering, University of Illinois, wherehe was an Instructor until 1959, an Assistant Professor from 1959 to 1961, an AssociateProfessorfrom 1961 to 1965, and Professorsince 1965. Also, he is currently a Research Professor at the Coordinated Science Laboratory, University of Illinois, where he is Director of the Decision and Control Laboratory. In 1964 he was a Visiting Associate Professor at the University of California, Berkeley, and in 1967 he was an Associate Member of the Center for Advanced Studies, University of Illinois. In the Fall of 1973 he was a Visiting Professor at M.I.T. and at Harvard

3

University. His areas of research are mutliperson control of multiple goal systems, decentralized control of large scale systems, sensitivity analysis, and stochastic control of systems with uncertain parameters. He is the coauthor of three books, editor and coauthor oftwo books, and the author or coauthor of 120 papers. He is the President for 1979 of the IEEE Control Systems Society. He previously served the Society as a member of the Administrative Committee, Chairman of the LinearSystemsCommittee, Chairman of the Awards Committee, Editor of the IEEE TRANSACTIONS ON AUTOMATIC CONTROL, member of the Information Dissemination Committee, Chairman of the Finance Committee, General Chairman of the 1975 IEEE Conference on Decision and Control, and Vice President for Financial and Administrative Activities. He served the IEEE Circuits and Systems Society as Associate Editor for Systems in 1962-1964. At the Institute level, he servedas a member of the IEEE Fellow Committee, memberof the IEEE Educational Activities Board, and Chairman of a committee for the revision of the IEEE Guidelines for ECPDAccreditation of Electrical Engineering Curricula in the United States. Presently he is a member of theMeetingsCommittee of the IEEE TechnicalActivities Board and a member of the IEEE Education Medal Committee. He is also an Associate Editor of the Journal of the Franklin Institute. In 1972 he received the Curtis W. McGraw Research Award of the American Society for Engineering Education. In 1968 he was elected a Fellow of the IEEE for “siguficant contributions in circuit theory and thesensitivityanalysis of control system.” He has beenelectedto membership in Phi Kappa Phi, Sigma Xi, and Eta Kappa Nu. He is listed in American Men and Women of Science, Who’s Who in America, and Who’s Who in Engineering. He is a Registered Professional Engineer in Illinois.

Closed-LoopStackelberg Strategies with Applications inthe Optimal Control of Multilevel Systems

Abshact--This paper develops a new approachto obtain the closed-loop Stackelberg (CLS)solution of an important class of two-perso0 nonzerosum dynamic games characterized by linear state dynamics and quadratic cost hmctionals. The new technique makes w of an important propertyof noounique representationsof a cl&-loop strategy, and it relates the CLS solution to a particular representation of the optimal solution of a team problem. It is shorn that, under certain conditions, the CLS strategies for the leader are linear and of the one-step memory type, while t h w of the follower can be realized inn i l e a r feedback form. Exact expressions are given for the optimalcoefficient matrices involved, which can be determinedrecursively. These results are then extended to multilevel di~crete-time control of hearquadratic systems which are characterized by one centralcontroller and K second-levelcontrollers. Onditions are obtained under wbich a one-step memory strategy of the cenbal controller forces the other controllers to a team-optimal solution, while each one of the K second-level controllers is in fact mhimizhng its o m cast function. Manuscript received March 8,1978; revisedNovember 16,1978. Paper recommended byD. Chairman of the Large Scale Systems, Differential Games Committee. T. Bapr is with theApplied Mathematics Division,MarmaraResearch Institute, Gebze-Kocaeli, Turkey, on leave at the Department of Applied Mathematics, Twente Universityof Technology, Enschede, The Netherlands. H. Selbuz iswith theApplied Mathematics Division, Marmara Research Institute, Gebze-Kocaeli, Turkey.

siljak,

I. INTRODUCTION HE Stackelbergsolutionconcept,firstintroduced in Teconomics of static inthe 1930’s withinthecontext economic competition [l], has entered the control literaturethroughtheworks of Chen, Cruz, and Simaan ([2], [3], [4]), who have utilized its dynamic version within the The dynamic context of hierarchicaldecisionmaking. Stackelberg solution is mostly appropriate in nonzero-sum two-person dynamic (differential) games when one of the players (the leader) has the ability (or enough power) to on theotherplayer(thefollower). enforcehisstrategy Hence, as opposed to the Nash equilibrium solution concept, theroles of theplayers are notsymmetric in this case. Indeterministicdifferentialgames,whentheplayers have access to only open-loop information, it is relatively simplerto obtain the necessaryconditions that Stackelbergstrategiesshouldsatisfy, and it isevenpossible to obtain theoptimalstrategiesexplicitlyinthecase of linear-quadratic (LQ) problems [2]. [SI. However,ina

0018-9286/79/0400-0166$00.75 01979 IEEE

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.