On bilevel programming, Part I: General nonlinear cases

July 4, 2017 | Autor: James Falk | Categoría: Applied Mathematics, Mathematical Programming, Sensitivity Analysis, Theoretical Analysis, Numerical Analysis and Computational Mathematics, Bilevel Programming

Share Embed

Laporkan tautan ini

Descripción

Mathematical Programming 70 (1995) 47-72

On bilevel programming, Part I: general nonlinear cases 1 James E. Falk *, Jiming Liu Department of OperationsResearch, School of Engineering and Applied Science, The George Washington University, Washington, DC 20052, USA Received 8 June 1993; revised manuscript received 15 August 1994

Abstract This paper is concerned with general nonlinear nonconvex bilevel programming problems (BLPP). We derive necessary and sufficient conditions at a local solution and investigate the stability and sensitivity analysis at a local solution in the BLPP. We then explore an approach in which a bundle method is used in the upper-level problem with subgradient information from the lower-level problem. Two algorithms are proposed to solve the general nonlinear BLPP and are shown to converge to regular points of the BLPP under appropriate conditions. The theoretical analysis conducted in this paper seems to indicate that a sensitivity-based approach is rather promising for solving general nonlinear BLPP.

Keywords: Bilevel programming; Nonlinear nonconvex; Nondifferentiable optimization; Economic planning; Sensitivity analysis

I. Introduction

This paper is concerned with the general n o n l i n e a r n o n c o n v e x bilevel p r o g r a m m i n g p r o b l e m (BLPP) which consists of two parts, an upper and a lower part. The upper-level p r o b l e m is defined as min y

F(x(y),

y),

* Corresponding author, e-mail: [email protected]. 1 This research is sponsored by the Office of Naval Research under contract N00014-89-J-1537. 0025-5610 © 1995 - The Mathematical Programming Society, Inc, All rights reserved

SSD1 0025-5610(94)00077-8

(1)

.I.E. Falk,J. Liu/MathematicalProgramming70 (1995)47-72

48

where (the set-valued function) x(y) is implicitly defined by the lower-level problem rain

f( x, y),

-

s.t.

(2)

( x , y) ~ S .

The feasible set S is defined by S -- {(x, y): g(x, y) 0, then x(-) is twice continuously differentiable near y. Thus the function D-detector(y) should be a good candidate as a detector for the twice differentiability of x(y) at y. To make best use of the global stability of bundle methods and the local fast convergence of the Newton-type methods, we propose the following adaptive method for solving the BLPP based on the idea of D-detector. In this paper we concentrate on quasi-Newton methods since this class of Newton-type methods are popular in practice, although second-order information could be utilized, see Remark 2.5. The basic idea is to embed a quasi-Newton method with line search into a bundle method whenever justified. To simplify our presentation, we shall specify some additional subroutines involved first, and then go to the description of the main steps of the algorithm. The

J.E. Falk, J. Liu / MathematicalProgramming70 (1995)47-72

65

usual line-search procedure for differentiable unconstrained optimization will be adapted to our problem. The resultant line-search is a finite procedure for semismooth functions, see [22].

Deri. This routine is used to compute the derivative of the function PF in the case where D-detector(y) > 0. We input (x, u, v, y) to Deri, it calculates Vyx(.) at y first (see Proposition 2.4), and then computes and outputs OF OF VyPF(y) = Oy + Ox VyX(y). Line-search-D. An adapted version of usual line-search (Wolfe-Powell) for differentiable unconstrained optimization is used here when D - d e t e c t o r ( y ) > 0. Let /3, p be given scalars with p ~ (0, ½), /3 ~ ( p , 1). Given y, a direction d and the directional derivative DPF(y; d), find A (by a golden section search or an interpolation-extrapolation procedure) such that PF( y + Ad) ~ PF( y) + pADPF( y; d), DPF( y + Ad; d) >~/3DPF( y; d). Always try A = 1 first. Update y = y + Ad.

BFGS. The BFGS updating formula. Specifically, input ( H k_ a, 6k, Yk) to BFGS, where H k_ a is a matrix that is a previous approximation of the inverse of the Hessian matrix and ~k, Yk are vectors, it will output H k as follows: I+

-

We describe the algorithm in pseudo-language as follows.

Adaptive Leader Predominate Algorithm Main step O. Initialization: (i) Choose a starting semi-local solution ( x 1, y~) and parameter T > 0 , 0 < m l < m 2 < 1 , 0 < m 3 < 1 , u > 0 , e~>0 and an upper bound Jmax ~> 3 for I Jk I. Compute PF(yl), ~1 ~ OPF(Y 1) and put w 1 := yl, J1 '= {1}, and set I-old = I-cur = I(x 1, yl), D-status = 0 and k := 1; (ii) Choose scalars for line-search for quasi-Newton method /3, p with p(0, ½), /3 E ( p, 1); choose a positive scalar called D-threshold and arbitrary symmetric, positive definite matrices H0I for each possible index set I c {1 . . . . . p}. Main step 1. Inner Iteration: LI: calculate D-detector(y~); let I = I-cur; L2: if D-detector(y k) >~ D-threshold and I-old = I, then L3: if D-status = 1, then

66 L4:

.I.E. Falk,J. Liu/MathematicalProgramming70 (1995)47-72 let 6k=yk--yk-1; yk= VyPF(yk)_ VyPF(y~-l);

Hi= BinS(/40 L5: L6: L7:

endif; compute d k= -H~VyPF(yk); perform a line search by using Line-search-D given inputs (x k, yk), d k, VyPF(yk), and Vyx(yk); obtain output (x k+ 1, yk ~1); set I-old = I-cur; I-cur = i(xk+l, yk+l); L8: if PF is differentiable at yk+ 1, then L9: calculate VyPF(y k+ 1) using Deri; L10: if HVyPF(yk+I)II 0), but the twice differentiability of x(-) may fail to hold outside a small neighborhood of yk, or if I-old 4: I-cur, then the current point and last point are not on a same piece, thus the second-order derivatives may be discontinuous somewhere in-between generally. Therefore, in either cases it seems reasonable to use the bundle method (lines 19-24). Note that the use of quasi-Newton method in our algorithm is conseruative, i.e., only when the last two iterations are on a same piece. Lines 3 - 6 compute a direction by a quasi-Newton method. Line 3 checks the value of the variable D-status to see the differentiability of the last point. If yes, since the matrix H~ stores some information about the inverse of the Hessian on the current piece built up from previous iterations and thus, line 4 uses the BFGS updating formula to obtain a new matrix. Line 6 computes a direction. L i n e 7 performs a line search and obtains a new estimation and updates the trace of the pieces. Line 8 tests the differentiability of PF at yk+ 1 If yes, line 9 calculates the gradient and line 10 tests the optimality conditions. If optimality conditions are detected, stop; otherwise in line 13 set D-status = i to keep the tract of the differentiability status. If PF is not differentiable, line 16 sets D-status --- 0, calculates a subgradient and exits the inner iteration. Lines 19-24 perform the inner iteration of a bundle algorithm. Line 20 checks the optimality conditions and line 23 records that a bundle method was used. It is expected that the adaptive leader predominate algorithm will have the global convergence property in the setting of Theorem 4.2 since it was designed intentionally to take the advantages of both a bundle method and a quasi-Newton method with line-search. Furthermore, if the problem (2) happens to possess nondegenerate conditions as stated in Proposition 2.4 at the limit of the optimizing sequence, evidently the adaptive leader predominate algorithm will take a quasi-Newton method in the final stages of the optimizing iterative process, and then the local convergence rate will be q-superlinear. Theorem 4.3. In the setting of Theorem 4.2, let {(x k, yk)} be the sequence generated by the adaptive leader predominate algorithm for an arbitrary starting semi-local

J.E. FallqJ. Liu / MathematicalProgramming 70 (1995) 47-72

68

solution ( x 1, y~). Then the sequence {(x k, yk)} has at least a point of accumulation, say ( x * , y * ). (a) I f there are no accumulation points for which SCS holds, then ( x *, y * ) is a regular point of first category to the BLPP. (b) If SCS holds at x * for (2) with y = y* and ~ 2 p F ( y * ) is positive definite, then ( x *, y * ) is a regular point of second category to the BLPP and {(x k y k)} converges q-superlinearly to ( x * , y * ), i.e., lira

k-,~

II(x k+a, Y k + l ) - ( x * ,

II(x k, Y k ) - ( x *

,

y*)ll2

y

,)

112

=0.

(19)

Proof. As we saw before, the adaptive leader predominate algorithm is a hybrid method which adaptively embeds the quasi-Newton method based on the BFGS updating formula with line-search into a bundle method. Since the damping part of the quasiNewton method used here essentially consists with the line-search procedure for bundle methods (see [22]), so the adaptive predominant algorithm is a descent method. By the assumption that the level set {(x, y): F ( x , y)~ 0 such that for all kth iterations with k ~>K, the adaptive leader predominate algorithm actually becomes the quasi-Newton method with line-search. Thus, from the well-known result concerning the global convergence of the quasi-Newton method with line-search by Powell (see [16] for details), we conclude that {yk} is q-superlinearly convergent to y * in the sense that lira k-,~

rl y k + l _ y * 112

= 0.

(20)

IlYk-Y*II2

Since x ( y ) is locally Lipschitzian near y *, it is easy to see that (20) implies (19). It is easy to see that in this case (x *, y * ) is a regular point of second category to the BLPP. [] R e m a r k 4.4. Careful readers might notice that there is another logical possibility except the cases (a) and (b) in Theorem 4.3, i.e., the case where the sequence {(x k, yk)} has an accumulation point (x*, y * ) such that SCS holds at x* for (2) with y = y * but VyZPF(y * ) is not positive definite. The stationarity of (x *, y * ) in this case is not clear to the authors and needs to be further investigated.

J.E. Falk, J. Liu / Mathematical Programming 70 (1995) 47-72

69

In closing this section, we shall make two remarks on the leader predominate and adaptive leader predominate algorithms. Firstly, note that in Propositions 2.4 and 2.7 the calculation of the derivative information explicitly involves the exact K K T triple of (2) with some fixed y. But in reality it is often impossible to obtain the exact solution. Recently, Fiacco and Liu [14,15] developed a technique to approximate the sensitivity information by way of general iterative methods. The results obtained there show that the computation of the derivative information from the lower-level problem (2) is actually available using the information generated in the normal course of solving (2). Secondly, Felgenhauer [11] recently showed that the quasi-Newton method based on the BFGS updating formula with line-search is stable in the sense that the gradients involved need not to be computed exactly. The conclusions obtained in [11] indicate that the adaptive leader predominate algorithm is a stable algorithm.

5. Examples In this section we present two examples to explain how our algorithms work. The first example is from [8]. The upper-level problem is min

y2 _ 2 y 1 + y 2 _ 2 y 2 + x l ( Y )

2 +x2(y)

2,

y

where x ( y ) is defined by the lower-level problem min

(xl-yl)2+(xz-y2)

2,

X

s.t.

0 . 5 - x i

Lihat lebih banyak...

On bilevel programming, Part I: General nonlinear cases

Descripción

Comentarios