Two complexity results on c-optimality in experimental design

July 8, 2017 | Autor: Milan Hladík | Categoría: Applied Mathematics, Computational Complexity, Experimental Design, Optimization Problem, Regression Model, Numerical Analysis and Computational Mathematics, Optimal Design, LINEAR PROGRAM, Numerical Analysis and Computational Mathematics, Optimal Design, LINEAR PROGRAM

Share Embed

Laporkan tautan ini

Descripción

Comput Optim Appl (2012) 51:1397–1408 DOI 10.1007/s10589-010-9377-8

Two complexity results on c-optimality in experimental design ˇ Michal Cerný · Milan Hladík

Received: 11 April 2010 / Published online: 25 November 2010 © Springer Science+Business Media, LLC 2010

Abstract Finding a c-optimal design of a regression model is a basic optimization problem in statistics. We study the computational complexity of the problem in the case of a finite experimental domain. We formulate a decision version of the problem and prove its NP-completeness. We provide examples of computationally complex instances of the design problem, motivated by cryptography. The problem, being NP-complete, is then relaxed; we prove that a decision version of the relaxation, called approximate c-optimality, is P -complete. We derive an equivalence theorem for linear programming: we show that the relaxed c-optimality is equivalent (in the sense of many-one LOGSPACE-reducibility) to general linear programming. Keywords Linear programming · Optimal design · NP-completeness · P-completeness

1 Introduction The linear regression model E(y) = Xβ describes the response of the dependent variable y as a linear function of regressors. The matrix X is called the design matrix of the regression model. The vector of regression parameters β = (β1 , . . . , βm )T is unknown and it is to be estimated from the observed data (X, y). Under traditional ˇ M. Cerný () Department of Econometrics, University of Economics, Prague, Winston Churchill Square 4, 130 00 Prague, Czech Republic e-mail: [email protected] M. Hladík Faculty of Mathematics and Physics, Department of Applied Mathematics, Charles University, Prague, Malostranské nám. 25, Prague, Czech Republic e-mail: [email protected]

ˇ M. Cerný, M. Hladík

1398

assumptions on the error terms, the ordinary least squares (OLS) estimator m )T = (X T X)−1 XT y 1 , . . . , β β = (β of β has desirable statistical properties; namely it is unbiased and, in the class of unbiased linear estimators, it is the best estimator with respect to variance. If we can choose the values of regressors, i.e. if we can control the design matrix X, then there is a naturally associated optimization problem: how shall the design matrix be chosen so that the estimate β of the unknown value of β is “as good as possible”? Usually there exist restrictions for the choice of the design matrix: (i) A set X ⊆ Rm , called the experimental domain, is given and each row x T of X has to fulfill x ∈ X . (ii) The number of rows is (or may be) restricted to a number N ; this corresponds to the situation that our budget is limited to make N observations only. The statement “to choose X so that the estimator β is as good as possible” may be formalized in different ways and there is a rich theory on this issue (Atkinson et al. [2]; Pázman [9]). Recall that the quality of an unbiased estimator is usually measured in terms of its variance. The covariance matrix of the OLS-estimator β is given by σ 2 (X T X)−1 , 2 where σ is the variance of the dependent variable. The covariance matrix defines a so-called confidence ellipse of the form β) ≤ κ}; E := {b : (b − β)T XT X(b − with an appropriate choice of the scaling constant κ, the ellipse has the property that β ∈ E with a given probability. i ), where var stands for variance, is proportional to the i-th Observe that var(β T diagonal entry of (X X)−1 . Minimizing the variance of the estimators of all regression parameters simultaneously is a multi-objective optimization problem. There are different ways to convert it into a single-objective problem. For instance: – D-optimality is the minimization of det(XT X)−1 ; it corresponds to minimizing the volume of E; – E-optimality is the minimization of the maximal eigenvalue of (XT X)−1 ; it corresponds to minimizing the length of the longest semiaxis of E; – A-optimality is the minimization of the trace of (XT X)−1 ; it corresponds to mini1 ), . . . , var(β m ). mizing the average of var(β We study the criterion known as c-optimality. Let c ∈ Rm be a fixed non-zero β and its variance is proportional to vector. Then the OLS-estimator of cT β is cT N · cT (XT X)−1 c,

(1)

where N stands for the number of observations. The number (1) is called c-variance of X and is denoted varc (X). The c-optimal matrix X is any matrix X respecting given restrictions of the type (i) and (ii) and minimizing varc (X). The c-optimality criterion is used if our aim is to estimate either a single regression parameter or a linear combination of regression parameters. Indeed, if we set e.g.

Two complexity results on c-optimality in experimental design

1399

1 is cT = (1, 0, . . . , 0), then we are designing the matrix X so that the variance of β minimized; or, in other words, we are designing the experiment to be able to estimate the unknown value β1 as precisely as possible. If we choose cT = ( m1 , . . . , m1 ), we are estimating the average value of regression coefficients. If we are designing the matrix X for estimation of the Cobb-Douglas production function ln Y = β1 ln F1 + β2 ln F2 + · · · + βm−1 ln Fm−1 + βm , where Y is output and F1 , . . . , Fm−1 are production factors, then the choice cT = (1, . . . , 1, 0) corresponds to the estimation of returns to scale. In general, if we interpret the value f (x) = x T β as the mean response of the system described by the regression model under the conditions x, then the choice c = x allows us to design the model for estimation of the mean response f (x). Of course, this is interesting if x∈ / X. Computational complexity of the problem depends on the experimental domain X . We study the case of X finite. This case occurs if we can control the experiment conditions only in discrete steps, if the domain X is so intricate that only a description of X in terms of a (sufficiently dense) grid is available or if the domain points have been generated by some kind of a nondeterministic or random process.

2 Problem statement Let the experimental X = {x 1 , . . . , x k } be fixed. Let ξ = (ξ1 , . . . , ξk )T be domain k a vector satisfying i=1 ξi = 1, ξ ≥ 0. Any such vector is called an approximate design, or design for short. It has the meaning that we shall perform 100ξi % of measurements of the dependent variable y in the point x i (i = 1, . . . , k) of the domain X . If a natural number N (standing for the number of observations) is given, and all numbers Nξ1 , . . . , N ξk are integral, we call the approximate design ξ an N -exact design. The N -exact design determines the N -row design matrix X (up to a permutation of rows) in the sense that |{j ∈ {1, . . . , N} : j -th row of X is x Ti }| = N ξi holds for all i = 1, . . . , k. For an approximate design ξ and a vector c denote varc (ξ ) := cT

k

−1 ξi · x i x Ti

c,

(2)

i=1

where (·)−1 stands for the matrix (pseudo)inverse. This number is called c-variance of ξ .1 It may be varc (ξ ) = ∞. 2 For a given N -exact design ξ , it holds var(cT β= β) = σN · varc (ξ ), where T T −1 (X X) X y with the N -row design matrix X determined by ξ . So varc (ξ ) represents the multiplicative contribution of the structure of the design ξ in the variance of the estimator cT β. Moreover, varc (ξ ) is independent of N and σ 2 . It means that varc (ξ ) is a natural measure of the quality of the design ξ . 1 Observe that for an N -exact design ξ , the definition (2) is consistent with (1).

1400

ˇ M. Cerný, M. Hladík

We call an N -exact design ξ c-optimal if for any N -exact design ξ it holds varc (ξ ) ≥ varc (ξ ). For complexity-theoretic considerations, we assume that all the vectors x 1 , . . . , x k , c are rational. A non-zero rational number is represented in the usual way as [±, numerator, denominator], where the numerator and denominator are written down in binary in the coprime form. In statistics, the optimization version of the problem is essential: Exact optimal design—optimization version. Given a rational experimental domain X , a rational vector c and a non-zero natural number N , find the c-optimal N -exact design ξ over the domain X . For complexity-theoretic classification, we need a decision version of the problem. The natural decision version is as follows: Exact optimal design—decision version. Given a rational experimental domain X , a rational vector c, a natural number N and a positive rational number S 2 , does there exist an N -exact design ξ over the domain X satisfying varc (ξ ) ≤ S 2 ? Or, equivalently, is the c-variance of the c-optimal N -exact experimental design ξ over X at most S 2 ? Let us call this problem EOD. The exact version of the problem reflects the situation that our budget is limited to N observations. We shall also investigate the approximate (or: asymptotic) version of the problem. This version relaxes the N -exactness requirement: it corresponds to the situation that we can make any finite number of observations. Approximately optimal design—decision version. Given a rational experimental domain X , a rational vector c and a positive rational number S 2 , is there an integer N and an N -exact experimental design ξ over X satisfying varc (ξ ) ≤ S 2 ? Or, if we are not restricted by the number of observations, is it possible to design an exact experiment with the c-variance most S 2 ? Let us call this problem AOD. For the sake of completeness, let us also state the optimization version. A design ξ is called approximately c-optimal, if it is N -exact for some N , and for any N and any N -exact design ξ it holds varc (ξ ) ≥ varc (ξ ). Approximately optimal design—optimization version. Given a rational experimental domain X and a rational vector c, find the approximately c-optimal design ξ over the domain X . The existence of the approximately c-optimal design will be investigated in Sect. 4. Remark 1 The term approximately c-optimal design is used in statistical literature (see Harman and Jurík [6]) and reflects the fact that N can be so large that in practice, rarely so many observations can be obtained. If we are able to make N < N observations only, then the vector N ξ need not be integral, and ξ is approximate in the sense that the entries of N ξ have to be rounded. More on rounding of approximate designs is found in Pukelsheim and Rieder [10].

Two complexity results on c-optimality in experimental design

1401

3 Complexity-theoretic classification of c-optimality We will prove two theorems: Theorem 1 AOD is P-complete. Theorem 2 EOD is NP-complete. By P-completeness we mean completeness for the class P, the class of sets decidable in Turing polynomial time, under LOGSPACE-computable many-one reductions. Recall that the class of LOGSPACE many-one reductions is the class of functions computable in Turing logarithmic space. More on P-completeness is found in Greenlaw et al. [5]. Remark 2 Assuming P = NP, the exact version of the problem has no polynomialtime algorithm. Any NP-set can be regarded as a particular case of the exact experimental design problem. NP-completeness theory provides plenty of computationally hard instances of NP-complete problems; so we obtain a set of hard testing instances for any (heuristic) algorithm used in practical statistics (see Atkinson et al. [2]). Such instances of the design problem are constructed in the proof of Theorem 2. Let us present an example motivated by cryptography. Security of the famous RSA cryptographic protocol (for details see [8]), and several other methods in cryptography, relies on the belief that the following problem is computationally difficult: let q be an n-bit integer the prime decomposition of which is in the form q = p1 p2 , where p1 , p2 are distinct primes. Given q, find p1 and p2 . We show that this problem is equivalent to solving a particular experimental design problem. It is easy to write down a boolean formula f (x1 , . . . , xn ; y1 , . . . , yn ; z1 , . . . , zn ) (possibly with further variables) which is true if and only if x = y · z, y = 1 and z = 1, where x1 , . . . , xn are bits of the natural number x, y1 , . . . , yn are bits of the natural number y and z1 , . . . , zn are bits of the natural number z. Now consider the formula fq (y; z) := f (q; y; z), where bits of q are written in f as constants and y and z are free variables. It is clear that fq has exactly two satisfying assignments: (y; z) = (p1 ; p2 ) and (y; z) = (p2 ; p1 ). In the proof of Theorem 2 we convert fq into an instance of the design problem such that from the optimal design it is possible to read out the two prime factors p1 and p2 of q. It is possible to argue that the proof of Theorem 2, where such instances are constructed, produces design problems that are artificial from the statistical point of view. In complexity theory, such a situation often occurs: in the large space of all admissible instances of a problem we are searching for the complexity core, but the core instances are “untypical” for the theory (e.g. statistics) that motivated the formulation of that problem. So, an important question to be further studied is: is it possible to define, in an exact sense, what is a “natural” instance of the design problem? Having seen that there exist “unnatural” instances, it would be valuable to define a property capturing the substance of the problem that is relevant for statistics and ruling out unnaturalness. Of course, further questions come to mind: is it computationally

ˇ M. Cerný, M. Hladík

1402

feasible to distinguish between the “natural” and “unnatural” instance? Is this property syntactic, in the sense that it can be observed from the form of the vector c and the experimental domain? Finally, is the restriction of the general problem onto the “natural” instances polynomial-time solvable, or is it again NP-complete? Such considerations should clarify what constitutes the intrinsic hardness of the problem; whether its complexity is induced by instances that a statistician would never think of or whether there are statistically-relevant instances that are as hard as the problem in general. Remark 3 The approximate version is polynomial-time solvable. This result might seem surprising as the definition of AOD only shows that AOD is recursively enumerable (or: 1 in arithmetic). The P-completeness result states that the problem AOD is as hard as any problem in P; for example, the detection of primality may be regarded as a particular instance of AOD (see Agrawal et al. [1]). Recall that NC, “Nick’s Class”, is the class of problems decidable by uniform polynomial-sized circuits of polylogarithmic depth (Papadimitriou [8]). These are sometimes called fast computable in parallel. Many problems are known to be fast computable in parallel, for instance determinant computation or solving systems of linear equations are. Under the broadly accepted conjecture P = NC, the P-completeness result implies that the approximate design problem is not fast computable in parallel. So we cannot expect that the problem would be solvable with parallel systems much faster than with a sequential computer.

4 Reduction to (mixed) linear programming The following theorem was proved in [6]. (Our formulation is slightly different from the theorem in the cited paper; however it is easily seen that our formulation is a direct consequence of the original statement.) Theorem 3 Let N be a natural number. Consider the mixed program max w subject to Ξ (y − z) = wc, y ≥ 0,

z ≥ 0,

1T (y + z) = 1, w ≥ 0,

(3)

N (y + z) integral

where Ξ is the matrix with columns x 1 , . . . , x k . If y, z, w is any optimal solution of (3), then ξ := y + z is the N -exact c-optimal experimental design and w −2 is its c-variance. We get an easy consequence:

Two complexity results on c-optimality in experimental design

1403

Lemma 1 If y, z, w is any optimal solution of max w subject to

Ξ (y − z) = wc, 1T (y + z) = 1, y ≥ 0, z ≥ 0, w ≥ 0,

(4)

then the vector ξ := y + z is the approximately c-optimal experimental design and w −2 is its c-variance. Proof We have to show that (i) ξ is N -exact for some N and (ii) for any N and any N -exact design ξ it holds varc (ξ ) ≥ varc (ξ ). By assumption, the linear program (4) is rational, and hence has a rational optimum y, z, w. Thus, for some N , the vector N · (y + z) is integral. This proves (i). The statement (ii) follows from the fact that any feasible solution of (3) is also a feasible solution of (4). Remark 4 As the approximate design problem is reducible to linear programming, it is reasonable to try to exploit the linear-programming formulation when new algorithms are constructed. An example of such treatment is the simplex-based algorithm for the design problem presented in [6]. However, Theorem 1 shows limits to this approach: it follows that any simplex-based algorithm for the approximate design problem suffers from the same problems as any simplex-based method; namely, it runs in exponential time in the worst case (unless a polynomial-time version of the simplex method is ever found).

5 Proof of Theorem 1 We will show that any linear program is reducible to the form (4) using a LOGSPACE-computable reduction. We will use the fact that linear programming is solvable in polynomial time (this is a well-known result by Khachyian [7]) and by Dobkin et al. [3] it is P-complete under LOGSPACE-reductions. This proves Theorem 1. We also get the following: Corollary 1 Linear programming and AOD are equivalent under LOGSPACEcomputable many-one reductions. Let (·)ij denote the (i, j )-th component of a matrix and (·)i the i-th component of a vector. The bitsize(r) of a rational number r is the number of bits necessary to write down the numerator and the denominator in binary (assuming they are coprime). The bitsize(Z) of a rational matrix Z is i,j bitsize((Z)i,j ). Lemma 2 If B is an integral matrix of full row rank and b is an integral vector, then there exists a constant L such that if the system of linear equations Bv = b has a non-negative solution then is has a non-negative rational solution v such that for any i the inequality (v)i ≤ L holds. Moreover, bitsize(L) = O(bitsize([B, b])) and L may be found from [B, b] in Turing computation space O(log bitsize([B, b])).

ˇ M. Cerný, M. Hladík

1404

Proof If Z is a square integral matrix then bitsize(det Z) ≤ bitsize(Z) (see Schrijver [11], p. 29). If the system Bv = b has a non-negative solution, then for some nonnegative solution v it holds that each non-zero entry of v may be computed using the det B Cramer’s Rule as det , where B is a submatrix of [B, b] (with columns possibly B permuted) and B is a submatrix of B. Hence, the bit sizes of both B and B are det B bounded by bitsize([B, b]) and so their determinants are also. Thus bitsize( det )= B O(bitsize([B, b])). LOGSPACE-computability is obvious. We have to consider a decision version of linear programming. The P-complete decision version of linear programming is: given a rational matrix B and a rational vector b, is the system Bv = b, v ≥ 0 feasible? By rationality we may assume that all numbers in B and b are integers. Lemma 3 The decision problem of feasibility of a general linear program in the form Bv = b, v ≥ 0 is reducible to the decision of feasibility of Du = d,

1T u = 1,

u ≥ 0.

(5)

Proof By Lemma 2 we know that the general program Bv = b, v ≥ 0 is feasible iff the program Bv = b,

v ≤ L · 1,

v≥0

(6)

v≥0

(7)

is feasible and, of course, iff the program Bv = b,

v ≤ nL · 1,

is feasible, where n denotes the number of columns of B. Consider the program 1 v (B 0) b, 1T v + z = 1, v ≥ 0, z ≥ 0 = (8) z nL which is in the desired form (5). If v , z is a solution to (8), then v := nLv is a solution to (7) and hence also (6) is feasible. And vice versa, if v is a solution to (6), 1 v, z := 1 − 1T v is a solution to (8) as (v )i ≤ n1 for any i = 1, . . . , n. then v := nL Lemma 4 The decision problem of the feasibility of (5) is reducible to (4). Proof Consider the linear program max w subject to

D d (y − z) = w , 1 1T

y ≥ 0,

z ≥ 0,

1T y + 1T z = 1,

(9)

w ≥ 0.

This problem is in the form (4) and is always feasible. We claim that (5) is feasible if and only if the optimal value of (9) is w ≥ 1. If (5) has a solution u, then y := u, z := 0, w = 1 is a solution to (9) and hence the maximum is ≥ 1. If the optimal value

Two complexity results on c-optimality in experimental design

1405

of w is ≥ 1, then there is a feasible solution with w = 1 (by convexity, as for w = 0 the program is feasible too) and then u := y is a solution to (5). This follows from the fact that if w = 1 then z = 0 due to the presence of both equations 1T y + 1T z = 1 and 1T y − 1T z = 1 in the system. Observe that in the construction in the proof of the Lemma we always have w ≤ 1. This follows from 1T y ≤ 1, 1T z ≥ 0 and w = 1T y − 1T z ≤ 1. Thus, w ≥ 1 iff w = 1. This shows that the problem remains P-complete even if we strengthen the condition w ≥ 0 in (4) into the form 1 ≥ w ≥ 0. The proof of Theorem 1 is now straightforward. If we D

d ask the AOD-question whether there exists a design over the domain 1T having 1 -variance at most one, this is equivalent to the feasibility of the original linear program. It is clear that the transformation can be done in logarithmic space and the transformation results into a linear increase in the size of the instance. (Note that by intro1 duction of a new variable it is sufficient to write down the factor nL in (8) once only.) So any algorithm for the design problem (4) may be used for any linear program.

6 Proof of Theorem 2 By Theorem 3, the exact c-optimal experimental design is equivalent to the mixed linear integer program max w subject to Ξ (y − z) = wc, 1T (y + z) = 1, N · (y + z) integral, y ≥ 0, z ≥ 0,

(10) w≥0

where Ξ is the experimental domain and N is the number of measurements of the dependent variable to be made. For the optimal w, w −2 is the c-variance of the optimal N -exact experimental design. From this formulation it is clear that EOD ∈ NP (as both linear and integer programming are). We will show that EOD is NP-complete by reduction from the following version of the Boolean Satisfiability Problem: given a boolean formula F in the conjunctive normal form, where each clause contains exactly three literals and no variable occurs twice in any clause, is F satisfiable? It is well-known that this problem is NP-complete (Garey and Johnson [4]; Papadimitriou [8]). Let F in the described form be given; let ζ1 , . . . , ζn denote its variables and C1 , . . . , Ck its clauses. Let Neg(Ci ) denote the number of negative literals in the clause Ci ; by assumption Neg(Ci ) ∈ {0, 1, 2, 3}. For a literal λ, let nsign(λ) denote 1 if the literal is negative and −1 if the literal is positive. For each clause Ci (i = 1, . . . , k) denote λi1 , λi2 , λi3 its three literals. First we shall consider a linear program with variables x = (x1 , . . . , xn )T , τ = (τ1 , . . . , τn )T and σ = (σ1 , . . . , σk )T . For each clause Ci we add the equation nsign(λi1 ) · xi1 + nsign(λi2 ) · xi2 + nsign(λi3 ) · xi3 + σi =

1 − Neg(Ci ) ; kn

ˇ M. Cerný, M. Hladík

1406

1 e.g. if C6 = (x1 ∨ ¬x3 ∨ ¬x5 ) then the equality is −x1 + x3 + x5 + σ6 = − kn . Note that this is equivalent to

1 1 1 x1 + − x3 + − x5 ≥ kn kn kn

(11)

if we interpret σ6 ≥ 0 as the slack variable for the inequality. Then, for each i = 1, . . . , n, we add xi + τ i =

1 kn

and, finally, x ≥ 0,

τ ≥ 0,

σ ≥ 0.

Call this linear program (L1 ); it is in the form Bv = b, v ≥ 0 where v consists of the variables x, τ , σ . Lemma 5 (a) Any solution of (L1 ) satisfies, for all i ∈ {1, . . . , n} and j ∈ {1, . . . , k}, 1 xi ∈ [0, kn ],

1 τi ∈ [0, kn ],

2 σj ∈ [0, kn ].

(b) If F is satisfiable and ζ1 , . . . , ζn is a satisfying valuation, let xi := 0 if ζi = 1 if ζi = TRUE. Then there exist unique values for σ and FALSE and let xi := kn τ such that x, σ , τ is a solution to (L1 ). 1 } for all i = 1, . . . , n then F (c) If there exists a solution to (L1 ) such that xi ∈ {0, kn is satisfiable. 1 Proof is straightforward: from (11) it is apparent that the value xi = kn represents the value ζi = TRUE and the value xi = 0 represents the value ζi = FALSE. The right-hand side of (11) states that the clause evaluates to TRUE.

The presence of fractions of the form 1T v =

· kn

is essential for the inequality

n k 1 2 () (xi + τi ) + σj ≤ + ≤ 1. k n i=1

j =1

We may assume that k and n are large enough so that () holds. If we add a slack variable t ≥ 0 in this inequality, then the following linear program (L2 ) is equivalent to (L1 ): B v = b,

v ≥ 0,

1T v = 1

Two complexity results on c-optimality in experimental design

where b := (B 0) and v := program (L3 ): max w

subject to

1407

v

t . Now set N := kn and consider the following mixed

B b (y − z) = w · , 1 1T

N · (y + z) integral,

y ≥ 0,

1T (y + z) = 1, z ≥ 0,

(12)

w ≥ 0.

The system (L3 ) is in the form (10). Lemma 6 The optimal value w of (L3 ) is at least one if and only if F is satisfiable. Proof It is easy to see, as in the proof of Theorem 1, that the maximum w cannot exceed 1. Thus we claim that the maximum w equals one if and only if F is satisfiable. We may regard the components of y as (x + , τ + , σ + , t + ). If F is satisfiable then, by Lemma 5(b), (L1 ) is satisfiable and all values of xi , τi and σj are integer multiples of N1 . Hence also t = 1 − i (xi + τi ) − j σj is a non-negative integer multiple of N1 . Let x, τ , σ , t be this feasible solution of (L2 ). If we define y as x + := x,

τ + := τ ,

σ + := σ ,

t + := t

and we set z := 0 and w = 1, we get a solution to (L3 )—all of the numbers are integer multiples of N1 and hence the integrality restrictions are met. If the maximum of (L3 ) is w = 1 then z = 0 (as in the proof of Theorem 1) and by the integrality restrictions, all the components of x + are integral multiples of N1 . However, by Lemma 5(a), (x + )i ∈ [0, N1 ] for all i, and hence the only possible values are (x + )i = 0 or (x + )i = N1 . By Lemma 5(c), this means that F is satisfiable. As the construction of (L3 ) can be done in polynomial time, the proof of Theorem 2 is complete. Remark 5 An approximately c-optimal design is N -exact for some N . Let us show an easy estimate for N . We may assume that all the domain vectors x i and the vector c are integral. Fix any optimal basis of (4) and let B be the columns of the matrix

Ξ 1T

−Ξ 1T

−c 0

corresponding to non-zero variables in that basis. By Cramer’s rule, any non-zero variable may be computed as a fraction with the term det B in denominator. This implies that if N is a multiple of | det B| then N ξi is integral for all i. Hence it suffices to make N = | det B| observations in order the approximate experiment be exact. This is interesting in particular if the experimental domain is formed of integral and small numbers and if the dimension is low compared to the number of points in the experimental domain.

1408

ˇ M. Cerný, M. Hladík

References 1. Agrawal, M., Kayal, N., Saxena, N.: PRIMES is in P. Ann. Math. 160, 781–793 (2004) 2. Atkinson, A., Donev, A., Tobias, R.: Optimum Experimental Designs with SAS. Oxford University Press, Oxford (2007) 3. Dobkin, D., Lipton, R., Reiss, S.: Linear programming is log-space hard for P. Inf. Process. Lett. 8(2), 96–97 (1979) 4. Garey, M., Johnson, D.: Computers and Intractability. A Guide to the Theory of NP-Completeness. Freeman, New York (1979) 5. Greenlaw, R., Hoover, H., Ruzzo, W.: Limits to Parallel Computation. P-completeness Theory. Oxford University Press, Oxford (1995) 6. Harman, R., Jurík, T.: Computing c-optimal experimental designs using the simplex method of linear programming. Comput. Stat. Data Anal. 53, 247–254 (2008) 7. Khachyian, L.: A polynomial algorithm for linear programming. Dokl. Soviet Acad. Sci. 244(5), 1093–1096 (1979) 8. Papadimitriou, C.: Computational Complexity. Addison-Wesley, Longman (1995) 9. Pázman, A.: Foundations of Optimum Experimental Design. Reidel, Dordrecht (1986) 10. Pukelsheim, F., Rieder, S.: Efficient rounding in approximate designs. Biometrika 79, 763–770 (1992) 11. Schrijver, A.: Theory of Linear and Integer Programming. Wiley, New York (1986)

Lihat lebih banyak...

Two complexity results on c-optimality in experimental design

Descripción

Comentarios