-
Notifications
You must be signed in to change notification settings - Fork 2
/
kkt.tex
233 lines (162 loc) · 13.6 KB
/
kkt.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
\documentclass{article}
\usepackage{graphicx}
\usepackage{amssymb,amsmath,latexsym}
\usepackage{listings,microtype}
\usepackage{geometry}
\geometry{letterpaper, margin=1in}
\lstset{
basicstyle=\ttfamily,
columns=fullflexible,
frame=single,
breaklines=true,
%postbreak=\mbox{\textcolor{red}{$\hookrightarrow$}\space},
}
\usepackage{comment}
\title{Developing accurate KKT formulation for NLP systems}
\begin{document}
\section{Introduction}
Solving constrained non linear optimization problems begins with identification of the Karush-Kuhn-Tucker conditions. This is due to the fact that for a differentiable convex system where constraint qualification holds, the solution of NLP is the solution to the KKT system and vise versa. The KKT conditions enables solving the NLP model as an MCP model by representing the KKT conditions as complimentarily equations. Thus, it is often advantageous to explicitly model the KKT conditions and solve them as a Mixed Complementarity Problem (MCP), as described in the PATH Solver Manual (/link) using the example of a Linear Transportation model. The traditional model of fixed demand and price is made more realistic by making the variables endogenous, i.e. the price of commodity is market affects the demand of the commodity, and the model is solved as a complimentarily problem. At times the complexity of the model causes errors in formulation of KKT conditions, leading to incorrect or infeasible solutions. Finding the source of the error can be a particularly tedious task. Thus, in this article, we propose a systematic method for identification of errors in the explicit KKT conditions for solving an NLP. .
We begin by understanding the definition of complementarity. If a function $F(z) : {\!R}^n \mapsto {\!R}^n$, lower bounds $ l \in { \!R \cup {-\infty}}^n$ and upper bounds $ u \in { \!R \cup {\infty}}^n$ has a solution vector $z \in \!R$ such that for each $ i \in {1,...,n}$, one of the following three conditions hold :
\centerline{$F_{i}(z) = 0$ and $ l_i \leq z_i \leq u_i $ or}
\centerline{$F_{i}(z) > 0$ and $ z_i = l_i$ or }
\centerline{ $F_{i}(z) < 0$ and $ z_i = u_i$ }
then function $F$ is complementary to the variable $z$ and its bounds. This is written in compact form as
\centerline{ $ F(z) \perp L \leq z \leq U $ }
\par
\noindent Where the symbol $\perp$ means "perpendicular to". From point of view of optimization, if $F(z)$ is non zero (non-binding constraint), changes in $z$ may further optimize the objective function until the constraint becomes binding. If $F(z)$ is binding, no changes in $z$ would further enhance the objective function, causing the marginal to be zero.
\noindent Consider the generalized NLP formulation below.
\begin{equation}
\begin{aligned}
& \min
& & f(x) \\
& \text{s.t.} & & g_{i}(x) \leqslant 0 & i = 1,2...m \\
& & & h_{k}(x) = 0 & k = 1,2...p \\
& & & d_{l}(x) \geqslant =0 & l = 1,2...q \\
& & & L \leq x \leq Z \\
& & & f(x): {\!R}^n \mapsto \!R , g(x): {\!R}^n \mapsto {\!R}^m\\
& & & h(x): {\!R}^n \mapsto {\!R}^ p , d(x): {\!R}^n \mapsto {\!R}^ p\\
\end{aligned}
\end{equation}
\noindent The Lagrange function and KKT conditions for the formulation above in the complementarity form are written as \\
\begin{equation}
\begin{aligned}
& L(x,u,v,w) = f(x) - <u,g(x)> - <v,h(x)> - <w,d(x)> \\
& \bigtriangledown_x L \perp L_x \leq x \leq U_x \\
& - \bigtriangledown_u L \perp u \geq 0 \\
& -\bigtriangledown_v L \perp v free \\
& -\bigtriangledown_w L \perp w \leq 0 \\
\end{aligned}
\end{equation}
\noindent It should be noted that the gradient of Lagrangian w.r.t to the marginals from NLP result in the original inequality and equality constraints of the NLP.
%--------------------------------------%--------------------------------------%--------------------------------------%--------------------------------------
\begin{comment}
\begin{equation}
\begin{aligned}
& \bigtriangledown{f(\hat{x})} - \sum_{i=1}^{m} u_{i} \bigtriangledown{g_{i}(\hat{x})}
- \sum_{k=1}^{p} v_{i} \bigtriangledown{h_{k}(\hat{x})} - \sum_{l=1}^{q} w_{i} \bigtriangledown{d_{l}(\hat{x})} = 0 \\
\\
& h_{k}(\hat{x}) = 0 k = 1,2...p \\
g_{i}(\hat{x}) \leq 0& i = 1,2...m \\ d_{l}(\hat{x}) \geq =0 & l = 1,2...q
\\
and,\\
<u_{i},g_{i}(x)> = 0 \\ <v_{i},h_{k}(x)> =0 \\ <w_{l},d_{l}(x)> =0
\end{aligned}
\end{equation}
where $<u_{i},g_{i}(x)> = 0$ represent the complimentarily condition and variables u, v, and w represent the marginals of the respective constraint. It is often written as
$g_{i}(x) \perp L \leq u \leq U $
where symbol $\perp $(referred to as perpendicular to) indicates pair-wise complementarity between the function g() and variable u and its bounds. The complimentarity condition essentially
\end{comment}
%--------------------------------------%--------------------------------------%--------------------------------------%--------------------------------------
\noindent Thus an NLP can be solved as an MCP using the KKT conditions. However, formulating the KKT conditions might prove difficult for complex systems, and require verification of their accuracy. In the following sections, we provide the framework for modeling the KKT conditions by using concept of dummy complementarity equations in the KKT formulation. The process includes the following steps:
\begin{enumerate}
\item Solve NLP formulation without explicit KKT conditions, and save the results
\item Restart the problem as an MCP with a system of dummy KKT conditions using solution from step 1 as initial point. The MCP should start at the solution itself
\item Replace one dummy equation at a time with one KKT condition. Resolve the NLP and save the results
\item Restart the problem as MCP. If MCP iteration count is $>$ 0, there exists a problem with the KKT condition which was introduced.
\end{enumerate}
\noindent Follow steps 3 and 4 until all KKT conditions have been successfully incorporated.
\section{Example : Maximum Revenue- NLP formulation}
Consider a simple case of a steel factory trying to maximize the revenue under budget constraints, with man-hours(h) and raw materials steel (s) as the decision variables. The revenue is a function of decision variables given by
\centerline{$R(h,s) = 200 h^{(2/3)}s^{(1/3)} $ }
\bigbreak
\noindent where, budget = \$ 20,000
cost of manpower = \$ 20 /hr
cost of raw material = \$ 170 / tonn
\noindent In it's standard form, the model can be written as :
\begin{equation}
\begin{aligned}
& \min
& & R(h,s) = - 200 h^{2/3}s^{1/3} \\
& \text{s.t.} & & 20h + 170s \leq 20000 \\
& & & L< h,s < U \\
\end{aligned}
\end{equation}
The GAMS program below gives the optimum values of \lstinputlisting{codes/factors.txt}
\lstinputlisting {codes/sd_nlp.gms}
\noindent The above model is then written in form of an MCP, by explicitly adding the KKT conditions to the model. For the given system, the KKT conditions are given as
\begin{equation}
\begin{aligned}
L = & - 200 h^{2/3}s^{1/3} - con1_m [ 20h + 170 s - 20000] \\
\bigtriangledown _h L = & - 200* (2/3) h^{(-1/3)} s^{(1/3)} - con1\_m*(20) \\
\bigtriangledown _s L: & - 200 * (1 / 3) h^{(2/3)} *(1/3) * s^{(-2/3)} - con1\_m*(170) \\
\bigtriangledown _{con1_m} L : & 20*h + 170 * s - 20000 =0 \\
\end{aligned}
\end{equation}
\noindent It should be noted that the third KKT equation, result of differentiation w.r.t to the marginal, is the inequality budget constraint from original NLP model. The above equations are solved as an MCP model as shown below using the results saved in the initial run. In the model below, an error has been made in one of the KKT conditions. Execution of the model terminates due to the abort statement, with declaration \textit{'we did not start at the solution'} as shown in the subsequent log.
\lstinputlisting{codes/sd_kkt.gms}
\lstinputlisting{codes/sd_kkt.log}
\noindent The absence of \textit{abort} statement results in a new solution to the MCP where the results do not match the desired values from the original NLP formulation, also indicating error in one of the KKT conditions. From simple observation, we can see that an error was made in typing out the multiplier for variable \textit{con1\_m} in equation \textit{dLdh}, which when corrected provides results consistent with the NLP formulation.
However, mere observation does not help with larger, more complex systems. A strategy, as described in Section 1 is required to effectively correct the model. In the given case, the MCP formulation can be first solved using dummy KKT conditions described below. These conditions can then be replaced one at a time with the real KKT equations derived earlier.
\lstinputlisting{codes/sd_kkt_dummy.gms}
\subsection{Rules for writing dummy KKT conditions}
Writing dummy condition, as straightforward as it may seem requires consideration of few salient points to avoid errors from the GAMS compiler.
\begin{enumerate}
\item Number of dummy equations must equal number of marginal variables from the NLP formulation
\item Each dummy KKT condition must be accompanied by its differentiating variable (of Lagrangian)
\item The value to differentiating variable is fixed at the marginal values provided by NLP solution
\item Each KKT condition must be modeled as 'complimentary to' the differentiation variable for the condition. i.e. $dLdh \perp h$, and is modeled as $dLdh.h$ in the model statement
\item Each variable representing the original constraint marginal (e.g con1\_m) in the MCP model must appear in one of the KKT conditions.
\end{enumerate}
Thus, all variables representing the marginals of NLP constraints can be incorporated in any one of the dummy equation. This equation should be replaced by KKT condition at the very end. However, in the current case, variable 'con1\_m' , representing marginal of constraint 'con1' from NLP formulation is part of both the KKT conditions. Thus, though incorporated in equation dLds , the order of replacing the dummy equation with KKT condition is irrelevant in the given case, but can be better understood in the example described in subsequent section
\section{KKT Development for complex systems}
In this section, we implement the proposed methodology and learnings from previous example to solve a more complex NLP problem as MCP. The NLP example has been designed as a minimization model, which helps maintaining the sign convention. The complexity applies the methodology over range of equations and contains leads/lags in the formulation. The NLP Model and the output is given below.
\lstinputlisting{codes/EX2.gms}
\lstinputlisting{codes/EX2.txt}
As seen in the model, the problem consists of five constraints and one objective variable. We define multiplier variables per the convention for each equations as follows:
\renewcommand\labelitemi{\tiny$\bullet$}
\begin{itemize}
\item $p(i)$ for equation $sssdef(i)$ , free variable initialized at $sssdef.m(i)$
\item $q(j)$ for equation $tttdef(i)$ , free variable initialized at $tttdef(i)$
\item $r1$ for equation $esum$ , positive variable initialized at $esum.m$
\item $r2$ for equation $ssum$ , positive variable initialized at $Ssum.m$
\item $s1$ for equation $allbnd$ , negative variable initialized at $allbnd.m$
\end{itemize}
The Lagrangian ($L$)of the function is given as
\begin{equation}
L (sss,ttt,x,p,q,r1,r2,s) = obj - \sum_{i} p(i) . sssdef(i) - \sum_{j} q(j) . tttdef(j) - r1 . esum - r2 . ssum - s1 . allbnd
\end{equation}
The gradients of Lagrangian with respect to the marginals are
\begin{equation}
\begin{aligned}
\\
& dLdttt(j) = \sum_{i} p(i)A(i,j) - q(j) - s1 \\
\\
& dLdsss(i) = 2(sss(i) - s0(i)) - p(i) - r2 - s1 , \quad r2 \forall i2 \in i \\
\\
& dLdx(j) = c(j) + 4q(j) + q(j-1) - r1 \exp^{(x(j)-1)} - s1 , \quad \exp^{(x(j)-1)} \forall j2 \in j \\
\\
\end{aligned}
\end{equation}
The gradients of Lagrangian with respect to the variables are the constraints themselves.
We save the results from the NLP formulation, and restart the following code with dummy KKT conditions shown in the code below. Since $\bigtriangledown L $ w.r.t to the variables are constraints themselves, no dummy equations are needed. Additionally, they need not be modeled with the other dummy equations under consideration.
\lstinputlisting{codes/EX2KKT0.gms}
The above model exits at iteration 0 with message "solution found' as shown by the log below.
\lstinputlisting{codes/EX2KKT0.log}
\subsection{Replacing Dummy Equations}
The dummy KKT equations representing gradient w.r.t the marginals are replaced one at a time per the process described in previous section. We first replace the equation $dLdttt(j)$ with the real KKT equation as shown in the model below (using restart feature). Also, the associated complementarity variable $ttt(i)$ is no longer fixed, which is the key to switching dummy equation from inactive to active. Both the dummy equation and the has been commented. The output log return a normal completion, as shown indicating that we started at the solution, i.e. the KKT equation for $dLttt(i)$ is correct.
It should be noted that the multiplier $p(i)$, defined in equation $dLdsss(i)$ is now removed as it is already part of the model in the new $dLdttt(j)$ equation. However, keeping p(i) in the dummy equation would not affect the solution in any way because the multiplier $sss(j)$ for the equation $dLdsss(i)$ is still fixed, allowing the dummy equation to take any value in the LHS (attribute to =n= equality).
\lstinputlisting{codes/EX2KKT1.gms}
\lstinputlisting{codes/EX2KKT1.log}
The remaining dummy equations are removed per the process described above, to obtain the final MCP model shown below. The model is initialized by the results obtained from solving it as an NLP.
\lstinputlisting{codes/EX2KKT3.gms}
\end{document}