Dakota Reference Manual  Version 6.4
Large-Scale Engineering Optimization and Uncertainty Analysis
 All Pages
correction


Correction approaches for surrogate models

Specification

Alias: none

Argument(s): none

Default: no surrogate correction

Required/Optional Description of Group Dakota Keyword Dakota Keyword Description
Required
(Choose One)
correction order (Group 1) zeroth_order

Specify that truth values must be matched.

first_order

Specify that truth values and gradients must be matched.

second_order

Specify that truth values, gradients and Hessians must be matched.

Required
(Choose One)
correction type (Group 2) additive

Additive correction factor for local surrogate accuracy

multiplicative

Multiplicative correction factor for local surrogate accuracy.

combined

Multipoint correction for a hierarchical surrogate

Description

Some of the surrogate model types support the use of correction factors that improve the local accuracy of the surrogate models.

The correction specification specifies that the approximation will be corrected to match truth data, either matching truth values in the case of zeroth_order matching, matching truth values and gradients in the case of first_order matching, or matching truth values, gradients, and Hessians in the case of second_order matching. For additive and multiplicative corrections, the correction is local in that the truth data is matched at a single point, typically the center of the approximation region. The additive correction adds a scalar offset (zeroth_order), a linear function (first_order), or a quadratic function (second_order) to the approximation to match the truth data at the point, and the multiplicative correction multiplies the approximation by a scalar (zeroth_order), a linear function (first_order), or a quadratic function (second_order) to match the truth data at the point. The additive first_order case is due to[57] and the multiplicative first_order case is commonly known as beta correction[40]. For the combined correction, the use of both additive and multiplicative corrections allows the satisfaction of an additional matching condition, typically the truth function values at the previous correction point (e.g., the center of the previous trust region). The combined correction is then a multipoint correction, as opposed to the local additive and multiplicative corrections. Each of these correction capabilities is described in detail in[22].

The correction factors force the surrogate models to match the true function values and possibly true function derivatives at the center point of each trust region. Currently, Dakota supports either zeroth-, first-, or second-order accurate correction methods, each of which can be applied using either an additive, multiplicative, or combined correction function. For each of these correction approaches, the correction is applied to the surrogate model and the corrected model is then interfaced with whatever algorithm is being employed. The default behavior is that no correction factor is applied.

The simplest correction approaches are those that enforce consistency in function values between the surrogate and original models at a single point in parameter space through use of a simple scalar offset or scaling applied to the surrogate model. First-order corrections such as the first-order multiplicative correction (also known as beta correction[14]) and the first-order additive correction[57] also enforce consistency in the gradients and provide a much more substantial correction capability that is sufficient for ensuring provable convergence in SBO algorithms. SBO convergence rates can be further accelerated through the use of second-order corrections which also enforce consistency in the Hessians[22], where the second-order information may involve analytic, finite-difference, or quasi-Newton Hessians.

Correcting surrogate models with additive corrections involves

\begin{equation} \hat{f_{hi_{\alpha}}}({\bf x}) = f_{lo}({\bf x}) + \alpha({\bf x}) \end{equation}

where multifidelity notation has been adopted for clarity. For multiplicative approaches, corrections take the form

\begin{equation} \hat{f_{hi_{\beta}}}({\bf x}) = f_{lo}({\bf x}) \beta({\bf x}) \end{equation}

where, for local corrections, $\alpha({\bf x})$ and $\beta({\bf x})$ are first or second-order Taylor series approximations to the exact correction functions:

\begin{eqnarray} \alpha({\bf x}) & = & A({\bf x_c}) + \nabla A({\bf x_c})^T ({\bf x} - {\bf x_c}) + \frac{1}{2} ({\bf x} - {\bf x_c})^T \nabla^2 A({\bf x_c}) ({\bf x} - {\bf x_c}) \\ \beta({\bf x}) & = & B({\bf x_c}) + \nabla B({\bf x_c})^T ({\bf x} - {\bf x_c}) + \frac{1}{2} ({\bf x} - {\bf x_c})^T \nabla^2 B({\bf x_c}) ({\bf x} - {\bf x_c}) \end{eqnarray}

where the exact correction functions are

\begin{eqnarray} A({\bf x}) & = & f_{hi}({\bf x}) - f_{lo}({\bf x}) \\ B({\bf x}) & = & \frac{f_{hi}({\bf x})}{f_{lo}({\bf x})} \end{eqnarray}

Refer to[22] for additional details on the derivations.

A combination of additive and multiplicative corrections can provide for additional flexibility in minimizing the impact of the correction away from the trust region center. In other words, both additive and multiplicative corrections can satisfy local consistency, but through the combination, global accuracy can be addressed as well. This involves a convex combination of the additive and multiplicative corrections:

\[ \hat{f_{hi_{\gamma}}}({\bf x}) = \gamma \hat{f_{hi_{\alpha}}}({\bf x}) + (1 - \gamma) \hat{f_{hi_{\beta}}}({\bf x}) \]

where $\gamma$ is calculated to satisfy an additional matching condition, such as matching values at the previous design iterate.

It should be noted that in both first order correction methods, the function $\hat{f}(x)$ matches the function value and gradients of $f_{t}(x)$ at $x=x_{c}$. This property is necessary in proving that the first order-corrected SBO algorithms are provably convergent to a local minimum of $f_{t}(x)$. However, the first order correction methods are significantly more expensive than the zeroth order correction methods, since the first order methods require computing both $\nabla f_{t}(x_{c})$ and $\nabla f_{s}(x_{c})$. When the SBO strategy is used with either of the zeroth order correction methods, or with no correction method, convergence is not guaranteed to a local minimum of $f_{t}(x)$. That is, the SBO strategy becomes a heuristic optimization algorithm. From a mathematical point of view this is undesirable, but as a practical matter, the heuristic variants of SBO are often effective in finding local minima.

Usage guidelines

  • Both the additive zeroth_order and multiplicative zeroth_order correction methods are "free" since they use values of $f_{t}(x_{c})$ that are normally computed by the SBO strategy.
  • The use of either the additive first_order method or the multiplicative first_order method does not necessarily improve the rate of convergence of the SBO algorithm.
  • When using the first order correction methods, the gradient-related response keywords must be modified to allow either analytic or numerical gradients to be computed. This provides the gradient data needed to compute the correction function.
  • For many computationally expensive engineering optimization problems, gradients often are too expensive to obtain or are discontinuous (or may not exist at all). In such cases the heuristic SBO algorithm has been an effective approach at identifying optimal designs[35].