Dakota Reference Manual
Version 6.12
Explore and Predict with Confidence

Optimization algorithms work to minimize (or maximize) an objective function, typically calculated by the user simulation code, subject to constraints on design variables and responses. Available approaches in Dakota include welltested, proven gradientbased, derivativefree local, and global methods for use in science and engineering design applications. Dakota also offers more advanced algorithms, e.g., to manage multiobjective optimization or perform surrogatebased minimization. This chapter summarizes optimization problem formulation, standard algorithms available in Dakota (mostly through included thirdparty libraries, see Section 6.5 of[5]), some advanced capabilities, and offers usage guidelines.
This section provides a basic introduction to the mathematical formulation of optimization, problems. The primary goal of this section is to introduce terms relating to these topics, and is not intended to be a description of theory or numerical algorithms. For further details, consult[7] ,[32],[39],[66], and [87].
A general optimization problem is formulated as follows:
where vector and matrix terms are marked in bold typeface. In this formulation, is an ndimensional vector of realvalued design variables or design parameters. The ndimensional vectors, and , are the lower and upper bounds, respectively, on the design parameters. These bounds define the allowable values for the elements of , and the set of all allowable values is termed the design space or the parameter space. A design point or a sample point is a particular set of values within the parameter space.
The optimization goal is to minimize the objective function, , while satisfying the constraints. Constraints can be categorized as either linear or nonlinear and as either inequality or equality. The nonlinear inequality constraints}, , are ``2sided,'' in that they have both lower and upper bounds, and , respectively. The nonlinear equality constraints, , have target values specified by . The linear inequality constraints create a linear system , where is the coefficient matrix for the linear system. These constraints are also 2sided as they have lower and upper bounds, and , respectively. The linear equality constraints create a linear system , where is the coefficient matrix for the linear system and are the target values. The constraints partition the parameter space into feasible and infeasible regions. A design point is said to be feasible if and only if it satisfies all of the constraints. Correspondingly, a design point is said to be infeasible if it violates one or more of the constraints.
Many different methods exist to solve the optimization problem given in Section 6.1 of[5], all of which iterate on in some manner. That is, an initial value for each parameter in is chosen, the response quantities, , , , are computed, often by running a simulation, and some algorithm is applied to generate a new that will either reduce the objective function, reduce the amount of infeasibility, or both. To facilitate a general presentation of these methods, three criteria will be used in the following discussion to differentiate them: optimization problem type, search goal, and search method.
The optimization problem type can be characterized both by the types of constraints present in the problem and by the linearity or nonlinearity of the objective and constraint functions. For constraint categorization, a hierarchy of complexity exists for optimization algorithms, ranging from simple bound constraints, through linear constraints, to full nonlinear constraints. By the nature of this increasing complexity, optimization problem categorizations are inclusive of all constraint types up to a particular level of complexity. That is, an unconstrained problem has no constraints, a boundconstrained problem has only lower and upper bounds on the design parameters, a linearlyconstrained problem has both linear and bound constraints, and a nonlinearlyconstrained problem may contain the full range of nonlinear, linear, and bound constraints. If all of the linear and nonlinear constraints are equality constraints, then this is referred to as an equalityconstrained problem, and if all of the linear and nonlinear constraints are inequality constraints, then this is referred to as an inequalityconstrained problem. Further categorizations can be made based on the linearity of the objective and constraint functions. A problem where the objective function and all constraints are linear is called a linear programming (LP) problem. These types of problems commonly arise in scheduling, logistics, and resource allocation applications. Likewise, a problem where at least some of the objective and constraint functions are nonlinear is called a nonlinear programming (NLP) problem. These NLP problems predominate in engineering applications and are the primary focus of Dakota.
The search goal refers to the ultimate objective of the optimization algorithm, i.e., either global or local optimization. In global optimization, the goal is to find the design point that gives the lowest feasible objective function value over the entire parameter space. In contrast, in local optimization, the goal is to find a design point that is lowest relative to a ``nearby'' region of the parameter space. In almost all cases, global optimization will be more computationally expensive than local optimization. Thus, the user must choose an optimization algorithm with an appropriate search scope that best fits the problem goals and the computational budget.
The search method refers to the approach taken in the optimization algorithm to locate a new design point that has a lower objective function or is more feasible than the current design point. The search method can be classified as either gradientbased or nongradientbased. In a gradientbased algorithm, gradients of the response functions are computed to find the direction of improvement. Gradientbased optimization is the search method that underlies many efficient local optimization methods. However, a drawback to this approach is that gradients can be computationally expensive, inaccurate, or even nonexistent. In such situations, nongradientbased search methods may be useful. There are numerous approaches to nongradientbased optimization. Some of the more well known of these include pattern search methods (nongradientbased local techniques) and genetic algorithms (nongradientbased global techniques).
Because of the computational cost of running simulation models, surrogatebased optimization (SBO) methods are often used to reduce the number of actual simulation runs. In SBO, a surrogate or approximate model is constructed based on a limited number of simulation runs. The optimization is then performed on the surrogate model. Dakota has an extensive framework for managing a variety of local, multipoint, global, and hierarchical surrogates for use in optimization. Finally, sometimes there are multiple objectives that one may want to optimize simultaneously instead of a single scalar objective. In this case, one may employ multiobjective methods that are described in Section 6.3.1 of[5].
This overview of optimization approaches underscores that no single optimization method or algorithm works best for all types of optimization problems. Section 6.4 of[5] offers guidelines for choosing a Dakota optimization algorithm best matched to your specific optimization problem.
Dakota's input commands permit the user to specify twosided nonlinear inequality constraints of the form , as well as nonlinear equality constraints of the form . Some optimizers (e.g., npsol_
, optpp_
, soga
, and moga
methods) can handle these constraint forms directly, whereas other optimizers (e.g., asynch_pattern_search
, dot_
, and conmin_
, mesh_adaptive_search
) require Dakota to perform an internal conversion of all constraints to onesided inequality constraints of the form . In the latter case, the twosided inequality constraints are treated as and and the equality constraints are treated as and . The situation is similar for linear constraints: asynch_pattern_search
, npsol_
, optpp_
, soga
, and moga
methods support them directly, whereas dot_
and conmin_
methods do not. For linear inequalities of the form and linear equalities of the form , the nonlinear constraint arrays in dot_
and conmin_
methods are further augmented to include and in the inequality case and and in the equality case. Awareness of these constraint augmentation procedures can be important for understanding the diagnostic data returned from the dot_
and conmin_
methods. Other optimizers fall somewhere in between. nlpql_
methods support nonlinear equality constraints and nonlinear onesided inequalities , but does not natively support linear constraints. Constraint mappings are used with NLPQL for both linear and nonlinear cases. Most coliny_
methods now support twosided nonlinear inequality constraints and nonlinear constraints with targets, but do not natively support linear constraints.
When gradient and Hessian information is used in the optimization, derivative components are most commonly computed with respect to the active continuous variables, which in this case are the continuous design variables. This differs from parameter study methods (for which all continuous variables are active) and from nondeterministic analysis methods (for which the uncertain variables are active). Refer to Chapter 11 of[5] for additional information on derivative components and active continuous variables.
This section summarizes the optimization methods available in Dakota. We group them according to search method and search goal and establish their relevance to types of problems. For a summary of this discussion, see Section 6.4 of[5].
Gradientbased optimizers are best suited for efficient navigation to a local minimum in the vicinity of the initial point. They are not intended to find global optima in nonconvex design spaces. For global optimization methods, see Section 6.2.3 of[5]. Gradientbased optimization methods are highly efficient, with the best convergence rates of all of the local optimization methods, and are the methods of choice when the problem is smooth, unimodal, and wellbehaved. However, these methods can be among the least robust when a problem exhibits nonsmooth, discontinuous, or multimodal behavior. The derivativefree methods described in Section 6.2.2 of[5] are more appropriate for problems with these characteristics.
Gradient accuracy is a critical factor for gradientbased optimizers, as inaccurate derivatives will often lead to failures in the search or premature termination of the method. Analytic gradients and Hessians are ideal but often unavailable. If analytic gradient and Hessian information can be provided by an application code, a full Newton method will achieve quadratic convergence rates near the solution. If only gradient information is available and the Hessian information is approximated from an accumulation of gradient data, the superlinear convergence rates can be obtained. It is most often the case for engineering applications, however, that a finite difference method will be used by the optimization algorithm to estimate gradient values. Dakota allows the user to select the step size for these calculations, as well as choose between forwarddifference and centraldifference algorithms. The finite difference step size should be selected as small as possible, to allow for local accuracy and convergence, but not so small that the steps are ``in the noise.'' This requires an assessment of the local smoothness of the response functions using, for example, a parameter study method. Central differencing will generally produce more reliable gradients than forward differencing but at roughly twice the expense.
Gradientbased methods for nonlinear optimization problems can be described as iterative processes in which a sequence of subproblems, usually which involve an approximation to the full nonlinear problem, are solved until the solution converges to a local optimum of the full problem. The optimization methods available in Dakota fall into several categories, each of which is characterized by the nature of the subproblems solved at each iteration.