Main Page | Class Hierarchy | Directories | File List | Related Pages

Model Commands

Model Commands Table of Contents

Model Description

The model specification in a DAKOTA input file specifies the components to be used in constructing a particular model instance. This specification selects a Model from the model hierarchy, which includes SingleModel, DataFitSurrModel, HierarchSurrModel, and NestedModel derived classes. Depending on the type of derived model, different sub-specifications are needed to construct different components of the model. In all cases, however, the model provides the logical unit for determining how a set of variables is mapped into a set of responses in support of an iterative method.

Several examples follow. The first example shows a minimal specification for a single model.

model,					        \
	single
This example does not provide any pointers and therefore relies on the default behavior of constructing the model with the last variables, interface, and responses specifications parsed. This is also the default model specification, for the case where no model specifications are provided by the user.

The next example displays a surrogate model specification which selects a quadratic polynomial from among the global approximation methods. It uses a pointer to a design of experiments method for generating the data needed for building the global approximation, reuses any old data available for the current approximation region, and employs the first-order multiplicative approach to correcting the approximation at the center of the current approximation region.

model,					        \
	id_model = 'M1'				\
	variables_pointer = 'V1'		\
	responses_pointer = 'R1'		\
	surrogate global			\
	  quadratic polynomial			\
	  dace_method_pointer = 'DACE'		\
	  reuse_samples region			\
	  correction multiplicative first_order
This example demonstrates the use of identifiers and pointers. It provides the optional model independent specifications for model identifier, variables pointer, and responses pointer (see Model Independent Controls) as well as model dependent specifications for global surrogates (see Global approximations).

Finally, an advanced nested model example would be

model,					        			\
	id_model = 'M1'							\
	variables_pointer = 'V1'					\
	responses_pointer = 'R1'					\
	nested								\
	  optional_interface_pointer = 'OI1'				\
	    optional_interface_responses_pointer = 'OIR1'		\
	  sub_method_pointer = 'SM1'					\
	    primary_variable_mapping   = ''  ''  'X'     'Y'		\
	    secondary_variable_mapping = ''  ''  'mean'  'mean'		\
	    primary_response_mapping   = 1. 0. 0. 0. 0. 0. 0. 0. 0.	\
	    secondary_response_mapping = 0. 0. 0. 1. 3. 0. 0. 0. 0.	\
					 0. 0. 0. 0. 0. 0. 1. 3. 0.
This example also supplies model independent controls for model identifier, variables pointer, and responses pointer (see Model Independent Controls), and supplies model dependent controls for specifying details of the nested mapping (see Nested Model Controls).

Model Specification

As alluded to in the examples above, the model specification has the following structure:
model, 						\
	<model independent controls>		\
	<model selection>			\
	  <model dependent controls>

The <model independent controls> are those controls which are valid for all models. Referring to dakota.input.spec, these controls are defined externally from and prior to the model selection blocks. The model selection blocks are all required group specifications separated by logical OR's, where the model selection must be single OR surrogate OR nested. If a surrogate model is specified, a secondary selection must be made for its type: global, multipoint, local, or hierarchical. The <model dependent controls> are those controls which are only meaningful for a specific model. These controls are defined within each model selection block. Defaults for model independent and model dependent controls are defined in DataModel. The following sections provide additional detail on the model independent controls followed by the model selections and their corresponding model dependent controls.

Model Independent Controls

The model independent controls include a model identifier string, pointers to variables and responses specifications, and a model type specification. The model identifier string is supplied with id_model and is used to provide a unique identifier string for use within method specifications (refer to model_pointer in Method Independent Controls).

The type of model can be single, nested, or surrogate. Each of these model specifications supports variables_pointer and responses_pointer strings for identifying the variables and responses specifications used in constructing the model (by cross-referencing with id_variables and id_responses strings from particular variables and responses keyword specifications). These pointers are valid for each model type since each model contains a set of variables that is mapped into a set of responses -- only the specifics of the mapping differ. Additional pointers are used for each model type for constructing the components of the variable to response mapping. As a strategy specification identifies one or more methods and a method specification identifies a model, a model specification identifies variables, interface, and responses specifications. This top-down flow specifies all of the object interrelationships.

For each of these pointer specifications, if a pointer string is specified and no corresponding id string is available, DAKOTA will exit with an error message. If the pointer is optional and no pointer string is specified, then the last specification parsed will be used. It is appropriate to omit optional cross-referencing whenever the relationships are unambiguous due to the presence of only one specification.

Table 6.1 provides the specification detail for the model independent controls involving identifiers, model type controls, and pointers.

Table 6.1 Specification detail for the model independent controls: identifiers, model types, and pointers
Description Keyword Associated Data Status Default
Model set identifier id_model string Optional method use of last model parsed
Model type single | surrogate | nested none Required group N/A (single if no model specification)
Variables set pointer variables_pointer string Optional model use of last variables parsed
Responses set pointer responses_pointer string Optional model use of last responses parsed

Single Model Controls

In the single model case, a single interface is used to map the variables into responses. The optional interface_pointer specification identifies this interface by cross-referencing with the id_interface string input from a particular interface keyword specification.

Table 6.2 provides the specification detail for single models.

Table 6.2 Specification detail for single models
Description Keyword Associated Data Status Default
Interface set pointer interface_pointer string Optional model use of last interface parsed

Surrogate Model Controls

In the surrogate model case, the specification first allows a mixture of surrogate and actual response mappings through the use of the optional id_surrogates specification. This identifies the subset of the response functions by number that are to be approximated (the default is all functions). The valid response function identifiers range from 1 through the total number of response functions (see Function Specification). Next, the specification selects a global, multipoint, local, or hierarchical approximation. Table 6.3 provides the specification detail for surrogate models.

Table 6.3 Specification detail for the surrogate models
Description Keyword Associated Data Status Default
Surrogate response ids id_surrogates list of integers Optional All response functions are approximated
Surrogate type selection global | multipoint | local | hierarchical none Required group N/A

Each of these surrogate types provides an approximate representation of a "truth" model which is used to perform the parameter to response mappings. This approximation is built and updated using data from the truth model. This data is generated in some cases using a design of experiments iterator applied to the truth model (global approximations with a dace_method_pointer). In other cases, truth model data from a single point (local, hierarchical approximations), from a few previously evaluated points (multipoint approximations), or from the restart database (global approximations with reuse_samples) can be used. Surrogate models are used extensively in the surrogate-based optimization strategy (see SurrBasedOptStrategy and Surrogate-based Optimization (SBO) Commands), in which the goals are to reduce expense by minimizing the number of truth function evaluations and to smooth out noisy data with a global data fit. However, the use of surrogate models is not restricted in any way to optimization techniques, and in fact, the uncertainty quantification methods and optimization under uncertainty strategy are other primary users.

The following sections present the global, multipoint, local, or hierarchical specification groups in further detail.

Global approximations

The global surrogate model specification requires the specification of one of the following approximation methods: neural_network, polynomial, mars, hermite, gaussian_process, or kriging. These specifications invoke a layered perceptron artificial neural network approximation, a polynomial regression approximation, a multivariate adaptive regression spline approximation, a hermite polynomial approximation, a gaussian process approximation, or a kriging interpolation approximation, respectively. In the polynomial case, the order of the polynomial (linear, quadratic, or cubic) must be specified, and in the kriging case, a vector of correlations can be optionally specified in order to bypass the internal kriging calculations of correlation coefficients. Note that the gaussian process approximation is new, and currently always invokes an internal optimization procedure to determine the correlation coefficients.

For each of the global surrogates, dace_method_pointer, reuse_samples, correction, and use_gradients can be optionally specified. The dace_method_pointer specification points to a design of experiments iterator which can be used to generate truth model data for building a global data fit. The reuse_samples specification can be used to employ old data (either from previous function evaluations performed in the run or from function evaluations read from a restart database or text file) in the building of new global approximations. The default is no reuse of old data (since this can induce directional bias), and the settings of all, region, and samples_file result in reuse of all available data, reuse of all data available in the current trust region, and reuse of all data from a specified text file, respectively. The combination of new build data from dace_method_pointer and old build data from reuse_samples must be sufficient for building the global approximation. If not enough data is available, the system will abort with an error message. Both dace_method_pointer and reuse_samples are optional specifications, which gives the user maximum flexibility in using design of experiments data, restart/text file data, or both.

The correction specification specifies that the approximation will be corrected to match truth data, either matching truth values in the case of zeroth_order matching, matching truth values and gradients in the case of first_order matching, or matching truth values, gradients, and Hessians in the case of second_order matching. For additive and multiplicative corrections, the correction is local in that the truth data is matched at a single point, typically the center of the approximation region. The additive correction adds a scalar offset (zeroth_order), a linear function (first_order), or a quadratic function (second_order) to the approximation to match the truth data at the point, and the multiplicative correction multiplies the approximation by a scalar (zeroth_order), a linear function (first_order), or a quadratic function (second_order) to match the truth data at the point. The additive first_order case is due to [Lewis and Nash, 2000] and the multiplicative first_order case is commonly known as beta correction [Haftka, 1991]. For the combined correction, the use of both additive and multiplicative corrections allows the satisfaction of an additional matching condition, typically the truth function values at the previous correction point (e.g., the center of the previous trust region). The combined correction is then a multipoint correction, as opposed to the local additive and multiplicative corrections. Each of these correction capabilities is described in detail in [Eldred et al., 2004a].

Finally, the use_gradients flag specifies a future capability for the use of gradient data in the global approximation builds. This capability is currently supported in SurrBasedOptStrategy, SurrogateDataPoint, and Approximation::build(), but is not yet supported in any global approximation derived class redefinitions of Approximation::find_coefficients(). Tables 6.4 and 6.5 summarizes the global approximation specifications.

Table 6.4 Specification detail for global approximations: global approximation type
Description Keyword Associated Data Status Default
Global approximations global none Required group (1 of 4 selections) N/A
Artificial neural network neural_network none Required (1 of 6 selections) N/A
Polynomial polynomial linear | quadratic | cubic Required group (1 of 6 selections) N/A
Multivariate adaptive regression splines mars none Required (1 of 6 selections) N/A
Hermite polynomial hermite none Required (1 of 6 selections) N/A
Gaussian process gaussian_process none Required (1 of 6 selections) N/A
Kriging interpolation kriging none Required group (1 of 6 selections) N/A
Kriging correlations correlations list of reals Optional internally computed correlations

Table 6.5 Specification detail for global approximations: build and correction controls
Description Keyword Associated Data Status Default
Design of experiments method pointer dace_method_pointer string Optional no design of experiments data
Sample reuse in global approximation builds reuse_samples all | region | samples_file Optional group no sample reuse
Surrogate correction approach correction additive or multiplicative or combined, zeroth_order or first_order or second_order Optional group no surrogate correction
Use of gradient data in global approximation builds use_gradients none Optional gradient data not used in global approximation builds

Multipoint approximations

Multipoint approximations use data from previous design points to improve the accuracy of local approximations. Currently, the Two-point Adaptive Nonlinearity Approximation (TANA-3) method of [Xu and Grandhi, 1998] is supported. This method requires response value and gradient information from two points, and uses a first-order Taylor series if only one point is available. The truth model to be used to generate the value/gradient data used in the approximation is identified through the required actual_model_pointer specification. Table 6.6 summarizes the multipoint approximation specifications.

Table 6.6 Specification detail for multipoint approximations
Description Keyword Associated Data Status Default
Multipoint approximation multipoint none Required group (1 of 4 selections) N/A
Two-point adaptive nonlinear approximation tana none Required N/A
Pointer to the truth model specification actual_model_pointer string Required N/A

Local approximations

Local approximations use value, gradient, and possibly Hessian data from a single point to form a series expansion for approximating data in the vicinity of this point. The currently available local approximation is the taylor_series selection. The order of the Taylor series may be either first-order or second-order, which is automatically determined from the gradient and Hessian specifications in the responses specification (see Gradient Specification and Hessian Specification) for the truth model.

The truth model to be used to generate the value/gradient/Hessian data used in the series expansion is identified through the required actual_model_pointer specification. The use of a model pointer (as opposed to an interface pointer) allows additional flexibility in defining the approximation. In particular, the derivative specification for the truth model may differ from the derivative specification for the approximation , and the truth model results being approximated may involve a model recursion (e.g., the values/gradients from a nested model). Table 6.7 summarizes the local approximation interface specifications.

Table 6.7 Specification detail for local approximations
Description Keyword Associated Data Status Default
Local approximation local none Required group (1 of 4 selections) N/A
Taylor series local approximation taylor_series none Required N/A
Pointer to the truth model specification actual_model_pointer string Required N/A

Hierarchical approximations

Hierarchical approximations use corrected results from a low fidelity model as an approximation to the results of a high fidelity "truth" model. These approximations are also known as model hierarchy, multifidelity, variable fidelity, and variable complexity approximations. The required low_fidelity_model_pointer specification points to the low fidelity model specification. This model is used to generate low fidelity responses which are then corrected and returned to an iterator. The required high_fidelity_model_pointer specification points to the specification for the high fidelity truth model. This model is used only for verifying low fidelity results and updating low fidelity corrections. The correction specification specifies which correction technique will be applied to the low fidelity results in order to match the high fidelity results at one or more points. In the hierarchical case (as compared to the global case), the correction specification is required, since the omission of a correction technique would effectively eliminate the purpose of the high fidelity model. If it is desired to use a low fidelity model without corrections, then a hierarchical approximation is not needed and a single model should be used. Refer to Global approximations for additional information on available correction approaches. Table 6.8 summarizes the hierarchical approximation specifications.

Table 6.8 Specification detail for hierarchical approximations
Description Keyword Associated Data Status Default
Hierarchical approximation hierarchical none Required group (1 of 4 selections) N/A
Pointer to the low fidelity model specification low_fidelity_model_pointer string Required N/A
Pointer to the high fidelity model specification high_fidelity_model_pointer string Required N/A
Surrogate correction approach correction additive or multiplicative or combined, zeroth_order or first_order or second_order Required group N/A

Nested Model Controls

In the nested model case, a sub_method_pointer must be provided in order to specify the nested iterator, and optional_interface_pointer and optional_interface_responses_pointer provide an optional group specification for the optional interface portion of nested models (where optional_interface_pointer points to the interface specification and optional_interface_responses_pointer points to a responses specification describing the data to be returned by this interface). This interface is used to provide non-nested data, which is then combined with data from the nested iterator using the primary_response_mapping and secondary_response_mapping inputs (see mapping discussion below).

Table 6.9 provides the specification detail for nested model pointers.

Table 6.9 Specification detail for nested models
Description Keyword Associated Data Status Default
Interface set pointer optional_interface_pointer string Optional group no optional interface
Responses pointer for nested model optional interfaces optional_interface_responses_pointer string Optional reuse of top-level responses specification
Sub-method pointer for nested models sub_method_pointer string Required N/A

Nested models may employ mappings for both the variable inputs to the sub-model and the response outputs from the sub-model. In the former case, the primary_variable_mapping and secondary_variable_mapping specifications are used to map from the top-level variables into the sub-model variables, and in the latter case, the primary_response_mapping and secondary_response_mapping specifications are used to map from the sub-model responses back to the top-level responses. For the variable mappings, the primary and secondary specifications provide lists of strings which are used to target active sub-model variables and their distribution parameters, respectively. The primary strings are matched to variable labels such as 'cdv_1' (either user-supplied or default labels), and the secondary strings are matched to distribution parameters such as 'mean' or 'std_deviation' (the singular form of the uncertain distribution parameter keywords, lacking the prepended distribution type identifier). Both specifications are optional, which is designed to support three possibilities:

  1. If both primary and secondary variable mappings are specified, then an active top-level variable value will be inserted into the identified sub-model distribution parameter (the secondary mapping) for the identified active sub-model variable (the primary mapping).
  2. If a primary mapping is specified but a secondary mapping is not, then an active top-level variable value will be inserted into the identified active sub-model variable value (the primary mapping).
  3. If a primary mapping is not specified, then an active top-level variable value will be added as an inactive sub-model variable, augmenting the active sub-model variables (note: if a secondary mapping is specified in this case, it will be ignored).

These different variable mapping possibilities may be used in any combination by employing empty strings ('') for particular omitted mappings (the number of strings in user-supplied primary and secondary variable mapping specifications must equal the number of active top-level variables).

For the response mappings, the primary and secondary specifications provide real-valued multipliers to be applied to sub-iterator response results. The sub-iterator response results are defined as follows for different sub-iterator types:

The primary values map sub-iterator response results into top-level objective functions, least squares terms, or generic response functions, depending on the declared top-level response set. The secondary values map sub-iterator response results into top-level nonlinear inequality and equality constraints. Refer to NestedModel::response_mapping() for additional details.

An example of variable and response mappings is provided below:

primary_variable_mapping   = ''  ''  'X'     'Y'	\
secondary_variable_mapping = ''  ''  'mean'  'mean'	\
primary_response_mapping   = 1. 0. 0. 0. 0. 0. 0. 0. 0.	\
secondary_response_mapping = 0. 0. 0. 1. 3. 0. 0. 0. 0.	\
			     0. 0. 0. 0. 0. 0. 1. 3. 0.	\
The variable mappings correspond to 4 top-level variables, the first two of which augment the active sub-model variables as inactive sub-model variables (option 3 above) and the latter two of which are inserted into the mean distribution parameters of active sub-model variables 'X' and 'Y' (option 1 above). The response mappings correspond to 9 sub-iterator response functions (e.g., a set of UQ final statistics for 3 response functions, each with a mean, a standard deviation, and a reliability level). The primary response mapping maps the first sub-iterator response function (mean) into a single objective function, least squares term, or generic response function (as dictated by the top-level response specification), and the secondary response mapping maps the fourth sub-iterator response function plus 3 times the fifth sub-iterator response function (mean plus 3 standard deviations) into one top-level nonlinear constraint and the seventh sub-iterator response function plus 3 times the eighth sub-iterator response function (mean plus 3 standard deviations) into another top-level nonlinear constraint (these top-level nonlinear constraints may be inequality or equality, as dictated by the top-level response specification).

Table 6.10 provides the specification detail for the model independent controls involving nested model mappings.

Table 6.10 Specification detail for the model independent controls: nested model mappings
Description Keyword Associated Data Status Default
Primary variable mappings for nested models primary_variable_mapping list of strings Optional augmentation of sub-model variables (no insertion)
Secondary variable mappings for nested models secondary_variable_mapping list of strings Optional primary mappings into sub-model variables are value-based
Primary response mappings for nested models primary_response_mapping list of reals Optional no sub-iterator contribution to primary functions
Secondary response mappings for nested models secondary_response_mapping list of reals Optional no sub-iterator contribution to secondary functions



Previous chapter

Next chapter
Generated on Fri Oct 13 19:34:25 2006 for DAKOTA by  doxygen 1.4.1