Variables Commands

Variables Commands Table of Contents

Variables Description

The variables section in a DAKOTA input file specifies the parameter set to be iterated by a particular method. This parameter set is made up of design, uncertain, and state variables. Design variables can be continuous or discrete and consist of those variables which an optimizer adjusts in order to locate an optimal design. Each of the design parameters can have an initial point and a descriptive tag. Continuous and discrete range types include lower and upper bounds, and discrete set types include the admissible set values.

Uncertain variables may be categorized as either aleatory or epistemic and either continuous or discrete. Continuous aleatory uncertain variables include normal, lognormal, uniform, loguniform, triangular, exponential, beta, gamma, gumbel, frechet, weibull, and histogram bin distributions. Discrete aleatory uncertain variables include poisson, binomial, negative binomial, geometric, hypergeometric, and histogram point distributions. In addition to aleatory uncertain variables defined by probability distributions, DAKOTA also supports epistemic uncertain variables that are non-probabilistic. The interval type specification is a continuous epistemic type that supports both simple bounded intervals as well as basic probability assignment (BPA) belief structures, where a BPA defines the uncertainty in a variable through providing one or more intervals in which the variable may lie along with varying levels of belief for each interval.

Each uncertain variable specification contains descriptive tags and most contain, either explicitly or implicitly, distribution lower and upper bounds. Distribution lower and upper bounds are explicit portions of the normal, lognormal, uniform, loguniform, triangular, and beta specifications, whereas they are implicitly defined for histogram bin, histogram point, and interval variables (from the extreme values within the bin/point/interval specifications) as well as for binomial (0 to num_trials) and hypergeometric (0 to min(num_drawn, num_selected)) variables. When used with design of experiments and multidimensional parameter studies, distribution bounds are also inferred for normal and lognormal (if optional bounds are unspecified) as well as for exponential, gamma, gumbel, frechet, weibull, poisson, negative binomial, and geometric (which have no bounds specifications); these bounds are [0, $\mu + 3 \sigma$] for exponential, gamma, frechet, weibull, poisson, negative binomial, geometric, and unspecified lognormal, and [$\mu - 3 \sigma$, $\mu + 3 \sigma$] for gumbel and unspecified normal. For other types of parameter studies (vector and centered), an inferred initial starting point is needed for the uncertain variables. All uncertain variables are initialized to their means for these studies, where mean values for bounded normal and bounded lognormal may additionally be repaired to satisfy any specified distribution bounds, mean values for discrete integer range distributions are rounded down to the nearest integer, and mean values for discrete set distributions are rounded to the nearest set value.

In addition to tags and bounds specifications, normal variables include mean and standard deviation specifications, lognormal variables include lambda and zeta, mean and standard deviation, or mean and error factor specifications, triangular variables include mode specifications, exponential variables include beta specifications, beta, gamma, gumbel, frechet, and weibull variables include alpha and beta specifications, histogram bin variables include abscissa and either ordinate or count specifications, poisson variables include lambda specifications, binomial and negative binomial variables include probability per trial and number of trials specifications, geometric variables include probability per trial specifications, hypergeometric variables include the specification of the total population, selected population, and number drawn, histogram point variables include abscissa and count specifications, and interval variables include basic probability assignments per interval.

State variables can be continuous or discrete and consist of "other" variables which are to be mapped through the simulation interface. Each state variable specification can have an initial state and descriptors. Continuous and discrete range types include lower and upper bounds, and discrete set types include the admissible set values. State variables provide a convenient mechanism for parameterizing additional model inputs, such as mesh density, simulation convergence tolerances and time step controls, and can be used to enact model adaptivity in future strategy developments.

The ordering of variables is important, and a consistent ordering is employed throughout the DAKOTA software. It is the same ordering as shown in dakota.input.summary and as presented in the outline of this chapter. That ordering can be summarized as continuous followed by discrete integer followed by discrete real within each of the following types: design, aleatory uncertain, epistemic uncertain, and state. Ordering of variable types below this granularity (e.g., from normal to histogram bin within continuous aleatory uncertain) is defined somewhat arbitrarily, but is enforced consistently throughout the code.

Several examples follow. In the first example, two continuous design variables are specified:

variables,
	continuous_design = 2
	  initial_point    0.9    1.1
	  upper_bounds     5.8    2.9
	  lower_bounds     0.5   -2.9
	  descriptors   'radius' 'location'

In the next example, defaults are employed. In this case, initial_point will default to a vector of 0. values, upper_bounds will default to vector values of DBL_MAX (the maximum number representable in double precision for a particular platform, as defined in the platform's float.h C header file), lower_bounds will default to a vector of -DBL_MAX values, and descriptors will default to a vector of 'cdv_i' strings, where i ranges from one to two:

variables,
	continuous_design = 2

In the following example, the syntax for a normal-lognormal distribution is shown. One normal and one lognormal uncertain variable are completely specified by their means and standard deviations. In addition, the dependence structure between the two variables is specified using the uncertain_correlation_matrix.

variables,
        normal_uncertain    =  1
          means             =  1.0
          std_deviations    =  1.0
          descriptors       =  'TF1n'
        lognormal_uncertain =  1
          means             =  2.0
          std_deviations    =  0.5
          descriptors       =  'TF2ln'
        uncertain_correlation_matrix =  1.0 0.2
                                        0.2 1.0

An example of the syntax for a state variables specification follows:

variables,
        continuous_state = 1
          initial_state       4.0
          lower_bounds        0.0
          upper_bounds        8.0
          descriptors        'CS1'
        discrete_state_range = 1
          initial_state       104
          lower_bounds        100
          upper_bounds        110
          descriptors        'DS1'

And in a more advanced example, a variables specification containing a set identifier, continuous and discrete design variables, normal and uniform uncertain variables, and continuous and discrete state variables is shown:

variables,
	id_variables = 'V1'
	continuous_design = 2
	  initial_point    0.9    1.1
	  upper_bounds     5.8    2.9
	  lower_bounds     0.5   -2.9
	  descriptors   'radius' 'location'
	discrete_design_range = 1
	  initial_point    2
	  upper_bounds     1
	  lower_bounds     3
	  descriptors   'material'
	normal_uncertain = 2
	  means          =  248.89, 593.33
	  std_deviations =   12.4,   29.7
	  descriptors    =  'TF1n'   'TF2n'
	uniform_uncertain = 2
	  lower_bounds =  199.3,  474.63
	  upper_bounds =  298.5,  712.
	  descriptors  =  'TF1u'   'TF2u'
	continuous_state = 2
	  initial_state = 1.e-4  1.e-6
	  descriptors   = 'EPSIT1' 'EPSIT2'
	discrete_state_set_int = 1
	  initial_state = 100
	  set_values    = 100 212 375
	  descriptors   = 'load_case'

Refer to the DAKOTA Users Manual [Adams et al., 2010] for discussion on how different iterators view these mixed variable sets.

Variables Specification

The variables specification has the following structure:

variables,
	<set identifier>
	<continuous design variables specification>
	<discrete design range variables specification>
	<discrete design set integer variables specification>
	<discrete design set real variables specification>
	<normal uncertain variables specification>
	<lognormal uncertain variables specification>
	<uniform uncertain variables specification>
	<loguniform uncertain variables specification>
	<triangular uncertain variables specification>
	<exponential uncertain variables specification>
	<beta uncertain variables specification>
	<gamma uncertain variables specification>
	<gumbel uncertain variables specification>
	<frechet uncertain variables specification>
	<weibull uncertain variables specification>
	<histogram bin uncertain variables specification>
	<poisson uncertain variables specification>
	<binomial uncertain variables specification>
	<negative binomial uncertain variables specification>
	<geometric uncertain variables specification>
	<hypergeometric uncertain variables specification>
	<histogram point uncertain variables specification>
	<uncertain correlation specification>
	<interval uncertain variables specification>
	<continuous state variables specification>
	<discrete state range variables specification>
	<discrete state set integer variables specification>
	<discrete state set real variables specification>

Referring to dakota.input.summary, it is evident from the enclosing brackets that the set identifier specification, the uncertain correlation specification, and each of the variables specifications are all optional. The set identifier and uncertain correlation are stand-alone optional specifications, whereas the variables specifications are optional group specifications, meaning that the group can either appear or not as a unit. If any part of an optional group is specified, then all required parts of the group must appear.

The optional status of the different variable type specifications allows the user to specify only those variables which are present (rather than explicitly specifying that the number of a particular type of variables is zero). However, at least one type of variables that are active for the iterator in use must have nonzero size or an input error message will result. The following sections describe each of these specification components in additional detail.

Variables Set Identifier

The optional set identifier specification uses the keyword id_variables to input a unique string for use in identifying a particular variables set. A model can then identify the use of this variables set by specifying the same string in its variables_pointer specification (see Model Independent Controls). For example, a model whose specification contains variables_pointer = 'V1' will use a variables specification containing the set identifier id_variables = 'V1'.

If the id_variables specification is omitted, a particular variables set will be used by a model only if that model omits specifying a variables_pointer and if the variables set was the last set parsed (or is the only set parsed). In common practice, if only one variables set exists, then id_variables can be safely omitted from the variables specification and variables_pointer can be omitted from the model specification(s), since there is no potential for ambiguity in this case. Table 7.1 summarizes the set identifier inputs.

Table 7.1 Specification detail for set identifier

Description

Keyword

Associated Data

Status

Default

Variables set identifier

id_variables

string

Optional

use of last variables parsed

Design Variables

Design variable types include continuous real, discrete range of integer values (contiguous integers), discrete set of integer values, and discrete set of real values. Within each optional design variables specification group, the number of variables is always required. The following Tables 7.2 through 7.5 summarize the required and optional specifications for each design variable subtype. The initial_point specifications provide the point in design space from which an iterator is started and default to either zeros (continuous and discrete range) or middle values (discrete sets). The descriptors specifications supply strings which will be replicated through the DAKOTA output to identify the numerical values for these parameters; these default to numbered strings.

For continuous and discrete range variables, the lower_bounds and upper_bounds restrict the size of the feasible design space and are frequently used to prevent nonphysical designs. Default values are positive and negative machine limits for upper and lower bounds (+/- DBL_MAX, INT_MAX, INT_MIN from the float.h and limits.h system header files). As for linear and nonlinear inequality constraint bounds (see Method Independent Controls and Objective and constraint functions (optimization data set)), a nonexistent upper bound can be specified by using a value greater than the "big bound size" constant (1.e+30 for continuous variables, 1.e+9 for discrete integer variables) and a nonexistent lower bound can be specified by using a value less than the negation of these constants (-1.e+30 for continuous, -1.e+9 for discrete integer), although not all optimizers currently support this feature (e.g., DOT and CONMIN will treat these large bound values as actual variable bounds, but this should not be problematic in practice).

Continuous Design Variables

Table 7.2 Specification detail for continuous design variables

Description

Keyword

Associated Data

Status

Default

Continuous design variables

continuous_design

integer

Optional group

no continuous design variables

Initial point

initial_point

list of reals

Optional

vector values = 0. (repaired to bounds, if required)

Lower bounds

lower_bounds

list of reals

Optional

vector values = -DBL_MAX

Upper bounds

upper_bounds

list of reals

Optional

vector values = +DBL_MAX

Scaling types

scale_types

list of strings

Optional

vector values = 'none'

Scales

scales

list of reals

Optional

vector values = 1. (no scaling)

Descriptors

descriptors

list of strings

Optional

vector of 'cdv_i' where i = 1,2,3...

For continuous variables, the scale_types specification includes strings specifying the scaling type for each component of the continuous design variables vector in methods that support scaling, when scaling is enabled (see Method Independent Controls for details). Each entry in scale_types may be selected from 'none', 'value', 'auto', or 'log', to select no, characteristic value, automatic, or logarithmic scaling, respectively. If a single string is specified it will apply to all components of the continuous design variables vector. Each entry in scales may be a user-specified nonzero real characteristic value to be used in scaling each variable component. These values are ignored for scaling type 'none', required for 'value', and optional for 'auto' and 'log'. If a single real value is specified it will apply to all components of the continuous design variables vector.

Discrete Design Range Variables

Table 7.3 Specification detail for discrete design range variables

Description

Keyword

Associated Data

Status

Default

Discrete design range variables

discrete_design_range

integer

Optional group

no discrete design variables

Initial point

initial_point

list of integers

Optional

vector values = 0 (repaired to bounds, if required)

Lower bounds

lower_bounds

list of integers

Optional

vector values = INT_MIN

Upper bounds

upper_bounds

list of integers

Optional

vector values = INT_MAX

Descriptors

descriptors

list of strings

Optional

vector of 'ddriv_i' where i = 1,2,3,...

Discrete Design Integer Set Variables

Discrete set variables are specified with an integer list specifying how many set members there are for each variable and a list of integer or real set values for discrete_design_set_integer (Table 7.4) and discrete_design_set_real (Table 7.5), respectively.

Table 7.4 Specification detail for discrete design set of integer variables

Description

Keyword

Associated Data

Status

Default

Discrete design set of integer variables

discrete_design_set_integer

integer

Optional group

no discrete design set of integer variables

Initial point

initial_point

list of integers

Optional

middle set values (mean indices, rounded down)

Number of values for each variable

num_set_values

list of integers

Optional

equal distribution

Set values

set_values

list of integers

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'ddsiv_i' where i = 1,2,3,...

Discrete Design Real Set Variables

Table 7.5 Specification detail for discrete design set of real variables

Description

Keyword

Associated Data

Status

Default

Discrete design set of real variables

discrete_design_set_real

integer

Optional group

no discrete design set of real variables

Initial point

initial_point

list of reals

Optional

middle set values (mean indices, rounded down)

Number of values for each variable

num_set_values

list of integers

Optional

equal distribution

Set values

set_values

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'ddsrv_i' where i = 1,2,3,...

Aleatory Uncertain Variables

Aleatory uncertain variables involve continuous or discrete probability distribution specifications. Continuous probability distributions including normal, lognormal, uniform, loguniform, triangular, exponential, beta, gamma, gumbel, frechet, weibull, and histogram bin distributions. Discrete probability distributions include poisson, binomial, negative binomial, geometric, hypergeometric, and histogram point distributions. Each of these specifications is an optional group specification.

These specifications of probability distributions directly support the use of probabilistic uncertainty quantification methods such as sampling, reliability, and stochastic expansion methods. However, the inclusion of lower and upper distribution bounds for all uncertain variable types (either explicitly defined, implicitly defined, or inferred; see Variables Description) also allows the use of these variables within methods that rely on a bounded region to define a set of function evaluations (i.e., design of experiments and some parameter study methods). Each distribution also provides optional uncertain variable descriptors (default values are numbered strings) that supply identifiers that help associate the numerical values with the uncertain parameters as they appear within the DAKOTA output. Tables 7.6 through 7.23 summarize the details of the aleatory uncertain variable specifications.

Normal Distribution

Within the normal uncertain optional group specification, the number of normal uncertain variables, the means, and standard deviations are required specifications, and the distribution lower and upper bounds and variable descriptors are optional specifications. The normal distribution is widely used to model uncertain variables such as population characteristics. It is also used to model the mean of a sample: as the sample size becomes very large, the Central Limit Theorem states that the mean becomes approximately normal, regardless of the distribution of the original variables.

The density function for the normal distribution is:

\[f(x) = \frac{1}{\sqrt{2\pi}\sigma_N} e^{-\frac{1}{2}\left(\frac{x-\mu_N}{\sigma_N}\right)^2}\]

where $\mu_N$ and $\sigma_N$ are the mean and standard deviation of the normal distribution, respectively.

Note that if you specify bounds for a normal distribution, the sampling occurs from the underlying distribution with the given mean and standard deviation, but samples are not taken outside the bounds (see "bounded normal" distribution type in [Wyss and Jorgensen, 1998]). This can result in the mean and the standard deviation of the sample data being different from the mean and standard deviation of the underlying distribution. For example, if you are sampling from a normal distribution with a mean of 5 and a standard deviation of 3, but you specify bounds of 1 and 7, the resulting mean of the samples will be around 4.3 and the resulting standard deviation will be around 1.6. This is because you have bounded the original distribution significantly, and asymetrically, since 7 is closer to the original mean than 1.

Table 7.6 Specification detail for normal uncertain variables

Description

Keyword

Associated Data

Status

Default

normal uncertain variables

normal_uncertain

integer

Optional group

no normal uncertain variables

normal uncertain means

means

list of reals

Required

N/A

normal uncertain standard deviations

std_deviations

list of reals

Required

N/A

Distribution lower bounds

lower_bounds

list of reals

Optional

vector values = -DBL_MAX

Distribution upper bounds

upper_bounds

list of reals

Optional

vector values = +DBL_MAX

Descriptors

descriptors

list of strings

Optional

vector of 'nuv_i' where i = 1,2,3,...

Lognormal Distribution

If the logarithm of an uncertain variable X has a normal distribution, that is $\log X \sim N(\mu,\sigma)$, then X is distributed with a lognormal distribution. The lognormal is often used to model time to perform some task. It can also be used to model variables which are the product of a large number of other quantities, by the Central Limit Theorem. Finally, the lognormal is used to model quantities which cannot have negative values. Within the lognormal uncertain optional group specification, the number of lognormal uncertain variables, the means, and either standard deviations or error factors must be specified, and the distribution lower and upper bounds and variable descriptors are optional specifications. These distribution bounds can be used to truncate the tails of lognormal distributions, which as for bounded normal, can result in the mean and the standard deviation of the sample data being different from the mean and standard deviation of the underlying distribution (see "bounded lognormal" and "bounded lognormal-n" distribution types in [Wyss and Jorgensen, 1998]).

For the lognormal variables, one may specify either the mean $\mu$ and standard deviation $\sigma$ of the actual lognormal distribution, the mean $\mu$ and error factor $\epsilon$ of the actual lognormal distribution, or the mean $\lambda$ ("lambda") and standard deviation $\zeta$ ("zeta") of the underlying normal distribution. The conversion equations from lognormal mean $\mu$ and either lognormal error factor $\epsilon$ or lognormal standard deviation $\sigma$ to the mean $\lambda$ and standard deviation $\zeta$ of the underlying normal distribution are as follows:

\[\zeta = \frac{ln(\epsilon)}{1.645}\]

\[\zeta^2 = ln(\frac{\sigma^2}{\mu^2} + 1)\]

\[\lambda = ln(\mu) - \frac{\zeta^2}{2}\]

Conversions from $\lambda$ and $\zeta$ back to $\mu$ and $\epsilon$ or $\sigma$ are as follows:

\[\mu = e^{\lambda + \frac{\zeta^2}{2}}\]

\[\sigma^2 = e^{2\lambda + \zeta^2}(e^{\zeta^2} - 1)\]

\[\epsilon = e^{1.645\zeta}\]

The density function for the lognormal distribution is:

\[f(x) = \frac{1}{\sqrt{2\pi}\zeta x} e^{-\frac{1}{2}\left(\frac{ln x-\lambda}{\zeta}\right)^2}\]

Table 7.7 Specification detail for lognormal uncertain variables

Description

Keyword

Associated Data

Status

Default

lognormal uncertain variables

lognormal_uncertain

integer

Optional group

no lognormal uncertain variables

lognormal uncertain means

means

list of reals

Required (1 of 3 selections)

N/A

lognormal uncertain standard deviations

std_deviations

list of reals

Required (1 of 3 selections)

N/A

lognormal uncertain error factors

error_factors

list of reals

Required (1 of 3 selections)

N/A

lognormal uncertain lambdas

lambdas

list of reals

Required (1 of 3 selections)

N/A

lognormal uncertain zetas

zetas

list of reals

Required (1 of 3 selections)

N/A

Distribution lower bounds

lower_bounds

list of reals

Optional

vector values = 0.

Distribution upper bounds

upper_bounds

list of reals

Optional

vector values = +DBL_MAX

Descriptors

descriptors

list of strings

Optional

vector of 'lnuv_i' where i = 1,2,3,...

Uniform Distribution

Within the uniform uncertain optional group specification, the number of uniform uncertain variables and the distribution lower and upper bounds are required specifications, and variable descriptors is an optional specification. The uniform distribution has the density function:

\[f(x) = \frac{1}{U_U-L_U}\]

where $U_U$ and $L_U$ are the upper and lower bounds of the uniform distribution, respectively. The mean of the uniform distribution is $\frac{U_U+L_U}{2}$ and the variance is $\frac{(U_U-L_U)^2}{12}$. Note that this distribution is a special case of the more general beta distribution.

Table 7.8 Specification detail for uniform uncertain variables

Description

Keyword

Associated Data

Status

Default

uniform uncertain variables

uniform_uncertain

integer

Optional group

no uniform uncertain variables

Distribution lower bounds

lower_bounds

list of reals

Required

N/A

Distribution upper bounds

upper_bounds

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'uuv_i' where i = 1,2,3,...

Loguniform Distribution

If the logarithm of an uncertain variable X has a uniform distribution, that is $\log X \sim U(L_{LU},U_{LU})$, then X is distributed with a loguniform distribution. Within the loguniform uncertain optional group specification, the number of loguniform uncertain variables and the distribution lower and upper bounds are required specifications, and variable descriptors is an optional specification. The loguniform distribution has the density function:

\[f(x) = \frac{1}{x(ln U_{LU} - ln {L_{LU}})}\]

Table 7.9 Specification detail for loguniform uncertain variables

Description

Keyword

Associated Data

Status

Default

loguniform uncertain variables

loguniform_uncertain

integer

Optional group

no loguniform uncertain variables

Distribution lower bounds

lower_bounds

list of reals

Required

N/A

Distribution upper bounds

upper_bounds

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'luuv_i' where i = 1,2,3,...

Triangular Distribution

The triangular distribution is often used when one does not have much data or information, but does have an estimate of the most likely value and the lower and upper bounds. Within the triangular uncertain optional group specification, the number of triangular uncertain variables, the modes, and the distribution lower and upper bounds are required specifications, and variable descriptors is an optional specification.

The density function for the triangular distribution is:

\[f(x) = \frac{2(x-L_T)}{(U_T-L_T)(M_T-L_T)}\]

if $L_T\leq x \leq M_T$, and

\[f(x) = \frac{2(U_T-x)}{(U_T-L_T)(U_T-M_T)}\]

if $M_T\leq x \leq U_T$, and 0 elsewhere. In these equations, $L_T$ is the lower bound, $U_T$ is the upper bound, and $M_T$ is the mode of the triangular distribution.

Table 7.10 Specification detail for triangular uncertain variables

Description

Keyword

Associated Data

Status

Default

triangular uncertain variables

triangular_uncertain

integer

Optional group

no triangular uncertain variables

triangular uncertain modes

modes

list of reals

Required

N/A

Distribution lower bounds

lower_bounds

list of reals

Required

N/A

Distribution upper bounds

upper_bounds

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'tuv_i' where i = 1,2,3,...

Exponential Distribution

The exponential distribution is often used for modeling failure rates. Within the exponential uncertain optional group specification, the number of exponential uncertain variables and the beta parameters are required specifications, and variable descriptors is an optional specification.

The density function for the exponential distribution is given by:

\[f(x) = \frac{1}{\beta} e^{\frac{-x}{\beta}}\]

where $\mu_{E} = \beta$ and $\sigma^2_{E} = \beta^2$. Note that this distribution is a special case of the more general gamma distribution.

Table 7.11 Specification detail for exponential uncertain variables

Description

Keyword

Associated Data

Status

Default

exponential uncertain variables

exponential_uncertain

integer

Optional group

no exponential uncertain variables

exponential uncertain betas

betas

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'euv_i' where i = 1,2,3,...

Beta Distribution

Within the beta uncertain optional group specification, the number of beta uncertain variables, the alpha and beta parameters, and the distribution upper and lower bounds are required specifications, and the variable descriptors is an optional specification. The beta distribution can be helpful when the actual distribution of an uncertain variable is unknown, but the user has a good idea of the bounds, the mean, and the standard deviation of the uncertain variable. The density function for the beta distribution is

\[f(x)= \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}\frac{(x-L_B)^{\alpha-1}(U_B-x)^{\beta-1}}{(U_B-L_B)^{\alpha+\beta-1}}\]

where $\Gamma(\alpha)$ is the gamma function and $B(\alpha, \beta) = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)}$ is the beta function. To calculate mean and standard deviation from the alpha, beta, upper bound, and lower bound parameters of the beta distribution, the following expressions may be used.

\[\mu_B = L_B+\frac{\alpha}{\alpha+\beta}(U_B-L_B)\]

\[\sigma_B^2 =\frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)}(U_B-L_B)^2\]

Solving these for $\alpha$ and $\beta$ gives:

\[\alpha = (\mu_B-L_B)\frac{(\mu_B-L_B)(U_B-\mu_B)-\sigma_B^2}{\sigma_B^2(U_B-L_B)}\]

\[\beta = (U_B-\mu_B)\frac{(\mu_B-L_B)(U_B-\mu_B)-\sigma_B^2}{\sigma_B^2(U_B-L_B)}\]

Note that the uniform distribution is a special case of this distribution for parameters $\alpha = \beta = 1$.

Table 7.12 Specification detail for beta uncertain variables

Description

Keyword

Associated Data

Status

Default

beta uncertain variables

beta_uncertain

integer

Optional group

no beta uncertain variables

beta uncertain alphas

alphas

list of reals

Required

N/A

beta uncertain betas

betas

list of reals

Required

N/A

Distribution lower bounds

lower_bounds

list of reals

Required

N/A

Distribution upper bounds

upper_bounds

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'buv_i' where i = 1,2,3,...

Gamma Distribution

The gamma distribution is sometimes used to model time to complete a task, such as a repair or service task. It is a very flexible distribution. Within the gamma uncertain optional group specification, the number of gamma uncertain variables and the alpha and beta parameters are required specifications, and variable descriptors is an optional specification.

The density function for the gamma distribution is given by:

\[f(x) = \frac{{x}^{\alpha-1}{e}^{\frac{-x}{\beta}}}{\beta^{\alpha}\Gamma(\alpha)}\]

where $\mu_{GA} = \alpha\beta$ and $\sigma^2_{GA} = \alpha\beta^2$. Note that the exponential distribution is a special case of this distribution for parameter $\alpha = 1$.

Table 7.13 Specification detail for gamma uncertain variables

Description

Keyword

Associated Data

Status

Default

gamma uncertain variables

gamma_uncertain

integer

Optional group

no gamma uncertain variables

gamma uncertain alphas

alphas

list of reals

Required

N/A

gamma uncertain betas

betas

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'gauv_i' where i = 1,2,3,...

Gumbel Distribution

Within the gumbel optional uncertain group specification, the number of gumbel uncertain variables, and the alpha and beta parameters are required specifications. The Gumbel distribution is also referred to as the Type I Largest Extreme Value distribution. The distribution of maxima in sample sets from a population with a normal distribution will asymptotically converge to this distribution. It is commonly used to model demand variables such as wind loads and flood levels.

The density function for the Gumbel distribution is given by:

\[f(x) = \alpha e^{-\alpha(x-\beta)} exp(-e^{-\alpha(x-\beta)})\]

where $\mu_{GU} = \beta + \frac{0.5772}{\alpha}$ and $\sigma_{GU} = \frac{\pi}{\sqrt{6}\alpha}$.

Table 7.14 Specification detail for gumbel uncertain variables

Description

Keyword

Associated Data

Status

Default

gumbel uncertain variables

gumbel_uncertain

integer

Optional group

no gumbel uncertain variables

gumbel uncertain alphas

alphas

list of reals

Required

N/A

gumbel uncertain betas

betas

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'guuv_i' where i = 1,2,3,...

Frechet Distribution

With the frechet uncertain optional group specification, the number of frechet uncertain variables and the alpha and beta parameters are required specifications. The Frechet distribution is also referred to as the Type II Largest Extreme Value distribution. The distribution of maxima in sample sets from a population with a lognormal distribution will asymptotically converge to this distribution. It is commonly used to model non-negative demand variables.

The density function for the frechet distribution is:

\[f(x) = \frac{\alpha}{\beta}(\frac{\beta}{x})^{\alpha+1}e^{-(\frac{\beta}{x})^\alpha}\]

where $\mu_F = \beta\Gamma(1-\frac{1}{\alpha})$ and $\sigma_F^2 = \beta^2[\Gamma(1-\frac{2}{\alpha})-\Gamma^2(1-\frac{1}{\alpha})]$

Table 7.15 Specification detail for frechet uncertain variables

Description

Keyword

Associated Data

Status

Default

frechet uncertain variables

frechet_uncertain

integer

Optional group

no frechet uncertain variables

frechet uncertain alphas

alphas

list of reals

Required

N/A

frechet uncertain betas

betas

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'fuv_i' where i = 1,2,3,...

Weibull Distribution

The Weibull distribution is commonly used in reliability studies to predict the lifetime of a device. Within the weibull uncertain optional group specification, the number of weibull uncertain variables and the alpha and beta parameters are required specifications. The Weibull distribution is also referred to as the Type III Smallest Extreme Value distribution. It is also used to model capacity variables such as material strength.

The density function for the weibull distribution is given by:

\[f(x) = \frac{\alpha}{\beta} \left(\frac{x}{\beta}\right)^{\alpha-1} e^{-\left(\frac{x}{\beta}\right)^{\alpha}}\]

where $\mu_W = \beta \Gamma(1+\frac{1}{\alpha})$ and $\sigma_W = \sqrt{\frac{\Gamma(1+\frac{2}{\alpha})}{\Gamma^2(1+\frac{1}{\alpha})} - 1} \mu_W$

Table 7.16 Specification detail for weibull uncertain variables

Description

Keyword

Associated Data

Status

Default

weibull uncertain variables

weibull_uncertain

integer

Optional group

no weibull uncertain variables

weibull uncertain alphas

alphas

list of reals

Required

N/A

weibull uncertain betas

betas

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'wuv_i' where i = 1,2,3,...

Histogram Bin Distribution

Histogram uncertain variables are typically used to model a set of empirical data. A bin histogram is a continuous aleatory distribution that allows the user to specify bins of non-zero width (where the uncertain variable may lie) along with the relative frequencies that are associated with each bin.

Within the histogram bin uncertain optional group specification, the number of histogram bin uncertain variables is a required specification, the number of pairs is an optional key for apportionment of abscissas/ordinates/counts, specification of abscissas and either ordinates or counts is required, and the variable descriptors is an optional specification. When using a histogram bin variable, one must define at least one bin (with two bounding value pairs).

The abscissas specification define abscissa values ("x" coordinates) for the PDF of each histogram variable. When paired with counts, the specifications provide sets of (x,c) pairs for each histogram variable where c defines a count (i.e., a frequency or relative probability) associated with a bin. If using bins of unequal width and specification of probability densities is more natural, then the counts specification can be replaced with a ordinates specification ("y" coordinates) in order to support interpretation of the input as (x,y) pairs defining the profile of a "skyline" PDF. Conversion between the two specifications is straightforward: a count/frequency is a cumulative probability quantity defined from the product of the ordinate density value and the x bin width. Thus, in the cases of bins of equal width, ordinate and count specifications are equivalent. In addition, ordinates and counts may be relative values; it is not necessary to scale them as all user inputs will be normalized.

To fully specify a bin-based histogram with n bins (potentially of unequal width), n+1 (x,c) or (x,y) pairs must be specified with the following features:

The number of pairs specifications provide for the proper association of multiple sets of (x,c) or (x,y) pairs with individual histogram variables. For example, in the following specification

histogram_bin_uncertain = 2
  num_pairs =        3           4
  abscissas =  5  8 10 .1 .2 .3 .4
  counts    = 17 21  0 12 24 12  0

num_pairs associates the first 3 (x,c) pairs from abscissas and counts ((5,17),(8,21),(10,0)) with one bin-based histogram variable, where one bin is defined between 5 and 8 with a count of 17 and another bin is defined between 8 and 10 with a count of 21. The following set of 4 (x,c) pairs ((.1,12),(.2,24),(.3,12),(.4,0)) defines a second bin-based histogram variable containing three equal-width bins with counts 12, 24, and 12 (middle bin is twice as probable as the other two).

Table 7.17 Specification detail for histogram bin uncertain variables

Description

Keyword

Associated Data

Status

Default

histogram bin uncertain variables

histogram_bin_uncertain

integer

Optional group

no histogram bin uncertain variables

key to apportionment among bin-based histogram variables

num_pairs

list of integers

Optional

equal distribution

sets of abscissas for bin-based histogram variables

abscissas

list of reals

Required

N/A

sets of ordinates for bin-based histogram variables

ordinates

list of reals

Required (1 of 2 selections)

N/A

sets of counts for bin-based histogram variables

counts

list of reals

Required (1 of 2 selections)

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'hubv_i' where i = 1,2,3,...

Poisson Distribution

The Poisson distribution is used to predict the number of discrete events that happen in a given time interval. The expected number of occurences in the time interval is $\lambda$, which must be a positive real number. For example, if events occur on average 4 times per year and we are interested in the distribution of events over six months, $\lambda$ would be 2 in this case. However, if we were interested in the distribution of events occuring over 5 years, $\lambda$ would be 20.

The density function for the poisson distribution is given by:

\[f(x) = \frac{\lambda e^{-\lambda}}{x!}\]

where $\lambda$ is the frequency of events happening, and x is the number of events that occur. The poisson distribution returns samples representing number of occurrences in the time period of interest.

Table 7.18 Specification detail for poisson uncertain variables

Description

Keyword

Associated Data

Status

Default

poisson uncertain variables

poisson_uncertain

integer

Optional group

no poisson uncertain variables

poisson uncertain lambdas

lambdas

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'puv_i' where i = 1,2,3,...

Binomial Distribution

The binomial distribution is typically used to predict the number of failures (or defective items or some type of event) in a total of n independent tests or trials, where each trial has the probability p of failing or being defective. Each particular test can be considered as a Bernoulli trial.

The density function for the binomial distribution is given by:

\[f(x) = \left(\begin{array}{c}n\\x\end{array}\right){p^x}{(1-p)^{(n-x)}}\]

where p is the probability of failure per trial and n is the number of trials.

Table 7.19 Specification detail for binomial uncertain variables

Description

Keyword

Associated Data

Status

Default

binomial uncertain variables

binomial_uncertain

integer

Optional group

no binomial uncertain variables

binomial uncertain prob_per_trial

prob_per_trial

list of reals

Required

N/A

binomial uncertain num_trials

num_trials

list of integers

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'biuv_i' where i = 1,2,3,...

Negative Binomial Distribution

The negative binomial distribution is typically used to predict the number of times to perform a test to have a total of n successes, where each test has a probability p of success.

The density function for the negative binomial distribution is given by:

\[f(x) = \left(\begin{array}{c}{n+x-1}\\{x}\end{array}\right){p^n}{(1-p)^{x}}\]

where p is the probability of success per trial and n is the number of successful trials.

Table 7.20 Specification detail for negative binomial uncertain variables

Description

Keyword

Associated Data

Status

Default

negative binomial uncertain variables

negative_binomial_uncertain

integer

Optional group

no negative binomial uncertain variables

negative binomial uncertain success prob_per_trial

prob_per_trial

list of reals

Required

N/A

negative binomial uncertain success num_trials

num_trials

list of integers

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'nbuv_i' where i = 1,2,3,...

Geometric Distribution

The geometric distribution represents the number of successful trials that might occur before a failure is observed.

The density function for the geometric distribution is given by:

\[f(x) = {p}{(1-p)^{x}}\]

where p is the probability of failure per trial.

Table 7.21 Specification detail for geometric uncertain variables

Description

Keyword

Associated Data

Status

Default

geometric uncertain variables

geometric_uncertain

integer

Optional group

no geometric uncertain variables

geometric uncertain prob_per_trial

prob_per_trial

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'geuv_i' where i = 1,2,3,...

Hypergeometric Distribution

The hypergeometric distribution is used to define the number of failures (or the number of successes; the number of some type of event) in a set of tests that has a known proportion of failures. The hypergeometric is often described using an urn model. For example, say we have a total population containing N balls, and we know that m of the balls are white and the remaining balls are green. If we draw n balls from the urn without replacement, the hypergeometric distribution describes the distribution of the number of white balls drawn from the urn.

The density function for the hypergeometric distribution is given by:

\[f(x) = \frac{\left(\begin{array}{c}m\\x\end{array}\right)\left(\begin{array}{c}{N-m}\\{n-x}\end{array}\right)}{\left(\begin{array}{c}N\\n\end{array}\right)}\]

where N is the total population, m is the number of items in the selected population (e.g. the number of white balls in the full urn of N items), and n is the number of balls drawn.

Table 7.22 Specification detail for hypergeometric uncertain variables

Description

Keyword

Associated Data

Status

Default

hypergeometric uncertain variables

hypergeometric_uncertain

integer

Optional group

no hypergeometric uncertain variables

hypergeometric uncertain total_population

total_population

list of integers

Required

N/A

hypergeometric uncertain selected_population

selected_population

list of integers

Required

N/A

hypergeometric uncertain num_drawn

num_drawn

list of integers

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'hguv_i' where i = 1,2,3,...

Histogram Point Distribution

As mentioned above, histogram uncertain variables are typically used to model a set of empirical data. A point histogram is a discrete aleatory distribution that allows the user to specify a set of real-valued points and associated frequency values.

Point histograms are similar to Discrete Design Real Set Variables and Discrete State Real Set Variables, but differ in the inclusion of information on the relative probabilities of observing the different values within the set.

Within the histogram point uncertain optional group specification, the number of histogram point uncertain variables is a required specification, the number of pairs is an optional key for apportionment of abscissas and counts, the sets of abscissas and counts are required, and the variable descriptors are optional. When using a histogram point variable, one must define at least one set of abscissa/count pairs. As for Histogram Bin Distribution, the abscissas specifications define abscissa values ("x" coordinates) for the PDF of each histogram variable. When paired with counts, the specifications provide sets of (x,c) pairs for each histogram variable where c defines a count (i.e., a frequency or relative probability) associated with a point.

To fully specify a point-based histogram with n points, n (x,c) pairs (note that (x,c) and (x,y) are equivalent in this case) must be specified with the following features:

The num_pairs specification provides for the proper association of multiple sets of (x,c) or (x,y) pairs with individual histogram variables. For example, in the following specification,

histogram_point_uncertain = 2
  num_pairs =   2           3  
  abscissas = 3 4 100 200 300
  counts    = 1 1   1   2   1

num_pairs associates the (x,c) pairs ((3,1),(4,1)) with one point-based histogram variable (where the values 3 and 4 are equally probable) and associates the (x,c) pairs ((100,1),(200,2),(300,1)) with a second point-based histogram variable (where the value 200 is twice as probable as either 100 or 300).

Table 7.23 Specification detail for histogram point uncertain variables

Description

Keyword

Associated Data

Status

Default

histogram point uncertain variables

histogram_point_uncertain

integer

Optional group

no histogram point uncertain variables

key to apportionment among point-based histogram variables

num_pairs

list of integers

Optional

equal distribution

sets of abscissas for point-based histogram variables

abscissas

list of reals

Required

N/A

sets of counts for point-based histogram variables

counts

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'hupv_i' where i = 1,2,3,...

Correlations

Aleatory uncertain variables may have correlations specified through use of an uncertain_correlation_matrix specification. This specification is generalized in the sense that its specific meaning depends on the nondeterministic method in use. When the method is a nondeterministic sampling method (i.e., sampling), then the correlation matrix specifies rank correlations [Iman and Conover, 1982]. When the method is instead a reliability (i.e., local_reliability or global_reliability) or stochastic expansion (i.e., polynomial_chaos or stoch_collocation) method, then the correlation matrix specifies correlation coefficients (normalized covariance) [Haldar and Mahadevan, 2000]. In either of these cases, specifying the identity matrix results in uncorrelated uncertain variables (the default). The matrix input should be symmetric and have all $n^2$ entries where n is the total number of aleatory uncertain variables (all normal, lognormal, uniform, loguniform, triangular, exponential, beta, gamma, gumbel, frechet, weibull, histogram bin, poisson, binomial, negative binomial, geometric, hypergeometric, and histogram point specifications, in that order). Table 7.24 summarizes the specification details:

Table 7.24 Specification detail for aleatory uncertain correlations

Description

Keyword

Associated Data

Status

Default

correlations in aleatory uncertain variables

uncertain_correlation_matrix

list of reals

Optional

identity matrix (uncorrelated)

Epistemic Uncertain Variables

In addition to continuous and discrete aleatory probability distributions, DAKOTA provides support for epistemic uncertainties through its interval variable specification. This is not a probability distribution; rather, it specifies a set of belief structures based on intervals that may be contiguous, overlapping, or disjoint. It is used in specifying the inputs necessary for an epistemic uncertainty analysis using Dempster-Shafer theory of evidence.

Interval Uncertain Variable

The interval uncertain variable is NOT a probability distribution. Although it may seem similar to a histogram, the interpretation of this uncertain variable is different. It is used in epistemic uncertainty analysis, where one is trying to model uncertainty due to lack of knowledge. In DAKOTA, epistemic uncertainty analysis is performed using Dempster-Shafer theory of evidence. In this approach, one does not assign a probability distribution to each uncertain input variable. Rather, one divides each uncertain input variable into one or more intervals. The input parameters are only known to occur within intervals: nothing more is assumed. Each interval is defined by its upper and lower bounds, and a Basic Probability Assignment (BPA) associated with that interval. The BPA represents a probability of that uncertain variable being located within that interval. The intervals and BPAs are used to construct uncertainty measures on the outputs called "belief" and "plausibility." Belief represents the smallest possible probability that is consistent with the evidence, while plausibility represents the largest possible probability that is consistent with the evidence. For more information about the Dempster-Shafer approach, see the nondeterministic evidence method, evidence, in the Methods section of this Reference manual. As an example, in the following specification:

interval_uncertain = 2
  num_intervals   =                       3               2
  interval_probs  =     0.2     0.5     0.3     0.4     0.6
  interval_bounds = 2.0 2.5 4.0 5.0 4.5 6.0 1.0 5.0 3.0 5.0

there are 2 interval uncertain variables. The first one is defined by three intervals, and the second by two intervals. The three intervals for the first variable have basic probability assignments of 0.2, 0.5, and 0.3, respectively, while the basic probability assignments for the two intervals for the second variable are 0.4 and 0.6. The basic probability assignments for each interval variable must sum to one. The interval bounds for the first variable are [2, 2.5], [4, 5], and [4.5, 6], and the interval bounds for the second variable are [1.0, 5.0] and [3.0, 5.0].

Note that the intervals can be overlapping or disjoint. Table 7.25 summarizes the specification details for the interval_uncertain variable.

Table 7.25 Specification detail for interval uncertain variables

Description

Keyword

Associated Data

Status

Default

interval uncertain variables

interval_uncertain

integer

Optional group

no interval uncertain variables

number of intervals defined for each interval variable

num_intervals

list of integers

Required group

None

basic probability assignments per interval

interval_probs

list of reals

Required group. Note that the probabilities per variable must sum to one.

None

bounds per interval

interval_bounds

list of reals

Required group. Specify bounds as (lower, upper) per interval, per variable

None

Descriptors

descriptors

list of strings

Optional

vector of 'iuv_i' where i = 1,2,3,...

State Variables

State variables provide a convenient mechanism for managing additional model parameterizations such as mesh density, simulation convergence tolerances, and time step controls. Types include continuous real, discrete range of integer values (contiguous integers), discrete set of integer values, and discrete set of real values. Within each optional state variables specification group, the number of variables is always required. The following Tables 7.26 through 7.29 summarize the required and optional specifications for each state variable subtype. The initial_state specifications provide the initial values for the state variables which will be passed through to the simulator (e.g., in order to define parameterized modeling controls). The remaining specifications are analagous to those for Design Variables.

Continuous State Variables

Table 7.26 Specification detail for continuous state variables

Description

Keyword

Associated Data

Status

Default

Continuous state variables

continuous_state

integer

Optional group

No continuous state variables

Initial states

initial_state

list of reals

Optional

vector values = 0. (repaired to bounds, if required)

Lower bounds

lower_bounds

list of reals

Optional

vector values = -DBL_MAX

Upper bounds

upper_bounds

list of reals

Optional

vector values = +DBL_MAX

Descriptors

descriptors

list of strings

Optional

vector of 'csv_i' where i = 1,2,3,...

Discrete State Range Variables

Table 7.27 Specification detail for discrete state range variables

Description

Keyword

Associated Data

Status

Default

Discrete state range variables

discrete_state_range

integer

Optional group

No discrete state variables

Initial states

initial_state

list of integers

Optional

vector values = 0 (repaired to bounds, if required)

Lower bounds

lower_bounds

list of integers

Optional

vector values = INT_MIN

Upper bounds

upper_bounds

list of integers

Optional

vector values = INT_MAX

Descriptors

descriptors

list of strings

Optional

vector of 'dsriv_i' where i = 1,2,3,...

Discrete State Integer Set Variables

Table 7.28 Specification detail for discrete state set of integer variables

Description

Keyword

Associated Data

Status

Default

Discrete state set of integer variables

discrete_state_set_integer

integer

Optional group

no discrete state set of integer variables

Initial state

initial_state

list of integers

Optional

middle set values (mean indices, rounded down)

Number of values for each variable

num_set_values

list of integers

Optional

equal distribution

Set values

set_values

list of integers

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'dssiv_i' where i = 1,2,3,...

Discrete State Real Set Variables

Table 7.29 Specification detail for discrete state set of real variables

Description

Keyword

Associated Data

Status

Default

Discrete state set of real variables

discrete_state_set_real

integer

Optional group

no discrete state set of real variables

Initial state

initial_state

list of reals

Optional

middle set values (mean indices, rounded down)

Number of values for each variable

num_set_values

list of integers

Optional

equal distribution

Set values

set_values

list of reals

Required

N/A

Descriptors

descriptors

list of strings

Optional

vector of 'dssrv_i' where i = 1,2,3,...



Previous chapter

Next chapter

Generated on 8 Feb 2012 for DAKOTA by  doxygen 1.6.1