Table of Contents
- Introduction
- Using the New QOI Dialog
- Anchor Text
- Anchor Text with Occurrence Count
- Anchor Text with Leading Text
- Anchor Text with Trailing Text
Introduction
QOI (pronounced "koi") is a library shipped with the Dakota GUI that is capable of extracting quantities of interest from unstructured text. This is particularly useful for scraping information from text files without the need to write complex regular expressions.
Most running processes output a stream of text - for instance, a log file, or perhaps console output for a command-line process. Generally speaking, this text output can vary - the body of text could be shorter or longer on any given run; text could also appear in non-deterministic order, depending on the process. It's usually not wise to try and extract a QOI from an output text file based on absolute character positioning. However, QOIs will usually occur in the same location in a log file relative to the surrounding text. More often than not, nearby labels (or "key text") that describe our QOIs can be used to find the QOIs themselves.
For instance, you might expect the following text to appear in your output:
MASS = 0.001
If you wanted to extract the "0.001" value shown above, you'd only need to know about the location of the "MASS" label preceding the value, and you could deduce the location of the value, provided we make some simple assumptions about the relationship between "MASS" and "0.001" (for instance, that an equal sign appears between the two words).
There are many different types of QOI extractors you can define using the "New QOI" dialog.
Using the New QOI Dialog
Currently, the QOI dialog can be accessed in two places:
- The New Script-Based Dakota Driver wizard, on the post-processing page.
- In Next-Gen Workflow, via the qoiExtractor node.
In both cases, to open the "New QOI" dialog, simply click on "Add QOI Extractor" and provide a name for your new QOI extractor. You will then be presented with the "New QOI" dialog.
The general idea behind this dialog is to import a blob of unstructured text into the area on the left, and then apply various QOI extractors to that text one at a time, until you find an extractor that does what you want it to do.
On the left side of the dialog...
- Open File (the folder button) This allows you to import a text file (for example, a saved log file) into the sample text area.
- Paste (the clipboard button) If you have text on the clipboard, click this button to paste it into the sample text area.
- Sample Text Area This is where you place your example unstructured text to guide you in defining a QOI extractor. Think of it as a sandbox for trying out different QOI extractors.
On the right side of the dialog...
- QOI Extractor Dropdown This dropdown contains all QOI extractors available for you to use. The following sections on this manual page describe each QOI extractor in detail.
- Extract Click this button to apply your configured QOI extractor to the unstructured text in the sample text area.
- "Extracted result appears here" text box Like the default text here implies, after you click the "Extract" button to apply your configured QOI extractor, the resulting extracted QOI appears here. This gives you immediate feedback as to whether your QOI extractor is doing what you want it to do.
Anchor Text
"Anchor Text" is the most straightforward type of QOI extractor. If your QOI is near an anchoring piece of text (or "key text") use the fields in this group to define the distance between the anchor text and the QOI. The fields shown here are laid out like a spoken sentence to help you conceptualize what the QOI extractor will do.
For example, let's take a look at a snippet from the log file that Dakota's classic cantilever beam simulation will output when run:
-------------------------------------------------------------------------------- Output Section: MASS Based on user inputs, estimated cantilever beam mass is: 2.89351852e+01 (lb)
Let's say "2.89351852e+01" is the QOI we're interested in. This number value relates to the mass of the cantilever beam - we know that because the text "MASS" appears two lines before the QOI. Therefore, "MASS" should be our key text. Because "MASS" and "2.89351852e+01" are separated by two lines, our Anchor Text QOI extractor expression should read as follows:
Get 1 field(s) that are 2 line(s) after the key text MASS
This is the most straightforward way to get the value. But there's more than one way to skin this cat. You could also set the following expression:
Get 1 field(s) that are 10 field(s) after the key text MASS
This works because "2.89351852e+01" is the tenth word (or "field") after "MASS". This approach is not as preferable to the first one, because it's not as intuitive. However, something like this would be required if there were no line breaks between "MASS" and "2.89351852e+01".
Anchor Text with Occurrence Count
"Anchor Text with Occurrence Count" works the same as "Anchor Text", but accounts for the possibility that your key text appears in the unstructured text more than once, and that you may not want the first occurrence of the key text.
For example, let's take a look at a snippet from the log file that Dakota's classic cantilever beam simulation will output when run:
-------------------------------------------------------------------------------- Output Section: MASS Based on user inputs, estimated cantilever beam mass is: 2.89351852e+01 (lb)
Let's pretend that "MASS" appears four times prior to this text snippet; therefore, this snippet of text is the fifth occurrence of "MASS", which is the one we're interested in. To get it using "Anchor Text with Occurrence Count", you would write:
Get 1 field(s) that are 2 line(s) after occurrence number 5 of the key text MASS
Anchor Text with Leading Text
"Anchor Text with Leading Text" allows you to bifurcate a body of text around an anchor (or "key text"). Using the anchor, you can preserve everything that precedes the anchor, and throw away everything that follows.
This type of QOI extractor can be extremely useful in conjunction with a workflow built in Next-Gen Workflow, by chaining multiple qoiExtractor nodes together. For example, the first qoiExtractor node could use "Anchor Text with Leading Text" to downsize a log file to a relevant portion; then subsequent qoiExtractor nodes can operate on the smaller subsection of the log file.
Anchor Text with Trailing Text
"Anchor Text with Trailing Text" allows you to bifurcate a body of text around an anchor (or "key text"). Using the anchor, you can preserve everything that follows the anchor, and throw away everything that precedes it.
This type of QOI extractor can be extremely useful in conjunction with a workflow built in Next-Gen Workflow, by chaining multiple qoiExtractor nodes together. For example, the first qoiExtractor node could use "Anchor Text with Trailing Text" to downsize a log file to a relevant portion; then subsequent qoiExtractor nodes can operate on the smaller subsection of the log file.