SST command preprocessor

Entering Commands
Output redirection and pipes
SST Macros
Control Loops
IF statement
FOR and WHILE loops
BREAK and CONTINUE
FOREACH loops
Nesting Loops
Order of processing
Escaping Metacharacters
Example

Entering commands

Whenever SST expects you to enter something from the keyboard, it will issue a prompt. The usual prompt is the letters SST followed by a '>' followed by a number which corresponds to the number of commands SST has executed in this session. It might look like this:

SST3>

Whenever you see a prompt on the screen you can enter input.

Commands entered from the keyboard are parsed by the SST parser only after a carriage return has been pressed. Lines may be edited before being sent to the command processor using the standard editing keys of the operating system. Input may be continued past the end of a line by ending the line with a backslash (\). The backslash and the following carriage return will be ignored by SST.

Comments can be entered by using the REM command followed by the comment, or by using the comment character, the pound sign (#). Any characters between a comment character and the end of a line will be ignored. Unlike the REM command, the comment character can appear after the end of a valid SST command on the same line. Comment characters are particularly useful when output is being spooled to a file (see the SPOOL command) and inside command files (see below).

A command can be aborted at any stage by sending an interrupt. On most systems hitting the control key and the letter C at the same time will work.

Output redirection and pipes

Output from an SST command can be sent directly to a file by using the output redirection character, >. The command

cova var[sr pop15] > cova.out

will send put the output from the COVA command in the file cova.out in the current directory. SST output can be appended to a file by using >> filename.

Under the unix operating system, output from SST commands can also be piped to unix programs using the output pipe character, |. The syntax for output redirection is sst-command | unix-command. The portion of the line after the | character is executed as a unix command and the SST output is sent to that command using the usual unix piping mechanism.

SST Macros

SST provides a macro facility that can substantially reduce the amount of typing required for entering commands, variable lists, and other user input. The simplest use of a macro is to define a variable list. Suppose for example that you expect to use the variables "pop15", "pop75" and "dpi" repeatedly in some subops. Define a variable list "vlist" using the MACRO command:

macro vlist { pop15 pop75 dpi 2

Whenever you want to use this list of variables in a command, you just ask SST to "expand" the macro. For example:

reg dep[sr] ind[one @vlist]

is equivalent to:

reg dep[sr] ind[one pop15 pop75 dpi]

The "@" symbol tells SST to expand the macro, i.e., to substitute the text of the macro in the command.

More complicated macros may involve parameter lists and multiple commands. For example, it is possible to define a macro to perform two stage least squares. Suppose that we are interested in estimating an equation with one endogenous and two exogenous variables and that we have several additional exogenous variables to be used as instruments. In the first stage of two stage least squares, we regress the included endogenous variable on all the exogenous variables and save the predicted values. In the second stage we regress the dependent variable on the predicted values from the first stage regression and the included exogenous variables. To define a macro "ls_2s" which performs this operation in one shot, type

macro ls_2s(y1, y2, x1, x2, z_vars) {
    reg dep[y2] ind[x1 x2 z_vars] pred[y2hat]
    reg dep[y1] ind[y2hat x1 x2] coef[b2sls]
    set ls2_err = y1 - b2sls(1)*y2 - b2sls(2)*x1 - b2sls(3)*x2
    calc stdev(e2sls)
}

The last two commands are necessary to calculate the standard error of the regression for two stage least squares (the estimated standard errors from the second stage regression need to be multiplied by the ratio of the last value calculated and the reported standard error of the regression from the second stage printout). To use the "ls_2s" macro when "price" and "quantity" are the endogenous variables, "one" and "weather" are the included exogenous variables and "populat" and "dpi" are the excluded exogenous variables type:

@ls_2(quantity, price, one, weather, populat dpi)

and the entire sequence of commands will be executed. Note that the macro arguments are delimited by commas, making it possible to assign lists of variables (separated by spaces) to the macro dummy arguments (as in the case of z_vars).

A macro name must be a valid SST identifier of any length. The braces surrounding the macro can be omitted if the entire macro body consists of a single statement appearing on the same line as the MACRO command.

When entering macros from the terminal SST will change the prompt string to indicate we are in the body of macro definition. The prompt will change from a dash to a "1" followed by a right angle bracket (">"):

SST1> macro listmac {
1> #List all currently defined macros
1> list macro
1> }

SST2>

Control Loops

In batch files or macros, you may want to execute some commands one or more times while some condition holds. SST provides some control structures with the IF, WHILE, FOR and FOREACH commands that allow you to write programs within SST.

IF statement

The IF statement is the most straightforward of the SST control structures. The syntax for the IF statement is the word "if" followed by a logical expression enclosed in parentheses followed by a body of SST commands. The body for the IF statement can either be a single command on the same line as the IF keyword or a set of commands enclosed in braces. The IF statement could be used when, for example, you want to re-run a regression with a variable deleted if its coefficient in the initial regression was less than some predetermined value:

reg dep[y] ind[x1 x2 x3] coef[b1]
if (abs(b1(3) < 1.0)) {
reg dep[y] ind[x1 x2]
}

The second REG command (with x1 and x2 as independent variables) will only be executed if the coefficient of x3 in the first regression (b1(3)) is less than one in absolute value. The above IF statement could have been written on a single line without the braces since it only contained a single statement in the body:

if (abs(b1(3) < 1.0)) reg dep[y] ind[x1 x2]

The logical expression for the IF statement can be any valid CALC expression. If it evaluates to exactly zero it is considered to be false, otherwise it is true.

The ELSE statement is also supported. It allows you to provide an alternate action if the logical expression of an IF statement is false. Following the ELSE keyword by another IF statement allows simple case-type statements to be set up:

if (abs(b1(3) < 1.0)) {
    reg dep[y] ind[x1 x2]
2 else if (abs(b1(2) < 1.0)) {
    reg dep[y] ind[x1 x3]
2 else {
    reg dep[y] ind[x2 x3]
}

If you are going to use an ELSE statement you must enclose the IF-body and the ELSE-body in parenthesis and the ELSE statement must appear on the same line as the closing brace of the IF-body.

FOR and WHILE loops

The FOR and WHILE loops allow you to run a set of SST commands while some condition holds. They are modeled after the for and while loops of the C programming language. The WHILE loop has the same syntax as the IF statement except that the body of the loop is repeated as long as the logical expression is true. Execution proceeds as follows: SST evaluates the logical expression; if it is true, it performs the specified set of commands; otherwise it proceeds to subsequent commands (outside the body of the loop). After the first pass through the set of commands, it reevaluates the logical expression. Again, if it is true it performs the commands. If the expression is false, it proceeds to the commands following the loop.

As an example, suppose you would like to estimate a regression ten times, each time adding an additional observation to the sample used to compute the regression. This can be done using a WHILE loop as follows:

calc counter = 1;
while (counter <= 10) {
    reg dep[y] ind[x1 x2] if[obsno < 20+counter]
    calc counter = counter + 1
}

Like the IF statement, the control expression for the WHILE statement can be any valid CALC expression. If the expression evaluates to exactly zero the loop is not executed; otherwise it is.

The FOR loop is very similar to the WHILE loop except that the initialization and re-initialization commands are included in the loop. So the previous example could be written with a FOR loop as:

for (calc counter = 1; counter <= 10; calc counter = counter + 1) {
    reg dep[y] ind[x1 x2] if[obsno < 20+counter]
}

BREAK and CONTINUE

The BREAK and CONTINUE commands are used to alter the execution of loops. Normally, a loop executes all statements in the body until the logical condition is no longer true. Sometimes, we wish to jump to next iteration of a loop without executing the rest of the body. This is accomplished using the CONTINUE command:

for (calc counter = 1; counter <= 10; calc counter = counter + 1) {
    # Don't bother regressing if the new data is missing
    if ( miss(y[20+counter]) ) continue
            reg dep[y] ind[x1 x2] if[obsno < 20+counter]
}

Other times, we wish to terminate a loop prematurely. Although this can often be done by setting a variable to a value such that the control expression is no longer true, a much cleaner method is to use the BREAK command:

for (calc counter; counter <= 10; calc counter = counter + 1) {
    # Don't bother regressing if the new data is missing
    if ( miss(y[20+counter]) ) continue
            # Don't go past the end of our range
    if ( 20+counter > maxobs ) break

    reg dep[y] ind[x1 x2] if[obsno < 20+counter]
}

FOREACH loops

FOREACH loops allow a group of commands to be executed once for each instance of an index variable. For example

foreach (i; 85 86 totl) {
    read file[data$i] to[x1_$i x2_$i]
}

is equivalent to:

read file[data85] to[x1_85 x2_85]
read file[data86] to[x1_86 x2_86]
read file[datatotl] to[x1_totl x2_totl]

Whenever "$i" is encountered in the body it is replaced the current value of the index variable, i. Any name may be used for the input variable - like all SST names, it must be a valid SST identifier. The list of values that the index is going to take on should be separated by commas or spaces (just like arguments to the TO and VAR subops). SST wildcard characters are also allowed in the FOREACH value list. The index variable will be set to each variable name that matches the wildcard specification, just as if the variable names had been entered by hand and separated by spaces. As another example, consider running a regression on each of the variables in the bkw data set using "sr" as the independent variable. This sequence of commands could be executed using the following foreach loop:

foreach (dep_variable; pop* dpi deldpi) {
    reg ind[sr] dep[$dep_variable]
    rem pause while we read the answer
    pause
}

Another expansion applied to FOREACH argument lists which is used in several examples below is for ranges. It is often the case that we wish to run a foreach loop over all integers from some starting value to some ending value. Rather than type all these integers in by hand, we can have SST perform the expansion by entering "{start-stop2" where start and stop are two integers with start < stop. Although this expansion works in several places within SST (most notably the VAR and TO subops) we shall only be concerned with its use in FOREACH argument lists.

SST1> foreach (i; {1-42) echo $i
1
2
3
4

Nesting loops

It is possible to nest FOR, WHILE and FOREACH loops and IF statements. If you are entering a nested set of loops from the terminal, SST updates the prompt to help you keep track of the current nesting level. The following example will demonstrate the use of nested loops and the SST prompting mechanism:

SST1> foreach (year; 85 85 87) {
1>   foreach (month; jan feb mar apr may jun) {
2>       load file[$month$year]
2>   }
1> }

SST2>

Order of processing

The SST HISTORY, MACRO and ARRAY mechanisms can be thought of as simple filters taking lines as inputs and outputting expanded lines. Each filter is applied sequentially: the output of the history filter is fed to the macro filter and then to the array filter. Some filters only act on lines read from the terminal while others act on all lines executed by SST.

The history filter is applied to all lines that are input from the terminal. Any history references are replaced by the appropriate command from the history list. Commands executed from command files are not passed through the history filter. Commands executed within loops, IF statements and macros that were entered from the terminal are only passed through the history filter as they are defined. So including a history reference within a FOREACH loop will cause the history reference to be expanded when the body is defined, not as it is executing.

The macro and array filters are applied to all lines as they are executed. The filters are applied one character at a time: a character is first checked to see if it is the macro metacharacter. If it is, then we expand the macro. Next we check for the array metacharacter and expand the array reference if it exists. When we have finished both the macro and array expansions we move on to the next character. This continues until we have processed the entire line.

Since the macro filter is applied before the array filter, including a macro reference in an array reference can cause problems. For example, suppose that the array routine is defined to be @say_hello and the macro say_hello echoes a greeting to the terminal. We might expect that if we type $routine SST would execute the macro say_hello. But if we follow through the steps that SST performs we see that something different happens:

Input String    After MACRO filter      After ARRAY filter      Output
------------    ------------------      ------------------      ------

$routine        $routine                @say_hello              @
say_hello       say_hello               say_hello               s
...

The problem here is that we do not go back to check if the variable expanded into a macro reference. Although this may seem like a pitfall, it rarely happens in practice.

How about going the other way? Is it okay to define a macro which contains a variable reference? The answer is yes. Let's look at another example. This time suppose the macro expand_myvar is defined as $myvar and the array myvar has the value hello world. If we pass the string @expand_myvar through the macro and array filters we get:

Input           After Macro filter      After Array filter      Output
-----           ------------------      ------------------      ------

@expand_myvar   $myvar                  hello world             h
ello world      ello world              ello world              e
...

This now behaves more as we expected. Generally you will use variables inside of macros much more often than the other way around and SST supports this usage.

Another point which often leads to confusion is that macros and arrays are expanded only during execution, i.e., they are not expanded when a loop or macro is being defined. If they were then setting up a simple FOREACH loop would not work. If we tried to enter

foreach (i; 1 2 3) {
    echo $i
}

and array expansion were applied to the body as it was defined then $i would be immediately replaced by its current value. But we want to wait until the ECHO command is actually executed before expanding $i, so that the foreach loop can change its value. For this reason the macro and array filters are applied only when a command is actually executed. There are a few other times when macro and array expansions are turned off. The most notable of these is the CONFIG PROMPT command (described later).

One final note should be made on the subject of macro recursion. SST allows macros to contain references to other macros. It is tempting to write recursive routines using the macro mechanism. This usage should be avoided! Since macros are implemented as simple text substitution, using macros recursively is not the same as calling a subroutine recursively. If you wish to implement a recursive algorithm, you should do so using the command file facility.

Escaping metacharacters

Sometimes we wish to use one of the SST metacharacters without forcing a history, macro or array expansion to take place. A common example is if we wish to load the file $data. Typing LOAD FILE[$data] will cause SST to look for an array named data. To get around this we use the escape character--a backlash (\). Placing an escape character in front of a metacharacter will cause the metacharacter to be sent through without the corresponding filter applied. So in the example above we would use LOAD FILE[\$data] to access the file $data. The backslash and the dollar sign are read by SST as a single dollar sign and no array expansion is performed.

If we want to use a filename beginning with the escape character itself, we would type \\. So to access the file \$data, for example, we would need to use LOAD FILE[\\\$data] (the first backslash escapes the second backslash and the third backslash escapes the dollar sign).

Example - stepwise regression

To illustrate some of the finer points of the SST command language, we will analyze a fairly complex command file which calculates a stepwise regression.

1   rem stepwise.cmd - calculate a stepwise regression
2   rem Usage:  stepwise(dvar, ivar_list)
3   rem                  $1    $2
4
5   rem dvar is the dependent variable
6   rem ivar1, ivar2, ... is a list of independent variables
7
8   # Create a local list of independent and dependent variables
9   if ($#argv < 1) {
10      # No dependent variable - read one from the terminal
11      echo Dependent variable:
12      array dvar = $<
13  2 else {
14      # Use the supplied arguments
15      array dvar = $1
16  }
17
18  if ($#argv < 2) {
19      # No independent variables - read them from the terminal
20      echo Dependent variable:
21      array ind_list = ( $< )
22   2 else {
23      # Use the supplied arguments
24      array ind_list = ($2)
25  }
26
27  # Don't print any output until we are ready
28  config out[off]
29
30  while (1) {
31
32      # Perform a regression on remaining independent variables
33      reg dep[$dvar] ind[$ind_list] coef[b_tmp] covmat[c_tmp]
34      if (!$status) goto error
35
36      # Create an empty array to hold significant variables
37      array sig_list
38
39      # Go through each of the independent variables
40      foreach (i; {1-$#ind_list2) {
41
42          # Check the dependence on the dependent variable
43          if (b_tmp($i) / sqrt(c_tmp($i,$i)) > 1.96 ) {
44              # Append the variable to the list of significant vars
45              array sig_list = ($sig_list, $ind_list[$i])
46          }
47      }
48
49      # See if we have eliminated any variables this iteration
50      # or if we are completely out of variables
51      if ($#ind_list == $#sig_list | $#sig_list == 0) {
52          # Break out of the while loop and exit
53          break
54      2 else {
55          # Use the remaining variables the next iteration
56          array ind_list = ( $sig_list )
57      }
58  }
59
60  config out[on]                              #Turn output back on
61  if( $#sig_list != 0 ) {
62      # show the results
63      reg dep[$dvar] ind[$sig_list]
64  2 else {
65      echo no significant independent variables
66  }
67
68  # Get rid of internal variables
69  del mat[b_tmp, c_tmp]
70  del array[dvar ind_list sig_list]
71  exit 1
72
73  error:
74  # An error occured someplace - clean up and leave
75
76  echo Fatal error:  array dump follows\:
77
77  # Define a macro to dump the value of a variable
78  macro dump_array(name) {
79      echo "   $name = ":
80      if ($?name) {
81          echo $name
82          del array[name]
83      }
84  }
85
86  @dump_array(dvar)
87  @dump_array(ind_list)
88  @dump_array(sig_list)
89
90  exit 0

The first seven lines of the command file describe the file and its usage. These comments are included since they allow you to figure out how to use the command file without having to read through the entire file.

Next we check the arguments and prompt the user for any missing arguments (lines 8-25). We use the fact that the arguments are stored in the array "argv" to determine how many arguments were passed to the routine. If the first or second arguments are missing, we print a prompt and read a line of input from the terminal ("$<").

Line 28 turns off all output so that we can find the significant variables without cluttering the screen with output.

Lines 30-58 define the main loop. In this loop we perform a regression on the independent variables and determine which variables are significant. The significant variables are used in the next iteration as the independent variables. At line 34 we perform the regression. We immediately check the status variable to make sure the regression was successful. If it wasn't we jump to the end of the file and print some diagnostics.

To create a list of significant variables we start with an empty array (line 37) and add the names of variables which are determined to be significant by the test in line 43. To add an element to an existing array we construct an array with multiple elements (signaled by the parentheses) and define those elements to be the previous elements of "sig_list" (separated by spaces) and the element of "ind_list" that satisfied our test. So if "ind_list" started out as "(pop15 pop75 dpi)", "sig_list" was "(pop15)" and "pop75" satisfied the criterion, the array statement in line 45, after expansions have been performed, would be:

array sig_list = (pop15 pop75)

Thus we have appended pop75 to sig_list. Notice that the index of the foreach statement (line 40) runs from 1 to the number of variables in the array "ind_list", so we check each element of ind_list.

After we have constructed the significant list, we check to see if it is time to break out of the main loop (started on line 30). There are two conditions which may signal that we are done: all the variables were significant or none of the variables were significant. If either of these cases is true (line 51) we break from the loop. Otherwise we replace the independent variable list with the significant variable list. We must use parentheses in line 56 to set "ind_list" to an array of values instead of a single value.

When we reach line 60 we have completed calculating the stepwise regression and it is time to display the results and cleanup. If there were any significant elements we print them; otherwise we echo a message stating that no significant variables were found. Lines 68-71 clean up temporary variable and arrays that we have created, and we finish by exiting with a value of 1.

The end of the command file (lines 73-90) contains a small set of commands for printing out the status of the command file when an error occured. This is done by defining a macro which will print out the value of an array (if it exists) and delete it. We invoke this macro for each of the arrays used in the file. The matrices b_tmp and c_tmp are not cleaned up since they may not exist (if the first REG command generated an error). There is currently no way to determine if an SST variable exists.