SST Expressions


Data Types

SST supports three basic data types: observation vectors, matrices and scalars.

An observation vector is a list of values that has a sample vector associated with it indicating if each element in the list is valid or missing. A data element which is not present in the list is considered missing. The value of an observation is stored as a single precision number. Observation vectors are the main type of data used in SST and are used by most SST commands. The SST manual refers to observation vectors as data; we shall use the abbreviation obsv (pronounced obs-vee) throughout the rest of the documentation.

SST is also capable of manipulating matrix data. Matrices are stored as a two dimensional array of double precision values. There can be no missing data in a matrix. Several SST commands create matrix data using optional subops; the COEF subop to the REG command is one example.

The simplest SST data type is a scalar. A scalar is a single double precision value. It cannot be missing. Examples of scalar data include constants; scalar variables, which are created with the CALC command; and individual elements of matrices and obsv's. SST also treats 1x1 matrices as scalars. Although very few SST commands reference scalars directly, they are very useful in the overall SST environment.

Scalar expressions

A scalar is an expression which involves operations on scalar data (usually constants and scalar variables). Scalar expressions are evaluated in SST using the CALC command. For example, to find the value of 1/7 we would issue the following command:

SST1> CALC 1/7
     0.14286

SST2>

SST prints the answer, 0.14286, on the next line and then prompts us for the next line of input. All common arithmetic operations are allowed on scalar data. These operations are summarized in the table at the end of this section. The precedence of the operators is the same as that of C and FORTRAN. Parentheses may be used to alter the order of operations, as in the following example:

SST3> CALC 1 * (2+3)
          5.00000
SST4>

Another set of operations uses relational operators, which determine order relationships between operands. The simplest relational operator is the == operator which checks to see if two operands have the same value. Relational operators return logical values: 0 is used for false and 1 is used for true. If we had a scalar variable x whose value was 7 and we checked it to see if its value is 10, the value returned would be 0, or false:

SST4> CALC x == 10
          0.00000
SST5>

Since all data is stored within SST as floating point data the relational operators that include checking for equality use the following formula to determine if x and y are equal:

                                 |x - y|
x == y          <=>             ---------  <  0.000001
                                |x| + |y|

This check is used only if x and y are not both identically zero (in which case x == y is true).

To assign values to a variable we use the equals sign in the same way as in C and FORTRAN. The expression to the right of the equals sign is evaluated and the variable listed to the left of the equals sign is given that value. The command

SST5> CALC x = 2+5

SST6>

will give x the value 7. If the variable on the left side of the equals sign does not exist it will be created, otherwise its value will be overwritten. Note that when an expression is an assignment its value is not printed.

Several other assignment operations are provided by SST. Variables can be incremented and decremented by using the ++ and -- operators. To increment the value x we would issue the command

SST7> CALC x++
SST8>

If x previously had the value 7 its new value would be 8. The increment and decrement operators can be placed either before or after the variable name (the significance of its position will be illustrated shortly). The variable which is being incremented or decremented must already exist; it will not be created.

Another set of assignment operations allows an operation to be performed on a variable and the result stored back under that variable name. To make this more clear, consider incrementing the variable x by y+1. One way to do this would be to use the command

CALC x = x + y + 1

Another way to perform this same operation is with the += operator:

CALC x += y + 1

The entire expression on the right side of += is evaluated and added to x. The result is stored back into x. This shorthand will work with subtraction, multiplication, division and all other arithmetic operations. Note, however, that it will not work with the logical operators. As with the increment and decrement operators, the variable to the left of the += operator must already exist.

Although no results are printed for assignment operations, the operation does have a value which can be used. For example if you issued the command

CALC y = x += 5

SST would increment x by 5 and then assign the new value of x to y (notice that the assignment operators are right associative -- they are evaluated from the right end of the expression). This type of operation will work with the equals sign or any of the combined operate/assign operators (+=, -=, etc). In particular it is legal to assign the same value to several variables with a statement:

CALC x = y = z = 1.5

The increment and decrement operators behave slightly differently depending on where the ++ or -- is placed with respect to the variable. If ++ is placed before a variable name then the variable is incremented and the new value is used for subsequent calculations. If ++ is placed after the variable name then the variable is incremented but the old value of the variable (before it was incremented) is used as the result of the ++ operation. Consider the following commands:

CALC y = ++x
CALC y = x++

If the initial value of x were 5 then in the first statement x would become 6 and y would be also be assigned the value 6 (if y didn't exist it would be created). In the second statement, x would still become 6 but y would be assigned the value 5, the value of x before it was incremented.

SST also supports a conditional operator. This operator allows you to select one of two possible values based on a logical expression. It has the syntax:

logical-expr ? true-expr : false-expr

The result of the operation is as follows: if the logical expression is nonzero (true) the result is set to the value of true-expr otherwise the result is the value of false-expr. The precedence of the conditional operator is lower than all operators except the equality operators, so the following statement will set y to be 1/x if x is non-zero, and 0 otherwise:

CALC y = x != 0 ? 1/x : 0

To allow for sequential evaluation of expressions, SST uses the comma. Expressions separated by a comma are evaluated in the order in which they appear. The value of the entire expression is equal to the last expression evaluated. Using some of the operators previously introduced, we might use the following command:

CALC z = (y > 3 ? y -= 3 : y += x, x)

This expression is equivalent to the following code fragment:

IF (y > 3) THEN y = y - 3 ELSE y = y + x
z = x

Note that since the comma has lower precedence than =, it is necessary to use parentheses to assign a variable to the result of a sequential expression.

The following table summarizes the order of operations supported by SST. A few of the operations listed here will be introduced in subsequent sections. The operations are listed in their order of precedence within SST unless overridden by the use of parentheses.

Operators           Associativity           Comments
----------          -------------           ---------
()                  left to right           Parentheses (for grouping)
! ++ -- - '         right to left           Unary operators
* / \ % .* ./ .\    left to right           Multiplication, division
+ -                 left to right           Addition, subtraction
< > <= >=           left to right           Relative ordering
== !=               left to right           Equality
&& (or &)           left to right           Logical AND
|| (or |)           left to right           Logical OR
?:                  right to left           Conditional expression
= += -= etc         right to left           Assignment
,                   left to right           Sequential evaluation

The SST expression syntax was derived from the C programming language. The differences between SST expressions and C expressions are few. SST does not support the bitwise operators &, |, ^ or ~, nor does it support the shift operators, << and >>. Since the bitwise operators are not supported, the logical operations && and || may be abbreviated & and | respectively. SST supports four additional operators as well, ^, \, %, and !:

a ^ b           raises a to the b power
a \ b           divides b by a (left division).
a % b           a modulo b (the fractional remainder from dividing a by b
                (to avoid conflicts with the history operator, always use
                blank space around the modulo operator)
!a              negation for a logical expression

Obsv expressions

Most of the data that SST deals with is stored as observation vectors, or obsv's, in which the data is a list of values. The values may be marked as missing, meaning they are not valid and have no value. The SET command allows us to modify the values of existing variables or create new variables using arithmetic operations.

The operations supported for obsv's include all operations available on scalars. An obsv expression is evaluated by observation, so if we have an observation vector y, the command

SET x = y + 1

will add one to each observation of y and assign the result to x. The value of y remains unchanged.

If any of the operands of an operation have missing observations then those observations will be marked as missing in the result. As an example consider the following data set in which MD signifies 'missing data':

Obsno            sr             pop15
   1:      11.43000          29.35000
   2:      12.07000          23.32000
   3:            MD          23.80000
   4:       5.75000                MD
   5:      12.88000          42.19000
   6:       8.79000          31.72000
   7:            MD                MD
   8:      11.90000          44.75000
   9:       4.98000          46.64000
  10:      10.78000          47.64000

If we issued the command

SET x = sr + pop15

we would get as a result:

Obsno            sr             pop15                 x
   1:      11.43000          29.35000          40.78000
   2:      12.07000          23.32000          35.39000
   3:            MD          23.80000                MD
   4:       5.75000                MD                MD
   5:      12.88000          42.19000          55.07000
   6:       8.79000          31.72000          40.51000
   7:            MD                MD                MD
   8:      11.90000          44.75000          56.65000
   9:       4.98000          46.64000          51.62000
  10:      10.78000          47.64000          58.42000

Relational operators also operate by observation. Just as with arithmetic operators, if one or both of the operands is missing, the result is marked as missing.

Obsv's may be used as the logical or controlling expression in conditional statements. SST will automatically test each observation in the observation vector for fulfillment of the test condition, thereby avoiding the use of time-consuming and complicated FOR loops to carry out this process. Thus, using data from the previous example, we could say:

SET x = sr > 10 ? 0 : pop15

meaning that for every observation of the obsv sr which is greater than 10, we will set the corresponding observation in the obsv x equal to 0. For every observation in sr not greater than 10, the corresponding observation in x will be set to the value for pop15. We would get the following results:

Obsno            sr             pop15                 x
   1:      11.43000          29.35000           0.00000
   2:      12.07000          23.32000           0.00000
   3:            MD          23.80000                MD
   4:       5.75000                MD                MD
   5:      12.88000          42.19000           0.00000
   6:       8.79000          31.72000          31.72000
   7:            MD                MD                MD
   8:      11.90000          44.75000           0.00000
   9:       4.98000          46.64000          46.64000
  10:      10.78000          47.64000           0.00000

This use of a conditional statement involving the entire observation vector is the easiest way to execute a command on individual elements of an obsv.

The Expression Sample Vector

Like many SST commands, the expression evaluation commands allow the user to specify subsets of the data that are to be worked with using the RANGE command and the IF and OBS subops. The RANGE command sets SST's global sample vector. The global sample vector marks certain observations as missing until another RANGE statement is issued. The IF and OBS subops, which are allowed on the SET command but not the MATRIX or CALC commands, can further mask out observations for a single command. The IF subop takes as its argument a logical expression. The expression is evaluated and converted to an obsv. Missing elements or valid elements with values of zero are marked as missing in the local sample vector.

The expression routines combine the global sample vector and the local sample vector to determine which observations will be used in obsv calculations. Only observations which are not marked as missing in either the global or local sample vectors will be used in obsv calculations. We shall refer to this combined sample vector as the expression sample vector.

In the creation of new variables, observations that are masked by the expression sample vector are marked as missing in the new variable. If a variable on the left side of an assignment operator already exists, only observations that are not masked by the expression sample vector are overwritten in that variable. If we wish to mark the masked observations as missing when an existing variable is overwritten, we can use the RP subop to set them. The following example illustrates the effect of the RP subop:

SET x = -1; if[obsno != 3 && obsno != 6]        # Setup x
SET y = z = obsno                               # Create y and z
SET y = x; if[obsno % 2 == 0]                   # No RP subop
SET z = x; if[obsno % 2 == 0] RP                # RP subop

Obsno             x                 y                 z
   1:      -1.00000           1.00000                MD
   2:      -1.00000          -1.00000          -1.00000
   3:            MD           3.00000                MD
   4:      -1.00000          -1.00000          -1.00000
   5:      -1.00000           5.00000                MD
   6:            MD                MD                MD
   7:      -1.00000          -1.00000                MD
   8:      -1.00000          -1.00000          -1.00000
   9:      -1.00000           9.00000                MD
  10:      -1.00000          -1.00000          -1.00000

Matrix expressions

Matrix expressions are evaluated with the MATRIX command. Matrices can be created by running an SST command such as REG with the COEF subop, or by entering them explicitly. An explicit matrix is created by surrounding a list of elements by braces, {2 -- note that this is a change from SST 1.1 which used angle brackets. Column elements are separated by commas and the semicolon is used to indicate the end of a row. For example, the command

MATRIX {1, 2, 3; 4, 5, 6; 7, 8, 92

will print

             [  1]             [  2]             [  3]
[  1]      1.00000           2.00000           3.00000
[  2]      4.00000           5.00000           6.00000
[  3]      7.00000           8.00000           9.00000

As with the CALC command, if an assignment is not specified the result of a matrix expression is printed on the terminal.

SST supports the usual operations between matrices. Addition, subtraction and multiplication of matrices are denoted by +, - and *. The operations are performed whenever the matrices have the proper dimensions. There are two matrix division symbols, \ and /. If A and B are matrices then A\B and B/A correspond to left and right multiplication of B by the inverse of A. In general A\B denotes the solution X to the equation A*X = B and B/A denotes the solution to X*A = B. Left division A\B is defined whenever B has as many rows as A. If A is square, it is factored using gaussian elimination. The factors are then used to solve the equations A*X[:,j] = B[:,j] where B[:,j] denotes the jth column of B. The result is a matrix X with the same dimensions as B. If A is not square it is inverted in a least squares sense using pseudo inverses. Right division operates similarly.

It is also possible to obtain element-by-element multiplication and division. If A and B have the same dimensions, A .* B denotes the matrix whose elements are simple products of the individual elements of A and B. Multiplication and division of matrices by a scalar also use the operators *, / and \. In addition, scalar expressions can use the .*, ./ and .\ operators (they are equivalent to *, / and \ respectively).

One additional unary operator is available for use with matrices -- the transpose operator, which looks like an apostrophe, ('). The transpose operator should follow a matrix expression and will cause the rows and columns of the expression to be switched. Thus if we had assigned the value of our explicit matrix from above to the variable x we could set y to the transpose of x with the following command:

MATRIX y = x'

Relational operators are not supported for matrices, since determining whether the matrix A is "less than" the matrix B depends on what notion of "less than" you wish to use.

Assignment operators are similar to those used for obsv's. Simple assignment and assignment operations (+=, -=, ..., .*=, ./=, .\=) are all supported. The increment and decrement operators cannot be used with matrices.

The conditional operator can be used in matrix expressions, but only if the logical expression evaluates to a scalar.

Subscripting

Individual elements and groups of elements within matrices and obsv's may be referenced by enclosing the subscripts in square brackets, separated by a comma in the case of matrices. For example A[3,3] refers the third row, third column of a matrix A, while x[4] refers to the third element of the observation vector x. For convenience, matrices for which one of the dimensions is 1 can also be referenced using a single subscript. To specify a range of possible subscripts the : can be used within the subscript - x[j:k] would specify elements j, j+1, ..., k of the obsv (or one-dimensional matrix) x. If the colon alone is specified, all valid indices are used. This provides a convenient way to access an entire row or column of a matrix -- A[:,j] is the same as the jth column of A.

Subscripts can be used both on the right side and on the left side of an assignment operator (=, +=, -=, etc). In either case only the selected elements of the matrix or obsv are used in the expression.

Obsv's provide one other method of subscripting, called indexing. Instead of using a constant or a range as a subscript, another obsv expression can be used as an index. For each observation, the corresponding observation of the referenced obsv is then used as a subscript. The subscript value is converted to an integer by dropping the fractional portion of the data. Consider the following command

SET z = x[y]

applied to the following data (shown with the result):

Obsno             x                 y                 z
   1:       1.00000           5.00000             7.00000
   2:       2.00000           3.00000                  MD
   3:            MD           1.70000             1.00000
   4:       4.00000                MD                  MD
   5:       7.00000           6.00000                  MD

Notice how out of range subscripts become missing in the result. Indexing cannot be used on the left hand side of an assignment operator.

Two variants of indexing are lagging and leading a variable. It may be desirable to shift the elements of an obsv with respect to the observation number. In other words we might wish to set y[1] to x[2] and y[2] to x[3] and so on. Shifting an observation in this manner is called leading a variable. This can be accomplished by the following command

SET y = x[+1]

Any integer can be used to vary the amount of shifting that occurs. We can also lag a variable by replacing the plus sign with a minus sign. The command

SET y = x[-5]

would set y[6] to x[1], y[7] to x[2] and so on. The values for y[1] to y[5], which would correspond to x[-4] to x[0], are marked as missing just as in the case of indexes that are out of range.

SST version 1.1 used parentheses for subscripts and leads/lags of obsvs. This is no longer supported.

Element assignment

Subscripted names can appear on the left side of an equals sign to set the value of a part of an obsv or matrix. For matrices the variable must exist and the range of subscripts must not exceed the size of the matrix. Additionally, the size of the expression on the right side of the equals sign must match the size of the subscripted name on the left side of the equals sign. If A is a 5x5 matrix then the following assignments are acceptable:

MATRIX A[1,4] = 7+3                     #first row, fourth col
MATRIX A[1:2, 1:2] = {1, 0; 0, 12       #upper corner
MATRIX A[:, 2:3] = {X; Y2               #second and third cols

An assignment of the form

MATRIX A[1:2, 1:2] = 1

will generate an error because the left side is a 2x2 matrix and the right side is a scalar (or 1x1 matrix).

Obsv element assignment has the same basic syntax but differs slightly in operation. Obsv's are created or extended if the variable or referenced observation does not exist. Newly created observations that are not part of the subscript range are marked as missing data. Thus to create an obsv x with the first 4 observations missing and the fifth observation equal to one, we use:

SET x[5] = 1;

Ranges can also be used (using the :). No size checking is performed -- observations on the right side of the equals sign that are outside the range of the subscript are ignored. It is possible, then, to set the 7th and 8th observations of x to the same values as the 7th and 8th observations of y:

SET x[7:8] = y;

The combination of the two SET statements gives the following values for x (assume that y was an obsv filled with -1):

1:            MD                MD                MD
4:            MD           1.00000                MD
7:      -1.00000          -1.00000

Mixed expressions

The SST expression evaluator allows operands of different types to be combined in a single expression. If operands to an operator all have the same data type then the operation is performed as usual. If the operands have different data types then the operands are converted to the default data type for the command.

In SET mode operands are converted to obsv's in a mixed operation. For constants this means that the constant becomes an obsv which has both a constant value and a sample vector identical to the expression sample vector. Matrices are converted to obsv's by stacking the columns of the matrix one on top of the other and marking all observations as valid.

In MATRIX mode all mixed operands are converted to matrices. Scalars are treated as 1x1 matrices (note however that several operators support the use of scalars -- most notably *, / and \). Obsv's are converted to a column vector by ignoring all missing data items.

Mixed expressions in CALC mode are treated as obsv expressions. When the final value of a CALC expression is evaluated only the first observation of the result is used. Note that this differs from SET mode and MATRIX mode treatment of mixed expressions.

The user can force data conversions by using the var() and mat() functions described later. This is useful when the default action is not desired.

Predefined variables

SST defines a number of variables which can be used in expressions. These include:

obsno           the number of the observation being evaluated
nobs            number of observations in the expression sample vector
maxobs          number of observations set by the RANGE command
MD              missing data
brnd            binomial (0 or 1) random variable
urnd            uniform random variable
nrnd            normal random variable
PI              The constant PI
systime         The current time (cumulative seconds since 1/1/1970)
_ser            Standard error of the last REG command
_llk            Log likelihood of the last MLE command
_               the last result calculated by SST

The obsno variable can be used in a number of ways. For example, one alternative to the normal syntax for lagging and leading variables is to use obsno. x[-5] is exactly equivalent to x[obsno-5] (in fact this is how x[-5] is implemented). More complicated renumbering can also be performed. The command

SET y = x[obsno*2]

would set the elements of y to the even numbered observations of x.

The value of the time variable is an integer which represents the number of seconds elapsed since a fixed date. The reference date varies from system to system, but is often 1/1/1970.

The last result variable, _, returns the value of the last expression which was evaluated by SST. This variable can be used to break a long expression into several shorter ones without the use of temporary variables. Only the SET, CALC and MATRIX commands can change the last result variable.

Internal functions

SST also defines a large number of functions that can be used in expressions. To call a function, use the function name followed by an argument list contained in parentheses. To calculate the square root of 5, for example, we might issue the following CALC command:

CALC sqrt(5)

Expressions may also be included in arguments. In this case the expressions are evaluated before being passed to the function. The command

CALC power(x+y, x-y)

is equivalent to

CALC (x+y)^(x-y)

The functions used above are scalar functions: they take as arguments a list of numbers and return a single number. When a scalar function is applied to an obsv it is applied for each observation. Applying scalar functions to matrices is not allowed.

Another type of function available in SST takes a list of values as an argument and returns a scalar. An example of such a function is sum(), which takes the sum of the elements which it is passed. These vector functions work on all data types (although using them on scalar data is not particularly useful). One important point is that obsv's passed to a vector function are affected by the expression sample vector. So the following sequence of commands:

SET one = 1; OBS[1-10]
SET x = sum(one); IF[obsno % 2 == 0] OBS[1-10]

would result in a vector x with 10 observations all with value 30 (the sum of the even numbers from 1 to 10).

SST provides another method to limit the observations used in a vector function: a logical expression can be supplied as a second argument. In this case the vector function is applied only to observations that are valid in the expression sample vector AND the corresponding element in the logical expression is valid and nonzero. So we could perform the same sum as above with the command

SET x = sum(one, obsno % 2 == 0); OBS[1-10]

It is important to remember that the logical expression does not override the expression sample vector -- it can mask observations not masked by the expression sample vector but it cannot unmask observations. As a rule, observations which are marked as missing in the global sample vector are never used for calculations.

Another class of functions defined by SST are matrix functions. These functions generally convert all arguments to matrices and return a matrix. They are all reasonably straight forward with the exception of the SVD function. The SVD function is an example of a function which returns multiple values (it is the only such function currently defined in SST). Under normal operation the SVD function returns a vector of the singular values of the matrix passed as its argument. We may also wish, however, to get the transformation matrices which form the complete singular value decomposition (A = U * D * V). To do this in SST we use a multiple value assignment:

MATRIX {U, D, V2 = svd(A)

This returns the full singular value decomposition of A in the matrices U, D, and V. If the svd function is used in an expression or in a simple assignment then the primary return value of svd() is used (the D matrix).

Below is a list of the functions available in SST:

Functions that return variables:
exp(x)                  e^x
log(x)                  natural log of x
sqrt(x)                 square root of x
sin(x)                  sin of x (x in radians)
cos(x)                  cos of x
tan(x)                  tangent of x
cumnorm(x)              cumulative normal of x
invnorm(x)              inverse cumulative normal of x
abs(x)                  absolute value of x
bvnorm(h,k,r)           bivariate normal probability that standardized
                        normal variates with correlation r exceed (h,k)
phi(x)                  normal probability density at x
floor(x)                largest integer less than or equal to x
ceil(x)                 smallest integer greater than or equal to x
cumchi(x,df,{nc2)       cumulative chi-square of x, degrees of freedom df,
                        noncentrality parameter nc
cumf(x,m,n,{nc2)        cumulative F-probability of x, degrees of freedom
                        m,n, and noncentrality parameter nc
cumt(x,m,{nc2)          cumulative Student's t probability of x, degrees
                        of freedom m, noncentrality parameter nc
erf(x)
mixchi(x, w1, ..., wn)  cumulative cdf for mixture of n chi squares, each
                        with one degree of freedom, mixed with weights
                        w1,...,wn.
nchi(x,w1,..,wn,df1,..,dfn,nc1,..,ncn)
                        cumulative cdf for mixture of n chi squares,
                        with degrees of freedom dfi and noncentrality
                        parameters nci
invchi(p,df)            Percentage point of central chi-square, degrees of
                        freedom df
invt(p,df)              Percentage point of central t, degrees of
                        freedom df

invf(p,m,n)             Percentage point of central F, degrees of
                        freedom m,n

asin(x)                 arcsin of x (result in radians)
acos(x)                 arccos of x
atan(x)                 arctan of x
gamma(x)                complete gamma function
incgam(y,x)             incomplete gamma function, argument y, parameter x
rmills(x)               inverse Mills ratio: phi(x)/cumnorm(-x)
power(x, y)             equivalent to x^y
srnd(seed)              seed the random number generator; returns old seed
brnd                    binomial (0 or 1) random number generator
nrnd                    standard normal random number generator
urnd                    uniform random number generator
vmin(x,y,...,z)         minimum of the variables x,y,...,z
vmax(x,y,...,z)         maximum of the variables x,y,...,z
rsum(x)                 running sum

Functions that return scalars:
min(x)                  minimum element in x
max(x)                  maximum element in x
sum(x)                  sum of the values of x
mean(x)                 mean of x
stddev(x)               standard deviation of x
median(x)               median of x
quantile(x, q)          quantile of x (q=0.5 => median), giving the value
                        of x such that a proportion q of the observations
                        are lower 
nrm(x)                  2-norm of x

Functions that operate on or return matrices:
(functions marked with a * are not implemented)
B = inv(A)              inverse of A (using LU algorithm)
B = ginv(A)             generalized inverse of A (Moore-Penrose)
{U,D,V2 = svd(A)        singular value decomposition of a mxk matrix A,
                        A = UDV', where U is mxk, D and V are kxk,
                        U and V are column orthonormal, D is diagonal
B = A' or 
B = transpose(A)        transpose of A
y = tr(A)               returns trace of A
y = det(A)              returns determinant of A
B = chol(A)             Cholesky decomposition of positive definite A,
                        A = B'B, B upper triangular
B = diag(x)             returns diagonal matrix with vector x on diagonal
y = vec(A)              returns vector containing rows of A, in sequence
y = vech(A)             returns vector containing upper triangle of A,
                        row by row
y = vecd(A)             returns vector containing the diagonal of A
B = cumsum(A)           returns matrix containing the cumulative sums of
                        the elements in the columns of A
B = mat(x)              converts variable to a column matrix
x = var(A)              converts column matrix to a variable
B = col(x,y,..,z)       converts variables to a matrix with columns
                        x,y,...,z
B = {A,C,...,D2         horizontally concatenates matrices
B = {A;C;...;D2         vertically concatenates matrices
B = kron(A,C)           Kronecker product of A and C
B = ones(n,m)           returns a nxm array of ones
B = zeros(n,m)          returns a nxm array of zeros
B = eye(n) or B = eye(n,m)
                        returns a nxn or nxm matrix with ones down the
                        diagonal, zeros elsewhere
B = toeplitz(x)         returns a band matrix with the first element of x
                        down the diagonal, the second element of x one
                        place off the diagonal, etc.
y = size(A)             returns a vector of length 1 for a variable and
                        length 2 for a matrix, with y[1] equal to the
                        number of rows of A and y[2] equal to the number
                        of columns of A
B = reshape(A,n,m)      reshapes A into a matrix of dimension nxm, taking
                        the elements row-by-row
y = maxindc(A)          returns a vector of indices of the rows of A that
                        contain the maximum elements in each column.
B = submat(A,x,y)       returns a submatrix of A with rows corresponding
                        to the positive elements of the vector x, and
                        columns corresponding to positive elements of the
                        vector y


Type cast functions
mat(x)                  Convert x to a matrix
var(x)                  Convert x to an obsv
scalar(x)               Convert x to a scalar

Miscellaneous functions
miss(x1,x2,...,xn)      Return 1 for an observation if any of the
                        variables x1,...,xn are missing data, 0 otherwise
col(x1, x2, ..., xn)    Combine several obsv into a matrix
mavg(x, start, stop)    Running average of an obsv; e.g., mavg(x,-1,0) is
                        a moving average of the current and one lag
                        observation; mavg(x,-nobs,0) is a cumulative
                        average from the start; and mavg(x,0,nobs) is a
                        cumulative average to the end
index(x)                Return the rank of each observation in a vector
unique(x)               Return a sorted list of the unique values of x.
                        {list, rank2 = unique(x) generates both the values
                        and the rank index for a variable

User functions

Users can define their own functions using the DEFINE command. The syntax for the command is

DEFINE function(arg1, arg2, ...) = expression

The expression must be free of syntax errors but variables and other functions that are referenced by the expression need not exist until it is actually executed. When the defined function is called, the dummy arguments (arg1, arg2, etc) will be replaced by the values of the expressions used in their place in the function call. A user function to standardize an expression by deviating it from its mean and then dividing by its standard deviation is given below:

DEFINE stand(x) = (x - mean(x)) / stddev(x)

To standardize the expression x+y and place the result in z:

SET z = stand(x+y)

Multiple arguments may be used in the same way as internal function calls. Recursive function calls are not allowed.

Options to the DEFINE subop can be specified by seperating optional subops from the expression with a semicolon. Currently only the TIME subop is allowed.

Error handling

There are several types of errors that can occur when evaluating an expression.

A syntax error occurs when the expression does not follow the rules for a valid arithmetic expression. When SST discovers a syntax error it prints an error message and indicates the location where the error occurred. No further processing is performed.

If a variable or function is not found, an error message indicating the unknown variable of function is printed and processing is aborted.

If the expression is syntactically correct and has no unresolved variable or function references then execution proceeds. If an error occurs during execution (such as out of memory or divide by zero) the evaluation is once again aborted. Any assignments that depended on the expression being evaluated are not performed.

This behavior is slightly modified when observation vectors are being used in a calculation. If a math error occurs (as opposed to a memory or other fatal error) an error message is printed, but execution is not stopped. Instead, the observation which generated the error is marked as missing and evaluation proceeds to the next valid observation. Any assignments that depended on the expression being evaluated are performed as usual.

Using Interactive Mode

All of the expression evaluation commands support an interactive mode. To enter interactive mode, simply issue the SET, CALC or MATRIX command with no arguments. You will be prompted for input lines until you enter the keyword `quit' (this can be abbreviated as `q'). Each input line will be evaluated as if it had been preceded by the corresponding evaluation command.

All three commands will print the value of an expression if the last operation in the expression is not an assignment. We have already seen the output in CALC and MATRIX mode. In SET mode the output is a compact listing of the elements of the expression:

SST8> SET

SET9> y
     1:       1.00000          -1.00000           3.00000
     4:      -1.00000           5.00000                MD
     7:       7.00000          -1.00000           9.00000
    10:      -1.00000

SST10>

The number in the leftmost column is the observation number of the next element.

Using expressions in variable lists

In many commands an expression can be used in place of a variable name in the VAR subop. This allows the use of an expression in a command without having to explicitly create a temporary variable. To use an expression in the VAR subop the expression must be enclosed in parentheses:

SCAT VAR[x y (x+y)]

All expressions are converted to obsv's before use. This allows constants to be treated as an obsv with constant value. For example, to perform a simple regression of the variable x we could use the command

REG DEP[x] IND[(1) (obsno)]

Expressions can not be used in places where the variable name is required for proper operation. For example the SAVE command (which saves variables to a file) can not be used with expressions.


SST Back Fortran