SST supports three basic data types: observation vectors, matrices and scalars.
An observation vector is a list of values that has a sample vector associated with it indicating if each element in the list is valid or missing. A data element which is not present in the list is considered missing. The value of an observation is stored as a single precision number. Observation vectors are the main type of data used in SST and are used by most SST commands. The SST manual refers to observation vectors as data; we shall use the abbreviation obsv (pronounced obs-vee) throughout the rest of the documentation.
SST is also capable of manipulating matrix data. Matrices are stored as a two dimensional array of double precision values. There can be no missing data in a matrix. Several SST commands create matrix data using optional subops; the COEF subop to the REG command is one example.
The simplest SST data type is a scalar. A scalar is a single double precision value. It cannot be missing. Examples of scalar data include constants; scalar variables, which are created with the CALC command; and individual elements of matrices and obsv's. SST also treats 1x1 matrices as scalars. Although very few SST commands reference scalars directly, they are very useful in the overall SST environment.
A scalar is an expression which involves operations on scalar data (usually constants and scalar variables). Scalar expressions are evaluated in SST using the CALC command. For example, to find the value of 1/7 we would issue the following command:
SST1> CALC 1/7 0.14286 SST2>
SST prints the answer, 0.14286, on the next line and then prompts us for the next line of input. All common arithmetic operations are allowed on scalar data. These operations are summarized in the table at the end of this section. The precedence of the operators is the same as that of C and FORTRAN. Parentheses may be used to alter the order of operations, as in the following example:
SST3> CALC 1 * (2+3) 5.00000 SST4>
Another set of operations uses relational operators, which determine
order relationships between operands. The simplest relational operator is
the ==
operator which checks to see if two operands have the same
value. Relational operators return logical values: 0 is used for false and
1 is used for true. If we had a scalar variable x
whose value was 7
and we checked it to see if its value is 10, the value returned would be 0,
or false:
SST4> CALC x == 10 0.00000 SST5>
Since all data is stored within SST as floating point data the relational
operators that include checking for equality use the following formula to
determine if x
and y
are equal:
|x - y| x == y <=> --------- < 0.000001 |x| + |y|
This check is used only if x and y are not both identically zero (in
which case x == y
is true).
To assign values to a variable we use the equals sign in the same way as in C and FORTRAN. The expression to the right of the equals sign is evaluated and the variable listed to the left of the equals sign is given that value. The command
SST5> CALC x = 2+5 SST6>
will give x
the value 7. If the variable on the left side of the
equals sign does not exist it will be created, otherwise its value
will be overwritten. Note that when an expression is an assignment its
value is not printed.
Several other assignment operations are provided by SST. Variables can be
incremented and decremented by using the ++
and --
operators.
To increment the value x
we would issue the command
SST7> CALC x++ SST8>
If x
previously had the value 7 its new value would be 8. The
increment and decrement operators can be placed either before or after the
variable name (the significance of its position will be illustrated
shortly). The variable which is being incremented or decremented must
already exist; it will not be created.
Another set of assignment operations allows an operation to be performed on
a variable and the result stored back under that variable name. To make
this more clear, consider incrementing the variable x
by y+1
.
One way to do this would be to use the command
CALC x = x + y + 1
Another way to perform this same operation is with the +=
operator:
CALC x += y + 1
The entire expression on the right side of +=
is evaluated and added
to x
. The result is stored back into x
. This shorthand will
work with subtraction, multiplication, division and all other arithmetic
operations. Note, however, that it will not work with the logical
operators. As with the increment and decrement operators, the variable to
the left of the +=
operator must already exist.
Although no results are printed for assignment operations, the operation does have a value which can be used. For example if you issued the command
CALC y = x += 5
SST would increment x
by 5 and then assign the new value of x
to y
(notice that the assignment operators are right
associative -- they are evaluated from the right end of the expression).
This type of operation will work with the equals sign or any of the
combined operate/assign operators (+=
, -=
, etc). In
particular it is legal to assign the same value to several variables with a
statement:
CALC x = y = z = 1.5
The increment and decrement operators behave slightly differently depending
on where the ++
or --
is placed with respect to the variable.
If ++
is placed before a variable name then the variable is
incremented and the new value is used for subsequent calculations. If
++
is placed after the variable name then the variable is
incremented but the old value of the variable (before it was
incremented) is used as the result of the ++
operation. Consider the
following commands:
CALC y = ++x CALC y = x++
If the initial value of x
were 5 then in the first statement
x
would become 6 and y
would be also be assigned the value 6
(if y didn't exist it would be created). In the second statement, x
would still become 6 but y
would be assigned the value 5, the value
of x
before it was incremented.
SST also supports a conditional operator. This operator allows you to select one of two possible values based on a logical expression. It has the syntax:
logical-expr ? true-expr : false-expr
The result of the operation is as follows: if the logical expression is
nonzero (true) the result is set to the value of true-expr
otherwise
the result is the value of false-expr
. The precedence of the
conditional operator is lower than all operators except the equality
operators, so the following statement will set y
to be 1/x
if
x
is non-zero, and 0 otherwise:
CALC y = x != 0 ? 1/x : 0
To allow for sequential evaluation of expressions, SST uses the comma. Expressions separated by a comma are evaluated in the order in which they appear. The value of the entire expression is equal to the last expression evaluated. Using some of the operators previously introduced, we might use the following command:
CALC z = (y > 3 ? y -= 3 : y += x, x)
This expression is equivalent to the following code fragment:
IF (y > 3) THEN y = y - 3 ELSE y = y + x z = x
Note that since the comma has lower precedence than =
, it is
necessary to use parentheses to assign a variable to the result of a
sequential expression.
The following table summarizes the order of operations supported by SST. A few of the operations listed here will be introduced in subsequent sections. The operations are listed in their order of precedence within SST unless overridden by the use of parentheses.
Operators Associativity Comments ---------- ------------- --------- () left to right Parentheses (for grouping) ! ++ -- - ' right to left Unary operators * / \ % .* ./ .\ left to right Multiplication, division + - left to right Addition, subtraction < > <= >= left to right Relative ordering == != left to right Equality && (or &) left to right Logical AND || (or |) left to right Logical OR ?: right to left Conditional expression = += -= etc right to left Assignment , left to right Sequential evaluation
The SST expression syntax was derived from the C programming language. The
differences between SST expressions and C expressions are few. SST does not
support the bitwise operators &, |, ^ or ~, nor does it support the shift
operators, << and >>. Since the bitwise operators are not supported, the
logical operations &&
and ||
may be abbreviated &
and
|
respectively. SST supports four additional operators as well,
^
, \
, %
, and !
:
a ^ b raises a to the b power a \ b divides b by a (left division). a % b a modulo b (the fractional remainder from dividing a by b (to avoid conflicts with the history operator, always use blank space around the modulo operator) !a negation for a logical expression
Most of the data that SST deals with is stored as observation
vectors, or obsv's, in which the data is a list of values.
The values may be
marked as missing, meaning they are not valid and have no value.
The SET
command allows us to modify the values of existing
variables or create new variables using arithmetic operations.
The operations supported for obsv's include all operations available on
scalars. An obsv expression is
evaluated by observation, so if we have an observation vector y
,
the command
SET x = y + 1
will add one to each observation of y
and assign the result to
x
. The value of y
remains unchanged.
If any of the operands of an operation have missing observations then
those observations will be marked as missing in the result. As an
example consider the following data set in which MD
signifies 'missing data':
Obsno sr pop15 1: 11.43000 29.35000 2: 12.07000 23.32000 3: MD 23.80000 4: 5.75000 MD 5: 12.88000 42.19000 6: 8.79000 31.72000 7: MD MD 8: 11.90000 44.75000 9: 4.98000 46.64000 10: 10.78000 47.64000
If we issued the command
SET x = sr + pop15
we would get as a result:
Obsno sr pop15 x 1: 11.43000 29.35000 40.78000 2: 12.07000 23.32000 35.39000 3: MD 23.80000 MD 4: 5.75000 MD MD 5: 12.88000 42.19000 55.07000 6: 8.79000 31.72000 40.51000 7: MD MD MD 8: 11.90000 44.75000 56.65000 9: 4.98000 46.64000 51.62000 10: 10.78000 47.64000 58.42000
Relational operators also operate by observation. Just as with arithmetic operators, if one or both of the operands is missing, the result is marked as missing.
Obsv's may be used as the logical or controlling expression in conditional statements. SST will automatically test each observation in the observation vector for fulfillment of the test condition, thereby avoiding the use of time-consuming and complicated FOR loops to carry out this process. Thus, using data from the previous example, we could say:
SET x = sr > 10 ? 0 : pop15
meaning that for every observation of the obsv sr
which
is greater than 10, we will set the corresponding observation in the
obsv x
equal to 0. For every observation in sr
not
greater than 10, the corresponding observation in x
will be set
to the value for pop15
. We would get the following results:
Obsno sr pop15 x 1: 11.43000 29.35000 0.00000 2: 12.07000 23.32000 0.00000 3: MD 23.80000 MD 4: 5.75000 MD MD 5: 12.88000 42.19000 0.00000 6: 8.79000 31.72000 31.72000 7: MD MD MD 8: 11.90000 44.75000 0.00000 9: 4.98000 46.64000 46.64000 10: 10.78000 47.64000 0.00000
This use of a conditional statement involving the entire observation vector is the easiest way to execute a command on individual elements of an obsv.
Like many SST commands, the expression evaluation commands allow the user to specify subsets of the data that are to be worked with using the RANGE command and the IF and OBS subops. The RANGE command sets SST's global sample vector. The global sample vector marks certain observations as missing until another RANGE statement is issued. The IF and OBS subops, which are allowed on the SET command but not the MATRIX or CALC commands, can further mask out observations for a single command. The IF subop takes as its argument a logical expression. The expression is evaluated and converted to an obsv. Missing elements or valid elements with values of zero are marked as missing in the local sample vector.
The expression routines combine the global sample vector and the local sample vector to determine which observations will be used in obsv calculations. Only observations which are not marked as missing in either the global or local sample vectors will be used in obsv calculations. We shall refer to this combined sample vector as the expression sample vector.
In the creation of new variables, observations that are masked by the expression sample vector are marked as missing in the new variable. If a variable on the left side of an assignment operator already exists, only observations that are not masked by the expression sample vector are overwritten in that variable. If we wish to mark the masked observations as missing when an existing variable is overwritten, we can use the RP subop to set them. The following example illustrates the effect of the RP subop:
SET x = -1; if[obsno != 3 && obsno != 6] # Setup x SET y = z = obsno # Create y and z SET y = x; if[obsno % 2 == 0] # No RP subop SET z = x; if[obsno % 2 == 0] RP # RP subop Obsno x y z 1: -1.00000 1.00000 MD 2: -1.00000 -1.00000 -1.00000 3: MD 3.00000 MD 4: -1.00000 -1.00000 -1.00000 5: -1.00000 5.00000 MD 6: MD MD MD 7: -1.00000 -1.00000 MD 8: -1.00000 -1.00000 -1.00000 9: -1.00000 9.00000 MD 10: -1.00000 -1.00000 -1.00000
Matrix expressions are evaluated with the MATRIX
command. Matrices
can be created by running an SST command such as REG
with the COEF
subop, or by entering them explicitly. An explicit matrix is created by
surrounding a list of elements by braces, {2 -- note that this is a change from SST
1.1 which used angle brackets. Column elements are separated by commas
and the semicolon is used to indicate the end of a row. For example, the
command
MATRIX {1, 2, 3; 4, 5, 6; 7, 8, 92
will print
[ 1] [ 2] [ 3] [ 1] 1.00000 2.00000 3.00000 [ 2] 4.00000 5.00000 6.00000 [ 3] 7.00000 8.00000 9.00000
As with the CALC
command, if an assignment is not specified the result of a matrix
expression is printed on the terminal.
SST supports the usual operations between matrices. Addition, subtraction
and multiplication of matrices are denoted by +
, -
and
*
. The operations are performed whenever the matrices have the
proper dimensions. There are two matrix division symbols, \
and /
. If A
and B
are matrices then A\B
and
B/A
correspond to left and right multiplication of B
by the
inverse of A
. In general A\B
denotes the solution X
to the equation A*X = B
and B/A
denotes the solution to
X*A = B
. Left division A\B
is defined whenever B
has
as many rows as A
. If A
is square, it is factored using
gaussian elimination. The factors are then used to solve the equations
A*X[:,j] = B[:,j]
where B[:,j]
denotes the jth column of
B
. The result is a matrix X
with the same dimensions as
B
. If A
is not square it is inverted in a least squares
sense using pseudo inverses. Right division operates similarly.
It is also possible to obtain element-by-element multiplication and
division. If A
and B
have the same dimensions, A .* B
denotes the matrix whose elements are simple products of the individual
elements of A
and B
. Multiplication and division of matrices
by a scalar also use the operators *
, /
and \
. In
addition, scalar expressions can use the .*
, ./
and .\
operators (they are equivalent to *
, /
and \
respectively).
One additional unary operator is available for use with matrices -- the
transpose operator, which looks like an apostrophe, ('). The transpose operator should follow a
matrix expression and will cause the rows and columns of the expression to
be switched. Thus if we had assigned the value of our explicit matrix from
above to the variable x
we could set y
to the transpose of
x
with the following command:
MATRIX y = x'
Relational operators are not supported for matrices, since determining
whether the matrix A
is "less than" the matrix B
depends on
what notion of "less than" you wish to use.
Assignment operators are similar to those used for obsv's. Simple
assignment and assignment operations (+=
, -=
, ..., .*=
,
./=
, .\=
) are all supported. The increment and decrement
operators cannot be used with matrices.
The conditional operator can be used in matrix expressions, but only if the logical expression evaluates to a scalar.
Individual elements and groups of elements within matrices and obsv's may
be referenced by enclosing the subscripts in square brackets, separated by
a comma in the case of matrices. For example A[3,3]
refers the
third row, third column of a matrix A
, while x[4]
refers to
the third element of the observation vector x
. For convenience,
matrices for which one of the dimensions is 1 can
also be referenced using a single subscript. To specify a range of
possible subscripts the :
can be used within the subscript -
x[j:k]
would specify elements j
, j+1
, ..., k
of the obsv (or one-dimensional matrix) x
. If the colon alone is
specified, all valid indices are used. This provides a convenient way to
access an entire row or column of a matrix -- A[:,j]
is the same as the
j
th column of A
.
Subscripts can be used both on the right side and on the left side of an
assignment operator (=
, +=
, -=
, etc). In either case
only the selected elements of the matrix or obsv are used in the
expression.
Obsv's provide one other method of subscripting, called indexing. Instead of using a constant or a range as a subscript, another obsv expression can be used as an index. For each observation, the corresponding observation of the referenced obsv is then used as a subscript. The subscript value is converted to an integer by dropping the fractional portion of the data. Consider the following command
SET z = x[y]
applied to the following data (shown with the result):
Obsno x y z 1: 1.00000 5.00000 7.00000 2: 2.00000 3.00000 MD 3: MD 1.70000 1.00000 4: 4.00000 MD MD 5: 7.00000 6.00000 MD
Notice how out of range subscripts become missing in the result. Indexing cannot be used on the left hand side of an assignment operator.
Two variants of indexing are lagging and leading a variable. It
may be desirable to shift the elements of an obsv with respect to the
observation number. In other words we might wish to set y[1]
to
x[2]
and y[2]
to x[3]
and so on. Shifting an
observation in this manner is called leading a variable. This can be
accomplished by the following command
SET y = x[+1]
Any integer can be used to vary the amount of shifting that occurs. We can also lag a variable by replacing the plus sign with a minus sign. The command
SET y = x[-5]
would set y[6]
to x[1]
, y[7]
to x[2]
and so on.
The values for y[1]
to y[5],
which would correspond to
x[-4]
to x[0]
, are marked as missing just as in the case of
indexes that are out of range.
SST version 1.1 used parentheses for subscripts and leads/lags of obsvs. This is no longer supported.
Subscripted names can appear on the left side of an equals sign to set
the value of a part of an obsv or matrix. For matrices the variable
must exist and the range of subscripts must not exceed the size of the
matrix. Additionally, the size of the expression on the right side of
the equals sign must match the size of the subscripted name on the left
side of the equals sign. If A
is a 5x5 matrix then the following
assignments are acceptable:
MATRIX A[1,4] = 7+3 #first row, fourth col MATRIX A[1:2, 1:2] = {1, 0; 0, 12 #upper corner MATRIX A[:, 2:3] = {X; Y2 #second and third cols
An assignment of the form
MATRIX A[1:2, 1:2] = 1
will generate an error because the left side is a 2x2 matrix and the right side is a scalar (or 1x1 matrix).
Obsv element assignment has the same basic syntax but differs slightly in operation. Obsv's are created or extended if the variable or referenced observation does not exist. Newly created observations that are not part of the subscript range are marked as missing data. Thus to create an obsv x with the first 4 observations missing and the fifth observation equal to one, we use:
SET x[5] = 1;
Ranges can also be used (using the :
). No size checking is
performed -- observations on the right side of the equals sign that are
outside the range of the subscript are ignored. It is possible, then, to
set the 7th and 8th observations of x
to the same values as the 7th
and 8th observations of y
:
SET x[7:8] = y;
The combination of the two SET statements gives the following values
for x
(assume that y
was an obsv filled with -1):
1: MD MD MD 4: MD 1.00000 MD 7: -1.00000 -1.00000
The SST expression evaluator allows operands of different types to be combined in a single expression. If operands to an operator all have the same data type then the operation is performed as usual. If the operands have different data types then the operands are converted to the default data type for the command.
In SET mode operands are converted to obsv's in a mixed operation. For constants this means that the constant becomes an obsv which has both a constant value and a sample vector identical to the expression sample vector. Matrices are converted to obsv's by stacking the columns of the matrix one on top of the other and marking all observations as valid.
In MATRIX mode all mixed operands are converted to matrices. Scalars are
treated as 1x1 matrices (note however that several operators support the
use of scalars -- most notably *
, /
and \
). Obsv's
are converted to a column vector by ignoring all missing data items.
Mixed expressions in CALC mode are treated as obsv expressions. When the final value of a CALC expression is evaluated only the first observation of the result is used. Note that this differs from SET mode and MATRIX mode treatment of mixed expressions.
The user can force data conversions by using the var()
and mat()
functions
described later. This is useful when the default action is
not desired.
SST defines a number of variables which can be used in expressions. These include:
obsno the number of the observation being evaluated nobs number of observations in the expression sample vector maxobs number of observations set by the RANGE command MD missing data brnd binomial (0 or 1) random variable urnd uniform random variable nrnd normal random variable PI The constant PI systime The current time (cumulative seconds since 1/1/1970) _ser Standard error of the last REG command _llk Log likelihood of the last MLE command _ the last result calculated by SST
The obsno
variable can be used in a number of ways. For example,
one alternative to the normal syntax for lagging and leading variables is
to use obsno
. x[-5]
is exactly equivalent to
x[obsno-5]
(in fact this is how x[-5]
is implemented). More
complicated renumbering can also be performed. The command
SET y = x[obsno*2]
would set the elements of y
to the even numbered observations of
x
.
The value of the time variable is an integer which represents the number of seconds elapsed since a fixed date. The reference date varies from system to system, but is often 1/1/1970.
The last result variable, _
, returns the value of the last
expression which was evaluated by SST. This variable can be used to
break a long expression into several shorter ones without the use of
temporary variables. Only the SET, CALC and MATRIX commands can change
the last result variable.
SST also defines a large number of functions that can be used in expressions. To call a function, use the function name followed by an argument list contained in parentheses. To calculate the square root of 5, for example, we might issue the following CALC command:
CALC sqrt(5)
Expressions may also be included in arguments. In this case the expressions are evaluated before being passed to the function. The command
CALC power(x+y, x-y)
is equivalent to
CALC (x+y)^(x-y)
The functions used above are scalar functions: they take as arguments a list of numbers and return a single number. When a scalar function is applied to an obsv it is applied for each observation. Applying scalar functions to matrices is not allowed.
Another type of function available in SST takes a list of values as an
argument and returns a scalar. An example of such a function is
sum()
, which takes the sum of the elements which it is passed.
These vector functions work on all data types (although using them on
scalar data is not particularly useful). One important point is that
obsv's passed to a vector function are affected by the expression sample
vector. So the following sequence of commands:
SET one = 1; OBS[1-10] SET x = sum(one); IF[obsno % 2 == 0] OBS[1-10]
would result in a vector x
with 10 observations all with value 30 (the
sum of the even numbers from 1 to 10).
SST provides another method to limit the observations used in a vector
function: a logical expression can be supplied as a second argument.
In this case the vector function is applied only to observations that
are valid in the expression sample vector AND
the corresponding element
in the logical expression is valid and nonzero. So we could perform
the same sum as above with the command
SET x = sum(one, obsno % 2 == 0); OBS[1-10]
It is important to remember that the logical expression does not override the expression sample vector -- it can mask observations not masked by the expression sample vector but it cannot unmask observations. As a rule, observations which are marked as missing in the global sample vector are never used for calculations.
Another class of functions defined by SST are matrix functions. These
functions generally convert all arguments to matrices and return a matrix.
They are all reasonably straight forward with the exception of the SVD
function. The SVD function is an example of a function which returns
multiple values (it is the only such function currently defined in SST).
Under normal operation the SVD function returns a vector of the singular
values of the matrix passed as its argument. We may also wish, however, to
get the transformation matrices which form the complete singular value
decomposition (A = U * D * V
). To do this in SST we use a multiple
value assignment:
MATRIX {U, D, V2 = svd(A)
This returns the full singular value decomposition of A
in the
matrices U
, D
, and V
. If the svd function is used in
an expression or in a simple assignment then the primary return value of
svd()
is used (the D
matrix).
Below is a list of the functions available in SST:
Functions that return variables: exp(x) e^x log(x) natural log of x sqrt(x) square root of x sin(x) sin of x (x in radians) cos(x) cos of x tan(x) tangent of x cumnorm(x) cumulative normal of x invnorm(x) inverse cumulative normal of x abs(x) absolute value of x bvnorm(h,k,r) bivariate normal probability that standardized normal variates with correlation r exceed (h,k) phi(x) normal probability density at x floor(x) largest integer less than or equal to x ceil(x) smallest integer greater than or equal to x cumchi(x,df,{nc2) cumulative chi-square of x, degrees of freedom df, noncentrality parameter nc cumf(x,m,n,{nc2) cumulative F-probability of x, degrees of freedom m,n, and noncentrality parameter nc cumt(x,m,{nc2) cumulative Student's t probability of x, degrees of freedom m, noncentrality parameter nc erf(x) mixchi(x, w1, ..., wn) cumulative cdf for mixture of n chi squares, each with one degree of freedom, mixed with weights w1,...,wn. nchi(x,w1,..,wn,df1,..,dfn,nc1,..,ncn) cumulative cdf for mixture of n chi squares, with degrees of freedom dfi and noncentrality parameters nci invchi(p,df) Percentage point of central chi-square, degrees of freedom df invt(p,df) Percentage point of central t, degrees of freedom df invf(p,m,n) Percentage point of central F, degrees of freedom m,n asin(x) arcsin of x (result in radians) acos(x) arccos of x atan(x) arctan of x gamma(x) complete gamma function incgam(y,x) incomplete gamma function, argument y, parameter x rmills(x) inverse Mills ratio: phi(x)/cumnorm(-x) power(x, y) equivalent to x^y srnd(seed) seed the random number generator; returns old seed brnd binomial (0 or 1) random number generator nrnd standard normal random number generator urnd uniform random number generator vmin(x,y,...,z) minimum of the variables x,y,...,z vmax(x,y,...,z) maximum of the variables x,y,...,z rsum(x) running sum Functions that return scalars: min(x) minimum element in x max(x) maximum element in x sum(x) sum of the values of x mean(x) mean of x stddev(x) standard deviation of x median(x) median of x quantile(x, q) quantile of x (q=0.5 => median), giving the value of x such that a proportion q of the observations are lower nrm(x) 2-norm of x Functions that operate on or return matrices: (functions marked with a * are not implemented) B = inv(A) inverse of A (using LU algorithm) B = ginv(A) generalized inverse of A (Moore-Penrose) {U,D,V2 = svd(A) singular value decomposition of a mxk matrix A, A = UDV', where U is mxk, D and V are kxk, U and V are column orthonormal, D is diagonal B = A' or B = transpose(A) transpose of A y = tr(A) returns trace of A y = det(A) returns determinant of A B = chol(A) Cholesky decomposition of positive definite A, A = B'B, B upper triangular B = diag(x) returns diagonal matrix with vector x on diagonal y = vec(A) returns vector containing rows of A, in sequence y = vech(A) returns vector containing upper triangle of A, row by row y = vecd(A) returns vector containing the diagonal of A B = cumsum(A) returns matrix containing the cumulative sums of the elements in the columns of A B = mat(x) converts variable to a column matrix x = var(A) converts column matrix to a variable B = col(x,y,..,z) converts variables to a matrix with columns x,y,...,z B = {A,C,...,D2 horizontally concatenates matrices B = {A;C;...;D2 vertically concatenates matrices B = kron(A,C) Kronecker product of A and C B = ones(n,m) returns a nxm array of ones B = zeros(n,m) returns a nxm array of zeros B = eye(n) or B = eye(n,m) returns a nxn or nxm matrix with ones down the diagonal, zeros elsewhere B = toeplitz(x) returns a band matrix with the first element of x down the diagonal, the second element of x one place off the diagonal, etc. y = size(A) returns a vector of length 1 for a variable and length 2 for a matrix, with y[1] equal to the number of rows of A and y[2] equal to the number of columns of A B = reshape(A,n,m) reshapes A into a matrix of dimension nxm, taking the elements row-by-row y = maxindc(A) returns a vector of indices of the rows of A that contain the maximum elements in each column. B = submat(A,x,y) returns a submatrix of A with rows corresponding to the positive elements of the vector x, and columns corresponding to positive elements of the vector y Type cast functions mat(x) Convert x to a matrix var(x) Convert x to an obsv scalar(x) Convert x to a scalar Miscellaneous functions miss(x1,x2,...,xn) Return 1 for an observation if any of the variables x1,...,xn are missing data, 0 otherwise col(x1, x2, ..., xn) Combine several obsv into a matrix mavg(x, start, stop) Running average of an obsv; e.g., mavg(x,-1,0) is a moving average of the current and one lag observation; mavg(x,-nobs,0) is a cumulative average from the start; and mavg(x,0,nobs) is a cumulative average to the end index(x) Return the rank of each observation in a vector unique(x) Return a sorted list of the unique values of x. {list, rank2 = unique(x) generates both the values and the rank index for a variable
Users can define their own functions using the DEFINE
command. The
syntax for the command is
DEFINE function(arg1, arg2, ...) = expression
The expression must be free of syntax errors but variables and other
functions that are referenced by the expression need not exist until it
is actually executed. When the defined function is called, the dummy
arguments (arg1
, arg2
, etc) will be replaced by the values of the
expressions used in their place in the function call. A user function
to standardize an expression by deviating it from its mean and
then dividing by its standard deviation is given below:
DEFINE stand(x) = (x - mean(x)) / stddev(x)
To standardize the expression x+y
and place the result in z
:
SET z = stand(x+y)
Multiple arguments may be used in the same way as internal function calls. Recursive function calls are not allowed.
Options to the DEFINE
subop can be specified by seperating optional
subops from the expression with a semicolon. Currently only the
TIME
subop is allowed.
There are several types of errors that can occur when evaluating an expression.
A syntax error occurs when the expression does not follow the rules for a valid arithmetic expression. When SST discovers a syntax error it prints an error message and indicates the location where the error occurred. No further processing is performed.
If a variable or function is not found, an error message indicating the unknown variable of function is printed and processing is aborted.
If the expression is syntactically correct and has no unresolved
variable or function references then execution proceeds. If an error
occurs during execution (such as out of memory
or divide by
zero
) the evaluation is once again aborted. Any assignments that
depended on the expression being evaluated are not performed.
This behavior is slightly modified when observation vectors are being used in a calculation. If a math error occurs (as opposed to a memory or other fatal error) an error message is printed, but execution is not stopped. Instead, the observation which generated the error is marked as missing and evaluation proceeds to the next valid observation. Any assignments that depended on the expression being evaluated are performed as usual.
All of the expression evaluation commands support an interactive mode. To enter interactive mode, simply issue the SET, CALC or MATRIX command with no arguments. You will be prompted for input lines until you enter the keyword `quit' (this can be abbreviated as `q'). Each input line will be evaluated as if it had been preceded by the corresponding evaluation command.
All three commands will print the value of an expression if the last operation in the expression is not an assignment. We have already seen the output in CALC and MATRIX mode. In SET mode the output is a compact listing of the elements of the expression:
SST8> SET SET9> y 1: 1.00000 -1.00000 3.00000 4: -1.00000 5.00000 MD 7: 7.00000 -1.00000 9.00000 10: -1.00000 SST10>
The number in the leftmost column is the observation number of the next element.
In many commands an expression can be used in place of a variable name in the VAR subop. This allows the use of an expression in a command without having to explicitly create a temporary variable. To use an expression in the VAR subop the expression must be enclosed in parentheses:
SCAT VAR[x y (x+y)]
All expressions are converted to obsv's before use. This allows constants to be treated as an obsv with constant value. For example, to perform a simple regression of the variable x we could use the command
REG DEP[x] IND[(1) (obsno)]
Expressions can not be used in places where the variable name is required for proper operation. For example the SAVE command (which saves variables to a file) can not be used with expressions.