Level Filters ----- ------- command preprocessor history, macros, arrays scanner comments, output redirection expansion wildcards, ranges, sets
The command preprocessor and scanner filters are covered elsewhere; this section concerns the expansion level filters only. The expansion filters are a set of filters which process the list of variable names used by certain commands and subops. The wildcard filter substitutes wildcard symbols (`*' or `?') with all possible matching variable names. The range filter selects existing variables from a variable range. The set expansion filter can be used as a shorthand for specifying a number list.
The expansion filters are only used inside those subops which take a list of variables as their argument:
CENSOR WEIGHT KEY IV BY DEP IND VAR PROB RSD SRSD PRED HAT TO IVALT MODEL
The expansion filters also process the FOREACH
and ARRAY
commands. The list of values for the index of a FOREACH
statement
is passed through the expansion filters before being broken up into
individual words. Similarly, the string to the right of the equals sign in
an ARRAY
command is sent through the expansion filters before being
parsed into a list of words.
The wildcard characters, `*' and `?', allow variable name substitution in a manner similar to MSDOS and UNIX filename substitution. If a word contains wildcard characters then the word is expanded to a list of matching variable names (separated by spaces). The following rules apply toward determining which variables match a wildcard string:
It is an error if a wildcard expression does not match any variables.
Consider the following sample list of variables:
pop15 pop45 pop75 pop105
The following examples illustrate some of the valid wildcard expansions:
pop1* --> pop15 pop105 pop1? --> pop15 pop1?* --> pop15 pop105 pop?5 --> pop15 pop45 pop75 pop*5 --> pop15 pop45 pop75 pop105
Range expansion operates on strings of the form
stemXXX-stemYYY
where XXX
and YYY
are two positive integers such that
XXX
<= YYY
. SST replaces such strings with a list of all
variables that have the prefix stem followed by a number in the range
XXX
to YYY
(inclusive).
Using our previous set of variables the following expansions hold:
pop10-pop75 --> pop15 pop45 pop75 pop1-pop200 --> pop15 pop45 pop75 pop105
Stem matching is performed using the wildcard matching algorithm so the characters `*' and `?' can appear in the stem:
p*1-p*100 --> pop15 pop45 pop75
It is an error if the stems are not identical or if no variables in the range exist.
A set is a string of the form {set_description2
(the braces are
required). The set description is one of two things:
start-stop
(where start
and
stop
are integers with start
<= stop
).
In either case the string expands to all the possible values of the set expression. The following examples illustrate the use of set expansion:
pop{15,17,992 --> pop15 pop17 pop99 pop{15-202 --> pop15 pop16 pop17 pop18 pop19 pop20
In addition, a head string and tail string may be concatenated with the set for more powerful expansions:
he{5,l,ar2d --> he5d held heard
Unlike wildcard and range expansions, the expanded words do not have to already exist as variables. The set description may also contain other sets:
{a,b{1,2,32,c2 --> a b1 b2 b3 c
Most of the commands that use the expansion filters treat the expanded string as a list of words (often variable names). In SST a list is a string of words separated by either spaces or commas. Commas and spaces inside nested parentheses and braces are not considered word separators. One consequence of this definition of a word is that the expansion filters take a list of words and generate another list of words.
The grammar for the expansion routines might look something like:
list: list ' ' word list ',' word word: any number of non-special characters word '*' word # wildcard strings word '?' word word int '-' word int # variable range word '{' list '2' word # set expansion word '{' int '-' int '2' word '(' expression ')' # parenthetical expression
There are two mechanisms available for stopping the action of the
special characters in the expansion filters. The first is the use of
the backslash character, `\'. The backslash character has the effect
of stopping the default action of any single character (this holds
for all of SST, not just the expansion routines). Thus we could use a
backslash to include a dash in a FOREACH
command:
foreach (i; cmd\-file1 cmd\-file2) { run $i }
Without the backslash, the expansion routines would treat the dash as part
of a variable range. Since cmd-file1 is not a valid variable range an
error message would be printed out. Backslashes can also be used to
include spaces and commas within a word. We might want to do this if we
want to run a FOREACH
loop over a set of variable pairs:
foreach (i; a\ b, b\ c, c\ d) { reg ind[x] dep[$i] }
This loop would execute the commands
reg ind[x] dep[a b] reg ind[x] dep[b c] reg ind[x] dep[c d]
For compatibility with MSDOS, where a backslash is often the directory separator, backslashes in front of regular characters are passed through without modification. To enter a single backslash in front of a special character without disabling the effect of that character precede the backslash with another backslash. Thus
\\{a,b,c2 --> \a \b \c
Another method of temporarily turning off the expansion filters is the use of double quotes. If a string which is passed through the expansion filters is enclosed in double quotes all characters inside the quotes lose any special meaning. In particular:
To enclose a quote in a quoted string we precede it with a backslash. All other backslashes are treated just like a regular (non-special) character (i.e. they are passed through the filter unchanged). Our previous example of multiple regressions using pairs of variables could have been implemented as follows:
foreach (i; "a b", "b c", "c d") { reg ind[x] dep[$i] }
foreach (i; \(1 2 3) echo $i
The backslash in front of the first open parenthesis is intended to stop the scanner from misreading about the nesting level. If we try this however, we get the following error message:
Error: missing )
The problem is that `\(' is converted by the scanner to `(' and is treated as a normal character (i.e. it does not increment the nesting level). When this is passed to the expansion filters they receive the following string:
(1 2 3
The expansion filters will read the unmatched open parenthesis (remember that a string contained in parenthesis is considered a word) and issue an error message.
Another attempt at a solution might be to use double or triple backslashes:
"\\(1 2 3" or "\\\(1 2 3". However, if we were to covert double
backslashes to a single backslash at each level then in order to enter a
backslash in a FOREACH
statement we would need to type 8
backslashes. For this reason SST does not convert double backslashes until
the very last stage of processing.