Sequential Analysis Programs

 1) For all programs, the data are assumed to be a series
    of integer codes with values ranging from "1" to what ever value the
    user specifies in the "ncodes" computation at the start of the program.
    Code values lower than 1 will produce errors.  It is best not
    to use high code values if there are many lower values with
    no data.  This is because the programs produce matrices
    of the size 1-to-ncodes (e.g., 1 to 6).  Large matrices
    based on absent values for particular integers increase
    computation time unnecessarily, and result in unattractive output.

 2) The "labels" compute statement at the start of the programs
    (e.g., compute labels={"Code 1", ... in SPSS) permits users
    to provide labels for their chosen code values.  The labels 
    will appear on the ouput matrices and may facilitate
    interpretation of the results.

 3) The significance levels for z values/adjusted residuals are
    based on the standard normal cumulative distribution function.  
    See Bakeman & Quera (1995) for cautionary advice on their
    interpretation.

 4) See Bakeman & Quera (1995, p. 282) and Bakeman & Gottman 
    (1997, p. 144) for advice on the number of events required
    for proper interpretation of the results.

 5) Expected frequencies: formula (1) from Bakeman & Quera (1995, p. 274)
    is used when consecutive codes may repeat; iterative proportional
    fitting is used when consecutive codes may 
    not repeat.  These expected frequencies (see the "expfreq" matrix)
    are used in the computations for LRX2 and the adjusted residuals.  
    A second matrix of expected frequencies ("et") is slightly
    different and is used in the computations for z values and
    for Wampold's transformed kappa.  The "expfreq" matrix is the
    output matrix in the SEQUENTIAL and SEQGROUPS programs.
    The "et" matrix may be viewed by inserting a PRINT statement
    at the relevant locations.

 6) Maximum likelihood estimation of the expected cell frequencies using
    iterative proportional fitting (Wickens, 1989, pp. 107-112) is used
    to estimate the expected frequencies when consecutive codes may
    not repeat.  The maximum number of iterations has been set at 100,
    and the convergence level has been set at .0001.  If for some reason
    convergence does not occur with your data after 100 iterations, then
    try increasing the number of iterations (e.g., change 100 to 200 on 
    the IPFLOOP), or use a less stringent convergence criterion
    (e.g., .001 instead of .0001).

 7) The phrases, "when adjacent codes may not repeat" and "when
    consecutive codes may not follow one another" are commonly
    used in the literature to refer to structural zeros in a
    transitional frequency matrix.  However, "structural zeros"
    may also occur for other reasons (i.e., they are not restricted to
    the main diagonal of the transitional frequency matrix).
    By default, the present programs assign structural zeros to the
    main diagonal when users specify ADJACENT = 0 at the start
    of a program.  However, the programs have been designed to
    process structural zeros occurring anywhere in the data.
    If your data involve structural zeros occurring in places other
    than along the main diagonal, then scroll down to the commands for
    the ONEZERO matrix and follow the instructions.  The instructions
    merely specify that a matrix of 1s and 0s must be entered
    for your data.

 8) Transformed kappa and the associated z values should not be 
    interpreted when consecutive codes may not repeat 
    (Wampold & Margolin, 1982, p. 756).

 9) -9999 or 9999 are printed when values cannot be computed, usually
    because a computation would involve division by zero.

10) The computation and printing of unwanted values can be suppressed
    by converting the relevant commands into comments within the
    programs (i.e., by using "*" in SPSS, and "/*  */" in SAS).

11) It is recommended that trial runs of permutation tests be
    first attempted using small numbers of permutations per 
    block (e.g., 10) and blocks of permutations (e.g., 3).  
    Once you are familiar with the operation of the programs, 
    enter the desired values and let the analyses begin.  
    (If you are using the SPSS or SAS versions, give them lots
    of time e.g., go for a long lunch and have a look when you return).

12) The confidence intervals for the permutation tests are based on
    the standard normal distribution.  Confidence intervals are
    not computed or printed for the permutation tests of significance
    when fewer than two blocks of permutations are requested.

13) The one-tailed permutation tests of significance are based on
    whether particular frequencies for the shuffled codes are greater
    than (if the signs of the effects for the actual data are positive)
    or less than (if the signs of the effects for the actual data
    are negative) the corresponding frequencies for the original
    data.  The same principle is used in the computations for
    two-tailed tests, with one additional feature: the frequency
    values to be surpassed for effects in the direction opposite
    to that of the original effect are determined by computing
    additional mock "observed" values that are equally distant from the
    expected values, but in the opposite direction of the original effect.

14) The Fortran 77 versions of the programs are stand-alone programs
    that do not require access to IMSL routines.  They also have their
    own random number generators (see the RNG FUNCTION at the end of
    the programs).  See Onghena (1993, Beh. Res. Meths. Instr. & Comp.,
    25, p 384) and Brysbaert (1991,  Beh. Res. Meths. Instr. & Comp.,
    23, p 45) for references.  The results from permutation tests
    using the Fortran 77 programs will sometimes differ slightly
    from the results from the SAS and SPSS programs due to differences
    in the random number generation programs, and in the number
    of decimal places used in these programs.  However, the various
    results are increasingly similar as the number of
    permutations increase.  The three kinds of programs (F77, SPSS
    and SAS) give exactly the same results when the same random numbers
    are used.

15) For very large numbers of permutations (e.g., 10 blocks of 10,000
    permutations) the programs may produce an error message, such as
    the following Fortran 77 message:

    *** TERMINATING a.out
    *** Received signal 11 (SIGSEGV)
    Segmentation Fault (core dumped)

    You can get around the problem by trying a different seed,
    such as 97323435.

16) The code-sequence shuffling algorithim for permutation tests
    when adjacent codes cannot repeat has a loop limit
    of 10,000.  This means that up to 10,000 attempts will be
    made to ensure that consecutive codes do not repeat in
    the shuffled data.  For most data sets the loop will be
    activated very few times, although it is conceivable that
    peculiar code sequences may sometimes require more passes
    through the loop.  In any case, the set maximum value of
    10,000 (see the LIMIT computation in the programs) is likely
    to be far more than what will ever be needed.  However, if
    nonadjacent codes are mistakenly specified when the original
    data consisted of codes that can repeat, then it is possible
    that there will be many passes through the loop and the
    computations will take very long (and the results will
    obviously be inappropriate due to the specification error).
    
17) Analyses involving large transitional frequency matrices
    (i.e., when the are many possible code values) may produce
    output matrices that are difficult to read.  This is because
    the rows of large, square matrices may require more than one
    line each of printed output.  Alternative, easier-to-read
    output in these cases can be obtained by transforming the
    square output matrices into output in which each row contains
    results for single cells.  Here are examples of commands for
    doing so:

* SPSS commands for printing the results by cell.
compute out  = make(ncodes*ncodes, 7, -9999).
loop #i = 1 to ncodes.
loop #j = 1 to ncodes.
compute out((ncodes*(#i-1)+#j),:) =
  { #i,#j,freqs(#i,#j),expfreq(#i,#j),p(#i,#j),zadjres(#i,#j),pzadjres(#i,#j) }.
end loop.
end loop.
print out  /format="f6.2".

/* SAS commands for printing the results by cell */
out  = j(ncodes*ncodes, 7, -9999);
do i = 1 to ncodes;
do j = 1 to ncodes;
out[(ncodes*(i-1)+j),] =
  ( i || j || freqs[i,j] || expfreq[i,j] || p[i,j] ||
    zadjres[i,j] || pzadjres[i,j] );
end;
end;
print out;


18) The following commands are for computing and printing 
    Gottman-Allison-Liker z values.  Insert them in the (3)
    appropriate locations in SEQUENTIAL or SEQGROUPS if you
    wish to obtain these values.  Gottman-Allison-Liker 
    z values and adjusted residuals will often, but not always,
    be very similar (see Bakeman & Gottman, 1997, p 110).  

* SPSS commands for Gottman-Allison-Liker z values.
compute zgal     = make(ncodes,ncodes,-9999).
compute pzgal    = make(ncodes,ncodes,1).

* Gottman-Allison-Liker z values & sig levels.
do if ( (et(#i,#j)*(1-pnr(#j,1))*(1-pnr(#i,1))) > 0).
compute zgal(#i,#j)=(freqs(#i,#j)-et(#i,#j))
   /sqrt(et(#i,#j)*(1-pnr(#j,1))*(1-pnr(#i,1))).
compute pzgal(#i,#j) = (1 - cdfnorm(abs(zgal(#i,#j))) ) * tailed.
end if.

print zgal  /format "f6.3" 
 /title="Gottman-Allison-Liker z Scores"/cnames=b/rnames=b.
print pzgal /format "f5.4" 
 /title="Significance Levels for the Gottman-Allison-Liker z Scores"
 /cnames=b/rnames=b.


/* SAS commands for Gottman-Allison-Liker z values */
zgal     = j(ncodes,ncodes,-9999);
pzgal    = j(ncodes,ncodes,1);

/* Gottman-Allison-Liker z values & sig levels */
if ( (et[i,j]*(1-pnr[j,1])*(1-pnr[i,1])) > 0)   then do;
zgal[i,j]=(freqs[i,j]-et[i,j])/sqrt(et[i,j]*(1-pnr[j,1])*(1-pnr[i,1]));
pzgal[i,j] = (1 - probnorm(abs(zgal[i,j])) ) * tailed;
end;

print, "Gottman-Allison-Liker z Scores", zgal[rowname=b colname=b format=6.3];
print, "Significance Levels for the Gottman-Allison-Liker z Scores",
        pzgal[rowname=b colname=b format=5.4];



18) The following commands are for computing and printing Sackett
    z values.  Insert them in the (3) appropriate locations in 
    SEQUENTIAL or SEQGROUPS if you wish to obtain these values.

* SPSS commands for Sackett z values.
compute zs       = make(ncodes,ncodes,-9999).
compute pzs      = make(ncodes,ncodes,1).

* Sackett z values & sig levels, see Bakeman & Gottman, 1997, p 109.
compute zs(#i,#j)=(freqs(#i,#j)-et(#i,#j))/sqrt(et(#i,#j)*(1-pnr(#j,1))).
compute pzs(#i,#j) = (1 - cdfnorm(abs(zs  (#i,#j))) ) * 2.

print zs  /format "f6.3" /title="Sackett z Scores"/cnames=b/rnames=b.
print pzs /format "f5.4"
     /title="Significance Levels for the Sackett z Scores"/cnames=b/rnames=b.


/* SAS commands for Sackett z values */
zs       = j(ncodes,ncodes,-9999);
pzs      = j(ncodes,ncodes,1);

/* Sackett z values & sig levels, see Bakeman & Gottman, 1997, p 109. */
zs[i,j]=(freqs[i,j]-et[i,j])/sqrt(et[i,j]*(1-pnr[j,1]));
pzs[i,j] = (1 - probnorm(abs(zs[i,j])) );

print, "Sackett z Scores", zs[rowname=b colname=b format=6.3];
print, "Significance Levels for the Sackett z Scores",
        pzs[rowname=b colname=b format=5.4];



19) SAS commands corresponding to the SPSS commands in Table 1:

/*  Table 1: SAS commands */
options nocenter nodate nonumber  linesize=90;  title;
proc iml;
/* data = {3,5,3,4,4,6,3,4,4,1, ... ,2,6,1}. */
/* Enter the number of codes here. */
ncodes = 6;
/* Enter the lag number for the analyses here. */
lag = 1;
freqs = j(ncodes,ncodes,0);
do c = 1 to nrow(data);
if ( c + lag <= nrow(data) )   then
freqs[(data[c,1]),(data[(c+lag),1])] =
freqs[(data[c,1]),(data[(c+lag),1])]  + 1;
end;
/* Transforming the transitional frequency matrix for further analyses */
freqsout = j((ncodes*ncodes),3,-9999);
do i = 1 to ncodes;
do j = 1 to ncodes;
freqsout[(ncodes*(i-1)+j),] = ( i || j || freqs[i,j] );
end;
end;
create freqdata from freqsout[colname={"given" "target" "freq"}] ;
append from freqsout;
quit;
proc freq data=freqdata; weight freq; tables given * target / chisq;
run;
/* Sampling zeros must be set to very small numbers or else they
   are treated as structural zeros by SAS */
data freqdata;
 set freqdata;
 if freq = 0  then freq = 1e-20;
run;
proc catmod data=freqdata; weight freq ;
  model given * target = _response_  / noresponse noparm pred=freq prob;
  loglin given target;
endsas;


20) Use the following commands to produce transitional frequency
    matrices from two parallel streams of data.  For example, codes
    may be available for both husband and wife for each of a series
    of intervals, as in the example below.  This produces 4 transition
    matrices.

* SPSS commands for two parallel streams of data.
matrix.
* data from Faraone & Dorfman, 1987, Psych. Bull, 101, p. 313;  0=1, 1=2.
compute data = {2,1; 2,2; 1,2; 1,2; 1,1; 1,1; 2,2; 2,2; 1,2; 1,1;
                2,2; 1,2; 2,2; 1,2; 1,1; 1,2; 1,2; 1,2; 2,1; 2,2 }.

* Enter labels for the codes (5 characters maximum), if desired, here.
compute labels={"Code 1","Code 2","Code 3","Code 4","Code 5","Code 6","Code 7",
  "Code 8","Code 9","Code 10","Code 11","Code 12","Code 13","Code 14","Code 15"}.

* Enter the number of codes here.
compute ncodes = 2.

* Enter the lag number for the analyses here.
compute lag = 1.

compute freqs11 = make(ncodes,ncodes,0).
compute freqs12 = freqs11.
compute freqs21 = freqs11.
compute freqs22 = freqs11.
loop #c = 1 to nrow(data).
do if ( #c + lag le nrow(data) ).
* stream 1 to stream 2.
compute freqs12((data(#c,1)),(data((#c+lag),2))) =
        freqs12((data(#c,1)),(data((#c+lag),2)))  + 1.
* stream 2 to stream 1.
compute freqs21((data(#c,2)),(data((#c+lag),1))) =
        freqs21((data(#c,2)),(data((#c+lag),1)))  + 1.
* stream 1 to stream 1.
compute freqs11((data(#c,1)),(data((#c+lag),1))) =
        freqs11((data(#c,1)),(data((#c+lag),1)))  + 1.
* stream 2 to stream 2.
compute freqs22((data(#c,2)),(data((#c+lag),2))) =
        freqs22((data(#c,2)),(data((#c+lag),2)))  + 1.
end if.
end loop.
compute b = labels(1,1:ncodes).
compute bb = { b,"Totals"}.
print freqs11/title="Cell Frequencies for Stream 1 to Stream 1"/cnames=bb/rnames=bb.
print freqs22/title="Cell Frequencies for Stream 2 to Stream 2"/cnames=bb/rnames=bb.
print freqs12/title="Cell Frequencies for Stream 1 to Stream 2"/cnames=bb/rnames=bb.
print freqs21/title="Cell Frequencies for Stream 2 to Stream 1"/cnames=bb/rnames=bb.
end matrix.


/* SAS commands for two parallel streams of data. */
proc iml;
/* data from Faraone & Dorfman, 1987, Psych. Bull, 101, p. 313;  0=1, 1=2. */
data = {2 1, 2 2, 1 2, 1 2, 1 1, 1 1, 2 2, 2 2, 1 2, 1 1, 2 2, 1 2, 2 2,
        1 2, 1 1, 1 2, 1 2, 1 2, 2 1, 2 2 };

/* Enter labels for the codes (5 characters maximum), if desired, here. */
labels={Code1 Code2 Code3 Code4   Code5  Code6  Code7
        Code8 Code9 Code10 Code11 Code12 Code13 Code14 Code15};

/* Enter the number of codes here. */
ncodes = 2;

/* Enter the lag number for the analyses here. */
lag = 1;

freqs11 = j(ncodes,ncodes,0);
freqs12 = freqs11;
freqs21 = freqs11;
freqs22 = freqs11;
do c = 1 to nrow(data);
if ( c + lag <= nrow(data) )  then do;
/* stream 1 to stream 2. */
freqs12[(data[c,1]),(data[(c+lag),2])] =
freqs12[(data[c,1]),(data[(c+lag),2])]  + 1;
/* stream 2 to stream 1. */
freqs21[(data[c,2]),(data[(c+lag),1])] =
freqs21[(data[c,2]),(data[(c+lag),1])]  + 1;
/* stream 1 to stream 1. */
freqs11[(data[c,1]),(data[(c+lag),1])] =
freqs11[(data[c,1]),(data[(c+lag),1])]  + 1;
/* stream 2 to stream 2. */
freqs22[(data[c,2]),(data[(c+lag),2])] =
freqs22[(data[c,2]),(data[(c+lag),2])]  + 1;
end;
end;
b = labels[1,1:ncodes];
bb = ( b || {Totals});
print, "Cell Frequencies for Stream 1 to Stream 1",
        freqs11[rowname=b colname=b format=7.0];
print, "Cell Frequencies for Stream 2 to Stream 2",
        freqs22[rowname=b colname=b format=7.0];
print, "Cell Frequencies for Stream 1 to Stream 2",
        freqs12[rowname=b colname=b format=7.0];
print, "Cell Frequencies for Stream 2 to Stream 1",
        freqs21[rowname=b colname=b format=7.0];
quit;


21) Commands for saving matrix output data:

*SPSS commands.

* Use the SAVE command to save matrix output data in an SPSS system file;
  simply substitute the name of the matrix you would like to save in the place
  of "kappa", and provide a valid file name for your system.
save kappa / outfile ='filename'.

* Use the next command to make a selected output data matrix the "active" file in SPSS.
* save kappa / outfile = * .

* Use the WRITE command to save matrix output data in an external raw data file;
  simply substitute the name of the matrix you would like to save in the place
  of "kappa", and provide a valid external file name for your system.

write kappa / outfile = 'filename'
            / field = 1 to 80 by 8 / mode=rectangular.


22) The output looks and reads much better if printing of
    the commands is suppressed.

23) The programs have been extensively tested for errors.  
    However, the author bears no responsibility for errors
    that may emerge from use of the programs.