Sequential Analysis Programs 1) For all programs, the data are assumed to be a series of integer codes with values ranging from "1" to what ever value the user specifies in the "ncodes" computation at the start of the program. Code values lower than 1 will produce errors. It is best not to use high code values if there are many lower values with no data. This is because the programs produce matrices of the size 1-to-ncodes (e.g., 1 to 6). Large matrices based on absent values for particular integers increase computation time unnecessarily, and result in unattractive output. 2) The "labels" compute statement at the start of the programs (e.g., compute labels={"Code 1", ... in SPSS) permits users to provide labels for their chosen code values. The labels will appear on the ouput matrices and may facilitate interpretation of the results. 3) The significance levels for z values/adjusted residuals are based on the standard normal cumulative distribution function. See Bakeman & Quera (1995) for cautionary advice on their interpretation. 4) See Bakeman & Quera (1995, p. 282) and Bakeman & Gottman (1997, p. 144) for advice on the number of events required for proper interpretation of the results. 5) Expected frequencies: formula (1) from Bakeman & Quera (1995, p. 274) is used when consecutive codes may repeat; iterative proportional fitting is used when consecutive codes may not repeat. These expected frequencies (see the "expfreq" matrix) are used in the computations for LRX2 and the adjusted residuals. A second matrix of expected frequencies ("et") is slightly different and is used in the computations for z values and for Wampold's transformed kappa. The "expfreq" matrix is the output matrix in the SEQUENTIAL and SEQGROUPS programs. The "et" matrix may be viewed by inserting a PRINT statement at the relevant locations. 6) Maximum likelihood estimation of the expected cell frequencies using iterative proportional fitting (Wickens, 1989, pp. 107-112) is used to estimate the expected frequencies when consecutive codes may not repeat. The maximum number of iterations has been set at 100, and the convergence level has been set at .0001. If for some reason convergence does not occur with your data after 100 iterations, then try increasing the number of iterations (e.g., change 100 to 200 on the IPFLOOP), or use a less stringent convergence criterion (e.g., .001 instead of .0001). 7) The phrases, "when adjacent codes may not repeat" and "when consecutive codes may not follow one another" are commonly used in the literature to refer to structural zeros in a transitional frequency matrix. However, "structural zeros" may also occur for other reasons (i.e., they are not restricted to the main diagonal of the transitional frequency matrix). By default, the present programs assign structural zeros to the main diagonal when users specify ADJACENT = 0 at the start of a program. However, the programs have been designed to process structural zeros occurring anywhere in the data. If your data involve structural zeros occurring in places other than along the main diagonal, then scroll down to the commands for the ONEZERO matrix and follow the instructions. The instructions merely specify that a matrix of 1s and 0s must be entered for your data. 8) Transformed kappa and the associated z values should not be interpreted when consecutive codes may not repeat (Wampold & Margolin, 1982, p. 756). 9) -9999 or 9999 are printed when values cannot be computed, usually because a computation would involve division by zero. 10) The computation and printing of unwanted values can be suppressed by converting the relevant commands into comments within the programs (i.e., by using "*" in SPSS, and "/* */" in SAS). 11) It is recommended that trial runs of permutation tests be first attempted using small numbers of permutations per block (e.g., 10) and blocks of permutations (e.g., 3). Once you are familiar with the operation of the programs, enter the desired values and let the analyses begin. (If you are using the SPSS or SAS versions, give them lots of time e.g., go for a long lunch and have a look when you return). 12) The confidence intervals for the permutation tests are based on the standard normal distribution. Confidence intervals are not computed or printed for the permutation tests of significance when fewer than two blocks of permutations are requested. 13) The one-tailed permutation tests of significance are based on whether particular frequencies for the shuffled codes are greater than (if the signs of the effects for the actual data are positive) or less than (if the signs of the effects for the actual data are negative) the corresponding frequencies for the original data. The same principle is used in the computations for two-tailed tests, with one additional feature: the frequency values to be surpassed for effects in the direction opposite to that of the original effect are determined by computing additional mock "observed" values that are equally distant from the expected values, but in the opposite direction of the original effect. 14) The Fortran 77 versions of the programs are stand-alone programs that do not require access to IMSL routines. They also have their own random number generators (see the RNG FUNCTION at the end of the programs). See Onghena (1993, Beh. Res. Meths. Instr. & Comp., 25, p 384) and Brysbaert (1991, Beh. Res. Meths. Instr. & Comp., 23, p 45) for references. The results from permutation tests using the Fortran 77 programs will sometimes differ slightly from the results from the SAS and SPSS programs due to differences in the random number generation programs, and in the number of decimal places used in these programs. However, the various results are increasingly similar as the number of permutations increase. The three kinds of programs (F77, SPSS and SAS) give exactly the same results when the same random numbers are used. 15) For very large numbers of permutations (e.g., 10 blocks of 10,000 permutations) the programs may produce an error message, such as the following Fortran 77 message: *** TERMINATING a.out *** Received signal 11 (SIGSEGV) Segmentation Fault (core dumped) You can get around the problem by trying a different seed, such as 97323435. 16) The code-sequence shuffling algorithim for permutation tests when adjacent codes cannot repeat has a loop limit of 10,000. This means that up to 10,000 attempts will be made to ensure that consecutive codes do not repeat in the shuffled data. For most data sets the loop will be activated very few times, although it is conceivable that peculiar code sequences may sometimes require more passes through the loop. In any case, the set maximum value of 10,000 (see the LIMIT computation in the programs) is likely to be far more than what will ever be needed. However, if nonadjacent codes are mistakenly specified when the original data consisted of codes that can repeat, then it is possible that there will be many passes through the loop and the computations will take very long (and the results will obviously be inappropriate due to the specification error). 17) Analyses involving large transitional frequency matrices (i.e., when the are many possible code values) may produce output matrices that are difficult to read. This is because the rows of large, square matrices may require more than one line each of printed output. Alternative, easier-to-read output in these cases can be obtained by transforming the square output matrices into output in which each row contains results for single cells. Here are examples of commands for doing so: * SPSS commands for printing the results by cell. compute out = make(ncodes*ncodes, 7, -9999). loop #i = 1 to ncodes. loop #j = 1 to ncodes. compute out((ncodes*(#i-1)+#j),:) = { #i,#j,freqs(#i,#j),expfreq(#i,#j),p(#i,#j),zadjres(#i,#j),pzadjres(#i,#j) }. end loop. end loop. print out /format="f6.2". /* SAS commands for printing the results by cell */ out = j(ncodes*ncodes, 7, -9999); do i = 1 to ncodes; do j = 1 to ncodes; out[(ncodes*(i-1)+j),] = ( i || j || freqs[i,j] || expfreq[i,j] || p[i,j] || zadjres[i,j] || pzadjres[i,j] ); end; end; print out; 18) The following commands are for computing and printing Gottman-Allison-Liker z values. Insert them in the (3) appropriate locations in SEQUENTIAL or SEQGROUPS if you wish to obtain these values. Gottman-Allison-Liker z values and adjusted residuals will often, but not always, be very similar (see Bakeman & Gottman, 1997, p 110). * SPSS commands for Gottman-Allison-Liker z values. compute zgal = make(ncodes,ncodes,-9999). compute pzgal = make(ncodes,ncodes,1). * Gottman-Allison-Liker z values & sig levels. do if ( (et(#i,#j)*(1-pnr(#j,1))*(1-pnr(#i,1))) > 0). compute zgal(#i,#j)=(freqs(#i,#j)-et(#i,#j)) /sqrt(et(#i,#j)*(1-pnr(#j,1))*(1-pnr(#i,1))). compute pzgal(#i,#j) = (1 - cdfnorm(abs(zgal(#i,#j))) ) * tailed. end if. print zgal /format "f6.3" /title="Gottman-Allison-Liker z Scores"/cnames=b/rnames=b. print pzgal /format "f5.4" /title="Significance Levels for the Gottman-Allison-Liker z Scores" /cnames=b/rnames=b. /* SAS commands for Gottman-Allison-Liker z values */ zgal = j(ncodes,ncodes,-9999); pzgal = j(ncodes,ncodes,1); /* Gottman-Allison-Liker z values & sig levels */ if ( (et[i,j]*(1-pnr[j,1])*(1-pnr[i,1])) > 0) then do; zgal[i,j]=(freqs[i,j]-et[i,j])/sqrt(et[i,j]*(1-pnr[j,1])*(1-pnr[i,1])); pzgal[i,j] = (1 - probnorm(abs(zgal[i,j])) ) * tailed; end; print, "Gottman-Allison-Liker z Scores", zgal[rowname=b colname=b format=6.3]; print, "Significance Levels for the Gottman-Allison-Liker z Scores", pzgal[rowname=b colname=b format=5.4]; 18) The following commands are for computing and printing Sackett z values. Insert them in the (3) appropriate locations in SEQUENTIAL or SEQGROUPS if you wish to obtain these values. * SPSS commands for Sackett z values. compute zs = make(ncodes,ncodes,-9999). compute pzs = make(ncodes,ncodes,1). * Sackett z values & sig levels, see Bakeman & Gottman, 1997, p 109. compute zs(#i,#j)=(freqs(#i,#j)-et(#i,#j))/sqrt(et(#i,#j)*(1-pnr(#j,1))). compute pzs(#i,#j) = (1 - cdfnorm(abs(zs (#i,#j))) ) * 2. print zs /format "f6.3" /title="Sackett z Scores"/cnames=b/rnames=b. print pzs /format "f5.4" /title="Significance Levels for the Sackett z Scores"/cnames=b/rnames=b. /* SAS commands for Sackett z values */ zs = j(ncodes,ncodes,-9999); pzs = j(ncodes,ncodes,1); /* Sackett z values & sig levels, see Bakeman & Gottman, 1997, p 109. */ zs[i,j]=(freqs[i,j]-et[i,j])/sqrt(et[i,j]*(1-pnr[j,1])); pzs[i,j] = (1 - probnorm(abs(zs[i,j])) ); print, "Sackett z Scores", zs[rowname=b colname=b format=6.3]; print, "Significance Levels for the Sackett z Scores", pzs[rowname=b colname=b format=5.4]; 19) SAS commands corresponding to the SPSS commands in Table 1: /* Table 1: SAS commands */ options nocenter nodate nonumber linesize=90; title; proc iml; /* data = {3,5,3,4,4,6,3,4,4,1, ... ,2,6,1}. */ /* Enter the number of codes here. */ ncodes = 6; /* Enter the lag number for the analyses here. */ lag = 1; freqs = j(ncodes,ncodes,0); do c = 1 to nrow(data); if ( c + lag <= nrow(data) ) then freqs[(data[c,1]),(data[(c+lag),1])] = freqs[(data[c,1]),(data[(c+lag),1])] + 1; end; /* Transforming the transitional frequency matrix for further analyses */ freqsout = j((ncodes*ncodes),3,-9999); do i = 1 to ncodes; do j = 1 to ncodes; freqsout[(ncodes*(i-1)+j),] = ( i || j || freqs[i,j] ); end; end; create freqdata from freqsout[colname={"given" "target" "freq"}] ; append from freqsout; quit; proc freq data=freqdata; weight freq; tables given * target / chisq; run; /* Sampling zeros must be set to very small numbers or else they are treated as structural zeros by SAS */ data freqdata; set freqdata; if freq = 0 then freq = 1e-20; run; proc catmod data=freqdata; weight freq ; model given * target = _response_ / noresponse noparm pred=freq prob; loglin given target; endsas; 20) Use the following commands to produce transitional frequency matrices from two parallel streams of data. For example, codes may be available for both husband and wife for each of a series of intervals, as in the example below. This produces 4 transition matrices. * SPSS commands for two parallel streams of data. matrix. * data from Faraone & Dorfman, 1987, Psych. Bull, 101, p. 313; 0=1, 1=2. compute data = {2,1; 2,2; 1,2; 1,2; 1,1; 1,1; 2,2; 2,2; 1,2; 1,1; 2,2; 1,2; 2,2; 1,2; 1,1; 1,2; 1,2; 1,2; 2,1; 2,2 }. * Enter labels for the codes (5 characters maximum), if desired, here. compute labels={"Code 1","Code 2","Code 3","Code 4","Code 5","Code 6","Code 7", "Code 8","Code 9","Code 10","Code 11","Code 12","Code 13","Code 14","Code 15"}. * Enter the number of codes here. compute ncodes = 2. * Enter the lag number for the analyses here. compute lag = 1. compute freqs11 = make(ncodes,ncodes,0). compute freqs12 = freqs11. compute freqs21 = freqs11. compute freqs22 = freqs11. loop #c = 1 to nrow(data). do if ( #c + lag le nrow(data) ). * stream 1 to stream 2. compute freqs12((data(#c,1)),(data((#c+lag),2))) = freqs12((data(#c,1)),(data((#c+lag),2))) + 1. * stream 2 to stream 1. compute freqs21((data(#c,2)),(data((#c+lag),1))) = freqs21((data(#c,2)),(data((#c+lag),1))) + 1. * stream 1 to stream 1. compute freqs11((data(#c,1)),(data((#c+lag),1))) = freqs11((data(#c,1)),(data((#c+lag),1))) + 1. * stream 2 to stream 2. compute freqs22((data(#c,2)),(data((#c+lag),2))) = freqs22((data(#c,2)),(data((#c+lag),2))) + 1. end if. end loop. compute b = labels(1,1:ncodes). compute bb = { b,"Totals"}. print freqs11/title="Cell Frequencies for Stream 1 to Stream 1"/cnames=bb/rnames=bb. print freqs22/title="Cell Frequencies for Stream 2 to Stream 2"/cnames=bb/rnames=bb. print freqs12/title="Cell Frequencies for Stream 1 to Stream 2"/cnames=bb/rnames=bb. print freqs21/title="Cell Frequencies for Stream 2 to Stream 1"/cnames=bb/rnames=bb. end matrix. /* SAS commands for two parallel streams of data. */ proc iml; /* data from Faraone & Dorfman, 1987, Psych. Bull, 101, p. 313; 0=1, 1=2. */ data = {2 1, 2 2, 1 2, 1 2, 1 1, 1 1, 2 2, 2 2, 1 2, 1 1, 2 2, 1 2, 2 2, 1 2, 1 1, 1 2, 1 2, 1 2, 2 1, 2 2 }; /* Enter labels for the codes (5 characters maximum), if desired, here. */ labels={Code1 Code2 Code3 Code4 Code5 Code6 Code7 Code8 Code9 Code10 Code11 Code12 Code13 Code14 Code15}; /* Enter the number of codes here. */ ncodes = 2; /* Enter the lag number for the analyses here. */ lag = 1; freqs11 = j(ncodes,ncodes,0); freqs12 = freqs11; freqs21 = freqs11; freqs22 = freqs11; do c = 1 to nrow(data); if ( c + lag <= nrow(data) ) then do; /* stream 1 to stream 2. */ freqs12[(data[c,1]),(data[(c+lag),2])] = freqs12[(data[c,1]),(data[(c+lag),2])] + 1; /* stream 2 to stream 1. */ freqs21[(data[c,2]),(data[(c+lag),1])] = freqs21[(data[c,2]),(data[(c+lag),1])] + 1; /* stream 1 to stream 1. */ freqs11[(data[c,1]),(data[(c+lag),1])] = freqs11[(data[c,1]),(data[(c+lag),1])] + 1; /* stream 2 to stream 2. */ freqs22[(data[c,2]),(data[(c+lag),2])] = freqs22[(data[c,2]),(data[(c+lag),2])] + 1; end; end; b = labels[1,1:ncodes]; bb = ( b || {Totals}); print, "Cell Frequencies for Stream 1 to Stream 1", freqs11[rowname=b colname=b format=7.0]; print, "Cell Frequencies for Stream 2 to Stream 2", freqs22[rowname=b colname=b format=7.0]; print, "Cell Frequencies for Stream 1 to Stream 2", freqs12[rowname=b colname=b format=7.0]; print, "Cell Frequencies for Stream 2 to Stream 1", freqs21[rowname=b colname=b format=7.0]; quit; 21) Commands for saving matrix output data: *SPSS commands. * Use the SAVE command to save matrix output data in an SPSS system file; simply substitute the name of the matrix you would like to save in the place of "kappa", and provide a valid file name for your system. save kappa / outfile ='filename'. * Use the next command to make a selected output data matrix the "active" file in SPSS. * save kappa / outfile = * . * Use the WRITE command to save matrix output data in an external raw data file; simply substitute the name of the matrix you would like to save in the place of "kappa", and provide a valid external file name for your system. write kappa / outfile = 'filename' / field = 1 to 80 by 8 / mode=rectangular. 22) The output looks and reads much better if printing of the commands is suppressed. 23) The programs have been extensively tested for errors. However, the author bears no responsibility for errors that may emerge from use of the programs.