* Moderated Regression For A 2-way Interaction Between Continuous Variables. **************** START OF TRIAL-RUN DATA ***************************** The following commands generate artificial data that can be used for a trial-run of the program. Just run this whole file. new file. input program. loop #a=1 to 200. compute idv = normal (1). compute moder = normal (1). compute e = uniform (1). end case. end loop. end file. end input program. compute dv = 8.7*idv + -1.49*moder + 1.23*(idv* moder) + e*30. descriptives var = all. **************** END OF TRIAL-RUN DATA ************************************ set mxloops=90000 printback=off width=80 seed = 1953125. matrix. * Specify the data to be analyzed on the following GET statement, as in the example below; "FILE = * " will use the currently active SPSS data set; "FILE = C:\filename" will use the specified SPSS data file on your computer; On the GET statement, the data matrix must be named DATA (as in the example); Enter the names of the variables to be analyzed from your data set after "VAR = ". The order of the variable names must be : IDV, Moderator, DV, with a comma between each variable name. *get data / file = * / var = idv, moder, dv. get data / file = * . * Specify the number of chunks for the IDV. compute chunkIDV = 7 . * Specify the number of chunks for the MOD. compute chunkMOD = 7 . * Specify the method of determining highest & lowest values for the IDV & MOD. * for an optimal design * Enter 1 to let the program do it automatically * Enter 2 to use your own preferred values. compute optdes = 1. * If you entered 2 on the above "optdes =" statement, then enter your preferred. * values on the next four statements: * Enter the lowest value for the IDV in an optimal design. compute lowIDV = 1. * Enter the highest value for the IDV in an optimal design. compute highIDV = 5. * Enter the lowest value for the MOD in an optimal design. compute lowMOD = 1. * Enter the highest value for the MOD in an optimal design. compute highMOD = 5. * Specify the # of randomized data sets for the randomization * test of statistical significance. Recommendations: * use 100 data sets for a trial run, but use 1000 or more for final results. compute permutes = 100. * End of required user specifications. compute idv = data(:,1). compute moder = data(:,2). compute dv = data(:,3). print /title="Moderated Regression For A 2-way Interaction Between Continuous Variables:". * sample size. compute bigN = nrow(idv). * centering the IDV & MOD. compute idv = idv - (csum(idv)/bigN). compute moder = moder - (csum(moder)/nrow(moder)). * 2-D frequency distribution. compute IDVmax = cmax(idv) + .00000001. compute IDVmin = cmin(idv) - .00000001. compute MODmax = cmax(moder) + .00000001. compute MODmin = cmin(moder) - .00000001. compute increm1 = (IDVmax - IDVmin) / chunkIDV. compute increm2 = (MODmax - MODmin) / chunkMOD. compute freqs = make(chunkIDV, chunkMOD, 0). loop #luper = 1 to bigN. compute IDVchun1 = ( idv(#luper,1) - IDVmin ) / increm1 . do if ( (IDVchun1 - trunc(IDVchun1)) > 0). compute IDVchunk = trunc(IDVchun1) + 1. else. compute IDVchunk = trunc(IDVchun1). end if. compute MODchun1 = ( moder(#luper,1) - MODmin ) / increm2 . do if ( (MODchun1 - trunc(MODchun1)) > 0). compute MODchunk = trunc(MODchun1) + 1. else. compute MODchunk = trunc(MODchun1). end if. compute freqs( IDVchunk, MODchunk ) = freqs( IDVchunk, MODchunk ) + 1. end loop. * Moderated Regression. compute x = idv &* moder. compute datam = {idv, moder, x, dv}. * mean, sd, & correlation matrix (Bernstein, p. 77-79). compute rawsp = t(datam) * datam . compute rsums = t(csum(datam)). compute mn = t(rsums) / bigN. compute corsp = rawsp - (1/bigN) * (rsums) * t(rsums) . compute vcv = corsp * (1/(bigN-1)). compute sd = t(sqrt(diag(vcv))). compute d = inv(mdiag(sqrt(diag(vcv)))). compute cr = d * vcv * d. compute beta = inv(cr(1:3,1:3)) * cr(1:3,4). compute b = (sd(1,4) &/ sd(1,1:3)) &* t(beta). compute a = mn(1,4) - rsum( mn(1,1:3) &* b ) . compute r2all = t(beta) * cr(1:3,4). compute r2main = t(inv(cr(1:2,1:2))*cr(1:2,4)) * cr(1:2,4). compute r2chXn = r2all - r2main. compute F = (r2all-r2main) / ((1-r2all)/(bigN-3-1)). compute dferror = bigN - 3 - 1. compute pF = 1 - fcdf(F,1,dferror). * Moderated Regression including quadratic terms. compute idvq = idv &* idv. compute modq = moder &* moder. compute datamq = { idv, moder, idvq, modq, x, dv }. * mean, sd, & correlation matrix (Bernstein, p. 77-79). compute rawspq = t(datamq) * datamq . compute rsumsq = t(csum(datamq)). compute mnq = t(rsumsq) / bigN. compute corspq = rawspq - (1/bigN) * (rsumsq) * t(rsumsq) . compute vcvq = corspq * (1/(bigN-1)). compute sdq = t(sqrt(diag(vcvq))). compute dq = inv(mdiag(sqrt(diag(vcvq)))). compute crq = dq * vcvq * dq. compute betaq = inv(crq(1:5,1:5)) * crq(1:5,6). compute bq = (sdq(1,6) &/ sdq(1,1:5)) &* t(betaq). compute aq = mnq(1,6) - rsum( mnq(1,1:5) &* bq ) . compute r2allq = t(betaq) * crq(1:5,6). compute r2mainq = t(inv(crq(1:4,1:4))*crq(1:4,6)) * crq(1:4,6). compute r2chXnq = r2allq - r2mainq. compute Fq = (r2allq-r2mainq) / ((1-r2allq)/(bigN-5-1)). compute dferrorq = bigN - 5 - 1. compute pFq = 1 - fcdf(Fq,1,dferrorq). * f-squared = the proportion of systematic variance accounted for by the effect * relative to unexplained variance in the criterion (A & W, 1991, p. 157). compute fsquare = (r2all - r2main) / (1 - r2all). compute fsquareq = (r2allq - r2mainq) / (1 - r2allq). * mse Darlington p 121. compute xx = { make(bigN,1,1), idv, moder, x }. compute sse = t(dv) * dv - t({ a; t(b)}) * t(xx) * dv. compute mse = sse / ( bigN - 3 - 1). * Vxzxz = the residual variance of the product XZ after controlling for X & Z (p. 378). * 2 ways of computing Vxzxz * method 1 = the MSE that results from regressing XZ on X & Z (p. 381) * method 2 = using the full mod reg equation, = * MSE / (n * stand. error of the estimate for the product&**2) (p. 381). * method 2: the full mod reg equation approach (yields slightly diff results in unusual sits). * Cohen & Cohen, 2003, p. 632. compute Rijm1 = inv(cr(1:3,1:3)). compute SExn = ( sd(1,4) &/ sd(1,3) ) * sqrt( (1 - r2all) / ( bigN - 3 - 1) ) * sqrt( Rijm1(3,3) ). compute Vxzxz2 = mse / ( ( bigN - 3 - 1) * SExn&**2 ). * method 1: using the MSE that results from regressing XZ on X & Z approach. compute datam2 = {idv, moder, x}. * mean, sd, & correlation matrix (Bernstein, p. 77-79). compute rawsp2 = t(datam2) * datam2 . compute rsums2 = t(csum(datam2)). compute mn2 = t(rsums2) / bigN. compute corsp2 = rawsp2 - (1/bigN) * (rsums2) * t(rsums2) . compute vcv2 = corsp2 * (1/(bigN-1)). compute sd2 = t(sqrt(diag(vcv2))). compute d2 = inv(mdiag(sqrt(diag(vcv2)))). compute cr2 = d2 * vcv2 * d2. compute beta2 = inv(cr2(1:2,1:2)) * cr2(1:2,3). compute b2 = (sd2(1,3) &/ sd2(1,1:2)) &* t(beta2). compute a2 = mn2(1,3) - rsum( mn2(1,1:2) &* b2 ). compute xx2 = { make(bigN,1,1), idv, moder }. compute sse2 = t(x) * x - t({ a2; t(b2)}) * t(xx2) * x. * mse Darlington p 121. compute mse2 = sse2 / ( bigN - 2 -1). compute Vxzxz1 = mse2. * PRE, & Relative efficiency / optimal design values. do if ( optdes = 1). * Using the mean scores in the top and bottoms chunks, based on the specified chunk sizes. compute idvtop = -9999. compute idvbot = -9999. compute modtop = -9999. compute modbot = -9999. loop #luper = 1 to bigN. do if ( idv(#luper,1) > (IDVmax - increm1) ). compute idvtop = { idvtop ; idv(#luper,1) }. end if. do if ( idv(#luper,1) < (IDVmin + increm1) ). compute idvbot = { idvbot ; idv(#luper,1) }. end if. do if ( moder(#luper,1) > (MODmax - increm2) ). compute modtop = { modtop ; moder(#luper,1) }. end if. do if ( moder(#luper,1) < (MODmin + increm2) ). compute modbot = { modbot ; moder(#luper,1) }. end if. end loop. compute idvtop = idvtop(2:nrow(idvtop),1). compute idvbot = idvbot(2:nrow(idvbot),1). compute modtop = modtop(2:nrow(modtop),1). compute modbot = modbot(2:nrow(modbot),1). compute highIDV = csum(idvtop) / nrow(idvtop). compute lowIDV = csum(idvbot) / nrow(idvbot). compute highMOD = csum(modtop) / nrow(modtop). compute lowMOD = csum(modbot) / nrow(modbot). end if. * maximim value of Vxzxz (p. 381, 383). compute maxVxzxz = (( (highIDV - lowIDV) / 2 ) &**2 ) * (( (highMOD - lowMOD) / 2 ) &**2 ). * Relative Efficiency . compute releffic = Vxzxz1 / maxVxzxz. * PRE squared partial correlation = proportional reduction in error (p 377, 384). * = the model improvement due to adding the product term. compute PRE = 1 / ( 1+ ( mse / (b(1,3)&**2 * Vxzxz1) ) ). * PRE for an optimal design p 384. compute PRE2 = 1 / ( 1+ ( mse / (b(1,3)&**2 * maxVxzxz) ) ). print bigN / title="Sample Size". print mn(1,1) / format "f9.3"/title="IDV Mean: ". print sd(1,1) /format "f9.3" /title="IDV Standard Deviation:". print IDVmin /format "f9.3" /title="IDV Lowest Score:". print IDVmax /format "f9.3" /title="IDV Highest Score:". print mn(1,2) /format "f9.3" /title="MOD Mean: ". print sd(1,2) /format "f9.3" /title="MOD Standard Deviation:". print MODmin /format "f9.3" /title="MOD Lowest Score:". print MODmax /format "f9.3" /title="MOD Highest Score:". print chunkIDV /title="Specified # of chunks for the IDV". print chunkMOD /title="Specified # of chunks for the MOD". print lowIDV /format "f9.3" /title="IDV Low Value for an Optimal Design:". print highIDV /format "f9.3" /title="IDV High Value for an Optimal Design:". print lowMOD/format "f9.3" /title="MOD Low Value for an Optimal Design:". print highMOD /format "f9.3" /title="MOD High Value for an Optimal Design:". print freqs /title="Joint Frequencies of the IDV & MOD". print Vxzxz1 /format "f9.3" /title="Vxzxz1 (residual variance of the product term)". print Vxzxz2 /format "f9.3" /title="Vxzxz2 (residual variance of the product term)". print maxVxzxz /format "f9.3" /title="maxVxzxz (maximum possible value of Vxzxz)". print releffic /format "f9.3" /title="Relative Efficiency (Vxzxz / maxVxzxz)". print PRE /format "f9.3" /title="PRE (Proportional Reduction in Error)". print PRE2 /format "f9.3" /title="PRE for the optimal design". print /space = 2 /title='Regression results for the equation that DOES NOT include quadratic terms:'. print b(1,1) /format "f9.3" /title="Slope coefficient for the IDV:". print b(1,2) /format "f9.3" /title="Slope coefficient for the MOD:". print b(1,3) /format "f9.3" /title="Slope coefficient for the product term:". print a /format "f9.3" /title="Intercept:". print r2chXn /format "f9.3" /title="Rsquared change for the interaction term". print fsquare /format "f9.3" /title="Fsquared effect size for the interaction term". print F /format "f9.3" /title="F value for the interaction term". print pF /format "f9.3" /title="Significance level of F (tabled value)". print permutes /format "f9.3" /title="Specified number of randomized data sets". * Analyses of the randomized data. compute counter = 0. compute counterq = 0. loop #nperms = 1 to permutes. * The raw data permutations are based on column-wise random shufflings * of the values in the raw data matrix using Castellan's (1992, * BRMIC, 24, 72-77) algorithm; The distributions of the original * raw variables are exactly preserved in the shuffled versions used * in the parallel analyses. * data matrix to permute. compute permdata = { idv, moder, dv }. loop #luper = 1 to (bigN -1). compute k1 = trunc( (bigN - #luper + 1) * uniform(1,1) + 1 ) + #luper - 1. compute k2 = trunc( (bigN - #luper + 1) * uniform(1,1) + 1 ) + #luper - 1. compute k3 = trunc( (bigN - #luper + 1) * uniform(1,1) + 1 ) + #luper - 1. compute d1 = permdata(#luper,1). compute d2 = permdata(#luper,2). compute d3 = permdata(#luper,3). compute permdata(#luper,1) = permdata(k1,1). compute permdata(#luper,2) = permdata(k2,2). compute permdata(#luper,3) = permdata(k3,3). compute permdata(k1,1) = d1. compute permdata(k2,2) = d2. compute permdata(k3,3) = d3. end loop. compute idv3 = permdata(:,1). compute moder3 = permdata(:,2). compute dv3 = permdata(:,3). * Moderated Regression. compute x3 = idv3 &* moder3. compute datam3 = {idv3, moder3, x3, dv3}. compute rawsp3 = t(datam3) * datam3 . compute rsums3 = t(csum(datam3)). compute corsp3 = rawsp3 - (1/bigN) * (rsums3) * t(rsums3) . compute vcv3 = corsp3 * (1/(bigN-1)). compute sd3 = t(sqrt(diag(vcv3))). compute d3 = inv(mdiag(sqrt(diag(vcv3)))). compute cr3 = d3 * vcv3 * d3. compute beta3 = inv(cr3(1:3,1:3)) * cr3(1:3,4). compute r2all3 = t(beta3) * cr3(1:3,4). compute r2main3 = t(( inv(cr3(1:2,1:2))*cr3(1:2,4))) * cr3(1:2,4). compute F3 = (r2all3-r2main3) / ((1-r2all3)/(bigN-3-1)). do if (F3 >= F). compute counter = counter + 1. end if. * Moderated Regression including quadratic terms. compute idvq3 = idv3 &* idv3. compute modq3 = moder3 &* moder3. compute datamq3 = { idv3, moder3, idvq3, modq3, x3, dv3 }. * mean, sd, & correlation matrix (Bernstein, p. 77-79). compute rawspq3 = t(datamq3) * datamq3 . compute rsumsq3 = t(csum(datamq3)). compute mnq3 = t(rsumsq3) / bigN. compute corspq3 = rawspq3 - (1/bigN) * (rsumsq3) * t(rsumsq3) . compute vcvq3 = corspq3 * (1/(bigN-1)). compute sdq3 = t(sqrt(diag(vcvq3))). compute dq3 = inv(mdiag(sqrt(diag(vcvq3)))). compute crq3 = dq3 * vcvq3 * dq3. compute betaq3 = inv(crq3(1:5,1:5)) * crq3(1:5,6). compute bq3 = (sdq3(1,6) &/ sdq3(1,1:5)) &* t(betaq3). compute aq3 = mnq3(1,6) - rsum( mnq3(1,1:5) &* bq3 ) . compute r2allq3 = t(betaq3) * crq3(1:5,6). compute r2mainq3 = t(inv(crq3(1:4,1:4))*crq3(1:4,6)) * crq3(1:4,6). compute r2chXnq3 = r2allq3 - r2mainq3. compute Fq3 = (r2allq3-r2mainq3) / ((1-r2allq3)/(bigN-5-1)). do if (Fq3 >= Fq). compute counterq = counterq + 1. end if. end loop. * significance level computation from Noreen (1989, p. 56). compute siglevel = (counter + 1) / (permutes + 1). compute siglevq = (counterq + 1) / (permutes + 1). print siglevel /format "f9.3" /title="Significance level of F (randomization test):". print /space = 2 /title='Regression results for the equation that INCLUDES quadratic terms:'. print bq(1,1) /format "f9.3" /title="Slope coefficient for the IDV:". print bq(1,2) /format "f9.3" /title="Slope coefficient for the MOD:". print bq(1,3) /format "f9.3" /title="Slope coefficient for the IDV quadratic term:". print bq(1,4) /format "f9.3" /title="Slope coefficient for the MOD quadratic term:". print bq(1,5) /format "f9.3" /title="Slope coefficient for the product term:". print aq /format "f9.3" /title="Intercept:". print r2chXnq /format "f9.3" /title="R-squared change for the interaction term". print fsquareq /format "f9.3" /title="f-squared effect size for the interaction term". print Fq /format "f9.3" /title="F value for the interaction term". print pFq /format "f9.3" /title="Significance level of F (tabled value)". print permutes /format "f9.3" /title="Specified number of randomized data sets". print siglevq /format "f9.3" /title="Significance level of F (randomization test):". * saving data for plot. compute freqsout = { -9999 , -9999 , -9999 }. loop #luper = 1 to chunkIDV. loop #lupec = 1 to chunkMOD. compute freqsout = { freqsout ; #luper , #lupec , freqs(#luper,#lupec) }. end loop. end loop. compute freqsout = freqsout(2:nrow(freqsout),:). *save freqsout /outfile=* / var = IDV, MOD, Freqs . end matrix. * The SPSS GRAPH command has few options for controlling the graphs, and does not produce 3-D bar charts or histograms; Enter the "freqsout" data into a more flexible graphing program, such as DeltaGraph, for better displays. *GRAPH /scatterplot(xyz)= IDV with Freqs WITH MOD /title= 'Joint Distribution of the Predictor & Moderator Variables'.