Date: Mon, 6 Feb 2012 12:30:52 -0800 Bruce Weaver "SPSSX(r) Discussion" Bruce Weaver Re: Interesting bit of code ;-) <1328547238299-5460597.post@n5.nabble.com> text/plain; charset=us-ascii

Here's a small improvement to the NUMERIC command used to generate the indicator variables:

NUMERIC A1 TO A3 B1 TO B4 A1B1 TO A1B4 A2B1 TO A2B4 A3B1 TO A3B4 (F1).

Bruce Weaver wrote > > Yes, it becomes more transparent the longer one looks at it. I was just > thinking about how to extend it to the case of a factorial design (only > for folks with old versions that won't support the Python-based dummy > variable generator, of course). Something like this, I suppose. > > * Generate A and B variables for a 3x4 factorial design. > > data list free / a b (2f1). > begin data > 1 1 1 2 1 3 1 4 > 2 1 2 2 2 3 2 4 > 3 1 3 2 3 3 3 4 > end data. > > AGGREGATE > /OUTFILE=* MODE=ADDVARIABLES > /BREAK= > /maxB 'Max value of B'=MAX(B). > FORMATS a b maxB (f1.0). > > * Now generate indicator variables for A, B, and A*B . > > NUMERIC A1 TO A3 B1 TO B4 > A1B1 A1B2 A1B3 A1B4 > A2B1 A2B2 A2B3 A2B4 > A3B1 A3B2 A3B3 A3B4 (F1). > RECODE A1 TO A3B4 (ELSE=0). /* Initialize all indicators to 0. > VECTOR AV = A1 TO A3 / BV = B1 TO B4 / ABV = A1B1 TO A3B4 . > COMPUTE AV(A) = 1. > COMPUTE BV(B) = 1. > COMPUTE ABV((A-1)*maxB+B) = 1. /* Note the use of maxB here . > LIST A B A1 to A3B4. > > > > > David Marso wrote >> >> Yeah, BUT I find the following to be as transparent as glass ;-). >> ONE compute per case rather than 50 and NO logical comparison required. >> NUMERIC state_01 TO state_50 (F1). >> RECODE state_01 TO state_50 (ELSE=0). >> VECTOR state_dummy=state_01 TO state_50. >> COMPUTE state_dummy(state)=1. >> >> "p.s. - I'd never noticed that one can use PRINT like that on END REPEAT. >> Thanks for educating me (once again)." >> Glad to be of service! >> -- >> ** You can also use: >> END REPEAT NOPRINT (but WTF? there for the sake of completeness???). >> -- >> >> Bruce Weaver wrote >>> >>> You're right about efficiency. But unless one is working with a HUGE >>> data file, the difference will likely be imperceptible (to the human >>> eye, at least). And in that case, transparency should trump efficiency. >>> >>> I feel like I'm stealing material from Art Kendall here. ;-) >>> >>> p.s. - I'd never noticed that one can use PRINT like that on END REPEAT. >>> Thanks for educating me (once again). >>> >>> >>> >>> David Marso wrote >>>> >>>> OTOH It is less efficient than the VECTOR approach. >>>> N computes rather than 1. >>>> I'm not sure about the RECODE WRT processing efficiency. My point was >>>> the interesting way that RECODE allows multiple vars to be created from >>>> one variable and the ability to subsequently recode these new variables >>>> in the single recode statement. >>>> -- >>>> Remember DO REPEAT cycles through the entire list of stand in >>>> 'variables'. >>>> -- >>>> NUMERIC dx1 TO dx4 (F1). >>>> >>>> DO REPEAT dx = dx1 to dx4 / # = 1 to 4 . >>>> - COMPUTE dx = g4 EQ #. >>>> END REPEAT PRINT. >>>> >>>> 18 0 +COMPUTE DX1 = G4 EQ 1 >>>> 19 0 +COMPUTE DX2 = G4 EQ 2 >>>> 20 0 +COMPUTE DX3 = G4 EQ 3 >>>> 21 0 +COMPUTE DX4 = G4 EQ 4 >>>> >>>> LIST. >>>> >>>> Bruce Weaver wrote >>>>> >>>>> I like DO-REPEAT for this task. It's very transparent, I think, and >>>>> not significantly more likely to cause RSI than the other methods you >>>>> show . ;-) >>>>> >>>>> DATA LIST FREE / g4 (F1). >>>>> BEGIN DATA >>>>> 1 2 3 4 >>>>> END DATA. >>>>> >>>>> NUMERIC dx1 TO dx4 (F1). >>>>> DO REPEAT dx = dx1 to dx4 / # = 1 to 4 . >>>>> - COMPUTE dx = g4 EQ #. >>>>> END REPEAT. >>>>> LIST. >>>>> >>>>> >>>>> >>>>> David Marso wrote >>>>>> >>>>>> DUMMY Variables... >>>>>> -------------------- >>>>>> *G4 exists in the data file and has integer values between 1 and 4. >>>>>> **COMMENTS??**. >>>>>> RECODE G4 (1=1) INTO DX1 >>>>>> / G4 (2=1) INTO DX2 >>>>>> / G4 (3=1) INTO DX3 >>>>>> / DX1 DX2 DX3 (MISSING=0). >>>>>> >>>>>> OTOH: The following is more concise with large number of groups ;-) >>>>>> >>>>>> NUMERIC DX1 TO DX4 (F1). >>>>>> RECODE DX1 TO DX4 (ELSE=0). >>>>>> VECTOR DX=DX1 TO DX4. >>>>>> COMPUTE DX(G4)=1. >>>>>> >>>>>> The following IMNSHO is abysmal. >>>>>> DO IF G4=1. >>>>>> + COMPUTE DX1=1. >>>>>> ELSE IF G4=2. >>>>>> + COMPUTE DX2=1. >>>>>> ELSE IF G4=3. >>>>>> + COMPUTE DX3=1. >>>>>> END IF. >>>>>> RECODE DX1 TO DX4 (MISSING=0)(ELSE=COPY). >>>>>> >>>>> >>>> >>> >> >