| Date: | Fri, 2 Dec 2005 17:18:36 -0600 |
| Reply-To: | Vishal Dave <VishalDave@Affina.com> |
| Sender: | "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU> |
| From: | Vishal Dave <VishalDave@Affina.com> |
| Subject: | Re: COUNTING CONSECUTIVES NUMBERS IN A SET OF VARIABLES |
| Content-Type: | text/plain; charset="iso-8859-1" |
Other way to solve this problem is by defining a loop which keep track of pattern counter. For example, If the loop operates twice then it will define b2=1 and if there will be another pattern then it will add counter = 1 in b2 value.
I am not sure if this works for all the possibilities of observations (2**5 =32) but this code should work.
Vishal.
****************************************************************
*List data.
DATA LIST LIST/var1 TO var5 (5 F8.0).
BEGIN DATA
1 1 1 1 1
1 0 1 1 0
1 0 1 1 1
1 1 0 1 1
1 1 1 1 0
1 0 0 0 1
0 0 0 1 0
0 1 0 1 1
END DATA.
SET MPRINT = ON PRINTBACK ON.
*Define the vector to compare each variable.
DEFINE !vars(nbvars=!TOKENS(1) /v1=!TOKENS(1) /v2=!TOKENS(1))
VECTOR v=!v1 TO !v2 /#val(!nbvars).
NUMERIC b1 TO b5.
VECTOR b=b1 TO b5.
LOOP #cnt=1 TO !nbvars.
- COMPUTE #val(#cnt) = v(#cnt).
- COMPUTE b(#cnt) = 0.
END LOOP.
*Try to look for the continuous patten and add the counter in loop statement to define pattern type (ie. b1, b2, b3...).
LOOP #i = 1 TO !nbvars-1.
DO IF (#i=1).
- COMPUTE #k=1.
- COMPUTE #l=1.
- LOOP #j = #i +1 TO !nbvars IF (#val(#i) EQ v(#j)).
- COMPUTE #k = #k +1.
- COMPUTE #l = #j.
- END LOOP.
- COMPUTE b(#k) = b(#k) + 1.
ELSE IF((#val(#i) NE v(#i+1)) OR (#val(#i) NE v(#i-1))).
- COMPUTE #k=1.
- COMPUTE #l=1.
- LOOP #j = #i +1 TO !nbvars IF (#val(#i) EQ v(#j)).
- COMPUTE #k = #k +1.
- COMPUTE #l = #j.
- END LOOP.
- COMPUTE b(#k) = b(#k) + 1.
END IF.
END LOOP IF (#l EQ !nbvars).
*b1 won't be the correct value from the counter variable but rest of the patterns are correct one.
*calculate b1 from rest of the values of b(i).
COMPUTE #m=0.
LOOP #i=2 TO !nbvars.
- COMPUTE #m = #m + #i*b(#i).
END LOOP.
COMPUTE #cnt2=1.
COMPUTE b(#cnt2) = !nbvars - #m.
!ENDDEFINE.
*Call Macro
!vars nbvars=5 v1=var1 v2=var5.
EXECUTE.
*******************************************************************
VD> -----Original Message-----
VD> From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
VD> Marks, Jim
VD> Sent: Friday, December 02, 2005 10:26 AM
VD> To: SPSSX-L@LISTSERV.UGA.EDU
VD> Subject: Re: COUNTING CONSECUTIVES NUMBERS IN A SET OF VARIABLES
VD>
VD> Here is a solution that can be extended to more variables (I am working
VD> on a similar problem, but I need the length of all sequences).
VD>
VD> To extend the number of variables you will need to edit these commands--
VD> COMPUTE max_seq
VD> MATCH (RENAME and DROP
VD> VECTOR
VD> LOOP
VD> RECODE, RENAME VARIABLES, VARIABLE LABELS
VD>
VD> If your data file extends to more cases, you will need to edit the DO
VD> REPEAT command.
VD>
VD> Note this solution requires an ID variable in your original file (and a
VD> saved copy of that file).
VD>
VD> This is tested:
VD>
VD> ** SAMPLE data.
VD>
VD> DATA LIST LIST/id (f8.0) v1 TO v5 (5 F8.0).
VD> BEGIN DATA
VD> 1 1 0 1 1 0
VD> 2 1 1 1 1 1
VD> 3 1 0 1 1 1
VD> 4 1 1 0 1 1
VD> 5 1 1 1 1 0
VD> 6 1 0 0 1 0
VD> 7 0 1 1 1 0
VD> 8 0 0 0 0 0
VD> END DATA.
VD>
VD> SAVE OUTFILE = 'c:\spss-test\conseq.sav'.
VD>
VD> ** we use flip so we can sum the lags
VD> ** and determine the largest consecutive sequence.
VD>
VD> FLIP.
VD>
VD> DO REPEAT a = var001 to var008.
VD>
VD> DO IF $casenum GT 2 AND a = 1.
VD> COMPUTE a= a+ lag(a).
VD> END IF.
VD>
VD> END REPEAT.
VD>
VD> FLIP.
VD>
VD> COMPUTE max_seq = MAX(v1 to v5).
VD>
VD> ** join the largest sequence variable back to the original file.
VD>
VD> MATCH FILES FILE = *
VD> /RENAME (case_lbl v1 v2 v3 v4 v5 = d0 d1 d2 d3 d4 d5)
VD> /FILE = 'c:\spss-test\conseq.sav'
VD> /BY id
VD> /DROP d0 d1 d2 d3 d4 d5.
VD>
VD> ** now calculate the "b" variables.
VD> VECTOR b(5f8.0).
VD> LOOP #i = 1 TO 5.
VD> COMPUTE b(#i) = max_seq = (#i).
VD> END LOOP.
VD> IF max_seq = 0 b1 = 1.
VD>
VD> RECODE b1 to b5 (SYSMIS = 0).
VD>
VD> ** requested variable names and labels (for clarity).
VD> RENAME VARIABLES (b1 b2 b4 b5 = b5 b4 b2 b1).
VD> VARIABLE LABELS
VD> b1 '5 1 values'
VD> b2 '4 1 values'
VD> b3 '3 1 values'
VD> b4 '2 1 values'
VD> b5 '1 1 value (or no 1 values)'
VD> .
VD> EXECUTE.
VD>
VD> --jim
VD> -----Original Message-----
VD> From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
VD> Marta García-Granero
VD> Sent: Friday, December 02, 2005 1:55 AM
VD> To: SPSSX-L@LISTSERV.UGA.EDU
VD> Subject: Re: COUNTING CONSECUTIVES NUMBERS IN A SET OF VARIABLES
VD>
VD> Hi Victor
VD>
VD> First of all, your coding scheme would leave out (b1 to b5 missing)
VD> those cases with only one '1' among the 5 variables, because you reserve
VD> b5=1 only for the particular case where v1 to v5 are all set to 0. What
VD> happens to those cases with only one '1' among the 5 variables?.
VD>
VD> I'm sure there is a more elegant and simple approach, but this one works
VD> (not easily transformed to more than 5 variables, sorry). I have listed
VD> the whole set of pattern (32: 2**5) and sorted them taking into account
VD> wether they had 5 '1', 4 consecutive '1', and so on. I have assigned
VD> b5=1 to cases with 5 '0' or only one '1'. Change it if you don't like it
VD> (change the "ELSE." to "ELSE IF pattern EQ '00000'.", and a new "ELSE"
VD> at then end to give another solution to all patterns with only one '1'.
VD>
VD> DATA LIST LIST/v1 TO v5 (5 F8.0).
VD> BEGIN DATA
VD> 1 0 1 1 0
VD> 1 1 1 1 1
VD> 1 0 1 1 1
VD> 1 1 0 1 1
VD> 1 1 1 1 0
VD> END DATA.
VD>
VD> STRING pattern(A5).
VD> COMPUTE pattern =
VD> STRING((10**4)*v1+(10**3)*v2+(10**2)*v3+10*v4+v5,'N5.0').
VD>
VD> DO IF pattern EQ '11111'.
VD> - COMPUTE b1=1. /* Five '1' *.
VD> - COMPUTE b2=0.
VD> - COMPUTE b3=0.
VD> - COMPUTE b4=0.
VD> - COMPUTE b5=0.
VD> ELSE IF ANY(pattern,'11110','01111').
VD> - COMPUTE b1=0.
VD> - COMPUTE b2=1. /* Four consecutive '1' *.
VD> - COMPUTE b3=0.
VD> - COMPUTE b4=0.
VD> - COMPUTE b5=0.
VD> ELSE IF ANY(pattern,'00111','01110','11100','11101','10111').
VD> - COMPUTE b1=0.
VD> - COMPUTE b2=0.
VD> - COMPUTE b3=1. /* Three consecutive '1' *.
VD> - COMPUTE b4=0.
VD> - COMPUTE b5=0.
VD> ELSE IF
VD> ANY(pattern,'00011','00110','01100','11000','01011','11010','10110','011
VD> 01','10011','11001').
VD> - COMPUTE b1=0.
VD> - COMPUTE b2=0.
VD> - COMPUTE b3=0.
VD> - COMPUTE b4=1. /* Only two consecutive '1' *.
VD> - COMPUTE b5=0.
VD> ELSE IF pattern EQ '11011'.
VD> - COMPUTE b1=0.
VD> - COMPUTE b2=0.
VD> - COMPUTE b3=0.
VD> - COMPUTE b4=2. /* Particular case where b4=2 *.
VD> - COMPUTE b5=0.
VD> ELSE. /*All patterns not explicited above have no consecutive '1' *.
VD> - COMPUTE b1=0.
VD> - COMPUTE b2=0.
VD> - COMPUTE b3=0.
VD> - COMPUTE b4=0.
VD> - COMPUTE b5=1.
VD> END IF.
VD> EXECUTE .
VD> FORMAT b1 TO b5 (F8.0).
VD>
VD> Regards,
VD> Marta mailto:biostatistics@terra.es
|