LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2010)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 6 Aug 2010 13:52:01 -0700
Reply-To:     Tufayel Chowdhury <tufayel_02@yahoo.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Tufayel Chowdhury <tufayel_02@yahoo.com>
Subject:      Re: grouping and sequencing cases
In-Reply-To:  <SNT110-W260A27E713EC53D625E2C8FA900@phx.gbl>
Content-Type: multipart/alternative;

Hi Ruben,

Yes, it was wonderful, I've learned a lot. As to the $casenum=12, my syntax gives the accurate result (type_b = 2). I checked it couple of times. The syntax also works fine with my dataset. But thanks a lot, I'm pretty new to syntax and your syntax helped a lot.

Regards Tufayel

________________________________ From: Ruben van den Berg <ruben_van_den_berg@hotmail.com> To: SPSSX-L@LISTSERV.UGA.EDU Sent: Thu, August 5, 2010 3:56:26 AM Subject: Re: grouping and sequencing cases

Dear Tufayel,

Your syntax is lovely, I completely forgot about CREATE and I didn't even know FIRST and LAST were functions in AGGREGATE!

However, when I ran it, the variable TYPE_B you created did not correspond to the variable TYPE in your test data. For $casenum=12 TYPE=2 but your syntax rendered TYPE_B=1. I'll paste the entire syntax below, I suffixed 'your' variables with _T (from Tufayel ;-)).

Thanks for the lovely teamwork!

Ruben van den Berg Consultant Models & Methods TNS NIPO Email: ruben.van.den.berg@tns-nipo.com Mobiel: +31 6 24641435 Telefoon: +31 20 522 5738 Internet: www.tns-nipo.com

data list free/personID location tourID type. begin data 1 1 1 1 1 3 1 1 1 3 1 1 1 1 2 2 1 3 2 2 1 2 3 3 1 3 3 3 1 3 3 3 1 3 3 3 1 2 4 2 1 3 4 2 1 1 5 2 2 1 1 2 2 3 1 2 2 2 2 2 2 3 2 2 2 3 2 2 2 1 3 2 3 2 1 3 3 3 1 3 3 3 1 3 3 2 2 3 end data. dataset name d1.

*Create visit.

compute visit=1. if personID=lag(personID) visit=lag(visit)+1.

*Compute tourID.

compute tourID_B=1. do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)). compute tourID_B=lag(tourID_B)+1. else if (personID=lag(personID)). compute tourID_B=lag(tourID_B). end if. execute.

*********Scratch copy of data.

dataset copy d2. dataset activate d2.

aggregate /outfile * overwrite=yes mode addvariables /break personid /maxvis=max(visit). execute.

select if location ne 3.

compute type_B=0. if visit=maxvis and location=1 and lag(location)=1 type_b=1. if visit=maxvis and location=1 and lag(location)=2 type_b=2. if visit=maxvis and location=2 and lag(location)=1 type_b=2. if visit=maxvis and location=2 and lag(location)=2 type_b=3.

sort cases personid(a)visit(d).

compute newcount=1. if personID=lag(personID) newcount=lag(newcount)+1. compute t1=lag(type_b). execute. if t1 gt 0 type_b=t1. execute.

aggregate /outfile * overwrite=yes mode addvariables /break personid /maxnewcount=max(newcount).

loop #i=3 to maxnewcount. if newcount=#i and location=1 and lag(location)=1 type_b=1. if newcount=#i and location=1 and lag(location)=2 type_b=2. if newcount=#i and location=2 and lag(location)=1 type_b=2. if newcount=#i and location=2 and lag(location)=2 type_b=3. end loop.

sort cases personid visit.

match files file * /keep personid visit type_b. execute.

match files file d1 /file d2 /by personid visit. execute.

dataset close all. dataset name d1.

if mis (type_b) type_b=lag(type_b). execute.

compute check=type-type_b. descriptives check.

delete variables check.

*********Tufayel solution.

*Make a little change in tourID. compute tourID_T=1. do if (location=1 or location=2) and (location ne lag(location)). compute tourID_T=lag(tourID_T)+1. else. compute tourID_T=lag(tourID_T). end if. execute. *Group the tours. compute group=1. if tourID = lag(tourID) group = lag(group). if tourID <> lag(tourID) group = 1+ lag(group). EXECUTE. *Compute the first location of the tour. aggregate /outfile * overwrite=yes mode addvariables /break group /firstlocation = first(location). EXECUTE. *Compute the last location of the tour. create leadlocation = lead(location,1). EXECUTE. aggregate /outfile * overwrite=yes mode addvariables /break group /lastlocation = last(leadlocation). EXECUTE. *Compute type. compute type_T = 0. if firstlocation = 1 and lastlocation = 1 type_T = 1. if firstlocation = 1 and lastlocation = 2 type_T = 2. if firstlocation = 2 and lastlocation = 1 type_T = 2. if firstlocation = 2 and lastlocation = 2 type_T = 3. EXECUTE. compute check=type_b-type_T. exe.

________________________________ Date: Wed, 4 Aug 2010 16:27:00 -0700 From: tufayel_02@yahoo.com Subject: Re: grouping and sequencing cases To: SPSSX-L@LISTSERV.UGA.EDU

Hi Ruben,

Thanks a lot for the syntax. I'm sorry that I couldn't clearly put the logic, but you got it right. I understood your syntax and was able to write (probably) an easier one to create the variable 'type'. I couldn't do it unless I went through your one.

*Make a little change in tourID.

compute tourID=1. do if (location=1 or location=2) and (location ne lag(location)). compute tourID=lag(tourID)+1. else. compute tourID=lag(tourID). end if. execute.

*Group the tours.

compute group=1. if tourID = lag(tourID) group = lag(group). if tourID <> lag(tourID) group = 1+ lag(group). EXECUTE.

*Compute the first location of the tour.

aggregate /outfile * overwrite=yes mode addvariables /break group /firstlocation = first(location). EXECUTE.

*Compute the last location of the tour.

create leadlocation = lead(location,1). EXECUTE.

aggregate /outfile * overwrite=yes mode addvariables /break group /lastlocation = last(leadlocation). EXECUTE.

*Compute type.

compute type_b = 0. if firstlocation = 1 and lastlocation = 1 type_b = 1. if firstlocation = 1 and lastlocation = 2 type_b = 2. if firstlocation = 2 and lastlocation = 1 type_b = 2. if firstlocation = 2 and lastlocation = 2 type_b = 3. EXECUTE.

Thanks a lot.

-Tufayel

________________________________ From: Ruben van den Berg <ruben_van_den_berg@hotmail.com> To: SPSSX-L@LISTSERV.UGA.EDU Sent: Tue, August 3, 2010 5:40:19 AM Subject: Re: grouping and sequencing cases

Dear Tufayel,

I'm sorry but you did not answer my question and I still don't understand the logic. However, I tried to 'extract' the logic from the data and wrote some syntax that exactly reproduces 'Type' in your data (but only for these example respondents). The syntax is rather long and clumsy, but I didn't see any better options to get it done. Perhaps the List can suggest some improvements?

This comes without any warranty whatsoever and I suggest you check the actual results meticulously, I'm not overly confident it will work properly.

HTH,

Ruben van den Berg Consultant Models & Methods TNS NIPO Email: ruben.van.den.berg@tns-nipo.com Mobiel: +31 6 24641435 Telefoon: +31 20 522 5738 Internet: www.tns-nipo.com

*Test data.

data list free/personID location tourID type. begin data 1 1 1 1 1 3 1 1 1 3 1 1 1 1 2 2 1 3 2 2 1 2 3 3 1 3 3 3 1 3 3 3 1 3 3 3 1 2 4 2 1 3 4 2 1 1 5 2 2 1 1 2 2 3 1 2 2 2 2 2 2 3 2 2 2 3 2 2 2 1 3 2 3 2 1 3 3 3 1 3 3 3 1 3 3 2 2 3 end data.

dataset name d1.

*Create visit.

compute visit=1. if personID=lag(personID) visit=lag(visit)+1.

*Compute tourID.

compute tourID_B=1. do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)). compute tourID_B=lag(tourID_B)+1. else if (personID=lag(personID)). compute tourID_B=lag(tourID_B). end if. execute.

*********Scratch copy of data.

dataset copy d2. dataset activate d2.

aggregate /outfile * overwrite=yes mode addvariables /break personid /maxvis=max(visit). execute.

select if location ne 3.

compute type_B=0. if visit=maxvis and location=1 and lag(location)=1 type_b=1. if visit=maxvis and location=1 and lag(location)=2 type_b=2. if visit=maxvis and location=2 and lag(location)=1 type_b=2. if visit=maxvis and location=2 and lag(location)=2 type_b=3.

sort cases personid(a)visit(d).

compute newcount=1. if personID=lag(personID) newcount=lag(newcount)+1. compute t1=lag(type_b). execute.

if t1 gt 0 type_b=t1. execute.

aggregate /outfile * overwrite=yes mode addvariables /break personid /maxnewcount=max(newcount).

loop #i=3 to maxnewcount. if newcount=#i and location=1 and lag(location)=1 type_b=1. if newcount=#i and location=1 and lag(location)=2 type_b=2. if newcount=#i and location=2 and lag(location)=1 type_b=2. if newcount=#i and location=2 and lag(location)=2 type_b=3. end loop.

sort cases personid visit.

match files file * /keep personid visit type_b. execute.

match files file d1 /file d2 /by personid visit. execute.

dataset close all. dataset name d1.

if mis (type_b) type_b=lag(type_b). execute.

compute check=type-type_b.

descriptives check.

delete variables check.

________________________________ Date: Mon, 2 Aug 2010 12:28:40 -0700 From: tufayel_02@yahoo.com Subject: Re: grouping and sequencing cases To: ruben_van_den_berg@hotmail.com

Hi Ruben,

I should have been more explicit about the context. In my dataset each case represent a person's activity throughout the whole day (24 hours). For example, sleeping at home > taking breakfast at home > taking kid to daycare > going to office > from office to grocery > from grocery to home... etc. For my convenience, I have recoded the activity-locations as three main categories: 1=home, 2=office, 3=other places.

I define a TOUR based on two anchors: home and office. Thus, if a person travels like this: home > car > daycare > car > home (1>3>3>3>1), this is a home-home tour. A tour is complete when a person starts from any of the two anchors (home or office) and goes/returns to any of those, via other places (i.e. 3).

The variables 'type' is defined based on the type of tour-origin and tour-destination. The only variables used to define 'type' should be location, personID shouldn't matter. type=1, if 1>3>3>3>1 (home-home tour), type=2, if 1>3>3>2 (home-office) or 2>3>3>3>1 (office-home), and type=3, if 2>3>3>2 (office-office).

locationtourIDtype 112 312 312 223 323 323 233

Hope I have cleared things up. And thanks so much for your help.

-Tufayel

________________________________ From: Ruben van den Berg <ruben_van_den_berg@hotmail.com> To: tufayel_02@yahoo.com Sent: Mon, August 2, 2010 3:56:33 AM Subject: RE: grouping and sequencing cases

Dear Tufayel,

This logic is pretty hard to grasp. From the data, it seems that for

1 3 1 3 2

The first 'tour' is 1-3-1 and the second 'tour' is 1-3-3. The type for the SECOND '1' is determined by the second tour (so its type=2, not 1). Is type always 1 for the first location within a personID?

Best,

Ruben van den Berg Consultant Models & Methods TNS NIPO Email: ruben.van.den.berg@tns-nipo.com Mobiel: +31 6 24641435 Telefoon: +31 20 522 5738 Internet: www.tns-nipo.com

________________________________ Date: Fri, 30 Jul 2010 17:28:30 -0700 From: tufayel_02@yahoo.com Subject: Re: grouping and sequencing cases To: SPSSX-L@LISTSERV.UGA.EDU

Hi Ruben,

Thank you for the help with tourID. I'm sorry for being ambiguous about the variable 'type'. Thing is, 1 and 2 are fixed locations (home and office respectively) and 3 is any kind of vehicle/walk. You'd notice in the example that within a personID the location changes from 1-1, 1-2, 2-1 or 2-2 via 3. To rephrase, the tours are always like 1-3-1, 1-3-2, 2-3-1 or 2-3-2. For a tour 1-3-1, type=1 (tourID=1 in the example), for 1-3-2 or 2-3-1 (tourID=2 and 4), type=2, and for 2-3-2 (tourID=3), type=3. I hope this clears things up.

personID location tourID type 1111 1311 1311 1122 1322 1233 1333 1333 1333 1242 1342 1152

Thanks Tufayel

________________________________ From: Ruben van den Berg <ruben_van_den_berg@hotmail.com> To: SPSSX-L@LISTSERV.UGA.EDU Sent: Fri, July 30, 2010 4:38:44 AM Subject: Re: grouping and sequencing cases

Dear Tufayel,

The order of cases in your example data is vital information, isn't it? The first thing I'd do if these were my raw data, is create this order in the data. Otherwise, if you'd sort your records randomly, you'd destroy part of the information contained in the data. I created a new variable 'visit' which is the nth visit for each personID. The next block of syntax should create tourID (I called it tourID_B so you can compare it to your desired tourID).

Your third request, however, was somewhat unclear to me. I think in total you have 9 location sequences:

1-1 1-2 1-3 2-1 2-2 2-3 3-1 3-2 3-3

Four of these (within personID) cause the tourID to change:

1-2 2-1 3-1 3-2

So within tourID groups there are 5 possible sequences:

1-1 1-3 2-2 2-3 3-3

As I understood, you want to create type within tourID group, so for each of these 5 sequences the value of type should be specified (even if (system) missing).

Could you please help us out a bit more?

Best,

Ruben van den Berg Consultant Models & Methods TNS NIPO Email: ruben.van.den.berg@tns-nipo.com Mobiel: +31 6 24641435 Telefoon: +31 20 522 5738 Internet: www.tns-nipo.com

data list free/personID location tourID type. begin data 1 1 1 1 1 3 1 1 1 3 1 1 1 1 2 2 1 3 2 2 1 2 3 3 1 3 3 3 1 3 3 3 1 3 3 3 1 2 4 2 1 3 4 2 1 1 5 2 2 1 1 2 2 3 1 2 2 2 2 2 2 3 2 2 2 3 2 2 2 1 3 2 3 2 1 3 3 3 1 3 3 3 1 3 3 2 2 3 end data.

compute visit=1. if personID=lag(personID) visit=lag(visit)+1.

compute tourID_B=1. do if (personID=lag(personID)) and (location=1 or location=2) and (location ne lag(location)). compute tourID_B=lag(tourID_B)+1. else if (personID=lag(personID)). compute tourID_B=lag(tourID_B). end if. execute.

________________________________

Date: Thu, 29 Jul 2010 16:58:27 -0700 From: tufayel_02@yahoo.com Subject: grouping and sequencing cases To: SPSSX-L@LISTSERV.UGA.EDU

Hi all,

I am trying to create two variables (tourID and type) from two existing variables (personID and location). TourID is a sequence where the numbers remain the same until 'location' 1 or 2 arrives. TourID always starts from 1 when personID changes.

The variable 'type' can take three values - 1, 2 and 3. If the 'location' changes from 1 to 1, type=1 (for that tourID group); if it changes from 1 - 2, or 2 - 1, type=2; if its 2 - 2, type=3 (last four cases).

personID location tourID type 1111 1311 1311 1122 1322 1233 1333 1333 1333 1242 1342 1152 2112 2312 2222 2322 2322 2132 3213 3313 3313 3223

Can anyone please help me out?

Thanks in advance!

Tufayel


[text/html]


Back to: Top of message | Previous page | Main SPSSX-L page