Date: Thu, 5 Aug 2010 23:19:26 +0300
Reply-To: hillel vardi <hilel@BGU.AC.IL>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: hillel vardi <hilel@BGU.AC.IL>
Subject: Re: grouping and sequencing cases
In-Reply-To: <186780.8590.qm@web59307.mail.re1.yahoo.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Shalom
Here is another way to do what you asked for .
It look to me that creating tourID is a mistake because in many case
the act that end one tour is also the act that start th next one .
In your data line 4 (location 1 ) is both the end of tour1 and the
start of tour 2. all lines of each tourID should have the same type
and that is not the case here .
(your progarm create 9 tourID but ther is only 7 tours ) .
I also think that after you calculate type you only need one line
per trip (run the syntax up to the comment to see it) .
Hillel Vard
BGU
dataset close all.
data list free/personID location tourID type.
begin data
1 1 1 1
1 3 1 1
1 3 1 1
1 1 2 2
1 3 2 2
1 2 3 3
1 3 3 3
1 3 3 3
1 3 3 3
1 2 4 2
1 3 4 2
1 1 5 2
2 1 1 2
2 3 1 2
2 2 2 2
2 3 2 2
2 3 2 2
2 1 3 2
3 2 1 3
3 3 1 3
3 3 1 3
3 2 2 3
end data.
compute seq=$casenum.
add files file=* / keep=personID location seq .
numeric tourID(f2) .
if any(location, 1, 2) and (personID eq lag(personID,1))
tourID=tourID+1 .
leave tourID .
SORT CASES BY personID seq .
dataset name orig .
DATASET COPY work .
DATASET ACTIVATE work .
if ( location eq lag(location,1)) and (personID eq lag(personID,1))
keep=0 .
recode keep(0=0)(else=1) .
select if keep=1 .
add files file=* / by personID / first=first .
if ( first eq 0 ) and ( lag(first,1) eq 0) location_detail=
lag(location,2)*100+ lag(location,1)*10+ location .
recode location_detail (232 =3)(131=1)(132 231=2) into type .
add files file=* / keep= personID seq location_detail type tourID .
select if type gt 0 .
execute .
*** run this part if you want to see all lines .
match files file= orig / file=* / by personID seq .
sort cases by personID(d) seq (d) .
if type gt 0 #type=type .
if sysmis(type) type= #type .
SORT CASES BY personID seq .
Tufayel Chowdhury wrote:
> Hi Ruben,
>
> Thanks a lot for the syntax. I'm sorry that I couldn't clearly put the
> logic, but you got it right. I understood your syntax and was able to
> write (probably) an easier one to create the variable 'type'. I
> couldn't do it unless I went through your one.
>
>
> *Make a little change in tourID.
>
> compute tourID=1.
> do if (location=1 or location=2) and (location ne lag(location)).
> compute tourID=lag(tourID)+1.
> else.
> compute tourID=lag(tourID).
> end if.
> execute.
>
> *Group the tours.
>
> compute group=1.
> if tourID = lag(tourID) group = lag(group).
> if tourID <> lag(tourID) group = 1+ lag(group).
> EXECUTE.
>
> *Compute the first location of the tour.
>
> aggregate
> /outfile * overwrite=yes mode addvariables
> /break group
> /firstlocation = first(location).
> EXECUTE.
>
> *Compute the last location of the tour.
>
> create leadlocation = lead(location,1).
> EXECUTE.
>
> aggregate
> /outfile * overwrite=yes mode addvariables
> /break group
> /lastlocation = last(leadlocation).
> EXECUTE.
>
> *Compute type.
>
> compute type_b = 0.
> if firstlocation = 1 and lastlocation = 1 type_b = 1.
> if firstlocation = 1 and lastlocation = 2 type_b = 2.
> if firstlocation = 2 and lastlocation = 1 type_b = 2.
> if firstlocation = 2 and lastlocation = 2 type_b = 3.
> EXECUTE.
>
>
> Thanks a lot.
>
> -Tufayel
>
> ------------------------------------------------------------------------
> *From:* Ruben van den Berg <ruben_van_den_berg@hotmail.com>
> *To:* SPSSX-L@LISTSERV.UGA.EDU
> *Sent:* Tue, August 3, 2010 5:40:19 AM
> *Subject:* Re: grouping and sequencing cases
>
> Dear Tufayel,
>
> I'm sorry but you did not answer my question and I still don't
> understand the logic. However, I tried to 'extract' the logic from the
> data and wrote some syntax that exactly reproduces 'Type' in your data
> (but only for these example respondents). The syntax is rather long
> and clumsy, but I didn't see any better options to get it done.
> Perhaps the List can suggest some improvements?
>
> This comes without any warranty whatsoever and I suggest you check the
> actual results meticulously, I'm not overly confident it will work
> properly.
>
> HTH,
>
> *Ruben van den Berg*
> *Consultant Models & Methods*
> TNS NIPO
> Email: ruben.van.den.berg@tns-nipo.com
> Mobiel: +31 6 24641435
> Telefoon: +31 20 522 5738
> Internet: www.tns-nipo.com <http://www.tns-nipo.com>
>
> *Test data.
>
> data list free/personID location tourID type.
> begin data
> 1 1 1 1
> 1 3 1 1
> 1 3 1 1
> 1 1 2 2
> 1 3 2 2
> 1 2 3 3
> 1 3 3 3
> 1 3 3 3
> 1 3 3 3
> 1 2 4 2
> 1 3 4 2
> 1 1 5 2
> 2 1 1 2
> 2 3 1 2
> 2 2 2 2
> 2 3 2 2
> 2 3 2 2
> 2 1 3 2
> 3 2 1 3
> 3 3 1 3
> 3 3 1 3
> 3 2 2 3
> end data.
>
> dataset name d1.
>
> *Create visit.
>
> compute visit=1.
> if personID=lag(personID) visit=lag(visit)+1.
>
> *Compute tourID.
>
> compute tourID_B=1.
> do if (personID=lag(personID)) and (location=1 or location=2) and
> (location ne lag(location)).
> compute tourID_B=lag(tourID_B)+1.
> else if (personID=lag(personID)).
> compute tourID_B=lag(tourID_B).
> end if.
> execute.
>
> *********Scratch copy of data.
>
> dataset copy d2.
> dataset activate d2.
>
> aggregate
> /outfile * overwrite=yes mode addvariables
> /break personid
> /maxvis=max(visit).
> execute.
>
> select if location ne 3.
>
> compute type_B=0.
> if visit=maxvis and location=1 and lag(location)=1 type_b=1.
> if visit=maxvis and location=1 and lag(location)=2 type_b=2.
> if visit=maxvis and location=2 and lag(location)=1 type_b=2.
> if visit=maxvis and location=2 and lag(location)=2 type_b=3.
>
> sort cases personid(a)visit(d).
>
> compute newcount=1.
> if personID=lag(personID) newcount=lag(newcount)+1.
> compute t1=lag(type_b).
> execute.
>
> if t1 gt 0 type_b=t1.
> execute.
>
> aggregate
> /outfile * overwrite=yes mode addvariables
> /break personid
> /maxnewcount=max(newcount).
>
> loop #i=3 to maxnewcount.
> if newcount=#i and location=1 and lag(location)=1 type_b=1.
> if newcount=#i and location=1 and lag(location)=2 type_b=2.
> if newcount=#i and location=2 and lag(location)=1 type_b=2.
> if newcount=#i and location=2 and lag(location)=2 type_b=3.
> end loop.
>
> sort cases personid visit.
>
> match files file *
> /keep personid visit type_b.
> execute.
>
> match files file d1
> /file d2
> /by personid visit.
> execute.
>
> dataset close all.
> dataset name d1.
>
> if mis (type_b) type_b=lag(type_b).
> execute.
>
> compute check=type-type_b.
>
> descriptives check.
>
> delete variables check.
>
>
>
> ------------------------------------------------------------------------
> Date: Mon, 2 Aug 2010 12:28:40 -0700
> From: tufayel_02@yahoo.com
> Subject: Re: grouping and sequencing cases
> To: ruben_van_den_berg@hotmail.com
>
> Hi Ruben,
>
> I should have been more explicit about the context. In my dataset each
> case represent a person's activity throughout the whole day (24
> hours). For example, sleeping at home > taking breakfast at home >
> taking kid to daycare > going to office > from office to grocery >
> from grocery to home... etc. For my convenience, I have recoded the
> activity-locations as three main categories: 1=home, 2=office, 3=other
> places.
>
> I define a TOUR based on two anchors: home and office. Thus, if a
> person travels like this: home > car > daycare > car > home
> (1>3>3>3>1), this is a home-home tour. A tour is complete when a
> person starts from any of the two anchors (home or office) and
> goes/returns to any of those, via other places (i.e. 3).
>
> The variables 'type' is defined based on the type of tour-origin and
> tour-destination. The only variables used to define 'type' should be
> location, personID shouldn't matter. type=1, if 1>3>3>3>1 (home-home
> tour), type=2, if 1>3>3>2 (home-office) or 2>3>3>3>1 (office-home),
> and type=3, if 2>3>3>2 (office-office).
>
> location tourID type
> 1 1 2
> 3 1 2
> 3 1 2
> 2 2 3
> 3 2 3
> 3 2 3
> 2 3 3
>
> Hope I have cleared things up. And thanks so much for your help.
>
> -Tufayel
>
>
>
> ------------------------------------------------------------------------
> *From:* Ruben van den Berg <ruben_van_den_berg@hotmail.com>
> *To:* tufayel_02@yahoo.com
> *Sent:* Mon, August 2, 2010 3:56:33 AM
> *Subject:* RE: grouping and sequencing cases
>
> Dear Tufayel,
>
> This logic is pretty hard to grasp. From the data, it seems that for
>
> 1
> 3
> 1
> 3
> 2
>
> The first 'tour' is 1-3-1 and the second 'tour' is 1-3-3. The type for
> the SECOND '1' is determined by the second tour (so its type=2, not
> 1). Is type always 1 for the first location within a personID?
>
> Best,
>
> *Ruben van den Berg*
> *Consultant Models & Methods*
> TNS NIPO
> Email: ruben.van.den.berg@tns-nipo.com
> Mobiel: +31 6 24641435
> Telefoon: +31 20 522 5738
> Internet: www.tns-nipo.com <http://www.tns-nipo.com/>
>
>
>
>
> ------------------------------------------------------------------------
> Date: Fri, 30 Jul 2010 17:28:30 -0700
> From: tufayel_02@yahoo.com
> Subject: Re: grouping and sequencing cases
> To: SPSSX-L@LISTSERV.UGA.EDU
>
> Hi Ruben,
>
> Thank you for the help with tourID. I'm sorry for being ambiguous
> about the variable 'type'. Thing is, 1 and 2 are fixed locations (home
> and office respectively) and 3 is any kind of vehicle/walk. You'd
> notice in the example that within a personID the location changes from
> 1-1, 1-2, 2-1 or 2-2 via 3. To rephrase, the tours are always like
> 1-3-1, 1-3-2, 2-3-1 or 2-3-2. For a tour 1-3-1, type=1 (tourID=1 in
> the example), for 1-3-2 or 2-3-1 (tourID=2 and 4), type=2, and for
> 2-3-2 (tourID=3), type=3. I hope this clears things up.
>
> *personID **location tourID** **type*
> *1** **1** **1** **1*
> *1** **3** **1** **1*
> *1** **3** **1** **1*
> *1** **1** **2** **2*
> *1** **3** **2** **2*
> *1** **2** **3** **3*
> *1** **3** **3** **3*
> *1** **3** **3** **3*
> *1** **3** **3** **3*
> *1** **2** **4** **2*
> *1** **3** **4** **2*
> *1** **1** **5** **2*
>
> Thanks
> Tufayel
>
> ------------------------------------------------------------------------
> *From:* Ruben van den Berg <ruben_van_den_berg@hotmail.com>
> *To:* SPSSX-L@LISTSERV.UGA.EDU
> *Sent:* Fri, July 30, 2010 4:38:44 AM
> *Subject:* Re: grouping and sequencing cases
>
> Dear Tufayel,
>
> The order of cases in your example data is vital information, isn't
> it? The first thing I'd do if these were my raw data, is create this
> order in the data. Otherwise, if you'd sort your records randomly,
> you'd destroy part of the information contained in the data. I created
> a new variable 'visit' which is the nth visit for each personID. The
> next block of syntax should create tourID (I called it tourID_B so you
> can compare it to your desired tourID).
>
> Your third request, however, was somewhat unclear to me. I think in
> total you have 9 location sequences:
>
> 1-1
> 1-2
> 1-3
> 2-1
> 2-2
> 2-3
> 3-1
> 3-2
> 3-3
>
> Four of these (within personID) cause the tourID to change:
>
> 1-2
> 2-1
> 3-1
> 3-2
>
> So within tourID groups there are 5 possible sequences:
>
> 1-1
> 1-3
> 2-2
> 2-3
> 3-3
>
> As I understood, you want to create type within tourID group, so for
> each of these 5 sequences the value of type should be specified (even
> if (system) missing).
>
> Could you please help us out a bit more?
>
> Best,
>
> *Ruben van den Berg*
> *Consultant Models & Methods*
> TNS NIPO
> Email: ruben.van.den.berg@tns-nipo.com
> Mobiel: +31 6 24641435
> Telefoon: +31 20 522 5738
> Internet: www.tns-nipo.com <http://www.tns-nipo.com/>
>
>
> data list free/personID location tourID type.
> begin data
> 1 1 1 1
> 1 3 1 1
> 1 3 1 1
> 1 1 2 2
> 1 3 2 2
> 1 2 3 3
> 1 3 3 3
> 1 3 3 3
> 1 3 3 3
> 1 2 4 2
> 1 3 4 2
> 1 1 5 2
> 2 1 1 2
> 2 3 1 2
> 2 2 2 2
> 2 3 2 2
> 2 3 2 2
> 2 1 3 2
> 3 2 1 3
> 3 3 1 3
> 3 3 1 3
> 3 2 2 3
> end data.
>
> compute visit=1.
> if personID=lag(personID) visit=lag(visit)+1.
>
> compute tourID_B=1.
> do if (personID=lag(personID)) and (location=1 or location=2) and
> (location ne lag(location)).
> compute tourID_B=lag(tourID_B)+1.
> else if (personID=lag(personID)).
> compute tourID_B=lag(tourID_B).
> end if.
> execute.
>
>
>
>
> ------------------------------------------------------------------------
>
> Date: Thu, 29 Jul 2010 16:58:27 -0700
> From: tufayel_02@yahoo.com
> Subject: grouping and sequencing cases
> To: SPSSX-L@LISTSERV.UGA.EDU
>
>
> Hi all,
>
> I am trying to create two variables (tourID and type) from two
> existing variables (personID and location). TourID is a sequence where
> the numbers remain the same until 'location' 1 or 2 arrives. TourID
> always starts from 1 when personID changes.
>
> The variable 'type' can take three values - 1, 2 and 3. If the
> 'location' changes from 1 to 1, type=1 (for that tourID group); if it
> changes from 1 - 2, or 2 - 1, type=2; if its 2 - 2, type=3 (last four
> cases).
>
> *personID **location tourID** **type*
> *1** **1** **1** **1*
> *1** **3** **1** **1*
> *1** **3** **1** **1*
> *1** **1** **2** **2*
> *1** **3** **2** **2*
> *1** **2** **3** **3*
> *1** **3** **3** **3*
> *1** **3** **3** **3*
> *1** **3** **3** **3*
> *1** **2** **4** **2*
> *1** **3** **4** **2*
> *1** **1** **5** **2*
> *2** **1** **1** **2*
> *2** **3** **1** **2*
> *2** **2** **2** **2*
> *2** **3** **2** **2*
> *2** **3** **2** **2*
> *2** **1** **3** **2*
> *3** **2** **1** **3*
> *3** **3** **1** **3*
> *3** **3** **1** **3*
> *3** **2** **2** **3*
> *
> *
> Can anyone please help me out?
>
> Thanks in advance!
>
> Tufayel
>
>
>
>
=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
|