LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 12 Feb 2008 14:42:35 -0500
Reply-To:     Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject:      Re: SQL logic help
In-Reply-To:  <5860691a-9dd8-4b99-b698-6083fc7df571@e6g2000prf.googlegroups.com>
Content-Type: text/plain; charset="us-ascii"

Reeza: Kenneth correctly points out that SQL does not have the logical equivalent of first.x. The first obs implies a first in a physical sequence in a dataset. Since appropriate physical sequencing of a dataset requires sorting, which may alter a physical sequence required for other purposes, logic programming does not rely on (or recognize) a physical sequence of obs in a dataset.

In SQL a database programmer relies on groups of distinct attribute values (sets) instead of physical sequence. In a group that has distinct key values (activity_date?), the MIN() summary function evaluates as True when its argument becomes the earliest date of the activity_date attribute. As a summary function, MIN() can only appear in a HAVING clause of a GROUP BY clause.

Now if a group contains more than one instance of an activity_date and they happen to have the earliest date value in that group, all satisfy the condition MIN(activity_date). A SQL program will then yield for that group more than one tuple and, possibly, more than one activity_ID. In contrast the first.group condition arbitrarily selects the first of the obs in the physical sequence of the dataset. Depending on specifications of the sorting procedure, that sequence could be the order of obs in the source dataset, or something else, and may not be reproducible unless the programmer preserves the physical ordering of the source dataset and sort specifications.

If one precludes duplicated activity_date values in any group, the SQL solution

... select distinct group,activity_id,MIN(activity_date) from have where activity=2 group by group,activity_ID ;

More than one tuple with the earliest activity_date in any group limited to activity=2 will result in initial selection of multiple tuples. If each has the same value of activity_id, the DISTINCT modifier will limit the result to a single tuple.

Depending on what relations exist among the attributes activity, group, activity_ID, and activity_date, other SQL programs could give you different results. The first.x method doesn't require much in the way of preconditions and key definitions, but it may only work for a particular configuration of data and may be difficult to reproduce. The SAS Data step solutions differ from SQL solutions in both theoretical and practical ways. Over a number of years the logic of sets and relations that SQL implements has become easier for me to grasp than the somewhat more procedural logic of SAS Data steps (or equivalent scripting languages such as PL/SQL and Transact/SQL). You'll have to decide what course to take for yourself. S

-----Original Message----- From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] On Behalf Of Reeza Sent: Tuesday, February 12, 2008 12:42 PM To: sas-l@uga.edu Subject: SQL logic help

Hi All,

I know this isn't quite the right group, but I also know there's a lot of SQL programmers here.

I have a simple set of logic that I can't quite get to work.

I have a list of activities, the activity date, the group and the activity ID.

What I need to do is select by group, the last activity, the date and corresponding activitity ID for a specific activity.

In SAS my code would be something like this

proc sort data=have; by group activity_date;

data need; set have; where activity=2; by group; if first.group; keep group activity_date activity_ID; run;

I just need some help translating this to SQL, please :)

TIA, Reeza


Back to: Top of message | Previous page | Main SAS-L page