```Date: Fri, 26 Apr 1996 16:14:37 GMT Reply-To: Michael Friendly Sender: "SAS(r) Discussion" From: Michael Friendly Organization: York University, Ontario, Canada Subject: Lags with BY variables: Merge seeks help, possibly from SQL Many thanks to all those who answered my query re constructing lags with BY variables. Here's a new problem for anyone willing to help, but first I'll explain the old problem and the context: The simplest solutions to make the lags restart when the BY variable(s) change were variations of the code below to produce the dataset lags data codes; input subj @; do i=1 to 16; input code \$ @; output; end; cards; 1 c a a b a c a c b b a b a a b c 2 c c b b a c a c c a c b c b c c ; data lags; set codes; by subj; drop cnt; lag1 = lag(code); lag2 = lag2(code); if first.subj then cnt=0; cnt+1; if cnt<=1 then lag1 = ' '; if cnt<=2 then lag2 = ' '; The goal here (for a macro I'm working on) is to produce a complete n-way frequency table giving joint frequencies of events for an arbitrary number of lags. By 'complete' I mean (for 2 subjects, 3 codes, lag1 x lag0), a dataset with 2 x 3 x 3 observations, including those which occur with zero frequencies. Unfortunately, it does not appear that I can convince PROC FREQ to include the zero counts in the output dataset. proc freq data=lags; tables lag1 * code / noprint out=freq; by subj; proc print; id subj; by subj; which gives only counts >=1: SUBJ LAG1 CODE COUNT PERCENT 1 c 1 . a a 2 13.3333 a b 3 20.0000 a c 2 13.3333 b a 3 20.0000 b b 1 6.6667 b c 1 6.6667 c a 2 13.3333 c b 1 6.6667 2 c 1 . a c 3 20.0000 b a 1 6.6667 b b 1 6.6667 b c 2 13.3333 c a 2 13.3333 c b 3 20.0000 c c 3 20.0000 My original idea to fill in the zero entries was to construct a dataset of all zeros and merge it with the freq datset for each subject. That is, I want all combinations of a,b,c in pairs filled in with observations of COUNT=0 for those combinations which do not occur. This worked when there were no BY variables for the lag dataset, but does not work now. Here's the simplified verion of the code: *-- Sort, putting subject last; proc sort data=freq; by lag1 code subj; *-- Create a table of 0 counts to fill in missing entries; data zeros; do lag1='a', 'b', 'c'; do code='a', 'b', 'c'; count = 0; end; end; data freq; merge freq(drop=percent) zeros; by lag1 code; if lag1 = ' ' then delete; *-- Resort, putting subject first; proc sort data=freq; by subj lag1 code; proc print; id subj; by subj; and the output: SUBJ LAG1 CODE COUNT 1 a a 2 a b 3 a c 2 b a 3 b b 1 b c 1 c a 2 c b 1 2 a c 3 b a 1 b b 1 b c 2 c a 2 c b 3 c c 0 Using SET instead of MERGE above gives a different, but incorrect result. The whole job of filling in the zero entries BY subject seems like something that could be much more easily accomplished with PROC SQL, but I'm not familiar enough with it to see how. Can anyone help? -- Michael Friendly Internet: friendly@hotspur.psych.yorku.ca (NeXTmail OK) Psychology Department York University Voice: 416 736-5118 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA ```

Back to: Top of message | Previous page | Main SAS-L page