Date: Mon, 29 Jul 2002 09:10:05 -0400
Reply-To: Quentin McMullen <QuentinMcMullen@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Quentin McMullen <QuentinMcMullen@WESTAT.COM>
Subject: Re: Memo: Proc Freq Question
Content-Type: text/plain; charset="iso-8859-1"
Jassim Rahma [mailto:jassimrahma@HSBC.COM] writes:
> Why I'm not getting the format NO INFORMATION in my proc freq and just
> getting it as Frequency Missing. My Format is :
>
> proc format;
> value $emp_score
>
> 'A' = 'Armed Forces/Ministry'
> 'G' = 'Government/Public Sector'
> 'M' = 'Manufacturing'
> 'O' = 'Others'
> 'P' = 'Private Sector'
> 'T' = 'Trading'
> 'W' = 'Wholesale'
> 'Q' = 'Not Asked'
> other = 'No Information';
> run;
>
>
> and proc freq is :
>
> proc freq data = MEFCO.MEFCOmerge;
> tables employment_status;
> **-- var PRIM_SCD_FNL_SCR;
> format employment_status $emp_score.;
> **-- class PRIM_SCD_FNL_SCR Decision;
> run;
Hi,
I see Dennis Diskin already referred you to the missing option.
Another issue here might be that your category 'No Information' may contain
out-of-range values ('X','Y', etc.) as well as missing values (' '). When
this happens, PROC FREQ will treat all values in the category as missing.
Cody & Pass _SAS Programming By Example_ has a good
demonstration/explanation of this on pages 206-207. In general, I think it
is good practice that whenever you have other= as part of a format, you
should have separate categories for the missing data. Below is an example.
Kind Regards,
--Quentin
193 data a;
194 do sex=1,2,3,.;
195 output;
196 end;
197 run;
198
199 proc format;
200 value badsexF
201 1='Male'
202 2='Female'
203 other='No Information'
204 ;
NOTE: Format BADSEXF has been output.
205 run;
206
207 proc freq data=a;
208 tables sex;
209 format sex badsexF.;
210 run;
NOTE: There were 4 observations read from the data set WORK.A.
NOTE: PROCEDURE FREQ used:
real time 0.04 seconds
/***********
Freq shows two missing, but really there is one missing and one bad value.
Cumulative Cumulative
sex Frequency Percent Frequency Percent
-----------------------------------------------
Male 1 50.00 1 50.00
Female 1 50.00 2 100.00
Frequency Missing = 2
**********/
211
212 proc freq data=a;
213 tables sex/missing;
214 format sex badsexF.;
215 run;
NOTE: There were 4 observations read from the data set WORK.A.
NOTE: PROCEDURE FREQ used:
real time 0.11 seconds
/*********
sex Frequency Percent Frequency Percent
-------------------------------------------------------------------
No Information 2 50.00 2 50.00
Male 1 25.00 3 75.00
Female 1 25.00 4 100.00
**********/
216
217 proc format;
218 value oksexF
219 1='Male'
220 2='Female'
221 .='Missing'
222 other='Out of Range'
223 ;
NOTE: Format OKSEXF has been output.
224 run;
NOTE: PROCEDURE FORMAT used:
real time 0.00 seconds
225
226 proc freq data=a;
227 tables sex;
228 format sex oksexF.;
229 run;
NOTE: There were 4 observations read from the data set WORK.A.
NOTE: PROCEDURE FREQ used:
real time 0.00 seconds
/*************
Cumulative Cumulative
sex Frequency Percent Frequency Percent
-----------------------------------------------------------------
Male 1 33.33 1 33.33
Female 1 33.33 2 66.67
Out of Range 1 33.33 3 100.00
Frequency Missing = 1
***********/