|Date: ||Mon, 24 Jan 2005 14:22:47 -0500|
|Sender: ||"SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>|
|From: ||Patrice Bourdages <Patrice.Bourdages@IAAH.CA>|
|Subject: ||Re: Basic SAS questions|
|Content-Type: ||text/plain; charset="iso-8859-1"|
The proc output does make sense (It's the relative position within the SAS
Dataset which is shown). But don't bother with it. It's not really important
as long as it's output meets your requirement.
In order to have a listing of the variable in the "proper order", you should
add the following statement at the end of your proc declaration :
proc contents data=MyDataset position;
the "position" statement will produce a second listing but with your
variables in the desired order. The default for SAS is to print the proc
contents with the variable in alphabetical order. Usefull for many and not
at all usefull to other. Later, you'll learn how to produce the desired
output with the ODS statement and you won't hurt the environment as much.
As for formatting, you would need to consult the sas documentation under
"Features of SAS language for Windows", "Formats". It will help you a little
bit more... The problem with SAS documentation is not about the contents.
Everything is there. It's just a matter of finding it ... :-)
Hope this helps you a little bit...
Patrice "the SnowBird" Bourdages
Quebec, Quebec, Canada
De : SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU]De la part de
Envoyι : 24 janvier, 2005 13:39
ΐ : SAS-L@LISTSERV.UGA.EDU
Objet : Basic SAS questions
Hello, I'm an MPH grad student at a university with no SAS support.
While we did have a lab on stat analysis, none of the preliminary
dataset cleaning/formatting/merging were covered, and I am unable to
get the programs to work despite referring to books and the online
help. So I hope that someone has the time to assist me with these
novice concerns! I'm using version 8.
First question: can someone tell me why my Proc contents output makes
no sense? You can see the # column is in a certain (correct) order, but
the "Pos" column is incorrect. This is from an Excel file that I
imported as a .csv file - when I look at the .csv file, the # is
correct, but Pos is not. Here's the Proc contents output : Note how
variable 1 "ID" begins at position 80, var 5 "VC" begins at position
-----Alphabetic List of Variables and
# Variable Type Len Pos
8 cigsperdayunmod Num 8 16
4 dob Char 8 104
19 fcigsperday Num 8 56
6 fev1 Num 8 8
16 ffev1 Num 8 48
20 fformersmok Char 8 168
21 fht Num 8 64
18 flungca Char 8 160
9 formersmok Char 8 120
23 frace Char 8 176
14 fsex Char 8 144
13 ftestdate Char 8 136
15 fvc Num 8 40
22 fwt Num 8 72
10 ht Num 8 24
1 id Char 8 80
7 lungcancer Char 8 112
12 raceunmod Char 8 128
3 sexunmod Char 8 96
2 testdate Char 8 88
5 vc Num 8 0
11 wt Num 8 32
17 zipcode Char 8 152
Second question: I can't figure out how to get the date formatting to
work. Here's my programming, and it keeps giving me error messages:
(I actually have 3 date variables in this dataset, but here I only used
the formatting once - using it for all 3 just produces more error
infile 'C:\Documents and Settings\user\Desktop\ExpNoLabels.csv'
input id $ testdate MMDDYY10. sexunmod $ dob $ vc fev1 lungcancer $
cigsperdayunmod formersmok $ ht wt raceunmod $ ftestdate $ fsex $
fvc ffev1 zipcode $ flungca $ fcigsperday fformersmok $ fht fwt
LOG MESSAGE FOLLOWS:
9 data herman;
11 infile 'C:\Documents and Settings\user\Desktop\ExpNoLabels.csv'
12 input id $ testdate mmddyy10. sexunmod $ dob $ vc fev1
13 cigsperdayunmod formersmok $ ht wt raceunmod $ ftestdate $ fsex
$ fvc ffev1 zipcode $
13 ! flungca $ fcigsperday fformersmok $ fht fwt frace $;
NOTE: The infile 'C:\Documents and
File Name=C:\Documents and Settings\user\Desktop\ExpNoLabels.csv,
NOTE: Invalid data for testdate in line 2 3-12.
NOTE: Invalid data for fev1 in line 2 34-34.
NOTE: Invalid data for cigsperdayunmod in line 2 38-38.
NOTE: Invalid data for wt in line 2 47-47.
NOTE: Invalid data for fcigsperday in line 2 81-81.
NOTE: Invalid data for fwt in line 2 90-90.
88 4,C 90
id=2 testdate=. sexunmod=5/20/193 dob=3150 vc=3400 fev1=. lungcancer=0
formersmok=63 ht=122 wt=. raceunmod=1/30/200 ftestdate=F fsex=2780
fvc=1960 ffev1=92108 zipcode=N
flungca=0 fcigsperday=. fformersmok=63 fht=124 fwt=. frace= _ERROR_=1
NOTE: Invalid data for testdate in line 3 3-12.
MUCH MORE OF THE SAME OMITTED HERE...
flungca=N fcigsperday=0 fformersmok=Y fht=63 fwt=138 frace=C _ERROR_=1
NOTE: 696 records were read from the infile 'C:\Documents and
The minimum record length was 89.
The maximum record length was 102.
NOTE: The data set WORK.HERMAN has 696 observations and 23 variables.
NOTE: DATA statement used:
real time 0.53 seconds
cpu time 0.24 seconds
15 proc contents;
When I omit the mmddyy10. after variable testdate all these error
messages disappear. So what is the correct way to format dates in the
input list type of statement? (Note that I didnt want to switch to the
Input statement where each var is listed @ column position, since my
column positions are wrong in the Proc contents output - See question
When I use the following import procedure (using an Excel file in which
names had already been removed and the actual data begins on the first
row), even though I said Getnames=no the variable names in the
resulting file use the first row variables values as variable names.
Any idea what to do about this?
PROC IMPORT OUT= WORK.herman
DATAFILE= "C:\Documents and Settings\user\Desktop\ExpNoLabel
Thanks for the help! I feel like a complete bonehead since I cant even
get to the analysis part with these problems in the way.