Date: Wed, 18 Jun 2008 18:26:35 -0400
Reply-To: Bucher Scott <SBucher@SCHOOLS.NYC.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Bucher Scott <SBucher@SCHOOLS.NYC.GOV>
Subject: Re: format issue: read CSV file into SAS
In-Reply-To: A<c5dcb420806181459o44e5dd0k2c2bf89f03c371ff@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
I ran into the same problem when saving results from Microsoft SQL Server Management Studio 2005. The problem is that it saves outputs in UNICODE by default. You can change this. When saving results click on the arrow on the save button > save with encoding > set 'Available encodings' to ANSI.
Also, it may be safer to save as tab delimited. If there is a comma in the field you will have problems. Another useful feature I have found is using the coalesce function to replace 'NULL's with blanks so they import properly into SAS. If anyone knows of a better way to do this then filling your queries with coalesce functions (i.e. in SQL rather than recoding in SAS), I would be interested in the solution.
Regards,
Scott Bucher
SAS Programmer
Office of Accountability
NYC Dept. of Education
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Tina MayIAsk
Sent: Wednesday, June 18, 2008 6:00 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: format issue: read CSV file into SAS
Hi,
I have a CSV file which I saved from a result generated by Transact-SQL. It looks all right. But when I imported it into SAS, all these columns turned out to have a wired looking. I tried both proc import and data step, and have the same issue.
Here is an example.
This is the CSV file.
24822326,54020 ,59016 ,1.0000
24842915,64106 ,65010 ,1.0000
24840873,54040 ,58042 ,5.0000
24843485,64007 ,65007 ,2.0000
24840722,54036 ,59002 ,4.0000
24843485,64106 ,65010 ,2.0000
24840878,54027 ,59039 ,2.0000
Then I ran the following code
*data* sub.substitution1;
length var1 var2 var3 var4 $*12*;
infile "C:\testfile1.CSV" dsd truncover;
input var1 $ var2 $ var3 $ var4 $;
*run*;
*proc* *print* data=sub.substitution1;*run*;
Here is the result
var1 var2 var3 var4
ÿþ2 4 8 2 2 5 4 0 2 0 5 9 0 1 6 1 . 0 0 0 0
2 4 8 4 2 9 6 4 1 0 6 6 5 0 1 0 1 . 0 0 0 0
2 4 8 4 0 8 5 4 0 4 0 5 8 0 4 2 5 . 0 0 0 0
2 4 8 4 3 4 6 4 0 0 7 6 5 0 0 7 2 . 0 0 0 0
2 4 8 4 0 7 5 4 0 3 6 5 9 0 0 2 4 . 0 0 0 0
2 4 8 4 3 4 6 4 1 0 6 6 5 0 1 0 2 . 0 0 0 0
2 4 8 4 0 8 5 4 0 2 7 5 9 0 3 9 2 . 0 0 0 0
I opened SAS dataset; there is a square between each digit for each variable. For example the first row of var2 looks like 5□4□0□2□0□. (Not sure if the square symbol will show up in this eamil, but anyway there is a square between these numbers). I checked the attribute of the column, it is $12.
Does anyone know how to get rid of these squares and make all these columns truly $12.?
Thank you very much.
Tina