Date: Wed, 29 Jan 1997 09:06:50 -0600
Reply-To: l-hoyle@ukans.edu
Sender: "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From: Larry Hoyle <l-hoyle@UKANS.EDU>
Organization: IPPBR, University of Kansas
Subject: Re: cgi2sas advice needed
Content-Type: text/plain; charset=us-ascii
Your perl program should probably check the length of the string.
You can also specify a maximum length in the HTML form.
You might consider having the perl program generate the form if it is
invoked with no parameters and check values if it is called with
parameters. It can then send back the form for corrections if a field
has a nasty value.
The example perl program on SAS Institute's WWW site for CGI to SAS
does checking for problematic characters for macro variable values.
These are superset of the ones you will need to be concerned about.
The cgi component of the SAS/Intrnet product will also handle values
with characters like & ' and ". It would allow you to avoid the Perl
programming.
Harish Chand wrote:
>
> I'm collecting data via the web and want to determine the best way to
> write the data out in a form which can be read into sas. I am
> particularly concerned about rogue data from open ended questions
> which could contain problematic characters for sas (as well as web
> site security).
>
> Here are my current plans, but I am open to better ideas.
> I plan on using perl to write the data to a file which I'll then read
> into sas. The file will have a bunch of records such as:
> var1 = 'foo1';
> var2 = 'foo2';
> ...
> output datasetn;
>
> As some of the values will be responses to open ended questions (email
> addresses, comments, etc.), I want to convert any characters which
> might screw up sas. Any tips?
>
> Here are my thoughts about problematic characters.
>
> 1. replace ' by '' (so embedded single quotes will be ok)
> 2. replace & by (amp) or something similar (although this shouldn't be
> a problem since I'm not using double quotes).
>
> I am not sure what to do (if anything) about embedded double quotes (").
>
> Are there any other characters I should worry about?
>
> Is there a better way to do things? The advantage of the above method
> is that I can write out to a single file rather than writing out to
> multiple raw data files (size of input file is not too big a concern),
> but I am open to better ideas.
>
> Any suggestions would be greatly appreciated.
>
> Thanks,
> Harish Chand
--
_____________________________________
Larry Hoyle | lat/long: 38.57.24 / 95.14.36 \
Institute for Public Policy | --> * <
and Business Research | Voice: (913) 864-3701 |
University of Kansas | FAX: (913) 864-3683 |
607 Blake Hall | mailto://l-hoyle@ukans.edu |
Lawrence, KS 66045-2960 | http://www.ukans.edu/cwis/units/IPPBR |
|_______________________________________|