LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2008, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sat, 14 Jun 2008 23:29:42 GMT
Reply-To:     Lou <lpogoda@VERIZON.NET>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Lou <lpogoda@VERIZON.NET>
Subject:      Re: proc append option force and variables left behind in the sand
Comments: To: sas-l@uga.edu

"Aldi Kraja" <aldi@wustl.edu> wrote in message news:48531C59.9040404@wustl.edu... > Hi, > > I had the impression that if one had a base sas set A, with one variable > k1 and I append a new set B with 3 variables that overlap the k1 > variable of set A, because of FORCE option I was expecting the new > appended set A to have 3 variables. The example below shows that it is > not the case. SAS drops any extra variable(s) compared to the base. > Question: is there a solution to set data together without loosing > variables? If I use SET statement is not economical because I have > thousands of sets to be appended. I was hoping append was very quick, > but I it surprised me to see so many variables left behind in the sand.

The documentation clearly states that variables not on the base dataset will be dropped. There are a couple of potential problems with doing as you suggest. One you've already stumbled over - a new variable won't show up in the appended dataset. The other is that a given variable name could appear on multiple datasets but have different attributes - different labels, different lengths, and/or be a character type on one and numeric on another.

Unless you're certain that identically named variables will have the same attributes on all datasets, I think you need to do some data analysis before proceeding.

But let's ignore that aspect for the moment. In order to append everything together, you need to construct a base dataset that has all the variables (with the correct attributes) that you need in the result. I'd suggest using PROC SQL to query the dictionary.columns table to get a list of every variable in every dataset in your input library (or libraries). The list needs to winnowed of course, since there will be a lot of duplicates - for numeric variables, you can just pick any of the entries in the list to serve as your model, but for character variables, you probably want to use one with the longest length attribute. Use the list to write ATRRIB statements for each variable, and create an empty base dataset. From there, you can append at will.

The only alternative that comes to mind is to use a SET statement in a datastep. As you say, that's not efficient, but your example log shows only one observation in the appending dataset. If this is a one-shot deal and you're talking about only a few thousand observations altogether, maybe you can live with the inefficiency.

> Thanks, Aldi > > Example 1: > NOTE: Appending P3T3.GP3D102C3 to OUT3.ZZZSTART3P3C3. > WARNING: Variable ZZZ017972XXX was not found on BASE file. The variable > will not be added to the BASE file. > NOTE: FORCE is specified, so dropping/truncating will occur. > NOTE: There were 1 observations read from the data set P3T3.GP3D102C3. > NOTE: The data set OUT3.ZZZSTART3P3C3 has 2 observations and 69168 > variables. > > Simple Example 2 (see the warning in the log and the output when you run > this program): > ===================================== > data A; > input k1; > datalines; > 12 > ; run; > data B; > input k1 k2 k3; > datalines; > 12 13 14 > ; > run; > proc append base=A data=B force; > run; > proc print data=A; > title "A after append procedure missing variables k2 and k3"; > run; > > > --


Back to: Top of message | Previous page | Main SAS-L page