Date: Tue, 20 Apr 2004 14:49:01 -0600
Reply-To: Jack Hamilton <JackHamilton@FIRSTHEALTH.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Jack Hamilton <JackHamilton@FIRSTHEALTH.COM>
Subject: Re: i/o memory buffering (was Duplicates & merge)
Content-Type: text/plain; charset=us-ascii
"pudding man" <pudding_man@MAIL.COM> 04/20/2004 12:32 PM wrote:
>Well, I'm admittedly muy mucho "slow on the draw" last
>week and this ...
>
>----- Original Message -----
>From: "Jack Hamilton" <JackHamilton@firsthealth.com>
>Date: Thu, 15 Apr 2004 11:14:25 -0600
>To: SAS-L@LISTSERV.UGA.EDU, pudding_man@MAIL.COM
>Subject: Re: [SAS-L] i/o memory buffering (was Duplicates & merge)
>
>> Paul said that all files use buffering. I disagreed, saying that
>> in-memory files might not use buffering.
>>
>> In-memory files are not theoretical, and I'm not sure why you think
>> they are.
>
>Oh, it's "Old School" stuff, I s'pose. Did I mention that I was
>recently appointed Grand Poobah Curmudgeon of the Digital
>Antiquities Society? :-)
>
>Seriously, I didn't say in-memory files are theoretical.
>
>> My notion of "what constitutes a file" is one used by
>> Windows, Unix, and MVS - files are those things that are manipulated
by
>> file I/O routines.
>
>As soon as you say "file I/O routines", I automatically
>think of:
>
>> > c.) The i/o sub/system (i.e. elements of bios, interrupt
>> > handler, disk-driver, disk-controller, disk device)
>> > positions the disk components (disk platter, read/
>> > write arm) at the data.
>
>All of which are designed for read/write from/to
>external storage.
That's certainly what it used to be. It's been generalized in some
operating systems.
>Perhaps you have better info than I. What major OS
>vendors support in-memory files?
Microsoft (on Windows), IBM (in MVS). Probably Unix - I can't say for
sure, but Unix tends to be pretty flexible about what it calls a file.
Unix has in-memory databases; would you consider a database to be a
file? I don't recall seeing anything which indicates that OpenVMS has
memory-only files.
>The only one I'm aware of is IBM mainframe (OS/VS+ and maybe
>VM). The facility is VIO (Virtual I/O). It's been around for
>years. VIO "simulates the activity of a DASD volume".
It was hiperspace in MVS. I think it's now called dataspace, and I'm
not sure what the differences are.
Windows used to have virtual drives set up in memory, but I think those
have gone away.
>"Any file which is truly worth writing is worth
> storing in a permanent fashion".
Well, I'd disagree with that. I don't want to make sort work files
permanent. There are lots of scratch files that I wouldn't think are
worth writing permanently. I wouldn't want PGP's temporary files to be
written to disk ever.
>You cannot write a permanent file to memory?
>Internal storage was not designed for permanence,
>and is inherently "volatile"?
I'm not sure what you're asking here, but yes, it would be convenient
to have an object that acts like a file but doesn't involve actual
writing to disk. It wouldn't be permanent, but there are lots of things
that don't need to be permanent. If the progam abends, I'll just
reconstruct it when I rerun.
>I gather that there are a handful of experimental
>64-bit systems that have been configured to store
>huge RDBMS tables in-memory 24/7. It seems likely
>such systems have potential "Disaster Recovery"
>issues that, by comparison, might make the Rock Of
>Gibraltar look like a grain of sand on the beach ... <g>
You wouldn't want to write everything to such files, just as you don't
want to write everything to backup tapes.
>The notion of a temporary file work-space in memory has been
>around for a long time. I fear that, over the years, it has
>largely failed to be viable for the vast majority of systems
>and applications.
>
>> As Dan Appleman says in his Visual Basic 5.0
>> Programmer's Guide to the Win32 API, "there is really no difference
>> between a file and memory", at least in modern operating systems.
>
>'Fraid I have no idea who Appleman might be. Scared to
>ask in what context in which he might've made such a statement. :-)
He's the author of a series of Visual Basic programming books. The
context is "File Operations"
He does follow that statement with "Ah, I know what you're thinking:
Surely such a statement is the product of a hallucination." He goes on
to explain why it's not.
>> This was not the case in past decades, but the vast majority of SAS
>> usage today is on machines which support in-memory files.
>
>I wasn't aware of that. Perhaps it would depend on how
>"support" is defined.
"Runs on". I don't think SAS Institute releases usage figures, but I'd
be surprised there aren't more licenses for Windows machines than for
any other OS. They do sell them by the hundred-pack.
>Well, you continue to offer no examples/details ...
OK, memory-mapped files in Windows, created using the CreateFileMapping
API function with an hFile handle of 0.
>You rely on your desktop and >= 1 multi-user system(s)? Are
>you actively using in-memory files on these? If so, could
>you describe config/usage of such facilities? Details are
>welcome ...
As far as I know, I'm not using them, other than possibly with PGP.
But whether I personally use them is not a good indication of whether
they exist.
>I readily admit that in-memory files "look" like real temp
>files to the programmer but strongly suspect that they
>"look" more like a highly specialized and sophisticated
>"buffer" to the OS. I will assume that, relevant to "common
>usage", they are kinda trivial until I have evidence to the
>contrary.
It doesn't matter whether the use is trivial - it matters only whether
the use exists at all.
>> I don't think
>> that SAS currently handles in-memory files in any special way, but
it
>> might someday.
>
>I hope soon. Memory keeps getting cheaper. 64-bit systems
>make for drastically larger addressing capacities. Imagine
>the potential performance improvement if one could read/write
>all WORK datasets in-memory.
>
>Seems I recall a SAS-L thread suggesting that SAS should
>support read/write of SAS datasets from/to a work-space or
>library in memory. I'll leave you to conjecture why there's
>no evidence that SI is contemplating such support.
I haven't asked. Have you?
> Prost,
> Puddin
>
>*******************************************************
>***** Puddin' Man **** Pudding_Man-at-mail.com ********
>*******************************************************;
>
>"Anarchy makes so sense whatsoever until one takes a good,
> long, hard look at past/present governments, what they've done,
> and what they've -not- done."
> - Madman
>
>> >>> "pudding man" <pudding_man@MAIL.COM> 04/15/2004 7:50 AM >>>
>> ----- Original Message -----
>> From: "Jack Hamilton" <JackHamilton@firsthealth.com>
>> Date: Wed, 14 Apr 2004 11:03:26 -0600
>> To: SAS-L@LISTSERV.UGA.EDU, pudding_man@mail.com
>> Subject: Re: i/o memory buffering (was Duplicates & merge)
>>
>> > I agree with all you said, but I think you overlooked the part of
my
>> > message which said "if a file exists only in memory".
>>
>> Well, My po' eyes didn't overlook it. I simply tried/failed
>> to relate it to the thrust of the thread. Also failed to
>> digest your notion of what constitutes a "file". Thereafter
>> I just naturally rambled on and on ... :-)
>>
>> I viewed the thread as having arisen from "common usage",
>> first in SAS dataset buffering, then in i/o processing
>> generally. Aside from SASFILE and _INFILE_, that's
>> purty much what I addressed. Saw little/nothing of a
>> theoretical nature in the discussion.
>>
>> > No disks involved
>> > (I can stipulate that the file is in real memory, not paged). Not
a
>> > common situation, but it takes only one counterexample to
disprove
>> > Paul's claim that "No file, flat or SAS, is read or written
without
>> > buffering".
>>
>> If one takes Paul's statement in the context of "common
>> usage" as applied to common definitions of a "file", I don't
>> know that his claim has been disproven in the least. The
>> statement seems to accurately describe a major component of
>> "data processing" as it has evolved thru the decades.
>>
>> I am willing to concede that, outside "common usage", and
>> with a definition of "file" that I do not support (i.e. "any
>> aggregation of data"), one can write a program to create a
>> "file" exclusively in memory. Something like this might be
>> applicable to certain simulations.
>>
>> Of course, if we stretch one more definition a bit ("any
>> intermediate storage entity" ??), such "file" might
>> kinda/sorta resemble a "buffer" to some folks ...
>>
>> "That the Gods of Semantics Should Spare Us These Tortures!" <dg>
>>
>> ...
>>
>> Zalut,
>> Puddin'
>>
>> PS: Me/myself/I have always subscribed to something like:
>>
>> File - An organized collection of records and
>> fields stored on an external device.
>>
>> *******************************************************
>> ***** Puddin' Man **** Pudding_Man-at-mail.com ********
>> *******************************************************;
>>
>> > --
>> > JackHamilton@FirstHealth.com
>> > Manager, Technical Development
>> > Metrics Department, First Health
>> > West Sacramento, California USA
>> >
>> > >>> "pudding man" <pudding_man@mail.com> 04/14/2004 9:55 AM >>>
>> > As Paul stated, the issue of disk-file i/o and memory-buffering
>> > (as I understand it) is not particular to SAS.
>> >
>> > When a running program initiates, say, a READ against
>> > a file stored on disk, (as best I recall) something like
>> > the following occurs:
>> >
>> > a.) The pgm requests i/o from the OS. This is done with
>> > an i/o "interrupt". Execution of the pgm is put on
>> > "hold".
>> > b.) The OS requests the i/o sub-system to que the data.
>> > c.) The i/o sub/system (i.e. elements of bios, interrupt
>> > handler, disk-driver, disk-controller, disk device)
>> > positions the disk components (disk platter, read/
>> > write arm) at the data. This is the "Slow"
>> > (electro-mechanical) part of the process.
>> > d.) Needed data is copied into a memory buffer.
>> > e.) pgm execution resumes with the data available for
>> > read operations.
>> >
>> > The pgm executes machine instructions. Operators and
>> > operands. The operands address memory. The cpu cannot
>> > directly address data on a disk device? This is done
>> > by the i/o sub-system?
>> >
>> > Is this generic process described in the SAS doc? I
>> > dunno. Don't have time to look.
>> >
>> > Per Paul, the OLDoc is potentially misleading. The _INFILE_
>> > "Input Buffer" holds a single rec. It is a useful concept at
>> > some levels. Of course, the OS-allocated Input Buffer can
>> > reasonably be expected to store many blocks/recs. It follows
>> > that the _INFILE_ Input Buffer and the generic OS-allocated Input
>> > Buffer are 2 separate entities. Perhaps this was what Peter
>> > C was describing.
>> >
>> > I suspect SASFILE is just a means by which SAS requests
>> > of the OS an Input Buffer as large as or larger than
>> > the specified file ...
>> >
>> > I know of no disk/tape-file i/o on conventional systems that
>> > doesn't employ some form of memory buffering. SAS allocates
>> > all manner of mem-buffers.
>> >
>> > Skoal,
>> > Puddin'
>> >
>> > *******************************************************
>> > ***** Puddin' Man **** Pudding_Man-at-mail.com ********
>> > *******************************************************;
>> >
>> > "My momma told me when I was yong bwah. She said:
>> > "If you take a notion to walk right up and piss in the
>> > snake-pit, you gotta 'spect that somethin's liable
>> > to jump up and bitecha!"
>> > And she wasn't just talking 'bout reptiles ... "
>> > - Madman
>> >
>> > From: Jack Hamilton <JackHamilton@FIRSTHEALTH.COM> Save
Address
>> > Block Sender This Is Spam
>> > To: SAS-L@LISTSERV.UGA.EDU
>> > CC:
>> > Subject: Re: Duplicates & merge
>> > Date: Tue, 13 Apr 2004 15:07:27 -0600
>> >
>> > I would not be inclined to think that the PDV is the input buffer
>> for
>> > SAS files. There's only one PDV, but there might be several
input
>> > files, contaiing variables with conflicting names and sizes. In
the
>> > case of a MERGE or SET with a BY, there has to be a least one
record
>> > from each data set ready to be placed into the PDV, before it is
>> > actually placed in the PDV. LAST. variables also require data to
be
>> > available to SAS without yet being placed into the PDV where your
>> > program can see it.
>> >
>> > I also disagree with Paul's statement "No file, flat or SAS, is
read
>> > or
>> > written without buffering and hence without creating an input
and/or
>> > output buffer in the first place." If a flat file exists only in
>> > memory
>> > (supported by some but not all operating systems), there's no
>> logical
>> > reason for it to have a buffer - SAS, or whatever program is
>> > processing
>> > it, could manipulate it directly in memory without ever making a
>> copy.
>> > SAS might not, for various good reasons, choose to do so, but
that's
>> > the
>> > same as is not being possible.
>> >
>> >
>> >
>> > --
>> > JackHamilton@FirstHealth.com
>
--
JackHamilton@FirstHealth.com
Manager, Technical Development
Metrics Department, First Health
West Sacramento, California USA
|