Date: Mon, 22 Jan 2007 08:55:57 -0700
Reply-To: Alan Churchill <SASL001@SAVIAN.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Alan Churchill <SASL001@SAVIAN.NET>
Subject: Re: Hashing and memory
In-Reply-To: <1169474880.816401.230640@11g2000cwr.googlegroups.com>
Content-Type: text/plain; charset="iso-8859-1"
Thanks Chris. This confirms what I observed in testing and what I assumed
was happening.
Alan
Alan Churchill
Savian "Bridging SAS and Microsoft Technologies"
www.savian.net
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
chris@OVIEW.CO.UK
Sent: Monday, January 22, 2007 7:08 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Hashing and memory
Hi Alan,
Hash functions are generally one-way - ie you can't retrieve the
original key value from the hash. See
http://en.wikipedia.org/wiki/Hash_function This is necessarily the
case if the key domain is larger than the hash domain, although the
Wiki article mentions specialized one-to-one hashes that can be
reversed to recover the original data.
So, in most cases, a general hash table (eg a Perl hash) must store the
original key value as well as the hashed key value. This will depend
on the application, though... you may not always want or need to
recover the key value, assuming you can prevent or ignore collisions.
HTH,
Chris.
--------------------------------------------------------
Elvis SAS Log Analyser - http://www.oview.co.uk/elvis
--------------------------------------------------------
Alan Churchill wrote:
> All,
>
>
>
> I am trying to understand the memory footprint of hashes and I am hoping
> that folks on this list can help.
>
>
>
> If I have a key value(s) and some data, the key(s) gets hashed and is
> converted to a numeric but the data seems to be simply loaded. Hence, the
> memory footprint should be the data size plus the hash table's footprint.
> First of all, is that correct. If so, then memory reductions should exist
if
> I have a really large key value or does the hash still have to hold the
> original value? I thought that the hash will be able to auto-magically
> reverse itself when the hash is complete.
>
>
>
> Finally, is there a way to reduce the memory footprint of a hash below the
> size of the dataset? Can I process data by using a point and thereby
> minimize the footprint to a more reasonable amount?
>
>
>
> Thanks for helping. I looked on the web and didn't find anything in a
> cursory search.
>
>
>
> Alan
>
>
>
> Alan Churchill
> Savian "Bridging SAS and Microsoft Technologies"
> <http://www.savian.net/> www.savian.net
|