[Coco] [Color Computer] Re: the final stage of optimization

Steven Hirsch snhirsch at gmail.com
Tue Nov 24 07:45:14 EST 2009


On Mon, 23 Nov 2009, Wayne Campbell wrote:

> James (it's nice to know your name),
>
> The program is a decompiler, unpacker, decoder, however you want to say 
> it. Basic09 compiles into intermediate code, and I am using that to 
> reconstruct the source code.
>
> I am still working at understanding exactly what a hash is, but I think 
> I get it. I was thinking about an idea to create an array for the 
> literals that included a field that would contain the numeric value of a 
> string. That value would be the total of the numeric values of each 
> character in the string. I'm not sure if that's what you mean by "packed 
> ascii" or not, but it sounds like it might be the same thing. Would it 
> be feasible, though? What are the chances that 2 different strings, even 
> of differing lengths, could have the same value?

The chances of this happening are very high.  It's called a "collision" in 
the terminology of a hash table.  I'd suggest researching the theory and 
implementation of hash tables on Wikipedia or a CS-related site.

If the key-calculation algorithm is working well (and a simple summation 
probably is not a good choice), the hashed items will be spread evenly 
over a limited number of "buckets".  Each bucket has a short list of data 
items (or references to those items if, e.g. they actually live on disk) 
that hash to that key.  At that point, you typically do a linear search of 
the list to directly find the target item.  Note that this is only one 
possible implementation of a hash table!

At best, a hash table can approach constant-time lookup.  At worst, it 
degenerates to an O(n) linear search.  Typically, you'll be more towards 
the constant-time end of things.

Steve

-- 



More information about the Coco mailing list