summaryrefslogtreecommitdiffstats
path: root/doc/lhash.doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc/lhash.doc')
-rw-r--r--doc/lhash.doc151
1 files changed, 0 insertions, 151 deletions
diff --git a/doc/lhash.doc b/doc/lhash.doc
deleted file mode 100644
index 5a2aeb4b38..0000000000
--- a/doc/lhash.doc
+++ /dev/null
@@ -1,151 +0,0 @@
-The LHASH library.
-
-I wrote this library in 1991 and have since forgotten why I called it lhash.
-It implements a hash table from an article I read at the
-time from 'Communications of the ACM'. What makes this hash
-table different is that as the table fills, the hash table is
-increased (or decreased) in size via realloc().
-When a 'resize' is done, instead of all hashes being redistributed over
-twice as many 'buckets', one bucket is split. So when an 'expand' is done,
-there is only a minimal cost to redistribute some values. Subsequent
-inserts will cause more single 'bucket' redistributions but there will
-never be a sudden large cost due to redistributing all the 'buckets'.
-
-The state for a particular hash table is kept in the LHASH structure.
-The LHASH structure also records statistics about most aspects of accessing
-the hash table. This is mostly a legacy of my writing this library for
-the reasons of implementing what looked like a nice algorithm rather than
-for a particular software product.
-
-Internal stuff you probably don't want to know about.
-The decision to increase or decrease the hash table size is made depending
-on the 'load' of the hash table. The load is the number of items in the
-hash table divided by the size of the hash table. The default values are
-as follows. If (hash->up_load < load) => expand.
-if (hash->down_load > load) => contract. The 'up_load' has a default value of
-1 and 'down_load' has a default value of 2. These numbers can be modified
-by the application by just playing with the 'up_load' and 'down_load'
-variables. The 'load' is kept in a form which is multiplied by 256. So
-hash->up_load=8*256; will cause a load of 8 to be set.
-
-If you are interested in performance the field to watch is
-num_comp_calls. The hash library keeps track of the 'hash' value for
-each item so when a lookup is done, the 'hashes' are compared, if
-there is a match, then a full compare is done, and
-hash->num_comp_calls is incremented. If num_comp_calls is not equal
-to num_delete plus num_retrieve it means that your hash function is
-generating hashes that are the same for different values. It is
-probably worth changing your hash function if this is the case because
-even if your hash table has 10 items in a 'bucked', it can be searched
-with 10 'unsigned long' compares and 10 linked list traverses. This
-will be much less expensive that 10 calls to you compare function.
-
-LHASH *lh_new(
-unsigned long (*hash)(),
-int (*cmp)());
- This function is used to create a new LHASH structure. It is passed
- function pointers that are used to store and retrieve values passed
- into the hash table. The 'hash'
- function is a hashing function that will return a hashed value of
- it's passed structure. 'cmp' is passed 2 parameters, it returns 0
- is they are equal, otherwise, non zero.
- If there are any problems (usually malloc failures), NULL is
- returned, otherwise a new LHASH structure is returned. The
- hash value is normally truncated to a power of 2, so make sure
- that your hash function returns well mixed low order bits.
-
-void lh_free(
-LHASH *lh);
- This function free()s a LHASH structure. If there is malloced
- data in the hash table, it will not be freed. Consider using the
- lh_doall function to deallocate any remaining entries in the hash
- table.
-
-char *lh_insert(
-LHASH *lh,
-char *data);
- This function inserts the data pointed to by data into the lh hash
- table. If there is already and entry in the hash table entry, the
- value being replaced is returned. A NULL is returned if the new
- entry does not clash with an entry already in the table (the normal
- case) or on a malloc() failure (perhaps I should change this....).
- The 'char *data' is exactly what is passed to the hash and
- comparison functions specified in lh_new().
-
-char *lh_delete(
-LHASH *lh,
-char *data);
- This routine deletes an entry from the hash table. The value being
- deleted is returned. NULL is returned if there is no such value in
- the hash table.
-
-char *lh_retrieve(
-LHASH *lh,
-char *data);
- If 'data' is in the hash table it is returned, else NULL is
- returned. The way these routines would normally be uses is that a
- dummy structure would have key fields populated and then
- ret=lh_retrieve(hash,&dummy);. Ret would now be a pointer to a fully
- populated structure.
-
-void lh_doall(
-LHASH *lh,
-void (*func)(char *a));
- This function will, for every entry in the hash table, call function
- 'func' with the data item as parameters.
- This function can be quite useful when used as follows.
- void cleanup(STUFF *a)
- { STUFF_free(a); }
- lh_doall(hash,cleanup);
- lh_free(hash);
- This can be used to free all the entries, lh_free() then
- cleans up the 'buckets' that point to nothing. Be careful
- when doing this. If you delete entries from the hash table,
- in the call back function, the table may decrease in size,
- moving item that you are
- currently on down lower in the hash table. This could cause
- some entries to be skipped. The best solution to this problem
- is to set lh->down_load=0 before you start. This will stop
- the hash table ever being decreased in size.
-
-void lh_doall_arg(
-LHASH *lh;
-void(*func)(char *a,char *arg));
-char *arg;
- This function is the same as lh_doall except that the function
- called will be passed 'arg' as the second argument.
-
-unsigned long lh_strhash(
-char *c);
- This function is a demo string hashing function. Since the LHASH
- routines would normally be passed structures, this routine would
- not normally be passed to lh_new(), rather it would be used in the
- function passed to lh_new().
-
-The next three routines print out various statistics about the state of the
-passed hash table. These numbers are all kept in the lhash structure.
-
-void lh_stats(
-LHASH *lh,
-FILE *out);
- This function prints out statistics on the size of the hash table,
- how many entries are in it, and the number and result of calls to
- the routines in this library.
-
-void lh_node_stats(
-LHASH *lh,
-FILE *out);
- For each 'bucket' in the hash table, the number of entries is
- printed.
-
-void lh_node_usage_stats(
-LHASH *lh,
-FILE *out);
- This function prints out a short summary of the state of the hash
- table. It prints what I call the 'load' and the 'actual load'.
- The load is the average number of data items per 'bucket' in the
- hash table. The 'actual load' is the average number of items per
- 'bucket', but only for buckets which contain entries. So the
- 'actual load' is the average number of searches that will need to
- find an item in the hash table, while the 'load' is the average number
- that will be done to record a miss.