=head1 NAME
-lh_new, lh_free, lh_insert, lh_delete, lh_retrieve, lh_doall,
-lh_doall_arg, lh_error - dynamic hash table
+lh_new, lh_free, lh_insert, lh_delete, lh_retrieve, lh_doall, lh_doall_arg, lh_error - dynamic hash table
=head1 SYNOPSIS
int lh_error(LHASH *table);
- typedef int (*LHASH_COMP_FN_TYPE)(void *, void *);
- typedef unsigned long (*LHASH_HASH_FN_TYPE)(void *);
- typedef void (*LHASH_DOALL_FN_TYPE)(void *);
- typedef void (*LHASH_DOALL_ARG_FN_TYPE)(void *, void *);
+ typedef int (*LHASH_COMP_FN_TYPE)(const void *, const void *);
+ typedef unsigned long (*LHASH_HASH_FN_TYPE)(const void *);
+ typedef void (*LHASH_DOALL_FN_TYPE)(const void *);
+ typedef void (*LHASH_DOALL_ARG_FN_TYPE)(const void *, const void *);
=head1 DESCRIPTION
can be arbitrary structures. Usually they consist of key and value
fields.
-lh_new() creates a new B<LHASH> structure. B<hash> takes a pointer to
-the structure and returns an unsigned long hash value of its key
-field. The hash value is normally truncated to a power of 2, so make
-sure that your hash function returns well mixed low order
-bits. B<compare> takes two arguments, and returns 0 if their keys are
-equal, non-zero otherwise. If your hash table will contain items of
-some uniform type, and similarly the B<hash> and B<compare> callbacks
-hash or compare the same type, then the B<DECLARE_LHASH_HASH_FN> and
+lh_new() creates a new B<LHASH> structure to store arbitrary data
+entries, and provides the 'hash' and 'compare' callbacks to be used in
+organising the table's entries. The B<hash> callback takes a pointer
+to a table entry as its argument and returns an unsigned long hash
+value for its key field. The hash value is normally truncated to a
+power of 2, so make sure that your hash function returns well mixed
+low order bits. The B<compare> callback takes two arguments (pointers
+to two hash table entries), and returns 0 if their keys are equal,
+non-zero otherwise. If your hash table will contain items of some
+particular type and the B<hash> and B<compare> callbacks hash/compare
+these types, then the B<DECLARE_LHASH_HASH_FN> and
B<IMPLEMENT_LHASH_COMP_FN> macros can be used to create callback
-wrappers of the prototypes required in lh_new(). These provide
+wrappers of the prototypes required by lh_new(). These provide
per-variable casts before calling the type-specific callbacks written
-by the application author. These macros are defined as;
+by the application author. These macros, as well as those used for
+the "doall" callbacks, are defined as;
#define DECLARE_LHASH_HASH_FN(f_name,o_type) \
- unsigned long f_name##_LHASH_HASH(void *);
+ unsigned long f_name##_LHASH_HASH(const void *);
#define IMPLEMENT_LHASH_HASH_FN(f_name,o_type) \
- unsigned long f_name##_LHASH_HASH(void *arg) { \
+ unsigned long f_name##_LHASH_HASH(const void *arg) { \
o_type a = (o_type)arg; \
return f_name(a); }
#define LHASH_HASH_FN(f_name) f_name##_LHASH_HASH
#define DECLARE_LHASH_COMP_FN(f_name,o_type) \
- int f_name##_LHASH_COMP(void *, void *);
+ int f_name##_LHASH_COMP(const void *, const void *);
#define IMPLEMENT_LHASH_COMP_FN(f_name,o_type) \
- int f_name##_LHASH_COMP(void *arg1, void *arg2) { \
+ int f_name##_LHASH_COMP(const void *arg1, const void *arg2) { \
o_type a = (o_type)arg1; \
o_type b = (o_type)arg2; \
return f_name(a,b); }
#define LHASH_COMP_FN(f_name) f_name##_LHASH_COMP
-An example of a hash table storing (pointers to) a structure type 'foo'
+ #define DECLARE_LHASH_DOALL_FN(f_name,o_type) \
+ void f_name##_LHASH_DOALL(const void *);
+ #define IMPLEMENT_LHASH_DOALL_FN(f_name,o_type) \
+ void f_name##_LHASH_DOALL(const void *arg) { \
+ o_type a = (o_type)arg; \
+ f_name(a); }
+ #define LHASH_DOALL_FN(f_name) f_name##_LHASH_DOALL
+
+ #define DECLARE_LHASH_DOALL_ARG_FN(f_name,o_type,a_type) \
+ void f_name##_LHASH_DOALL_ARG(const void *, const void *);
+ #define IMPLEMENT_LHASH_DOALL_ARG_FN(f_name,o_type,a_type) \
+ void f_name##_LHASH_DOALL_ARG(const void *arg1, const void *arg2) { \
+ o_type a = (o_type)arg1; \
+ a_type b = (a_type)arg2; \
+ f_name(a,b); }
+ #define LHASH_DOALL_ARG_FN(f_name) f_name##_LHASH_DOALL_ARG
+
+An example of a hash table storing (pointers to) structures of type 'STUFF'
could be defined as follows;
- unsigned long foo_hash(foo *tohash);
- int foo_compare(foo *arg1, foo *arg2);
- static IMPLEMENT_LHASH_HASH_FN(foo_hash, foo *)
- static IMPLEMENT_LHASH_COMP_FN(foo_compare, foo *);
+ /* Calculates the hash value of 'tohash' (implemented elsewhere) */
+ unsigned long STUFF_hash(const STUFF *tohash);
+ /* Orders 'arg1' and 'arg2' (implemented elsewhere) */
+ int STUFF_cmp(const STUFF *arg1, const STUFF *arg2);
+ /* Create the type-safe wrapper functions for use in the LHASH internals */
+ static IMPLEMENT_LHASH_HASH_FN(STUFF_hash, const STUFF *)
+ static IMPLEMENT_LHASH_COMP_FN(STUFF_cmp, const STUFF *);
/* ... */
int main(int argc, char *argv[]) {
- LHASH *hashtable = lh_new(LHASH_HASH_FN(foo_hash),
- LHASH_COMP_FN(foo_compare));
+ /* Create the new hash table using the hash/compare wrappers */
+ LHASH *hashtable = lh_new(LHASH_HASH_FN(STUFF_hash),
+ LHASH_COMP_FN(STUFF_cmp));
/* ... */
}
lh_free() frees the B<LHASH> structure B<table>. Allocated hash table
entries will not be freed; consider using lh_doall() to deallocate any
-remaining entries in the hash table.
+remaining entries in the hash table (see below).
lh_insert() inserts the structure pointed to by B<data> into B<table>.
If there already is an entry with the same key, the old value is
pointer to a fully populated structure.
lh_doall() will, for every entry in the hash table, call B<func> with
-the data item as parameters.
-This function can be quite useful when used as follows:
- void cleanup(STUFF *a)
- { STUFF_free(a); }
- lh_doall(hash,(LHASH_DOALL_FN_TYPE)cleanup);
- lh_free(hash);
-This can be used to free all the entries. lh_free() then cleans up the
-'buckets' that point to nothing. When doing this, be careful if you
-delete entries from the hash table in B<func>: the table may decrease
-in size, moving item that you are currently on down lower in the hash
-table. This could cause some entries to be skipped. The best
-solution to this problem is to set hash-E<gt>down_load=0 before you
-start. This will stop the hash table ever being decreased in size.
-
-lh_doall_arg() is the same as lh_doall() except that B<func> will
-be called with B<arg> as the second argument and B<func> should be
-of type B<LHASH_DOALL_ARG_FN_TYPE> (a callback prototype that is
-passed an extra argument).
-
+the data item as its parameter. For lh_doall() and lh_doall_arg(),
+function pointer casting should be avoided in the callbacks (see
+B<NOTE>) - instead, either declare the callbacks to match the
+prototype required in lh_new() or use the decare/implement macros to
+create type-safe wrappers that cast variables prior to calling your
+type-specific callbacks. An example of this is illustrated here where
+the callback is used to cleanup resources for items in the hash table
+prior to the hashtable itself being deallocated:
+
+ /* Cleans up resources belonging to 'a' (this is implemented elsewhere) */
+ void STUFF_cleanup(STUFF *a);
+ /* Implement a prototype-compatible wrapper for "STUFF_cleanup" */
+ IMPLEMENT_LHASH_DOALL_FN(STUFF_cleanup, STUFF *)
+ /* ... then later in the code ... */
+ /* So to run "STUFF_cleanup" against all items in a hash table ... */
+ lh_doall(hashtable, LHASH_DOALL_FN(STUFF_cleanup));
+ /* Then the hash table itself can be deallocated */
+ lh_free(hashtable);
+
+When doing this, be careful if you delete entries from the hash table
+in your callbacks: the table may decrease in size, moving the item
+that you are currently on down lower in the hash table - this could
+cause some entries to be skipped during the iteration. The second
+best solution to this problem is to set hash-E<gt>down_load=0 before
+you start (which will stop the hash table ever decreasing in size).
+The best solution is probably to avoid deleting items from the hash
+table inside a "doall" callback!
+
+lh_doall_arg() is the same as lh_doall() except that B<func> will be
+called with B<arg> as the second argument and B<func> should be of
+type B<LHASH_DOALL_ARG_FN_TYPE> (a callback prototype that is passed
+both the table entry and an extra argument). As with lh_doall(), you
+can instead choose to declare your callback with a prototype matching
+the types you are dealing with and use the declare/implement macros to
+create compatible wrappers that cast variables before calling your
+type-specific callbacks. An example of this is demonstrated here
+(printing all hash table entries to a BIO that is provided by the
+caller):
+
+ /* Prints item 'a' to 'output_bio' (this is implemented elsewhere) */
+ void STUFF_print(const STUFF *a, BIO *output_bio);
+ /* Implement a prototype-compatible wrapper for "STUFF_print" */
+ static IMPLEMENT_LHASH_DOALL_ARG_FN(STUFF_print, const STUFF *, BIO *)
+ /* ... then later in the code ... */
+ /* Print out the entire hashtable to a particular BIO */
+ lh_doall_arg(hashtable, LHASH_DOALL_ARG_FN(STUFF_print), logging_bio);
+
lh_error() can be used to determine if an error occurred in the last
operation. lh_error() is a macro.
lh_free(), lh_doall() and lh_doall_arg() return no values.
+=head1 NOTE
+
+The various LHASH macros and callback types exist to make it possible
+to write type-safe code without resorting to function-prototype
+casting - an evil that makes application code much harder to
+audit/verify and also opens the window of opportunity for stack
+corruption and other hard-to-find bugs. It also, apparently, violates
+ANSI-C.
+
+The LHASH code regards table entries as constant data. As such, it
+internally represents lh_insert()'d items with a "const void *"
+pointer type. This is why callbacks such as those used by lh_doall()
+and lh_doall_arg() declare their prototypes with "const", even for the
+parameters that pass back the table items' data pointers - for
+consistency, user-provided data is "const" at all times as far as the
+LHASH code is concerned. However, as callers are themselves providing
+these pointers, they can choose whether they too should be treating
+all such parameters as constant.
+
+As an example, a hash table may be maintained by code that, for
+reasons of encapsulation, has only "const" access to the data being
+indexed in the hash table (ie. it is returned as "const" from
+elsewhere in their code) - in this case the LHASH prototypes are
+appropriate as-is. Conversely, if the caller is responsible for the
+life-time of the data in question, then they may well wish to make
+modifications to table item passed back in the lh_doall() or
+lh_doall_arg() callbacks (see the "STUFF_cleanup" example above). If
+so, the caller can either cast the "const" away (if they're providing
+the raw callbacks themselves) or use the macros to declare/implement
+the wrapper functions without "const" types.
+
+Callers that only have "const" access to data they're indexing in a
+table, yet declare callbacks without constant types (or cast the
+"const" away themselves), are therefore creating their own risks/bugs
+without being encouraged to do so by the API. On a related note,
+those auditing code should pay special attention to any instances of
+DECLARE/IMPLEMENT_LHASH_DOALL_[ARG_]_FN macros that provide types
+without any "const" qualifiers.
+
=head1 BUGS
lh_insert() returns B<NULL> both for success and error.
probably worth changing your hash function if this is the case because
even if your hash table has 10 items in a 'bucket', it can be searched
with 10 B<unsigned long> compares and 10 linked list traverses. This
-will be much less expensive that 10 calls to you compare function.
+will be much less expensive that 10 calls to your compare function.
lh_strhash() is a demo string hashing function: