SYNOPSIS

    #include "libut/ut.h"
    
    HASH_ADD_INT( head, tmp, intmbr, add);
    HASH_FIND_INT(head, tmp, intmbr, findintptr);
    
    HASH_ADD_STR( head, tmp, strmbr, add);
    HASH_FIND_STR(head, tmp, strmbr, findstrptr);
    
    HASH_DEL( head, tmp, delptr);
    
    /*  Generalized form of ADD/FIND */
    HASH_ADD( head, tmp, keymbr, add, keylen);
    HASH_FIND(head, tmp, keymbr, keyptr, keylen);

DESCRIPTION

These macros manipulate hashes. The hash allows elements (C structures) to be added, deleted or looked up (by their key) in constant-time. It is also possible to iterate over the elements in the hash. Every element in a hash must have a unique key.

Since these are macros, they do not require their arguments to be of a particular type. However, the structures to be formed into a hash must have a member named next which points to the next structure (used to iterate over the elements of the hash as if they were a singly-linked list); and a member defined as UT_hash_handle hh. E.g.,

    typedef struct user_t {
        char name[10];
        int id;
        struct user_t *next;
        UT_hash_handle hh;
    } user_t;

To form these structures into a hash, a hash head is also required. The head is a pointer to the first structure. It must be initialized to NULL.

    user_t *user_list = NULL;

The sections below explain the hash macros using this user_t example structure. The id member, a unique integer for each user, is to be used as the key for the example hash.

HASH_ADD

This is how to add an element to a hash:

    user_t *user, *tmp;
    
    /* create the element. */
    user = malloc(sizeof(user_t));
    strcpy(user->name, "Joe");
    user->id = 1;
    user->next = NULL;
    
    /* add the element to the hash */
    HASH_ADD_INT( user_list, tmp, id, user );

Note that the tmp variable was created of the same structure-pointer type as the hash head and the new element. It is used a temporary variable and need not be initialized.

The id argument specifies the structure member which acts as the key. Since this is a macro, there is no problem specifying the structure member by itself; the macro expands to valid C syntax, e.g., tmp->id.

The macro HASH_ADD_INT is used because this hash uses an integer (the user->id member) as its key. If the key had been a string, HASH_ADD_STR would be used.

Generalized form of HASH_ADD

HASH_ADD_INT and HASH_ADD_STR are shorthand for the generalized macro HASH_ADD:

    HASH_ADD( user_list, tmp, id, user, sizeof(int) );

The generalized form works for any type of key (not just integer or string). The length of the key in bytes is given as the last argument.

HASH_DEL

This is how to delete an element (given it's pointer) from a hash:

    /* suppose user was previously HASH_ADD'ed to user_list */
    user_t *user, *tmp;
    
    /* delete the element from the list */
    HASH_DEL( user_list, tmp, user );

The tmp variable was created of the same structure-pointer type as the hash head and the element to be deleted. It is used a temporary variable and need not be initialized.

Deleting an element from a hash only unlinks it from the hash. It does not free it.

If a hash is no longer needed, deleting each of its elements will (upon deletion of the final element) free internally allocated memory used to manage the hash.

HASH_FIND

This macro looks for an element in the hash having the specified key.

    /* suppose user Joe was previously HASH_ADD'ed to user_list */
    user_t *user, *tmp;
    int sought_id = 1;
    
    /* find user with id 1 in the list */
    HASH_FIND_INT( user_list, tmp, id, &sought_id);

The tmp variable was created of the same structure-pointer type as the hash head and the element being sought. It is used a temporary variable and need not be initialized.

Upon completion of this macro, tmp points to the element being sought, or is NULL if an element with the specified key wasn't found in the list.

Note that the last argument &sought_id is a pointer to the sought key.

The macro HASH_FIND_INT is used because this hash uses an integer (the user->id member) as its key. If the key had been a string, HASH_FIND_STR would be used.

Generalized form of HASH_FIND

HASH_FIND_INT and HASH_FIND_STR are shorthand for the generalized macro HASH_FIND:

    HASH_FIND( user_list, tmp, id, &sought_id, sizeof(int) );

The generalized form works for any type of key (not just integer or string). The length of the key in bytes is given as the last argument.

NOTES

Internally the Bernstein hash is used to distribute the elements into buckets. The number of buckets is dynamically expanded to keep scanning within a bucket to 10 items or fewer. Ultimately the performance of any hash is affected by the quality of the hash algorithm on the particular values of the key.

RELATED CONTROL PORT COMMANDS

The mem -h control port command prints information about the hashes in use.

AUTHOR

Troy D. Hanson <thanson@users.sourceforge.net>