What is Hashing?
Hashing is a technique used for storing and retrieving information as quickly as possible. It is used to perform optimal searches and is useful in implementing symbol tables
Why Hashing?
In the Trees chapter we saw that balanced binary search trees support operations such as insert, delete and search in O( logn) time. In applications, if we need these operations in O( 1), then hashing provides a way.Understanding Hashing
We can treat Array has Hash table. But there is a problem.Example: Write an algo for printing the first repeated character if there are duplicated elements in it.
- Simple and Brute Force Way :
- Given a string, for each character check whether that character is repeated or not. The time complexity of this approach is O( n2) with O( 1) space complexity.
- Better Way : Use Array
- Suppose the number of characters are ASCII char which is 256.
- We create and array of length 256
- If any char is repeated increment its count
Problem is :Suppose given array can have combination of numbers. In this case the set of possible values is infinity.
The process of mapping the keys to locations is called hashing.(linked list can grow dynamically)
Components of Hashing:
Hashing has four key components:
1) Hash Table :
Hash table is a generalization of array.
The basic idea behind hashing is to take a field in a record, known as the key, and convert it through some fixed process to a numeric value, known as the hash key, which represents the position to either store or find an item in the table.
Why Hash Table?
That means, given a key k, we find the element whose key is k by just looking in the kth position of the array. This is called direct addressing. if we have less locations and more possible keys, then simple array implementation is not enough.
In these cases one option is to use hash tables.
Hash table or hash map is a data structure that stores the keys and their associated values, and hash table uses a hash function to map keys to their associated values. The general convention is that we use a hash table when the number of keys actually stored is small relative to the number of possible keys.
In these cases one option is to use hash tables.
Hash table or hash map is a data structure that stores the keys and their associated values, and hash table uses a hash function to map keys to their associated values. The general convention is that we use a hash table when the number of keys actually stored is small relative to the number of possible keys.
2) Hash Functions :
The fixed process to convert a key to a hash key is known as a hash function.
The hash function is used to transform the key into the index. elements, a hash function that maps each item into a unique slot is referred to as a perfect hash function. Given an arbitrary collection of elements, there is no systematic way to construct a perfect hash function.
Our goal is to create a hash function that minimizes the number of collisions, is easy to compute, and evenly distributes the elements in the hash table.
Characteristics of Good Hash Functions :
A good hash function should have the following characteristics:
- Minimize collision
- Be easy and quick to compute
- Distribute key values evenly in the hash table
- Use all the information provided in the key
- Have a high load factor for a given set of keys
3) Collisions :
Hash functions are used to map each key to a different address space, but practically it is not possible to create such a hash function and the problem is called collision. Collision is the condition where two records are stored in the same location.
No comments:
Post a Comment