Matlab – Hash tables in MATLAB

hashtableMATLAB

Does MATLAB have any support for hash tables?

Some background

I am working on a problem in Matlab that requires a scale-space representation of an image. To do this I create a 2-D Gaussian filter with variance sigma*s^k for k in some range., and then I use each one in turn to filter the image. Now, I want some sort of mapping from k to the filtered image.

If k were always an integer, I'd simply create a 3D array such that:

arr[k] = <image filtered with k-th guassian>

However, k is not necessarily an integer, so I can't do this. What I thought of doing was keeping an array of ks such that:

arr[find(array_of_ks_ = k)] = <image filtered with k-th guassian>

Which seems pretty good at first thought, except I will be doing this lookup potentially a few thousand times with about 20 or 30 values of k, and I fear that this will hurt performance.

I wonder if I wouldn't be better served doing this with a hash table of some sort so that I would have a lookup time that is O(1) instead of O(n).

Now, I know that I shouldn't optimize prematurely, and I may not have this problem at all, but remember, this is just the background, and there may be cases where this is really the best solution, regardless of whether it is the best solution for my problem.

Best Answer

Consider using MATLAB's map class: containers.Map. Here is a brief overview:

Creation:

>> keys = {'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', ...
  'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec', 'Annual'};

>> values = {327.2, 368.2, 197.6, 178.4, 100.0,  69.9, ...
  32.3,  37.3,  19.0,  37.0,  73.2, 110.9, 1551.0};

>> rainfallMap = containers.Map(keys, values)

rainfallMap = 
  containers.Map handle
  Package: containers

  Properties:
        Count: 13
      KeyType: 'char'
    ValueType: 'double'
  Methods, Events, Superclasses

Lookup:
```
x = rainfallMap('Jan');
```
Assign:
```
rainfallMap('Jan') = 0;
```
Add:
```
rainfallMap('Total') = 999;
```
Remove:
```
rainfallMap.remove('Total')
```

Inspect:

values = rainfallMap.values;
keys = rainfallMap.keys;
sz = rainfallMap.size;

Check key:

if rainfallMap.isKey('Today')
    ...
end

Related Solutions

How does a hash table work

Here's an explanation in layman's terms.

Let's assume you want to fill up a library with books and not just stuff them in there, but you want to be able to easily find them again when you need them.

So, you decide that if the person that wants to read a book knows the title of the book and the exact title to boot, then that's all it should take. With the title, the person, with the aid of the librarian, should be able to find the book easily and quickly.

So, how can you do that? Well, obviously you can keep some kind of list of where you put each book, but then you have the same problem as searching the library, you need to search the list. Granted, the list would be smaller and easier to search, but still you don't want to search sequentially from one end of the library (or list) to the other.

You want something that, with the title of the book, can give you the right spot at once, so all you have to do is just stroll over to the right shelf, and pick up the book.

But how can that be done? Well, with a bit of forethought when you fill up the library and a lot of work when you fill up the library.

Instead of just starting to fill up the library from one end to the other, you devise a clever little method. You take the title of the book, run it through a small computer program, which spits out a shelf number and a slot number on that shelf. This is where you place the book.

The beauty of this program is that later on, when a person comes back in to read the book, you feed the title through the program once more, and get back the same shelf number and slot number that you were originally given, and this is where the book is located.

The program, as others have already mentioned, is called a hash algorithm or hash computation and usually works by taking the data fed into it (the title of the book in this case) and calculates a number from it.

For simplicity, let's say that it just converts each letter and symbol into a number and sums them all up. In reality, it's a lot more complicated than that, but let's leave it at that for now.

The beauty of such an algorithm is that if you feed the same input into it again and again, it will keep spitting out the same number each time.

Ok, so that's basically how a hash table works.

Technical stuff follows.

First, there's the size of the number. Usually, the output of such a hash algorithm is inside a range of some large number, typically much larger than the space you have in your table. For instance, let's say that we have room for exactly one million books in the library. The output of the hash calculation could be in the range of 0 to one billion which is a lot higher.

So, what do we do? We use something called modulus calculation, which basically says that if you counted to the number you wanted (i.e. the one billion number) but wanted to stay inside a much smaller range, each time you hit the limit of that smaller range you started back at 0, but you have to keep track of how far in the big sequence you've come.

Say that the output of the hash algorithm is in the range of 0 to 20 and you get the value 17 from a particular title. If the size of the library is only 7 books, you count 1, 2, 3, 4, 5, 6, and when you get to 7, you start back at 0. Since we need to count 17 times, we have 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, and the final number is 3.

Of course modulus calculation isn't done like that, it's done with division and a remainder. The remainder of dividing 17 by 7 is 3 (7 goes 2 times into 17 at 14 and the difference between 17 and 14 is 3).

Thus, you put the book in slot number 3.

This leads to the next problem. Collisions. Since the algorithm has no way to space out the books so that they fill the library exactly (or the hash table if you will), it will invariably end up calculating a number that has been used before. In the library sense, when you get to the shelf and the slot number you wish to put a book in, there's already a book there.

Various collision handling methods exist, including running the data into yet another calculation to get another spot in the table (double hashing), or simply to find a space close to the one you were given (i.e. right next to the previous book assuming the slot was available also known as linear probing). This would mean that you have some digging to do when you try to find the book later, but it's still better than simply starting at one end of the library.

Finally, at some point, you might want to put more books into the library than the library allows. In other words, you need to build a bigger library. Since the exact spot in the library was calculated using the exact and current size of the library, it goes to follow that if you resize the library you might end up having to find new spots for all the books since the calculation done to find their spots has changed.

I hope this explanation was a bit more down to earth than buckets and functions :)

Bash – How to define hash tables in Bash

Bash 4

Bash 4 natively supports this feature. Make sure your script's hashbang is #!/usr/bin/env bash or #!/bin/bash so you don't end up using sh. Make sure you're either executing your script directly, or execute script with bash script. (Not actually executing a Bash script with Bash does happen, and will be really confusing!)

You declare an associative array by doing:

declare -A animals

You can fill it up with elements using the normal array assignment operator. For example, if you want to have a map of animal[sound(key)] = animal(value):

animals=( ["moo"]="cow" ["woof"]="dog")

Or declare and instantiate in one line:

declare -A animals=( ["moo"]="cow" ["woof"]="dog")

Then use them just like normal arrays. Use

animals['key']='value' to set value
"${animals[@]}" to expand the values
"${!animals[@]}" (notice the !) to expand the keys

Don't forget to quote them:

echo "${animals[moo]}"
for sound in "${!animals[@]}"; do echo "$sound - ${animals[$sound]}"; done

Bash 3

Before bash 4, you don't have associative arrays. Do not use eval to emulate them. Avoid eval like the plague, because it is the plague of shell scripting. The most important reason is that eval treats your data as executable code (there are many other reasons too).

First and foremost: Consider upgrading to bash 4. This will make the whole process much easier for you.

If there's a reason you can't upgrade, declare is a far safer option. It does not evaluate data as bash code like eval does, and as such does not allow arbitrary code injection quite so easily.

Let's prepare the answer by introducing the concepts:

First, indirection.

$ animals_moo=cow; sound=moo; i="animals_$sound"; echo "${!i}"
cow

Secondly, declare:

$ sound=moo; animal=cow; declare "animals_$sound=$animal"; echo "$animals_moo"
cow

Bring them together:

# Set a value:
declare "array_$index=$value"

# Get a value:
arrayGet() { 
    local array=$1 index=$2
    local i="${array}_$index"
    printf '%s' "${!i}"
}

Let's use it:

$ sound=moo
$ animal=cow
$ declare "animals_$sound=$animal"
$ arrayGet animals "$sound"
cow

Note: declare cannot be put in a function. Any use of declare inside a bash function turns the variable it creates local to the scope of that function, meaning we can't access or modify global arrays with it. (In bash 4 you can use declare -g to declare global variables - but in bash 4, you can use associative arrays in the first place, avoiding this workaround.)

Summary:

Upgrade to bash 4 and use declare -A for associative arrays.
Use the declare option if you can't upgrade.
Consider using awk instead and avoid the issue altogether.