Algorithms – Chunking an Array into Letter Groups of Equal Length

algorithmsarraylist

Given a list of items, I want to break it up into four groups with as equal length as possible. The items need to stay grouped by the first letter in each item as well.

26 letters / 4 groups would generally cover 6.5 letters in each group. If we had an equal amount of items starting with the same letter in each group it might look like this:

[A-F] (6 letters)
[G-M] (7 letters)
[N-S] (6 letters)
[T-Z] (7 letters)

However, in practice, we may find that our original list is heavy on the items in the [N-S] group.

[A-F] (50 items)
[G-M] (40 items)
[N-S] (70 items)
[T-Z] (40 items)

We may want to push all the items that start with N into group 2 and all the items that start with S into group 4 to achieve balance:

[A-F] (50 items)
[G-N] (50 items)
[O-R] (50 items)
[S-Z] (50 items)

Does anyone have any ideas of where to point me in terms of coming up with an algorithm that can solve this sort of problem.

I will most likely be using javascript on the client to implement whatever solution will work. I'd like to use as functional of an approach as possible.

Best Answer

Create 4 lists for the results and 26 lists to hold the interim items by initial letter then apply this pseudo-code:

foreach item in masterlist
    append item to initial letter list
    increment total counter

average bucket size = total/4

iterate letter buckets in order
    if current results bucket size  + length letter bucket < avergae bucket size
        append letter bucket to results bucket
    else
        append letter bucket to results bucket if it will be over average by less than it would be under if you do nothing
        move to next results bucket
Related Topic