If you're stuck with your list data structure
If you strive purely to minimize number of priority updates, there's not too much you can do beyond the binary-search approach.
Otherwise, one possible smallish improvement I can think of is, whenever you insert an item C between B and D, check the item A to the B and E to the right of D and redistribute their priorities.
So for something like:
C
↓
A B D E
0 1 7 8
You'd end up with:
A B C D E
0 2 4 6 8
Note that A
and E
don't change, their priorities are just required in the calculation.
Or you can leave B
where it is and only update D
, or vice versa, depending on whether A
and D
or B
and E
are further apart.
If you can pick another data structure
It would be better to pick one where the priority is structurally defined, not explicitly for each element.
A modified balanced binary search tree could work here, where you store the number of ancestors for each node.
The comparison for insertion would then be defined purely on where you want to insert it compared with the counts of each node.
Note that something like a red-black tree doesn't require any comparisons to keep itself balanced.
Update, insert and delete operations will all take O(log n)
.
Depending on your requirements, you can have a secondary data structure that allows you to look up some node for a given element.
If the number of hours each person is available is constant (e.g. it's always 1 hour), then your problem can be modeled as a Network Flow, easily solvable in polynomial time with Ford-Fulkerson's algorithm.
To build your flow network, create two layers of nodes; the first layer represents your balls, the second layer represents your buckets. Add an edge from the source to each ball with capacity 1. Add an edge from each ball to each compatible bucket with capacity 1. Add an edge from each bucket to the sink with capacity corresponding to the bucket capacity. Maximum flow represents optimal assignment.
Otherwise, if the number of hours each person can contribute varies, then the above algorithm does not work due to your constraint that each person can only visit one location. (Network flow would solve the problem, but it might split one person's hours across multiple locations.) In this case you have something that's related to a Knapsack or Bin Packing problem.
If the number of hours each person can contribute is an integer, you can solve it via pseudo-polynomial dynamic programming. Otherwise, you have to either:
- Implement a brute-force backtracking algorithm. This will give you an exact solution but will not work if the number of people or locations exceeds 50 or so.
- Implement a combinatorial heuristic method, as Doc Brown suggested. Hill Climbing or Simulated Annealing will perform very well for this problem and give you optimal or nearly optimal results very quickly.
Best Answer
The problem you have here in general is called the Scheduling Problem and in its general form is NP-complete. In other words, it is a well-known problem for which there exists no efficient algorithm (yet, or maybe never).
The linked page (already mentioned in a comment before) refers to one specific variant called job shop scheduling. There are many different scheduling problems, which all come down to be NP-complete in some way.
In your case, however, I would seriously rethink about the need to have priorities. Job scheduling problems usually have hard dependencies (like job A needs to be finished for job B to be able to start). Priorities are a whole different problem dimension and will complicate your task tremendously - something which may not be worth it. It is hard enough, if you simply say that task T1 can only be performed by workers W1 or W2 - without the need to prefer W1.
One note if you check the literature: what in your case are workers are typically machines in the literature - think automation factories where T1,...,Tn tiles are produced by M1,...,Mm machines.
Also one of the first things to check is if you need to find the perfect solution at all. Generally, NP-completeness stems from the requirement that the best solution has to be found. In your description you write things like T1 will be skipped, which is normally not possible in the theoretical scheduling problem formulations. Given that you have no need to schedule all tasks, you may also not have a requirement to solve the problem optimally. In that case, you can either try to write a Greedy-like algorithm yourself or look at the research being done on approximation algorithms for scheduling. These algorithms have the advantage of being fast, but the disadvantage of not necessarily finding an optimal solution.