I have a csv of employee ids, names, and a reference column with the id of their direct manager, say something like this
emp_id, emp_name, mgr_id
1,The Boss,,
2,Manager Joe,1
3,Manager Sally,1
4,Peon Frank,2
5,Peon Jill,2
6,Peon Rodger,3
7,Peon Ralph,3
I would like to be able to generate a (json) object representing this structure, something along the lines of
DATA = {
"parent": "The Boss",
"children: [
{
"parent": "Manager Joe",
"children": [ {"parent": "Peon Frank"}, {"parent": "Peon Jill"} ]
},
{
"parent": "Manager Sally",
"children": [ {"parent": "Peon Rodger"}, {"parent": "Peon Ralph" } ]
}]
}
So from the data, an entry with no mgr_id represents something like the CEO or Leader.
So just collecting some thoughts.. I know this could be represented by some tree data structure with the tree traversal generating the correct output. The parser would have to be responsible for inserting into the tree. Maybe the number of children would give you a weight? Just gathering thoughts. Not sure how I would pivot around the fact that multiple objects can be constructed with children.
Is there an algorithm defined that can parse this structure I am not thinking of? Descending down into the children seems relatively straightforward. I am not seeing this in my head, and could use some help. Thank you
Best Answer
There are several ways to create structure from flatland. Recursion is one of the best. This program uses it, with a preliminary step of figuring out which items are parents.
While the strategy is language-agnostic, every language has different details of data structures and building blocks. I've rendered the approach in Python.
NB, about half this code is for display and demonstration purposes, so you can follow along and see how the algorithm is working. For production, fee free to remove those parts.
Oh...and I changed the labels from 'parent' to 'name' and 'children' to 'reports' because I found that more agreeable. You can, of course, choose whatever you like.
Running this shows the following output:
Some preliminaries: It also uses
print
to help show nested data structures. We'd normally get data through a database connection, here we just parse it out of static text. Finally, while the data you presented is beautifully ordered, with the bosses at the top and with the lowest employeed id numbers (simplifying the) problem, I'd like to confirm that the code works in any order. So I've modified some of the id numbers to reflect a non-sequential allocation, and brought inrandom.shuffle
to randomize the order of data. You wouldn't do this in production, but as a part of testing, it increases confidence the logic is working by design, not accident.