Data Structures usage and motivational aspects

datadata structuresusage

For long student life I was always wondering why there are so many of them yet there seems to be lack of usage at all in many of them. The opinion didn't really change when I got a job.

We have brilliant books on what they are and their complexities, but I never encounter resources which would actually give a good hint of practical usage. I perfectly understand that I have to look at problem , analyse required operations, look for data structure that does them efficiently. However in practice I never do that, not because of human laziness syndrome, but because when it comes to work I acknowledge time priority over self-development.

Over time I thought that when I would be better developer I will automatically use more of them – that didn't happen at all or maybe I just didn't. Then I found that the colleagues usually in the same plate as me – knowing more or less some three of data structures and being totally happy about it and refusing to discuss this matter further with me, coming back to conversations about 'cool new languages' 'libraries that do jobs for you' and the joy to work under scrumban etc.

I am stuck with ArrayLists, Arrays and SortedMap , which no matter what I do always suffice or either I tweak them to be capable of fulfilling my task. Yes, it might be inefficient but do we really have to care if Intel increases performance over years no matter if we improve our skills? Does new Xeon or IBM machines really care what we use? What if I like build things, but I am not particularly excited whether it is n log(n) or just n? Over twenty years the processing power increased enormously, which gives us freedom of not being critical about which one to use? On top of that new more optimized languages appear which support multiple cores more efficiently.

To be more specific: I would like to find motivational material on complex real areas/cases of possible effective usages of data structures. I would be really grateful if you would provide relevant resources. There is similar question ,but in the end the links again mostly describe or do dumb example(vehicles, students or holy grail quest – yes, very relevant) them and people keep referring to the "scenario decides the data structure to use". I want to know these complex scenarios to be able to identify similarities to my scenario and then use them. The complex scenarios where it really matters and not necessarily of quantitive nature. It seems that data structures only concern is efficiency and nothing else? There seems to be no particular convenience for developer in use one over another.

(only when I found scientific resources on why exactly simple carbohydrates are evil I stopped eating sugar and candies completely replacing it with less harmful fruits – I hope you can see the analogy)

Best Answer

I think there are a few reasons why developers tend to only use a few data structures.

There are general data structures that work well enough most of the time so they become the primary choice of data structures for most problems.

  • I know hash tables and lists are my "goto" data structures. I even know that in some of the cases I used hash tables when Red Black Trees or even AVL Trees would have been better choices. However, the hash table is more familiar to both me and my team. And choosing it doesn't impact the performance of our software very much.

Languages influence the choice by providing certain data structures as builtins leading to greater usage.

  • Whether it's arrays in C or hash tables in Perl and Python I see others (and myself) using these data structures more often because there more familiar and more readily available.

Programmers tend to head into one area during their careers and stay in that area.

  • I know database and filesystem implementers are much more familiar with B-Trees and tend to use that data structure in other things.

Depending on what you work on the choice of data structure may not matter that much within reason. Notice the "within reason". I'm not talking about using a free form text string and full text search to contain numerically indexed data that should in an array. The are reasons why it may not matter.

  • Your dealing with a small amount of data.
  • Your bottleneck is external to the code you control.

To me part of the criteria for journeyman and expert developers is knowing more than a couple of data structures, but more importantly being able to recognize when your "goto" data structures (or even the structures you know) aren't appropriate. And then doing the work and research to find the appropriate structure and use it.

Related Topic