Database – What are the differences between algorithms using data structures and algorithms using databases

algorithmsdata structuresdatabaseprogramming practices

The General Question

What are the differences between algorithms using data structures and algorithms using databases?

Some Context

This is a question that has been bugging me for some time, and I have not been able to come up with a convincing answer for it.

Currently, I am working on strengthening my understanding of algorithms that, of course, heavily involve data structures. These are basic structures such as Bag, Queue, Stack, Priority Queue, and Heap.

I also use databases on a daily basis to store the data that has been processed and submitted by the end-user or processed by the program. I retrieve and submit the data through a DAL, which has data structures of its own that are generated based on the tables in the database.

My questions come when I have the option to sort the data using the database to send it back to me ordered in either an ascending/descending fashion or retrieve and load the data into my logic, process this data in a priority queue, and heap sort all of it. Or another one would be to search for records using the database rather than loading a subset of the records and using something like binary search to find the record or records I am interested in.

In my mind, I would try to have as many operations take place on the database-end before sending it over because communication is expensive. This also makes me wonder when do you use algorithms and data structures strictly defined within your own logic rather to process data than that of the database's?

So here are the questions…

Questions

  1. What are the differences between data structures and databases?
  2. When do we use algorithms that use data structures defined solely within your own logic and not that of the database's?
  3. @Harvey post: When do the methods in the database become less efficient to use than methods in your own logic?
    • @mirculixx post: What makes a method efficient?
  4. @Harvey post: How is processing data with data structures faster than doing it in the database?

Clarifications

  1. @Grant post: The databases I normally work with are relational, and these questions are coming out of working with them. However, I do think these questions are applicable to any persistence framework (when I say framework, I mean it in the most general sense).

I know answers without a specific context are difficult. Food-for-thought, advice, or discussion points are mainly what I'm looking for and would be most appreciated!

Best Answer

Data Structures are, for the most part:

  1. Memory resident,
  2. Transient,
  3. Limited in size,
  4. Not re-entrant without adding concurrency mechanisms like locks or immutability,
  5. Not ACID compliant,
  6. Fast, if chosen carefully.

Databases are, for the most part:

  1. Disk-bound,
  2. Persistent,
  3. Large,
  4. Safely concurrent,
  5. ACID compliant, with transactional capabilities,
  6. Slower than data structures

Data structures are meant to be passed from one place to another, and used internally within a program. When was the last time you sent data from a web page to a web server using a database, or performed a calculation on a database that was entirely resident in memory?

Database systems use data structures as part of their internal implementation. It's a question of size and scope; you use data structures within your program, but a database system is a program in its own right.

Related Topic