Database Design – Why Not Expose a Primary Key

designSecuritytheory

In my education I have been told that it is a flawed idea to expose actual primary keys (not only DB keys, but all primary accessors) to the user.

I always thought it to be a security problem (because an attacker could attempt to read stuff not their own).

Now I have to check if the user is allowed to access anyway, so is there a different reason behind it?

Also, as my users have to access the data anyway I will need to have a public key for the outside world somewhere in between. Now that public key has the same problems as the primary key, doesn't it?


There has been the request of an example on why do that anyway, so here is one.
Keep in mind that the question is meant to be about the principle itself not only if it applies in this example. Answers addressing other situations are explicitly welcome.

Application (Web, Mobile) that handles activity, has multiple UIs and at least one automated API for intersystem communication (e.G. the accounting department wants to know how much to charge the customer based on what has been done).
The Application has multiple customers so separation of their data (logically, the data is stored in the same DB) is a must have of the system. Each request will be checked for validity no matter what.

Activity is very fine granular so it is together in some container object, lets call it "Task".

Three usecases:

  1. User A wants to send User B to some Task so he sends him a link (HTTP) to get some Activity done there.
  2. User B has to go outside the building so he opens the Task on his mobile device.
  3. Accounting wants to charge the customer for the Task, but uses a third party accounting system that automatically loads the Task / Activity by some code that refers to the REST – API of the Application

Each of the usecases requires (or gets easier if) the agent to have some addressable identifier for the Task and the Activity.

Best Answer

Also, as my users have to access the data anyway I will need to have a public key for the outside world somewhere in between.

Exactly. Take the stateless HTTP, who would otherwise not know what resource it should request: it exposes your question's ID 218306 in the URL. Perhaps you're actually wondering whether an exposed identifier may be predictable?

The only places where I've heard a negative answer to that, used the rationale: "But they can change the ID in the URL!". So they used GUIDs instead of implementing proper authorization.

I can imagine one situation where you don't want your identifiers to be predictable: resource harvesting. If you have an site that publicly hosts certain resources others may be interesting in, and you host them like /images/n.jpg or /videos/n.mp4 where n just is an incrementing number, anyone looking at traffic to and from your website can harvest all your resources.

So, to directly answer your question: no, it is not bad to directly "expose" identifiers that only have meaning to your program, usually it is even required for your program to succesfully operate.

Related Topic