Relational Database Design – How to Allow Users to Create Custom Fields

database-designrelational-database

Ive been looking around for a way to design a database for large amounts of leads (users) with custom fields.

Ive check this out (How would you design a user database with custom fields), but this solution would limit the amount of custom fields.

So far ive been designing it in three tables:

leads                    (ID, phone, email)
leads_fields             (ID, name, type, required)
leads_fields_content     (ID, fields_id, lead_id, content)

Users can create all the leads_fields they need, fx. 20 fields.

So when I have a lead, I would go trough and check in leads_fields_content for lead_id, get that collection and get the corresponding leads_fields that is referenced.

I see this working – However, one client wants to upload 300.000 leads from day one. So thats 300.000 rows in leads. Then lets say that there are 20 fields for each lead. That would then create 300.000*20 rows in leads_fields_content which is 6.000.000 rows. Thats only for one client.

My question: is this at all the correct way of designing this, taking into account the amount of time it would take to go trough 300.000 rows, and then 6.000.000 afterwards? And this would only get exponentially bigger.

Best Answer

If your customers want to be completely free to define their own custom fields, your approach in general is fine. However, for the given numbers, you need to take some precautions to keep the system performant and scalable.

I guess when there are 300.000 leads with 20 fields, there is a high chance of having lots of those content entries empty for the non-required fields. In the suggested design, you can simply leave out those records from the leads_fields_content table, no need to create a record with an empty content for those. So if there is only a small percentage of the fields filled with content, you get only that small percentage of records in the leads_fields_content, not necessarily 6 millions. Your design already allows to store sparse tables efficiently!
Proper indexing should be obligatory. However, this is way easier for fields which are known at design time. So if you know some standard fields which are typically required for any customer and don't really need to be customizable (like a lead's name), it would probably be a good idea to make it a fixed part of the lead table instead. Then you can create a specific index for those fields.
If each of your customers has his own "custom schema", it is clear they want their data 100% separated from other customers. For example, you will never get any requirements for searching over the content of different customers. So it will probably be best to separate the data of each customer physically. How you accomplish this depends on the DBMS you are using: separate tables per customer, tables in different schemas, different table spaces, same table in different "logical" databases on one server, or completely different database instances per customer, maybe on different machines - it all depends what kind of DB system you are using, how many customers you have and how scalable you need the system to be.

Related Solutions

Database – How would you design a user database with custom fields

Please consider this as an alternative. The previous two examples will both require that you make changes to the schema as the application's scope grows in addition the "custom_column" solution is difficult to extend and maintain. Eventually you'll end up with Custom_510 and then just imagine how awful this table will be to work with.

First let's use your Companies schema.

[Companies] ComnpanyId, COMPANY_NAME, CREATED_ON

Next we'll also use your Users schema for top level required attributes that will be used/shared by all companies.

[Users] UserId, COMPANY_ID, FIRST_NAME, LAST_NAME, EMAIL, CREATED_ON

Next we build a table where we will define our dynamic attributes that are specific to each companies custom user attributes. So here an example value of the Attribute column would be "LikeMusic":

[UserAttributeDefinition] UserAttributeDefinitionId, CompanyId, Attribute

Next we define a UserAttributes table that will hold user attribute values

[UserAttributes] UserAttributeDefinitionId, UserId, Value

This can be modified in many ways to be better for performance. You can use multiple tables for UserAttributes making each one specific to the data type being stored in Value or just leave it as a VarChar and work with it as a keyvalue store.

You also may want to move CompanyId off of the UserAttributeDefiniton table and into a cross reference table for future proofing.

Best Answer

Related Solutions

Database – How would you design a user database with custom fields

Related Topic