Database – Is it a bad practice to keep database schema scripts (DDL) and manipulation (DML) scripts in different modules

databasemodulessql

we have a project structure like the following

"module-shared" module depends on "module-database" module and some other modules depend on "module-shared" but no other module depends on "module-database".

All the DML SQL scripts are stored in "module-shared" but DDL are stored in "module-database".

  • The advantage of this design is that

we don't have to release the database if changes are around DML in "module-shared" and use database integration tests to make sure they are still compatible.

  • The disadvantage is that

this creates some confusions on the versions. For example you can have "module-shared" 2.3.1 depends on "module-database" 1.4.2 and eventually we will forget which version is compatible with which.

Questions:

Would it be a better design if we have a "module-database" which contains both DDL and DML (and all other database operations if any) and module-shared contains everything it had except database operations?

What's your experiences on having multi-tier architectures? do you always use modules to segregate tiers? Do you have experiences where you may have DAO and Services in the same module? If you do, where did you keep the DDL schema scripts?

UPDATE: forget to mention that the DDL is versioned using database script versioning system such as flyway. So that all DDL changes are incremental

Note: e.g. a module could be gradle module, maven module, etc.

Note: I am trying my best to be not biased while asking this question. So please assume that I don't have any preference on either solution. 🙂

Best Answer

In practice, it is difficult to release a new database version. While it is nice to have a module with the current DDL, in many cases you will likely be modifying an existing database. I would keep it as a separate module.

The practice for DDL that I use, is to generate an upgrade (and downgrade) script for each release. Changes that break the current release (deletions and new constraints) get deferred until the next release. There are some other consideration that allow two releases to work with a database version. These scripts include the DML release they apply to in their name.

Most of your DML should work over a wide range of versions. It is possible that you will have many DML releases without DDL changes. In other cases, you may have DDL changes to resolve performance issues that have no corresponding DML changes.

If you tag your code in your version control system, it really doesn't matter much which of the two modules changed. Unless you check-in a version identifier, it is OK to have multiple tags on the same code.

EDIT: I've taken a quick look at flyway. I don't think it significantly alters the above recommendations. I would suggest making your application DML dependencies allow the DDL version to increase. I wouldn't want to break my application by adding an index, column, or table. (DDL changes may include DML to update data which should be in the same repository as the DDL.)

Consider deprecating columns and tables that are being dropped. You should be able to use text search on the DML to identify and remove use deprecated objects. This should allow you to test some regressions without having to regress the database.

Related Topic