In practice, it is difficult to release a new database version. While it is nice to have a module with the current DDL, in many cases you will likely be modifying an existing database. I would keep it as a separate module.
The practice for DDL that I use, is to generate an upgrade (and downgrade) script for each release. Changes that break the current release (deletions and new constraints) get deferred until the next release. There are some other consideration that allow two releases to work with a database version. These scripts include the DML release they apply to in their name.
Most of your DML should work over a wide range of versions. It is possible that you will have many DML releases without DDL changes. In other cases, you may have DDL changes to resolve performance issues that have no corresponding DML changes.
If you tag your code in your version control system, it really doesn't matter much which of the two modules changed. Unless you check-in a version identifier, it is OK to have multiple tags on the same code.
EDIT: I've taken a quick look at flyway. I don't think it significantly alters the above recommendations. I would suggest making your application DML dependencies allow the DDL version to increase. I wouldn't want to break my application by adding an index, column, or table. (DDL changes may include DML to update data which should be in the same repository as the DDL.)
Consider deprecating columns and tables that are being dropped. You should be able to use text search on the DML to identify and remove use deprecated objects. This should allow you to test some regressions without having to regress the database.
There are at least two questions being asked here (arguably a lot more, so I'll have to ignore many of the little ones), but the solution to both comes down to writing a class that effectively encapsulates retrieving (your) data from (your) files. Since you specifically asked whether it should be a class or something else, in this language classes are the main data abstraction that's capable of remembering things, which is obviously necessary if you want to avoid reading the same file over and over, so that's why this should be a class.
Since you haven't told me what any of the functions in module 2 actually do after calling func1, I'm going to make up some fairly generic requirements. Let's pretend func1 reads a comma-separated value file of numbers. Say func2 gets the sum of all numbers. Say func3 gets the average and standard distribution of each row.
What you want to do is write a class that handles reading and parsing these files, and exposes methods to retrieve certain aspects of the data it finds. For instance, you might have the class' constructor take a single filename/filepath argument, so the file gets read exactly once during construction, and all method calls on the object merely manipulate the in-memory representation of the data you've already read. The important thing is that no other module should know (exactly) how this class gets its data. They should see getNums() and getStdDev() and getTotal() methods, but they probably shouldn't know whether these values come from a .csv file or an Excel spreadsheet or a website or wherever, and they certainly should not know whether that file/site is being cached or not.
If you do this right, it will be trivial to add getEigenvectors() and whatnot to module2 whenever you feel like it without even thinking about what module1 is doing. That is always the main goal of reducing coupling. This applies to pretty much every language that supports OOP.
If you need more specific help you'll have to explain more about what these functions actually do and why the generic advice I just gave is insufficient.
Best Answer
When there are two modules in the standard lib with the same name, what often has happened is that the original module was written in Python. That's because it is a lot easier to prototype and get it working quickly than in lower-level languages.
Later, once a reasonable design has been found and bugs fixed etc, performance may become a focus. It's a good time to write the slow parts in Cython or the C API and speed them up through compilation to machine code. Typically the additions are placed in a
_module.so
or DLL and imported from within the originalmodule.py
.This avoids the work of prototyping/writing the entire thing in the C API, which is quite tedious.