(Late to the party, but I couldn't resist)
Let's straighten out some misconceptions.
Joins are expensive
As compared to what? Of course, reading one flat table is "cheaper" than joining tables, but any mature RDBMS is highly optimized for executing joins because they are inevitably part of sound database designs. Joins over foreign key constraints (the most common ones) are especially optimized. And of course, proper indexing is indispensable.
we should keep our database normalized and with the least queries executed as possible
The way you pose this, it seems to be a consequence of preventing these "expensive" joins. The reverse is true. Normalization will always result in more tables and, hence, more joins to query the same data as from a denormalized data schema. (Well, to be fair to you, later on you say "We prevent EAV and therefore, Joins.").
We should avoid an Entity-Attribute-Value approach (EAV), unless denormalization becomes desirable
EAV is all but denormalization. I'm under the impression that you don't fully understand what EAV is.
In an EAV design, attributes of a relation (aka fields, or columns, of a database table) are taken out of a relation and stored as records in an Attributes tables. The values are stored in yet another table that has foreign keys to the Attributes table and an Entity table. A record in the Attributes tables expresses one fact: this is value X of attribute Y in entity Z.
So with EAV, when applied rigorously, if you want to know the start date, end date, and cost of a tournament, you'll have to query the PGTournament table and join to Attribute and Value (with a WHERE
condition for the attributes). That's two joins instead of zero without EAV!
Nearly always, EAV is bad design. It's to be used when there's no alternative (for instance in lab applications where new analyses for samples can be invented every day -- a fixed set of fields in a Sample table won't suffice).
In your case, I don't see any reason to introduce EAV -- I don't even understand why you bring it up. I think it is because you confuse EAV with 1:1 associations. Read on.
Which means that we should adhere to OOP's SOLID Principles
The EF class model is part of a data access layer. It's not a domain model! At least, it's not its first responsibility to be that. The class properties should facilitate data access. That means that there will be bidirectional relationships and Id properties, to mention two OOP anti-patterns. And the real OOP bummer: the classes tend to be highly anemic. Whenever the EF classes can be used as domain classes, this is a mere bonus.
virtual properties without the List<>
type
Such properties are known as navigation properties because the "navigate" to other entities. Lists are collection navigation properties and entity-type properties (without the List<>
type) are reference navigation properties. They don't have to be virtual. When they're virtual, EF may be able to lazily load the properties.
this means that it is a one-to-one relationship
Why? Reference navigation properties are often the "1" part of a 1:n association. I think most of your reference properties are like that. For example, GameGenre
. I think there are many tournaments having the same GameGenre
. It's a 1
(genre) to n
(tournament) association, even if GameGenre
doesn't have a Tournaments
collection. Maybe only TournamentSettings
and MainImage
are actual 1:1 associations.
1:1 Associations distribute data belonging to one entity over multiple tables. There can be very good reasons to do that. One of them is to facilitate querying light-weight data without the heavy payload of some blob, like MainImage
. Another one is separating sensitive data from public data. Or common data (often queried) from specialized data (queried sometimes), maybe your TournamentSettings
.
Now, finally, your question:
should I favor a fully normalized design over a OOP Approach when modeling in Entity Framework?
You're comparing apples and oranges. Normalized design is database, OOP is class model. But if there is anything to favor, it's normalized design. A well-wrought database design is pivotal to any data-based application. Everything else follows. The EF class model will necessarily closely reflect the database structure. As I said above: it must be seen as a data access layer.
But whenever you model business logic, of course, try to do it as SOLID as possible. That means that sometimes you'll have to populate a specialized domain model out of the entities queried by EF, and sometimes the EF classes can be extended to encapsulate behavior and data (which is what OOP is all about).
You nailed the two obvious choices. But neither is quite as hard as you say, and which makes more sense depends on how much you expect to do in integrating content of one blog in another.
Separate Database approach:
This is the simplest to implement approach. You already have a connection string for connecting to your database. Just wrap it behind a function, that takes some sort of 'Request' object as an argument. It can then just return the right database connection string. Easy, and very modular. Essentially nothing changes in your application.
The only downside of this approach is that if you ever want to analyze across blogs (e.g. which Blog is the most active, which blog uses the word 'fart' most often, etc), that won't work well with this approach.
Within one database approach:
This requires an extra 'blogid' field in all your top-level tables (maybe all tables). Anywhere you grab data without first joining on another table (if you are already joining on a table with the blogid, you probably dont need to add the blogid to the new table - like perhaps the users table? This depends on the semantics you want too).
And yes, this does require a tiny change to your top level queries - saying where blogID=blogIDImWorkingOn. But it makes easier - stuff later - when you want to integrate/analyze content across blogs.
Hope this helps! Good luck!
Best Answer
What came first, the process, or the data used by that process? I know this is kind of a "chicken or the egg" question, but in the case of software, I believe it is the process.
For instance, you can build up your data model incrementally by implementing a single use-case at a time with just in-memory persistence (or anything as easy to implement). When you feel you've implemented enough use-cases to outline the basic entities, you may replace the in-memory persistence by a real database, and then continue to refine the schema as you go forward, one use-case at a time.
This takes the focus out of the database and moves it to the core of the problem: the business rules. If you start by implementing the business rules, you'll eventually find (by a process very similar to Natural Selection, by the way) which data is truly needed by the business. If you start by modeling the database, without the feedback of whether that data is truly needed (or in that format, or in that level of normalization, etc...), you'll either end up doing a lot of late adjustments in the schema (which may require heavy migration procedures, if the business is already running with it), or you'll have to implement "work-arounds" in the business rules to make up for the out-of-tune data model.
TL;DR: The database depends on the the business - it is defined by them. You won't need the data unless you have a process that operates with it (a report is also a process). Implement the process first, and you'll find which data it needs. Model the data first, and you may just be able to count how many assumptions were wrong when you first modeled it.
A little out of the topic but very important: the workflow I describe is often used along with very important practices such as "The simplest thing that could possibly work", test-driven development, and a focus on decoupling your architecture from the details that get in your way (hint: database). About the last one, this talk sums up the idea pretty well.