Design – Using a relational database vs JSON objects for event/activity data

designjsonsql

I am working on a project where I am trying to decide between using a standard SQL relational database or JSON objects to store data about an event or activity.

The project will store data on multiple event types so I have decided to just describe one event type for this question.

The live music event (described in full using the JSON schema at the bottom of this question) is an object that stores data such as where the event will take place, the time/date of the event and the cost of the event. The live music event object has both one-to-one (event–> name, event –> description) and one-to-many (event–> venues, event–> dates, event–> ticket types) relationships. Furthermore, the event object can contain one or more performer IDs, which link to the performer object. The performer object stores data on musicians who are performing at the live music event.

The data will be queried by users using both simple ("Find me events with 'x' name") and complex ("Find me events with 'x' music genre and 'y' cost within a radius of 'z' from my current location") queries. The data will be submitted by users using a web form.

As you can probably tell from the defined JSON schema, I was originally going to use JSON objects to store this data but I've heard from some people who say that because my data is purely relational, I should stick to the older methods.

I would appreciate any thoughts on the pros and cons of each approach given my needs. If you need anything clarified, please feel free to ask.

{
    "event": {
        "eventID":{
            "type":"string"
        },  
        "eventType":{
            "type":"array",
            "eventTypeItem":{
                "type":"string"
            }
        },
        "eventName":{
            "type":"string"
        },      
        "eventDescription":{
            "type":"string"
        },
        "eventVenueList":{
            "type":"array",
            "eventVenueListID":{
                "type":"integer"
            }
        },
        "eventURL":{
            "type":"string"
        },
        "eventTwitter":{
            "type":"string"
        },
        "eventFB":{
            "type":"string"
        },
        "eventInstagram":{
            "type":"string"
        },
        "eventEmail":{
            "type":"string",
            "format":"email"
        },
        "eventContactPerson":{
            "type":"string"
        },
        "eventDoorTime": {
            "type":"string",
            "format":"date-time"
        },  
        "eventPerformerIDList":{
            "type":"array",
            "liveMusicPerformerID":{
                "type":"integer"
            }
        },  
        "eventSetList":{
            "type":"array",
            "eventPerformerID":{
                "type":"integer"
            },
            "eventPerformerStartTime":{
                "type":"string",
                "format":"date-time"
            },
            "eventPerformerEndTime":{
                "type":"string",
                "format":"date-time"
            }                                   
        },
        "eventDateList": {
            "type":"array",
            "eventDateItem": {
                "type":"string",
                "format":"date-time"
            }   
        },
        "eventDateStartTime": {
            "type":"string",
            "format":"date-time"
        },
        "eventDateEndTime": {
            "type":"string",
            "format":"date-time"
        },
        "eventTicket":{ 
            "type":"array",
            "eventTicketType":{
                "type":"string" 
            },
            "eventTicketLowPrice":{
                "type":"number"
            },
            "eventTicketHighPrice":{
                "type":"number" 
            },
            "eventDatesAdvancePrice": {
                "type":"number"
            }   
        }
    },  
    "performer": {
        "performerID": {
            "type":"integer"
        },
        "performerType": {
            "type":"string"
        },
        "performerName": {
            "type":"string"
        },
        "performerAlternateName": {
            "type":"array",
            "performerAlterateNameItem":{
                "type":"string"
            }
        },
        "performerGenreList": {
            "type":"array",
            "performerGenreItem":{
                "type":"string"
            }
        },
        "performerURL": {
            "type":"string"
        }                                       
    }
}   

Best Answer

I think your question really boils down to: When should I use a NoSQL approach vs. RDBMS? You settled on JSON early (a NoSQL-ish decision), perhaps because you've got Ajax consumers.

The answer of course to when to use NoSQL approaches vs. RDBMS's is basically about what type of data you're working with and what consumers you anticipate having. If your data is essentially relational (fairly flat hierarchies, no weird data types like images or audio, predictable relationships between the schemas that can be easily described in keys), and your consumers are anticipated to eventually include people who want to do Business Intelligence queries (ad hoc querying), then an RDBMS is the way to go. It's fairly easy to turn a query into a JSON representation, so it doesn't significantly burden your Ajax consumers -- it just adds a little transformation coding into your endpoints (REST/SOAP/whatever). Conversely, if your data is very hierarchical (deep schemas), contains weird data types like images, audio, video, etc., there are few relationships between entities, and you know that your end users will not be doing BI, then NoSQL/storing JSON may be appropriate.

Of course, even these general guidelines aren't rock solid. The reason Google developed Google File System, MapReduce (work which was used by Doug Cutting to build Hadoop at Yahoo) and later BigQuery (a NoSQL oriented [schemaless] way of managing large scale data) was precisely because they had a lot of ad hoc BI requests, and they couldn't get relational approaches to scale up to the tera/peta/exa/zetta/yotta scales they were trying to manage. The only practical approach was to scale out, sacrificing some of ad-hoc-query user friendliness that an RDBMS provides, and substituting a simple algorithm (MapReduce) that could be coded fairly easily for any given query.

Given your schema above, my question would basically be: Why wouldn't you use an RDBMS? I don't see much of a reason not to. Our profession is supposed to be engineering oriented, not fashion oriented, so our instinct should be to pick the easiest solution that works, right? I mean, your endpoints may have to do a little translation if your consumers are Ajaxy, but your data looks very flat and it seems likely that business users are going to want to do all kinds of ad hoc querying on things like music events (Which event was most attended within 50 miles of our capital city last year?)

'Go not to the elves for counsel, for they will say both no and yes.' -- Frodo

Related Topic