Data Structures – Is Storing Large JSON in a Database Row Wise?

data structuresfile-systemsjson

I have this project which stores product details from amazon into the database.

Just to give you an idea on how big it is:

[
  {
    "title": "Genetic Engineering (Opposing Viewpoints)",
    "short_title": "Genetic Engineering ...",
    "brand": "",
    "condition": "",
    "sales_rank": "7171426",
    "binding": "Book",
    "item_detail_url": "http://localhost/wordpress/product/?asin=0737705124",
    "node_list": "Books > Science & Math > Biological Sciences > Biotechnology",
    "node_category": "Books",
    "subcat": "",
    "model_number": "",
    "item_url": "http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=128",
    "details_url": "http://localhost/wordpress/product/?asin=0737705124",
    "large_image": "http://localhost/wordpress/wp-content/plugins/ecom/img/large-notfound.png",
    "medium_image": "http://localhost/wordpress/wp-content/plugins/ecom/img/medium-notfound.png",
    "small_image": "http://localhost/wordpress/wp-content/plugins/ecom/img/small-notfound.png",
    "thumbnail_image": "http://localhost/wordpress/wp-content/plugins/ecom/img/thumbnail-notfound.png",
    "tiny_img": "http://localhost/wordpress/wp-content/plugins/ecom/img/tiny-notfound.png",
    "swatch_img": "http://localhost/wordpress/wp-content/plugins/ecom/img/swatch-notfound.png",
    "total_images": "6",
    "amount": "33.70",
    "currency": "$",
    "long_currency": "USD",
    "price": "$33.70",
    "price_type": "List Price",
    "show_price_type": "0",
    "stars_url": "",
    "product_review": "",
    "rating": "",
    "yellow_star_class": "",
    "white_star_class": "",
    "rating_text": " of 5",
    "reviews_url": "",
    "review_label": "",
    "reviews_label": "Read all ",
    "review_count": "",
    "create_review_url": "http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=132",
    "create_review_label": "Write a review",
    "buy_url": "http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=19186",
    "add_to_cart_action": "http://localhost/wordpress/wp-content/ecom-plugin-redirects/add_to_cart.php",
    "asin": "0737705124",
    "status": "Only 7 left in stock.",
    "snippet_condition": "in_stock",
    "status_class": "ninstck",
    "customer_images": [
      "http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/51M2vvFvs2BL.jpg",
      "http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/31FIM-YIUrL.jpg",
      "http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/51M2vvFvs2BL.jpg",
      "http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/51M2vvFvs2BL.jpg"
    ],
    "disclaimer": "",
    "item_attributes": [
      {
        "attr": "Author",
        "value": "Greenhaven Press"
      },
      {
        "attr": "Binding",
        "value": "Hardcover"
      },
      {
        "attr": "EAN",
        "value": "9780737705126"
      },
      {
        "attr": "Edition",
        "value": "1"
      },
      {
        "attr": "ISBN",
        "value": "0737705124"
      },
      {
        "attr": "Label",
        "value": "Greenhaven Press"
      },
      {
        "attr": "Manufacturer",
        "value": "Greenhaven Press"
      },
      {
        "attr": "NumberOfItems",
        "value": "1"
      },
      {
        "attr": "NumberOfPages",
        "value": "224"
      },
      {
        "attr": "ProductGroup",
        "value": "Book"
      },
      {
        "attr": "ProductTypeName",
        "value": "ABIS_BOOK"
      },
      {
        "attr": "PublicationDate",
        "value": "2000-06"
      },
      {
        "attr": "Publisher",
        "value": "Greenhaven Press"
      },
      {
        "attr": "SKU",
        "value": "G0737705124I2N00"
      },
      {
        "attr": "Studio",
        "value": "Greenhaven Press"
      },
      {
        "attr": "Title",
        "value": "Genetic Engineering (Opposing Viewpoints)"
      }
    ],
    "customer_review_url": "http://localhost/wordpress/wp-content/ecom-customer-reviews/0737705124.html",
    "flickr_results": [
      "http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/5105560852_06c7d06f14_m.jpg"
    ],
    "freebase_text": "No around the web data available yet",
    "freebase_image": "http://localhost/wordpress/wp-content/plugins/ecom/img/freebase-notfound.jpg",
    "ebay_related_items": [
      {
        "title": "Genetic Engineering (Introducing Issues With Opposing Viewpoints), , Good Book",
        "image": "http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/140.jpg",
        "url": "http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=12165",
        "currency_id": "$",
        "current_price": "26.2"
      },
      {
        "title": "Genetic Engineering Opposing Viewpoints by DAVID BENDER - 1964 Hardcover",
        "image": "http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/140.jpg",
        "url": "http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=130",
        "currency_id": "AUD",
        "current_price": "11.99"
      }
    ],
    "no_follow": "rel=\"nofollow\"",
    "new_tab": "target=\"_blank\"",
    "related_products": [],
    "super_saver_shipping": "",
    "shipping_availability": "",
    "total_offers": "7",
    "added_to_cart": ""
  }
]

So the structure for the table is:

  • asin
  • title
  • details (the product details in json)

Will the performance suffer if I have to store like 10,000 products? Is there any other way of doing this? I'm thinking of the following, but the current setup is really the most convenient one since I also have to use the data on the client side:

  • store the product details in a file. So something like ASIN123.json
  • store the product details in one big file. (I'm guessing it will be a drag to extract data from this file)
  • store each of the fields in the details in its own table field

Thanks in advance!

UPDATE

Thanks for the answers! I just want to add some more details to my question.
First, the records are updated for a specific interval. Only specific data such as the price or the title are updated.

Second, I'm also using the json encoded data in the client-side so I thought at first it would be easier to just have it json encoded so I can easily use it in the client side without having to convert. Does this change your opinion about simply storing the fields in a regular table field in an RDBMS setup?

Best Answer

Size is not so much of an issue, the ability to query and maintain the data however is.

If, for example, Greenhaven Press decides they want to change their name to Greenhaven Press International, you'll have to find the record, deserialize it, change it, serialize it, pump it back into the database.

Consider this: does storing these objects as serialized data offer you a clear added value over storing it in a relational form? If the answer is no, then it might not be worth the hassle.

UPDATE

As far as your update of your question goes: I'm inclined to say no, it makes little or no difference. Whether you update one field or all of them in this json string is irrelevant because the whole process is identical.

Don't forget that your requirements might change; even though you're using json on the client side now doesn't mean you'll need json in the future. Storing your data in a relational form guarantees technology-independence while preserving relationships, data constraints and queryable metadata: this is where the true value of a relational db lies. Discarding those advantages will neither give you a performance gain nor make your application more scalable or flexible.