How do social networking websites compute friend updates

scalabilitysocial-networking

Social networking website probably maintain tables for users, friends and events…

How do they use these tables to compute friends events in an efficient and scalable manner?

Best Answer

Many of the social networking sites like Twitter don't use an RDBMS at all but a Message Queue application. A lot of them start out with a already present application like RabbitMQ. Some of them get big enough they have to heavily customize or build their own. Twitter is in the process of doing this for the second time.

A message queue application works by holding messages from one service for one or more other services. For instance say service Frank is publishing messages to a queue foo. Joe and Jill are subscribed to Franks foo queue. the application will keep track of whether or not Joe or Jill have recieved the messages and once every subscriber to the queue has recieved the message it discards it. Frank fires messages and forgets about it. Joe and Jill ask for messages from foo and get whatever messages they haven't gotten yet. Joe and Jill do whatever they need to do with the message. Perhaps keeping it around perhaps not.

The message queue application guarantees that everyone who is supposed to get the message can and will get the message when they request them. The publisher can send the messages confident that subscriber can get them eventually. This has the benefit of being completely asynchronous and not requiring costly joins.

EDIT: I should mention also that usually the storage for these kind of things at high scale are heavily denormalized. So Joe and Jill may be storing a copy of the exact same message. This is considered ok because it helps the application scale to billions of users.

Other reading:

  1. http://www.rabbitmq.com/
  2. http://qpid.apache.org/
Related Topic