How to implement the activity stream in a social network

Architecturedesign-patternssocial-networkingstream

I'm developing my own social network, and I haven't found on the web examples of implementation the stream of users' actions… For example, how to filter actions for each users? How to store the action events? Which data model and object model can I use for the actions stream and for the actions itselves?

Best Answer

Summary: For about 1 million active users and 150 million stored activities, I keep it simple:

Use a relational database for storage of unique activities (1 record per activity / "thing that happened") Make the records as compact as you can. Structure so that you can quickly grab a batch of activities by activity ID or by using a set of friend IDs with time constraints.
Publish the activity IDs to Redis whenever an activity record is created, adding the ID to an "activity stream" list for every user who is a friend/subscriber that should see the activity.

Query Redis to get the activity stream for any user and then grab the related data from the db as needed. Fall back to querying the db by time if the user needs to browse far back in time (if you even offer this)

I use a plain old MySQL table for dealing with about 15 million activities.

It looks something like this:

id             
user_id       (int)
activity_type (tinyint)
source_id     (int)  
parent_id     (int)
parent_type   (tinyint)
time          (datetime but a smaller type like int would be better)

activity_type tells me the type of activity, source_id tells me the record that the activity is related to. So if the activity type means "added favorite" then I know that the source_id refers to the ID of a favorite record.

The parent_id/parent_type are useful for my app - they tell me what the activity is related to. If a book was favorited, then parent_id/parent_type would tell me that the activity relates to a book (type) with a given primary key (id)

I index on (user_id, time) and query for activities that are user_id IN (...friends...) AND time > some-cutoff-point. Ditching the id and choosing a different clustered index might be a good idea - I haven't experimented with that.

Pretty basic stuff, but it works, it's simple, and it is easy to work with as your needs change. Also, if you aren't using MySQL you might be able to do better index-wise.

For faster access to the most recent activities, I've been experimenting with Redis. Redis stores all of its data in-memory, so you can't put all of your activities in there, but you could store enough for most of the commonly-hit screens on your site. The most recent 100 for each user or something like that. With Redis in the mix, it might work like this:

Create your MySQL activity record
For each friend of the user who created the activity, push the ID onto their activity list in Redis.
Trim each list to the last X items

Redis is fast and offers a way to pipeline commands across one connection - so pushing an activity out to 1000 friends takes milliseconds.

For a more detailed explanation of what I am talking about, see Redis' Twitter example: http://redis.io/topics/twitter-clone

Update February 2011 I've got 50 million active activities at the moment and I haven't changed anything. One nice thing about doing something similar to this is that it uses compact, small rows. I am planning on making some changes that would involve many more activities and more queries of those activities and I will definitely be using Redis to keep things speedy. I'm using Redis in other areas and it really works well for certain kinds of problems.

Update July 2014 We're up to about 700K monthly active users. For the last couple years, I've been using Redis (as described in the bulleted list) for storing the last 1000 activity IDs for each user. There are usually about 100 million activity records in the system and they are still stored in MySQL and are still the same layout. These records let us get away with less Redis memory, they serve as the record of activity data, and we use them if users need to page further back in time to find something.

This wasn't a clever or especially interesting solution but it has served me well.

Related Solutions

C# – How to copy the contents of one stream to another

From .NET 4.5 on, there is the Stream.CopyToAsync method

input.CopyToAsync(output);

This will return a Task that can be continued on when completed, like so:

await input.CopyToAsync(output)

// Code from here on will be run in a continuation.

Note that depending on where the call to CopyToAsync is made, the code that follows may or may not continue on the same thread that called it.

The SynchronizationContext that was captured when calling await will determine what thread the continuation will be executed on.

Additionally, this call (and this is an implementation detail subject to change) still sequences reads and writes (it just doesn't waste a threads blocking on I/O completion).

From .NET 4.0 on, there's is the Stream.CopyTo method

input.CopyTo(output);

For .NET 3.5 and before

There isn't anything baked into the framework to assist with this; you have to copy the content manually, like so:

public static void CopyStream(Stream input, Stream output)
{
    byte[] buffer = new byte[32768];
    int read;
    while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
    {
        output.Write (buffer, 0, read);
    }
}

Note 1: This method will allow you to report on progress (x bytes read so far ...)
Note 2: Why use a fixed buffer size and not input.Length? Because that Length may not be available! From the docs:

If a class derived from Stream does not support seeking, calls to Length, SetLength, Position, and Seek throw a NotSupportedException.

C# – How to save a stream to a file in C#

As highlighted by Tilendor in Jon Skeet's answer, streams have a CopyTo method since .NET 4.

var fileStream = File.Create("C:\\Path\\To\\File");
myOtherObject.InputStream.Seek(0, SeekOrigin.Begin);
myOtherObject.InputStream.CopyTo(fileStream);
fileStream.Close();

Or with the using syntax:

using (var fileStream = File.Create("C:\\Path\\To\\File"))
{
    myOtherObject.InputStream.Seek(0, SeekOrigin.Begin);
    myOtherObject.InputStream.CopyTo(fileStream);
}

Best Answer

Related Solutions

C# – How to copy the contents of one stream to another

C# – How to save a stream to a file in C#

Related Topic