Coming up with a valid ranking algorithm for articles

algorithmsarticles

I'm creating a website which will publish one or more articles a week.

I'd like to be able to rank the articles based on popularity. In order to do this, I need a formula to calculate a score for each article. This formula should reward popular articles (those that receive many views), while also bringing new articles closer to the top in order to keep the content fresh and interesting.

The website doesn't support voting on the articles. The readers will be able to comment and I can have the site track the amount of unique views the article gets.


My attempt

I have come up with a way to calculate a score for each article but I'm not sure if it's the right way to do it, or if it will work at all.

This formula calculates the score:

score = unique_views / ( hours_since_release * 4 )

The formula keeps articles with many views afloat, while also making sure that they'll eventually subside to leave space for new articles.

I wrote a small program to test this formula. Here is the output of one run.

This script generates between 0 and 2 articles every day, and simulates their score as time progresses.

When an article is created, it chooses a random number (M) between 100 and 1000, representing its popularity. Every "day", the script adds a random number of views between 0 and M to each article.

The actual views shown in the output (V) are the result of this formula:

view_count = ( 10 * log10( view_count ) ) ^ 2

I used this to make sure the view count doesn't keep growing forever. Instead, its growth will slow down as time goes, as it might in real life.

The values A and S in the output represent respectively the age in days of the article and its score, based on the formula shown above.

This formula causes some interesting behavior: if you look at the game Tomb Raider: Anniversary, it spawns with a big Mnumber, which means it's going to be very popular. It gets to the top on the first day, goes down in the next 2 days and then comes back to the top for 3 days in a row due to the amount of views it gets.

This seems to be the behavior I'm looking for, but I'm not sure if it will work in practice, or even if this is the right approach at all.

You can find the source of my rough prototype here if you'd like to run it yourself.


Here comes the question:

How can you come up with and test ranking algorithms so that they satisfy the conditions of the problem, using the available variables?

Secondary question: Is my ranking algorithm appropriate, and would it work in the situation I described?


EDIT: From the links in maythesource.com's answer, I just realized there's also the issue of score gaming. Would only considering unique visits somehow prevent gaming, or is it a weak countermeasure?

Best Answer

Your problem is similar to application markets. Take for example google play - they don't use only one list.

Featured: new apps hand-picked by Google Play team.
Staff Picks: rotating set of apps chosen by Google Play team.
Top Free: most popular free apps of all time.
Top New Free: most popular free apps less than 30 days old.
...
Trending Apps: apps showing growth in installs in the last 24 hours.

See: https://support.google.com/googleplay/android-developer/answer/1295940?hl=en

In the same way, I don't think it's enough to only have one list and one algorithm. As it is, your list may get into a feedback loop and cause higher ranked articles to be pushed even higher.

If however, you created two lists (or three) you could offset this.

Examples:

  • Top New Articles- Scan all articles within the last week that have more than X rating and Y viewes and order with following algorithm: Group applications by day, then order applications within day by views+rating.
  • Overall Top Articles - Display articles based on their all time views/rating/popularity.
  • Featured/Editor's picks - Select the articles that really stand out and display these.

It's good if you have some control over what is featured and have a minimum level of quality in what is featured/sponsored to your viewers.

Also have a look at some articles such as:

Related Topic