Mathias Meyer
Mathias Meyer

Tags

The idea of building and storing user timelines (think Twitter) in Riak confused me at first. It sounds like such a spot-on case for time series databases. Yet Yammer managed to make the idea pretty popular. The whole thing lacked implementation though, because they kept their to themselves, which I don’t blame them for at all.

Apple Mac Timeline

So let’s have a look at how simple it is to build one. You can see a timeline of all Apple Macintosh products above, but that’s not what we want to build.

Instead we want to build something like the Twitter timeline. A user follows many other users, and wants to look at a feed built from their activities, so something like the timeline shown below.

Twitter timeline

How do you model a timeline in Riak?

For every user we store one object in Riak. Every timeline contains a list of tweet ids, or whatever activity you’re referencing, or it can contain the whole tweets. Something like this should work:

{
"entries": [
"147358793708220417",
"147358442070351873",
"147239915816620033",
"147338086110597120"
]
}
view raw timeline.json hosted with ❤ by GitHub

If you want to store more data, turn the list into an array of hashes containing whatever information is necessary to rebuild the timeline later.

{
"entries": [
{
"id": "147358793708220417",
"user": "roidrage",
}, {
"id": "147358442070351873",
"user": "technoweenie"
}, {
"id": "147239915816620033",
"user": "seancribbs"
}, {
"id": "147338086110597120",
"user": "pharkmillups"
}
]
}

Adding entries to the timeline

To add new entries to a timeline, prepend them to the existing list of entries, here’s some Ruby code to show how it’s done.

def add(user, activity)
bucket = @riak.bucket("timelines")
feed = bucket.get_or_new(user)
feed.data.unshift(activity)
feed.store()
end
view raw add_entry.rb hosted with ❤ by GitHub

The code assumes you take care of followership somewhere else. You can store that data in Riak too, but the code is oblivious to its whereabouts.

Conflicts, siblings, oh my!

The fun starts when two clients update the same timeline, you get a conflict and siblings. The strength of a simple data structure like the example above is that they’re easy to merge together while still keeping ordering based on the ids. The ids are ordered only in this example, Twitter somewhat makes sure they are.

When you get a conflict, a smart Riak library like Ripple helps you find out about it. To add on the earlier example, here’s a version of add that detects conflicts.

def add(user, activity)
bucket = @riak.bucket("timelines")
feed = bucket.get_or_new(user)
if feed.conflict?
feed.data = merge(feed.map(&:data))
end
feed.data.unshift(activity)
feed.store()
end

Suddenly you have two or more objects instead of one, each containing a different timeline. To turn them into a single list, you merge all of them together, discard the duplicates, and restore order based on the id. Here’s some Ruby to do that.

def merge(*timelines)
activities = []
timelines.each do |timeline|
timeline.each do |activity|
if not activities.include?(activity)
activities << activity
end
end
end
activities
end

You iterate over all timeline objects and keep adding unique activities to a new list, returning that when done.

Sort, and done!

All that’s left to do is sort the result.

def sort(timeline)
timeline.sort do |activity1, activity2|
activity1['id'] <=> activity2['id']
end
end
def add(user, activity)
bucket = @riak.bucket("timelines")
feed = bucket.get_or_new(user)
if feed.conflict?
feed.data = sort(merge(feed.map(&:data)))
end
feed.data.unshift(activity)
feed.store()
end

There’s the whole code. To spare you the pain of having to write your own library, all this is bundled into a gem called riaktivity. If you’re doing Python, the Brett Hoerner has got you covered with timak. Be sure to watch the original talk by Yammer, it’s well worth it.

There’s ups and downs to this approach, and things that need to be taken care of. More on that and modeling data for Riak in general is covered in the Riak Handbook, the definitive guide on Riak.