User User name Password  
   
Saturday 5.7.2008 / 07:07 PM
Search:        In English   Suomeksi   På svenska
afterdawn.com / profiles / personal safety valve / blog archive / sharding a forum database /
Home Blog Pictures Shoutbox Links

Sharding a forum database

29 Aug 2007 10:29 (Edited: 29 Aug 2007 10:29)

Ok, this is just me thinking out loud, nothing to prove my theory or correctness of my thinking yet, but wanted to put something in "paper" anyway, in case something comes up and I forget the concept before having time to work on the process..

Anyway, everybody who works on web development with modestly big sites that are dynamic know the basic problem with database being the bottleneck of the whole system.

The most obvious thing for everybody to do at first is to utilize some level of database caching, often done either via tools like memcached or by application server -level caching tools (stuff that is implemented to the core of ColdFusion and its clones, but not in very "professional" way). After this is done, the next bottleneck comes from data updates (SELECTs and UPDATEs) and the occassional moments that you actually do need to get fresh data to your cache as well.

The SELECT issue and caching it can be done in various ways -- we utilize both, memory-level caching for our query results and also static file-based caching for information that we can trust to stay intact up until some level of "triggering" occurs. So, lets not babble more about that -- caching SELECT data is piece of cake and there are zillion options availble for it, no matter what app server you're utilizing, whether its PHP, JSP, ColdFusion or BlueDragon.

The problem that happens is with monstrous tables, updating them and seeking one-off data from them. Everybody who has a discussion forum with more than 1-5 million messages knows what I'm talking about. Data changes constantly and people make LIKE searches against it (provided you still have the search engine enabled to do that, most big forums tend to disable it :-). Having index tables and row-level locking can only help you so far, but eventually you'll end up having problems.

Sure, the easy solution is to add hardware -- and we're in position that we must do it anyway -- but as a developer for almost a decade now, I object the idea of solving problems purely by adding more horsepower to the db itself.

Then, I bumped into couple of white papers today, describing something that I've thought for many, many years about splitting up big tables into smaller chuncks. Call it a "federated table model", "partitioning database tables" or -- yay, new term -- "sharding". MySQL 5.1.6 and above offer some level of built-in assistance for this, but most solutions, whether built-in to the db engine itself or to the code, are based on ID ranges. I.e. you put, say forum messages, IDs ranging from 0 to 100,000 to one table and call it messages_1, then IDs from 100,000 to 200,000 to table called messages_2, etc..

It might work fine for most cases, but it poses a problem as well when you need to join that data. My thinking is centered here on discussion forum model, which is typically split into four levels:

groups
forum rooms
threads
messages

Typically there's only a handful of groups, we have something like 20 groups on forums. Their purpose is usually quite obscure, they simply provide a "big umbrella" under which group of forums sit. Forum rooms -- most sites have less than 30 forum rooms, which I think, is ideal for most sites. We aren't most sites, we have 250 of them :-) But still, from db point-of-view, a portion of a drop in an ocean.

Threads. Now figures are getting bigger. The ratio between messages (==posts) and threads varies wildly from forum to forum. On our site, the ratio is somewhere along 6-8 messages per thread. So, we have appx. 550,000 threads. For a slightly older server running MySQL, any table beyond 500,000 is about to pose a some level of a problem. Not a big problem, but still a problem.

Then, to the nightmare of all forum site dbas.. Messages. We aren't anywhere near the top of big-boards chart, having only 2.3M messages in our English forums, 750,000 messages in our Finnish forums and something like 300,000 messages in "other forums". All these forums use the same tables, so to simplify it, we have appx. 3.7 million messages altogether in our messages table.

Now, a table with just under four million rows is slightly problematic itself. Then, by definition, you do JOINs against threads table and boom, you already are joining up a 500,000 rows to 3.7 million rows. And as most forums show the nick names of writers for each post, you also need to make joins or some other arrangements to join your results to 'users' table as well. Our users table just happens to be quite large. 650,000 records to be pretty exact.

Now, looking at those figures, splitting up stuff makes sense.

My initial approach will be to split up the messages table, as it gets most "action" anyway, in terms of SELECTs, UPDATEs and INSERTs anyway (actually UPDATEs occur quite rarely, only couple of hundred each day).

Back to biz...

So, as I stated previously, my thinking has been in the past to split the messages table up by "ranges", i.e. store, say, 100k concurrent records to one table and move to next table after that. Problems arise when the "split position" happens to be in middle of a thread and you need to show the posts in one page anyway, so you need to build some weird UNION queries to come up with a solution.

Furthermore, splitting up by ranges -- whether you decide to use 'thread_id' or 'message_id' ranges -- causes one issue. Your "old" tables, the ones with small-number-IDs, are being used very little by UPDATEs and INSERTs. And as unfortunate it is and as much as it tells about the current culture, the most SELECTs will also occur to the most recent table. So, you manage to reduce the size of the most actively used table, but still face the problem of having to lock its rows for updates, etc and let pther tables sit idle most of the time.

Solution? Lets be creative with the ranges :-)

Easy solution that I came up with.. And as a disclaimer I know this has been done by other people for sure in the past and I'm quite certain that somebody comes along and disses my idea as a copycat from somebody else. But I havent actively thought about splitting up tables, even by ranges, for 2-3 years now until today when I accidentally bumped into one short blog article that talked about splitting up tables ("partitioning db tables"), but it also focused on range-method.

Ah, damn, forgot to continue.. So... I'm going to focus on splitting up the goliath and the most obvious solution that I came up with was to split it up by the thread_id that each post carries and that effectively ties each post to one discussion thread. Now, to avoid the range problem I described above, I decided to use something that I've used for pseudo-randomizing in the past, i.e. lets take the last digit..

So, you begin a new discussion thread and first you obviously (after you've initialized a transaction, of course, we're keeping it ACID..) create a new rec in 'threads' table and get its ID back. ID is, say, 1122334. Now, we take the last digit, which is '4' and use it as our 'messages' table indicator. So, after the thread ID has been generated, our first post goes to table called 'messages_4'. And so on, all messages posted to that thread obviously share the same thread_id, so they all get stored to the same table. But the next new thread that is created by a user gets thread_id 1122335 and all of its messages get stored to table called 'messages_5', etc.. You get ten tables, all of them are in active use and therefor all UPDATEs, INSERTs and SELECTs get evenly distributed to 10 different tables.

If you manage to remove JOINs from basic selects against messages_X tables (typically JOINs pulling the user nick names to show alongside the msg itself), you can freely distribute each messages_X table's queries to different database slaves as well, if you prefer to have that type of "manual" load balancing.

Only nag in this concept (other than being forced to modify code that handles messages) is that I need to build a separate index table to maintain the incremental numbering of messages, which would essentially just hold the message_ids. Sure, this could be avoided by swapping the current numbering scheme to use GUID instead or to use a combination of table_number-message_id, but as the additional table adds very little overhead to the whole project, I think I'm going to stick to the index table plan instead.

Silly little concept and I cannot guarantee its efficiency, but as it is damn easy to implement, I most likely will be testing it later this year and obviously posting my findings here in my blog as well :-)

Tags: bluedragon  coldfusion  database  forum  forums  mysql  partition  partitioning  php  sharding  splitting  up 

 

User comments

    (No comments made)


Post your comment

In order to post your comments here, you need be logged in to our system. Simply follow this link in order to login and to post your comments here.

Digital video: AfterDawn.com | AfterDawn Forums | DVD X Copy Forums
Music: MP3Lizard.com
Gaming: Blasteroids.com | Blasteroids Forums
Software: Software downloads
Blogs: User profile pages
RSS feeds: AfterDawn.com News | Software updates | AfterDawn Forums
International: AfterDawn in Finnish | AfterDawn in Swedish | download.fi | fin.MP3Lizard.com
Navigate: Search | Site map
About us: About AfterDawn Ltd | Advertise on our sites | Rules, Restrictions, Legal disclaimer & Privacy policy
Contact us: Send feedback | Contact our media sales team
 
  © 1999-2008 by AfterDawn Ltd.