Dealing with the Digg Effect - A First Hand Story and Advice on Keeping Your Site Up

Published in Web Development on Thursday, October 5th, 2006

Getting Dugg, Slashdotted or hit by heavy traffic from another social website like Reddit or Del.icio.us can be quite exciting for any website owner. Traffic can be addictive, and it is exciting to see the hits pile up. But many websites have trouble dealing with the Digg effect and stop responding due to the heavy traffic influx.

Note: So this article made it to the Digg homepage and the site survived without trouble. Check out the redux at the bottom.

So the question is, how can you be sure that your site won't be brought down after it's made it to the front page of one of the sites mentioned above?

Well, I don't claim to have the answer, but just yesterday I had the pleasure of playing support for a friends' website as he made it to the front page of Digg and onto del.icio.us Popular. And guess what?

It went down.

So, what did we try, and what would I suggest for the next round?

The battle plan

We had done our best to set the site up to battle valiantly against the traffic.

We had setup his Wordpress 2.0 install to cache, which had reportedly helped other people survive this type of scenario, and we had a static HTML version of the dynamic/cached page ready for to be redirected to if necessary. As a last effort, we also employed the Coral cache advice recently published over st Shoemoney.

So what went wrong?

Well, basically, the server stopped responding to the requests. At that point, none of our work was of use, as the requests weren't getting to the server. The redirects we used couldn't fire, and the page we prepared could not respond.

Lessons Learned - What Would I do Next Time?

1. Prepare the front line: Talk to your hosts

The front line for your team in this case is the server. If you, like many, are on a shared box, you don't have much control over what happens when you get a spike in traffic, and chances are you wouldn't know what to do anyway :-)

The good news is that the people at your hosting company do know what will happen and how to handle it. And since the next few tips become useless once your server is down, this is the most important step.

If you think that you may at some point get Dugg and hit the home page, call your hosts and ask them about it. What will they do if it happens? Should you advise them so they know that the surge isn't an attack of some sort?

Get this stuff sorted, because, as we found out yesterday, it's not any fun sitting on hold while your site is shooting blanks.

Note: This step may sound a little out there, but consider it a preemptive method for dealing with the situation. One thing I learned while researching this, after the fact, was that different hosts react differently, i.e. some did fine serving dynamic pages where others crashed. Your host will likely have experience with traffic spikes on shared hosting and that is the info you are after.

2. Serve an HTML version of your page

This one is quite simple. Basically, once you know that you've been Dugg, save a copy of your post as a flat html page (View Source, Save) and then add a line similar to the following to your .htaccess file:

RewriteRule ^my/10/04/dynamic-url$ html-version.html [L]

Caching like that done in Wordpress (caching of query data) is decent, but it still requires PHP to build the page from the cached data and that takes up resources on the server that aren't required when delivering a flat, normal html page.

3. Host your assets on another server

If your site is image heavy, maybe host them on another server or load them into your Flickr account. Or cache them on the aforementioned Coral (I have my doubts about how reliably quick Coral is).

The same can be done with CSS files and JavaScript etc.

4. Redirect to a Mirror of your page

This is the advice that was posted recently over on Shoemoney and has been bantered around in Digg comments for some time now. This is good advice, however keep in mind two things:

Duggmirror seems to be a very good answer to this problem as it appears to be quite quick. Duggmirror caches pages that are submitted to Digg, and so the only issue is that if you want to redirect to them, you have to set it up once you've been Dugg (or once you realize that you have been Dugg), which may be too late.

Wrap Up

There you have it, four things that can be done to deal with the Digg Effect.

If you are too lazy to do most of these things, you could save your site by redirecting to something like Duggmirror before the server decides to crumble under the traffic. But the reality is that a Digging can happen fast, and if your server stops responding, you can no longer redirect or do anything but watch the blank page come up over and over again while you are on hold to the help desk.

So talk to your hosts, know the plan and be prepared, and maybe you won't have to endure the pain comments like this one.

Links and Resources

This article is by no means the first on the topic, it's just my experience. The following links are advice and ideas from around the web on how to avoid the Digg effect:

This article made it to the Digg homepage. As you can see in the graph below, there was a serious spike representing a maximun of about 2000 visits per hour at 8p. Definitley not too much for the server to handle, and looking at the hard data (and the graph), Mint, which uses the DB server, had no trouble recording the flow of traffic.

At times I would poke around the rest of the site to see how it was doing (emptying my cache each time) and the dynamic portion responded as it usually does.

Full kudos to Bluegravity and the dual Xeon server this site sits on for keeping it up just fine.

Graph of site traffic

Comments and Feedback

Great post Mike, good to see you posting again!

Thanks Dean, I appreciate that. It felt good to see my site again. Time to get writing again, I think.

Nice post.

If it get's really heavy I tend to do a 302 redirect to the Coral Cache version of the page for everyone apart from Coral.

Coral isn't THAT quick, but it's much better than a DB error page or nothing at all!

I was dugg as well with an wallpaper site being hosted, traffic went through the roof. However, having full root access to the server it was an easy fix. I had set all my variables low for Apache, so it would not spawn too many children which would have killed the server because of load, and because of this, the site was unavailable, but still serving. I upped the limit, and it handled the load better than ever before. Served out about 20 Mbit/second for almost 12 hours before it slowed down to about 5 Mbit/second. Off course there were spikes. Total transfer used during that 12 hours? 700 GB.

Great post. I've been interested lately in reaching out to popular hosts (dreamhost, mediatemple, 1and1) to get their official guides written up to handle the digg/slashdot (is it still appropriate to include slashdot, or have those days passed?) effect.

If anyone gets official responses from their host, perhaps those documents should be published (or, to protect the hosts network, published as account center help pages to clients).

That is really good advice. Digg brings down a lot of websites.

Some other quick tips for the next level that might help someone:

1. Use lighttpd for your images and perhaps the whole site.

2. Use fast-cgi if possible.

3. If your budget allows, use separate servers for data(mysql/pgsql), application (perl/php/pyton/etc) and images.

4. If you use mysql use innoDB for write intensive tables.

5. Use Danga's excellent memcached.

At uvota we've experimented with all these things and more.

So now I know how to handle the Digg Effect

Could you give me an idea of the number of unique visitors a digging brings in? I'm trying to workout if I could withstand a digging or not

Another tip if your wordpress driven site gets dugg: Go into your templates and hardcode everything you can, like urls that normally use PHP includes.

In your header.php and other templates, there's tags like "bloginfo('charset')" that can be changed to "UTF8", or "bloginfo('rss2_url')" that can be changed to yourblog.com/feed/ and such.

All those little tags are coded this way to make the templates portable for installing on any wordpress blog. Once you have the blog and template set up, they aren't necessary. They should be hard coded to cut down on the ?php calls that build the site.

Andy, here's a graph of the traffic from Mint. Nothing too great, but a definiate spike. Traffic's been down here as I've been quiet:

Or you could just use the exceptionally good CMS that is practicaly immune even to the such high traffic that digg.com generates: Drupal.

One of the larger hosts that can handle such spikes is Netfirms.com

They use a clustered load-balanced platform ideally suited for large traffic spikes.

good advice. made me just go get Mint for my own site.

man i am lame. i didn't know a spike in traffic could bring down my site. i am addicted to traffic + so excited to be blogrolled by big sites lately. i'd better prepare my little blogger site + hope for digging.

How does Drupal handle this traffic to be immune?

Hey derbz, no idea. I did notice a lot of Drupal fanboyz on the Digg thread and elsewhere ;-)

I've also seen a lot other articles with people singing the praises of Drupal, but in the case I am describing, talking about the script that powers the site sort of misses the point. (I've also seen a lot of posts where WP sites have been Dugg and survived just fine.)

The site that went down due to Digg that I was helping out with went down because the host - Pair - thought it was some form of DOS attack. They took the site down. It had nothing to do with the script running the site. We were worried, however, as his DB server had been throwing errors last Sunday with normal traffic numbers.

So the jist of it is: get to know the host "you" are on. Not just the specs, but what they will do if you have a spike in traffic. If you simply filled out a form and pay the $7.00 each month for hosting, you may want to find out more about what you are paying for (the 'you' there isn't directed at you, derbz!).

Good idea about the HTML cache of the page. It's really a shame when someone gets front paged only to have their site go down. What's your hosting details? (processor, memory, shared or VPS, etc.)

Hey spike, I'm on a dual xeon, 2gigs ram shared host*. My site didn't go down, however, the site in question was hosted at another provider.

* Intel Xeon 2.0 - 3.2 GHz Processor(s) with 512 - 1024 KB L2 Cache

Home » Blog » Web Development

Check out the blog categories for older content

The latest from my personal website,
Mike Papageorge.com

SiteUptime Web Site Monitoring Service

Sitepoint's web devlopment books have helped me out on many occasions both for finding a quick solution to a problem but also to level out my knowlegde in weaker areas (JavaScript, I'm looking at you!). I am recommending the following titles from my bookshelf:

The Principles Of Successful Freelancing

I started freelancing by diving in head first and getting on with it. Many years and a lot of experience later I was still able to take away some gems from this book, and there are plenty I wish I had thought of beforehand. If you are new to freelancing and have a lot of questions (or maybe don't know what questions to ask!) do yourself a favor and at least check out the sample chapters.

The Art & Science Of JavaScript

The author line-up for this book says it all. 7 excellent developers show you how to get your JavaScript coding up to speed with 7 chapters of great theory, code and examples. Metaprogramming with JavaScript (chapter 5 from Dan Webb) really helped me iron out some things I was missing about JavaScript. That said each chapter really helped me to develop my JavaScript skills beyond simple Ajax calls and html insertion with libs like JQuery.

The PHP Anthology: 101 Essential Tips, Tricks & Hacks

Like the other books listed here, this provides a great reference for the PHP developer looking to have the right answers from the right people at their fingertips. I tend to pull this off the shelf when I need to delve into new territory and usually find a workable solution to keep development moving. This only needs to happen once and you recoup the price of the book in time saved from having to develop the solution or find the right pattern for getting the job done..