Page 1 of 2

Facebook Down?

Posted: Thu Sep 23, 2010 8:46 pm
by Boardtalk.net
I'm in Ireland and was wondering if its the same anywhere else?... or is it just me?

Re: Facebook Down?

Posted: Thu Sep 23, 2010 8:48 pm
by Mai
It is very much down. Good thing for me too, I need less destractions. :lol:

Re: Facebook Down?

Posted: Thu Sep 23, 2010 8:51 pm
by Boardtalk.net
Phew... so it wasnt anything I did :lol:
I'd be interested to see what countries are affected… everyone post up where you are please and thanks :D

Re: Facebook Down?

Posted: Thu Sep 23, 2010 9:19 pm
by MichaelC
UK, me too. Github was also down today and freenode had too take a server down (netsplit). Two of my servers had emergency Maintenance today.

Maybe some of them were running centOS and needed to deal with the issue my servers did:
There was a recent exploit that impacted all CentOS5 64Bit servers and in order to mitigate the aforementioned exploit, all server running it need to be rebooted in order to upgrade into a newly released kernel.

Re: Facebook Down?

Posted: Sat Sep 25, 2010 4:44 pm
by Boardtalk.net
Thanks Unknown Bliss for the info :D Don’t know too much about the inner workings of servers but it must have been a heck of an upgrade.

Re: Facebook Down?

Posted: Sat Sep 25, 2010 5:25 pm
by Brandon05
Not down for me. Michigan, USA

I barely use it anyways so less distractions for me 24/7.

Re: Facebook Down?

Posted: Sat Sep 25, 2010 8:46 pm
by MichaelC
The downtime was on the 23rd. ;)

Re: Facebook Down?

Posted: Sun Sep 26, 2010 10:28 pm
by Brandon05
It was not down for me at all this week. ;)

I have it on a tab when I load up my browser and use it to talk with friends all the time.

Re: Facebook Down?

Posted: Sun Sep 26, 2010 11:05 pm
by MichaelC
It was down. FB said it themselves.

http://www.facebook.com/note.php?note_i ... 199&ref=mf

Re: Facebook Down?

Posted: Mon Sep 27, 2010 12:49 am
by Sierron
Good thing I didn't care much about Facebook.

Re: Facebook Down?

Posted: Mon Sep 27, 2010 2:25 am
by Joe Abraham
On September 23, 2010, Facebook was showing DNS failure when you tried to reach their site. This seemed to be the case for most visitors.

Either their service is down - which rarely happens - and in that case you should just try later, or you have Internet connections on your computer. If you're able to connect to other websites, then your computer is not the problem.

Another possibility is that you're trying to connect to Facebook via a computer/network that blocked Facebook (such as a work place, a school, etc.).

To check if Facebook is up/down "at the moment," you can go to a website called Downrightnow. See the Related Link for this website's URL.

Re: Facebook Down?

Posted: Mon Sep 27, 2010 2:48 pm
by .Victor
Down for everyone or just me is a great site to check if a site is down :)

Re: Facebook Down?

Posted: Mon Sep 27, 2010 2:49 pm
by MichaelC
Unknown Bliss wrote:It was down. FB said it themselves.

http://www.facebook.com/note.php?note_i ... 199&ref=mf
Joe, an explanation is there.

I have quoted it here as well:
Early today Facebook was down or unreachable for many of you for approximately 2.5 hours. This is the worst outage we’ve had in over four years, and we wanted to first of all apologize for it. We also wanted to provide much more technical detail on what happened and share one big lesson learned.

The key flaw that caused this outage to be so severe was an unfortunate handling of an error condition. An automated system for verifying configuration values ended up causing much more damage than it fixed.

The intent of the automated system is to check for configuration values that are invalid in the cache and replace them with updated values from the persistent store. This works well for a transient problem with the cache, but it doesn’t work when the persistent store is invalid.

Today we made a change to the persistent copy of a configuration value that was interpreted as invalid. This meant that every single client saw the invalid value and attempted to fix it. Because the fix involves making a query to a cluster of databases, that cluster was quickly overwhelmed by hundreds of thousands of queries a second.

To make matters worse, every time a client got an error attempting to query one of the databases it interpreted it as an invalid value, and deleted the corresponding cache key. This meant that even after the original problem had been fixed, the stream of queries continued. As long as the databases failed to service some of the requests, they were causing even more requests to themselves. We had entered a feedback loop that didn’t allow the databases to recover.

The way to stop the feedback cycle was quite painful - we had to stop all traffic to this database cluster, which meant turning off the site. Once the databases had recovered and the root cause had been fixed, we slowly allowed more people back onto the site.

This got the site back up and running today, and for now we’ve turned off the system that attempts to correct configuration values. We’re exploring new designs for this configuration system following design patterns of other systems at Facebook that deal more gracefully with feedback loops and transient spikes.

We apologize again for the site outage, and we want you to know that we take the performance and reliability of Facebook very seriously.

Re: Facebook Down?

Posted: Mon Sep 27, 2010 6:05 pm
by noth
wow what an amazing explanation, so it wasn't aliens after all
Once the databases had recovered and the root cause had been fixed, we slowly allowed more people back onto the site.

hey!! how do they do that then? :D

Re: Facebook Down?

Posted: Mon Sep 27, 2010 8:52 pm
by MichaelC
noth wrote:wow what an amazing explanation, so it wasn't aliens after all
Once the databases had recovered and the root cause had been fixed, we slowly allowed more people back onto the site.

hey!! how do they do that then? :D
Using special limiters which will basically be set right. Allow 1GB of traffic per 1 minutes then gradually increase that limit is my guess. Otherwise if you switch it on every user is refreshing, the server could have issues and could cause more problems.