A huge part of the online community was left twiddling thumbs as Facebook, WhatsApp, Instagram, and other online platforms that depend on it went down globally. The social media giant has confirmed the issue was much bigger than DNS servers. It seems Facebook was unreachable due to BGP Issue.
Apparently, Facebook faced a BGP or the Border Gateway Protocol issue. The company has now rectified the erroneous configuration, allowing millions to log in and get back to their social media accounts.
Facebook and its owned properties were working fine, but the information superhighway had routing problems leading users to a dead end:
Collectively, Facebook, WhatsApp, and Instagram command a huge chunk of the online community. These services handling tremendous amounts of data every second, 24x7x365.
However, even the Internet’s giants, sometimes suffer from some small issues that render them unreachable. Facebook, WhatsApp, and Instagram collectively went down at about 11:50 AM EST.
To the huge community of people and businesses around the world who depend on us: we're sorry. We’ve been working hard to restore access to our apps and services and are happy to report they are coming back online now. Thank you for bearing with us.
— Facebook (@Facebook) October 4, 2021
While millions of users scrambled to confirm if their internet was working, the issue was with the social media giant’s backend servers. Strangely, even the core servers were functioning, but the pathway had suddenly vanished.
Facebook has confirmed that yesterday’s worldwide outage was caused by faulty configuration changes. Apparently, the backbone routers, couldn’t lead users to the servers of the company.
Facebook, Instagram and WhatsApp are starting to come back online but are working very slowly. pic.twitter.com/YlLoxHeo62
— Yid Info (@YidInfoNews) October 4, 2021
Speaking about the “outage”, Santosh Janardhan, VP for Engineering and Infrastructure at Facebook, said:
“Our engineering teams have learned that configuration changes on the backbone routers that coordinate network traffic between our data centers caused issues that interrupted this communication. This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt.”
Not just the users, even the people working for Facebook couldn’t get any work done?
The outage was quite serious as it brought even major parts of the Facebook universe to a standstill. The configuration issues also impacted the company’s internal systems and tools.
This is primary reason that Facebook engineers had a tough time bringing the systems back online. The “BGP Issue”.
A bunch of Facebook networks has just disappeared from the internet: pic.twitter.com/j07LrmAAdW
— Giorgio Bonfiglio (@g_bonfiglio) October 4, 2021
Facebook has reportedly confirmed that all the user data is intact. In other words, no one managed to compromise Facebook’s security during the downtime.
It would have been near impossible to actually reach Facebook’s servers as the various Facebook routing prefixes had suddenly disappeared from the Internet’s BGP routing tables.
Still trending positive: "HTTP/1.1 503 No server is available for the request"
(the absolutely fun thing here is I don't use Facebook – I'm just so passionate about outages and reliability in general) pic.twitter.com/VLNNh725zQ
— Giorgio Bonfiglio (@g_bonfiglio) October 4, 2021
Technical jargon aside, the BGP routing protocol is like the Internet “postal system,”. Facebook and every other website need to advertise its digital address.
As it turns out, the advertised address disappeared. With the address removed, no one else on the Internet knows how to connect to their servers.