Facebook engineer: Going large requires thinking small

Facebook's head engineer offers advice on managing continual exponential systems growth at the Usenix conference

When managing a constantly expanding system with many moving parts, it is crucial to break the system into large numbers of small pieces and manage them with lots of small, dedicated teams, advised Bobby Johnson, who is director of engineering for Facebook, at the Usenix Annual Technical Conference in Boston.

The topic of his presentation was "Lessons of Scale at Facebook," and Johnson certainly is a good person to hold forth on that topic. When Johnson started at the service four years ago, it had only 7 million users -- now it is up to about 400 million. He has seen constant exponential growth. Now Johnson oversees about 400 engineers, which is about one for each million users that the service has.

One major lesson he has learned over the years: Keep the projects to add new features small, and manage them with small teams that have direct control over the features.

At a high level, Facebook has a simple three-layer architecture, consisting of Web servers that assemble pages for users, the cache layer, which keeps much of the data that is frequently used, and a database layer, which serves mainly as "persistent storage," Johnson said.

Each layer has been scaled horizontally, meaning the layers are run across thousands of servers.

It is largely an open-source stack, though one that has been heavily modified: The Web server layer runs modified copies of Linux. The cache layer relies on the Memcache, and MySQL powers the databases. The page components (each page is assembled from dozens of smaller components) are written in PHP, though the code has been pre-compiled.

These days the site, which people expect to be up constantly, can get up to 100 million messages per second to the cache layer. The system handles about 1 billion chat messages and 100 million search messages per day. And because the service now has a global audience, it remains busy 24 hours a day. And it keeps growing.

"Every week, we have our biggest day ever," he said.

For building in new features and enhancing the old ones, the approach that Facebook has taken has been one of deploying "very small teams who move quickly," Johnson said. "We make small changes frequently," he said, noting that when something goes wrong, they can isolate the problem quickly.

The data set is unusual in that it is highly connected. For instance, Facebook runs the world's largest photo-sharing site, even though it doesn't offer many of the features of other sites, such as Flickr. But the one advantage it does have is that people are tagged in photos.

And this is true overall for Facebook, he added. The true value it adds is connectivity, the ability for users to connect with other users.

"What is interesting about me is what I am connected to. We call this the social graph," he said. This produces a lot of interconnected data.

Each page is composed of thousands of elements, such as photos, lists of friends, people who have poked you and so on. And these elements are updated frequently. A conversation taking place must be rendered in the order in which contributions were made, otherwise it doesn't make sense.

The company has developed a number of techniques to make this seem as seamless to the user as possible. Since each page is comprised of multiple elements, the work can be divided among multiple servers in parallel.

Also, as soon as a page is requested, the server sends all the essential elements to the browser so it can start assembling the page. This approach uses a special Javascript library called a Primer, which supplies all the essential items required for a quick start.

The database layer is used largely as persistent storage, maintaining permanent copies of all the data entered. Given the size of all the data being held, the databases are too slow to run queries against each time a user requests a piece of data, so the vast majority of data is held in the cache layer. Johnson estimated that tens of terabytes of data is being held here.

The data is stored at this layer not in relational tables, but rather in what Johnson called graphs, or serial lists of data elements identified by a single key.

About half the data in this layer, at any given time, has been generated within the past year. The rest is stored on vast arrays of serial attached storage. Facebook does not delete old data, because the gains in storage space would be negligible, he said.

The company is working on ways to run more complex queries against these graphs. "What we're building now are systems that think of the world as a graph," he said. One day you will be able to ask Facebook to identify all your friends who like a particular game and the answer will be served up within a few seconds.

How an organization sets up its internal chain of responsibility is crucial for its success, Johnson noted. "If somebody is responsible for something, they must have control over it," he said.

Many conflicts within an organization may stem from the fact that someone has responsibility for something, but does not have any control over it.

For instance, putting someone in charge of making a Web site faster will not produce beneficial results unless that person also is in charge of the code that runs the site. "It puts the person in a bad position," since he has limited influence over the people who are actually in control of the parts of the site that need improving, he said. It also reduces the amount of flexibility that the organization can have, since basically the only power that the person in charge has is to say no to proposed changes.

The Facebook engineers are broken into small teams, each working on a different app or standalone function. Each team is responsible for launching its own application and can't just simply hand it over to the operational teams and walk away. When something goes wrong, it is the development team that must figure it out, he said in a brief conversation with IDG News Service after the presentation.

Johnson also said that Facebook also applies this approach to privacy. It has a small team that does watch over privacy. "This is the one thing we can't screw up," he said. But each programming team also has a designated person to manage the privacy aspects of the project they are working on.

Joab Jackson covers enterpise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the PC World newsletter!

Error: Please check your email address.

Tags open sourceinstant messagingapplication developmentdevelopment platformsweb servicessocial networkingcloud computinginternetPortalsFacebookDevelopment toolsArchitectureMailsoftwaresocial media

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments


Lexar® JumpDrive® S57 USB 3.0 flash drive

Learn more >

Microsoft L5V-00027 Sculpt Ergonomic Keyboard Desktop

Learn more >


Lexar® JumpDrive® S45 USB 3.0 flash drive 

Learn more >


Lexar® Professional 1800x microSDHC™/microSDXC™ UHS-II cards 

Learn more >

HD Pan/Tilt Wi-Fi Camera with Night Vision NC450

Learn more >

Audio-Technica ATH-ANC70 Noise Cancelling Headphones

Learn more >

Lexar® JumpDrive® C20c USB Type-C flash drive 

Learn more >


Back To Business Guide

Click for more ›

Most Popular Reviews

Latest News Articles


PCW Evaluation Team

Azadeh Williams

HP OfficeJet Pro 8730

A smarter way to print for busy small business owners, combining speedy printing with scanning and copying, making it easier to produce high quality documents and images at a touch of a button.

Andrew Grant

HP OfficeJet Pro 8730

I've had a multifunction printer in the office going on 10 years now. It was a neat bit of kit back in the day -- print, copy, scan, fax -- when printing over WiFi felt a bit like magic. It’s seen better days though and an upgrade’s well overdue. This HP OfficeJet Pro 8730 looks like it ticks all the same boxes: print, copy, scan, and fax. (Really? Does anyone fax anything any more? I guess it's good to know the facility’s there, just in case.) Printing over WiFi is more-or- less standard these days.

Ed Dawson

HP OfficeJet Pro 8730

As a freelance writer who is always on the go, I like my technology to be both efficient and effective so I can do my job well. The HP OfficeJet Pro 8730 Inkjet Printer ticks all the boxes in terms of form factor, performance and user interface.

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Aysha Strobbe

Windows 10 / HP Spectre x360

Ultimately, I think the Windows 10 environment is excellent for me as it caters for so many different uses. The inclusion of the Xbox app is also great for when you need some downtime too!

Mark Escubio

Windows 10 / Lenovo Yoga 910

For me, the Xbox Play Anywhere is a great new feature as it allows you to play your current Xbox games with higher resolutions and better graphics without forking out extra cash for another copy. Although available titles are still scarce, but I’m sure it will grow in time.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?