Facebook, Microsoft target faster services with new AI server designs

Facebook's Big Basin and Microsoft's reworked Project Olympus have more space for GPUs to accelerate machine learning

Facebook on Wednesday rolled out some staggering statistics related to its social networks. Each day, users watch 100 million hours of video, 400 million people use Messenger, and more than 95 million photos and videos are posted on Instagram.

That puts a heavy load on Facebook's servers in data centers, which help orchestrate all these services to ensure timely responses. In addition, Facebook's servers use machine learning technologies to improve services, with one visible example being image recognition.

The story is similar for Microsoft, which is continually looking to balance the load on its servers. For example, Microsoft's data centers apply machine learning for natural language services like Cortana.

Both companies introduced new open-source hardware designs to ensure faster responses to such artificial intelligence services, and the designs will allow the companies to offer more services via their networks and software. The server designs were introduced at the Open Compute Project U.S. Summit on Wednesday.

These server designs can be used by other companies as a reference to design their own servers in-house and then send for mass manufacturing in Asia, something Facebook and Google have been doing for years. Financial organizations have also been experimenting with OCP designs to make servers for their organizations.

Facebook's Big Basin is an unorthodox server box that the company has termed "JBOG" -- for Just a Bunch Of GPUs -- that can deliver unprecedented power for machine learning. The system does not have a CPU and operates as an independent box that needs to be connected to discrete server and storage boxes.

Big Basin delivers on the promise of decoupling processing, storage, and networking units in data centers. In independent pools, storage and processing can be scaled up much faster but are limited when stuffed in one server box like today. The computation is also much faster when processing and storage are networked closer together. Decoupled units also share power and cooling resources, which reduces the electric bill in data centers.

The Big Basin system can be connected to Tioga Pass, a new Facebook open-source dual-CPU server design.

A decoupled data center design is important for companies like Facebook and Google, which are buying thousands of servers to meet with their growing processing needs.  The companies can scale up web services and machine learning tasks much faster by decoupling storage, processing, and other resources.

Intel is also chasing a similar design with its Rack Scale architecture, and companies like Dell and Hewlett Packard Enterprise offer blueprints for such server implementations.

Facebook's Big Basin system has eight Nvidia Tesla P100 GPU accelerators, connected in a mesh architecture via the super-fast NVLink interconnect. The mesh interconnect is similar to one in Nvidia's DGX-1 server, which is used in an AI supercomputer from Fujitsu in Japan.

The other new AI server design came from Microsoft, which announced Project Olympus, which has more space for AI co-processors. Microsoft also announced a GPU accelerator with Nvidia and Ingrasys called HGX-1, which is similar to Facebook's Big Sur but can be scaled to link 32 GPUs together. 

Project Olympus is a more conventional server design that doesn't require massive changes in server installations. It's a 1U rack server with the CPUs, GPUs, memory, storage, and networking all in one box.

Microsoft's new server design has a universal motherboard slot that will support the latest server chips, including Intel's Skylake and AMD's Naples. Project Olympus will do something rarely seen in servers: cross over from x86 to ARM with support for Qualcomm's Centriq 2400 or Cavium's Thunder X2 chips.

Qualcomm will be showing a motherboard and server based on the Project Olympus design at the OCP summit. The Qualcomm server will run Windows Server, the first time the OS is being shown running on an ARM chip.

The universal x86 and ARM motherboard support will allow customers to switch between chip architectures without purchasing new hardware. Bringing ARM support to Project Olympus is one of the big achievements of the new server design, Kushagra Vaid, general manager for Azure hardware infrastructure at Microsoft, said in a blog entry.

There's also space for Intel's FPGAs (field programmable gate arrays), which will speed up search and deep-learning applications in servers. Microsoft uses FPGAs to deliver faster Bing results. The server also has slots for up to three PCI-Express cards like GPUs, up to eight NVMe SSDs, ethernet, and DDR4 memory. It also has multiple fans, heatsinks, and multiple batteries to keep the server running in case of power loss.

The Project Olympus HGX-1 supports eight Nvidia Pascal GPUs via the NVLink interconnect technology. Four HGX-1 AI accelerators can be linked to create a massive machine learning cluster of 32 GPUs.

Today’s data centers are undergoing a massive shift to support the rapid adoption of AI computing, said Ian Buck, vice president and general manager of accelerated computing at Nvidia.

"The new OCP designs from Microsoft and Facebook show that hyperscale data centers need high-performance GPUs to support the enormous demands of AI computing," Buck said.

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection
Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Agam Shah

IDG News Service
Show Comments


James Cook University - Master of Data Science Online Course

Learn more >


Sansai 6-Outlet Power Board + 4-Port USB Charging Station

Learn more >



Back To Business Guide

Click for more ›

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Louise Coady

Brother MFC-L9570CDW Multifunction Printer

The printer was convenient, produced clear and vibrant images and was very easy to use

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Walid Mikhael

Brother QL-820NWB Professional Label Printer

It’s easy to set up, it’s compact and quiet when printing and to top if off, the print quality is excellent. This is hands down the best printer I’ve used for printing labels.

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?