Facebook, Microsoft target faster services with new AI server designs

Facebook's Big Basin and Microsoft's reworked Project Olympus have more space for GPUs to accelerate machine learning

Facebook on Wednesday rolled out some staggering statistics related to its social networks. Each day, users watch 100 million hours of video, 400 million people use Messenger, and more than 95 million photos and videos are posted on Instagram.

That puts a heavy load on Facebook's servers in data centers, which help orchestrate all these services to ensure timely responses. In addition, Facebook's servers use machine learning technologies to improve services, with one visible example being image recognition.

The story is similar for Microsoft, which is continually looking to balance the load on its servers. For example, Microsoft's data centers apply machine learning for natural language services like Cortana.

Both companies introduced new open-source hardware designs to ensure faster responses to such artificial intelligence services, and the designs will allow the companies to offer more services via their networks and software. The server designs were introduced at the Open Compute Project U.S. Summit on Wednesday.

These server designs can be used by other companies as a reference to design their own servers in-house and then send for mass manufacturing in Asia, something Facebook and Google have been doing for years. Financial organizations have also been experimenting with OCP designs to make servers for their organizations.

Facebook's Big Basin is an unorthodox server box that the company has termed "JBOG" -- for Just a Bunch Of GPUs -- that can deliver unprecedented power for machine learning. The system does not have a CPU and operates as an independent box that needs to be connected to discrete server and storage boxes.

Big Basin delivers on the promise of decoupling processing, storage, and networking units in data centers. In independent pools, storage and processing can be scaled up much faster but are limited when stuffed in one server box like today. The computation is also much faster when processing and storage are networked closer together. Decoupled units also share power and cooling resources, which reduces the electric bill in data centers.

The Big Basin system can be connected to Tioga Pass, a new Facebook open-source dual-CPU server design.

A decoupled data center design is important for companies like Facebook and Google, which are buying thousands of servers to meet with their growing processing needs.  The companies can scale up web services and machine learning tasks much faster by decoupling storage, processing, and other resources.

Intel is also chasing a similar design with its Rack Scale architecture, and companies like Dell and Hewlett Packard Enterprise offer blueprints for such server implementations.

Facebook's Big Basin system has eight Nvidia Tesla P100 GPU accelerators, connected in a mesh architecture via the super-fast NVLink interconnect. The mesh interconnect is similar to one in Nvidia's DGX-1 server, which is used in an AI supercomputer from Fujitsu in Japan.

The other new AI server design came from Microsoft, which announced Project Olympus, which has more space for AI co-processors. Microsoft also announced a GPU accelerator with Nvidia and Ingrasys called HGX-1, which is similar to Facebook's Big Sur but can be scaled to link 32 GPUs together. 

Project Olympus is a more conventional server design that doesn't require massive changes in server installations. It's a 1U rack server with the CPUs, GPUs, memory, storage, and networking all in one box.

Microsoft's new server design has a universal motherboard slot that will support the latest server chips, including Intel's Skylake and AMD's Naples. Project Olympus will do something rarely seen in servers: cross over from x86 to ARM with support for Qualcomm's Centriq 2400 or Cavium's Thunder X2 chips.

Qualcomm will be showing a motherboard and server based on the Project Olympus design at the OCP summit. The Qualcomm server will run Windows Server, the first time the OS is being shown running on an ARM chip.

The universal x86 and ARM motherboard support will allow customers to switch between chip architectures without purchasing new hardware. Bringing ARM support to Project Olympus is one of the big achievements of the new server design, Kushagra Vaid, general manager for Azure hardware infrastructure at Microsoft, said in a blog entry.

There's also space for Intel's FPGAs (field programmable gate arrays), which will speed up search and deep-learning applications in servers. Microsoft uses FPGAs to deliver faster Bing results. The server also has slots for up to three PCI-Express cards like GPUs, up to eight NVMe SSDs, ethernet, and DDR4 memory. It also has multiple fans, heatsinks, and multiple batteries to keep the server running in case of power loss.

The Project Olympus HGX-1 supports eight Nvidia Pascal GPUs via the NVLink interconnect technology. Four HGX-1 AI accelerators can be linked to create a massive machine learning cluster of 32 GPUs.

Today’s data centers are undergoing a massive shift to support the rapid adoption of AI computing, said Ian Buck, vice president and general manager of accelerated computing at Nvidia.

"The new OCP designs from Microsoft and Facebook show that hyperscale data centers need high-performance GPUs to support the enormous demands of AI computing," Buck said.

Join the PC World newsletter!

Error: Please check your email address.

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Agam Shah

IDG News Service
Show Comments

Most Popular Reviews

Latest News Articles


PCW Evaluation Team

Azadeh Williams

HP OfficeJet Pro 8730

A smarter way to print for busy small business owners, combining speedy printing with scanning and copying, making it easier to produce high quality documents and images at a touch of a button.

Andrew Grant

HP OfficeJet Pro 8730

I've had a multifunction printer in the office going on 10 years now. It was a neat bit of kit back in the day -- print, copy, scan, fax -- when printing over WiFi felt a bit like magic. It’s seen better days though and an upgrade’s well overdue. This HP OfficeJet Pro 8730 looks like it ticks all the same boxes: print, copy, scan, and fax. (Really? Does anyone fax anything any more? I guess it's good to know the facility’s there, just in case.) Printing over WiFi is more-or- less standard these days.

Ed Dawson

HP OfficeJet Pro 8730

As a freelance writer who is always on the go, I like my technology to be both efficient and effective so I can do my job well. The HP OfficeJet Pro 8730 Inkjet Printer ticks all the boxes in terms of form factor, performance and user interface.

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Aysha Strobbe

Windows 10 / HP Spectre x360

Ultimately, I think the Windows 10 environment is excellent for me as it caters for so many different uses. The inclusion of the Xbox app is also great for when you need some downtime too!

Mark Escubio

Windows 10 / Lenovo Yoga 910

For me, the Xbox Play Anywhere is a great new feature as it allows you to play your current Xbox games with higher resolutions and better graphics without forking out extra cash for another copy. Although available titles are still scarce, but I’m sure it will grow in time.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?