Facebook, Microsoft target faster services with new AI server designs

Facebook's Big Basin and Microsoft's reworked Project Olympus have more space for GPUs to accelerate machine learning

Facebook on Wednesday rolled out some staggering statistics related to its social networks. Each day, users watch 100 million hours of video, 400 million people use Messenger, and more than 95 million photos and videos are posted on Instagram.

That puts a heavy load on Facebook's servers in data centers, which help orchestrate all these services to ensure timely responses. In addition, Facebook's servers use machine learning technologies to improve services, with one visible example being image recognition.

The story is similar for Microsoft, which is continually looking to balance the load on its servers. For example, Microsoft's data centers apply machine learning for natural language services like Cortana.

Both companies introduced new open-source hardware designs to ensure faster responses to such artificial intelligence services, and the designs will allow the companies to offer more services via their networks and software. The server designs were introduced at the Open Compute Project U.S. Summit on Wednesday.

These server designs can be used by other companies as a reference to design their own servers in-house and then send for mass manufacturing in Asia, something Facebook and Google have been doing for years. Financial organizations have also been experimenting with OCP designs to make servers for their organizations.

Facebook's Big Basin is an unorthodox server box that the company has termed "JBOG" -- for Just a Bunch Of GPUs -- that can deliver unprecedented power for machine learning. The system does not have a CPU and operates as an independent box that needs to be connected to discrete server and storage boxes.

Big Basin delivers on the promise of decoupling processing, storage, and networking units in data centers. In independent pools, storage and processing can be scaled up much faster but are limited when stuffed in one server box like today. The computation is also much faster when processing and storage are networked closer together. Decoupled units also share power and cooling resources, which reduces the electric bill in data centers.

The Big Basin system can be connected to Tioga Pass, a new Facebook open-source dual-CPU server design.

A decoupled data center design is important for companies like Facebook and Google, which are buying thousands of servers to meet with their growing processing needs.  The companies can scale up web services and machine learning tasks much faster by decoupling storage, processing, and other resources.

Intel is also chasing a similar design with its Rack Scale architecture, and companies like Dell and Hewlett Packard Enterprise offer blueprints for such server implementations.

Facebook's Big Basin system has eight Nvidia Tesla P100 GPU accelerators, connected in a mesh architecture via the super-fast NVLink interconnect. The mesh interconnect is similar to one in Nvidia's DGX-1 server, which is used in an AI supercomputer from Fujitsu in Japan.

The other new AI server design came from Microsoft, which announced Project Olympus, which has more space for AI co-processors. Microsoft also announced a GPU accelerator with Nvidia and Ingrasys called HGX-1, which is similar to Facebook's Big Sur but can be scaled to link 32 GPUs together. 

Project Olympus is a more conventional server design that doesn't require massive changes in server installations. It's a 1U rack server with the CPUs, GPUs, memory, storage, and networking all in one box.

Microsoft's new server design has a universal motherboard slot that will support the latest server chips, including Intel's Skylake and AMD's Naples. Project Olympus will do something rarely seen in servers: cross over from x86 to ARM with support for Qualcomm's Centriq 2400 or Cavium's Thunder X2 chips.

Qualcomm will be showing a motherboard and server based on the Project Olympus design at the OCP summit. The Qualcomm server will run Windows Server, the first time the OS is being shown running on an ARM chip.

The universal x86 and ARM motherboard support will allow customers to switch between chip architectures without purchasing new hardware. Bringing ARM support to Project Olympus is one of the big achievements of the new server design, Kushagra Vaid, general manager for Azure hardware infrastructure at Microsoft, said in a blog entry.

There's also space for Intel's FPGAs (field programmable gate arrays), which will speed up search and deep-learning applications in servers. Microsoft uses FPGAs to deliver faster Bing results. The server also has slots for up to three PCI-Express cards like GPUs, up to eight NVMe SSDs, ethernet, and DDR4 memory. It also has multiple fans, heatsinks, and multiple batteries to keep the server running in case of power loss.

The Project Olympus HGX-1 supports eight Nvidia Pascal GPUs via the NVLink interconnect technology. Four HGX-1 AI accelerators can be linked to create a massive machine learning cluster of 32 GPUs.

Today’s data centers are undergoing a massive shift to support the rapid adoption of AI computing, said Ian Buck, vice president and general manager of accelerated computing at Nvidia.

"The new OCP designs from Microsoft and Facebook show that hyperscale data centers need high-performance GPUs to support the enormous demands of AI computing," Buck said.

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection
Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Agam Shah

IDG News Service
Show Comments

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

David Coyle

Brother PocketJet PJ-773 A4 Portable Thermal Printer

I rate the printer as a 5 out of 5 stars as it has been able to fit seamlessly into my busy and mobile lifestyle.

Kurt Hegetschweiler

Brother PocketJet PJ-773 A4 Portable Thermal Printer

It’s perfect for mobile workers. Just take it out — it’s small enough to sit anywhere — turn it on, load a sheet of paper, and start printing.

Matthew Stivala

HP OfficeJet 250 Mobile Printer

The HP OfficeJet 250 Mobile Printer is a great device that fits perfectly into my fast paced and mobile lifestyle. My first impression of the printer itself was how incredibly compact and sleek the device was.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?