Microsoft bolsters artificial intelligence with additions to Project Oxford

Developers will get access to advanced tools for face recognition, speech recognition and more

Microsoft's Project Oxford, a suite of developer tools based on the company's machine learning and artificial intelligence research, is getting a new quintet of services, the company announced at its Future Decoded conference in London. 

Developers can now take advantage of an emotion detection service that looks at a photo and lists an array of emotions that it detects on the subjects' faces. For each person in an image (up to a certain number), the service will pass back the probabilities that someone is expressing anger, happiness, fear, surprise, disgust, sadness, contempt or nothing at all.

According to Ryan Galgon, a senior program manager at Microsoft, the company built the service after it saw developers using Project Oxford's existing face detection technology in applications that run sentiment analysis on photographs. The new service makes different applications possible, like editing photos based on the feelings of the people in them.  

That's not to say Microsoft has handed developers an emotion-detection expert in a box. The service can only handle static images at this point, and Galgon said Microsoft is more confident in some of its emotion detection models (like finding happiness) than in others (contempt and disgust). 

Speaking of video, by the end of this year Project Oxford will have beta support for video tools including motion detection and image stabilization. The suite will also have face-tracking tools that will log where people are in each frame of a video so users can analyze what's going on.  

Depending on the size of the video, it could take a while to process a file with one of those services. Microsoft has placed a cap on how big the video files fed into the service can be, and Galgon suggests that developers scale down the resolution of large files. 

In voice, Microsoft announced Custom Recognition Intelligent Services (CRIS), which lets developers create voice-recognition models for specific circumstances. It's useful for taking dictation that a traditional model wouldn't be well suited to, like the speech of young kids or interactions with a kiosk at a baseball park.

To get more personalized results, developers have to feed the service a set of audio files, along with transcriptions of the speech, to build up the speech model.

There's also a forthcoming speaker recognition feature in Project Oxford's speech toolset. Right now, it's not really built to do something like analyze a recorded conversation and pick out who's speaking when. But it does let developers take a short clip of someone talking and determine whether the person speaking matches the person it's been trained to recognize. 

It's the sort of thing Galgon sees working as a lightweight form of authentication: Not as secure as a password or fingerprint but useful as one signal to see if someone is who they say they are. 

Put together, speaker recognition and face detection could be used as part of the foundation of a security system similar to Google's Project Abacus, which authenticates a user based on a variety of signals including voice and facial recognition. Abacus is still in development.

All the processing for these services is handled in Microsoft's cloud, which means applications using Project Oxford APIs have to be connected to the Internet. Galgon said the company has heard from plenty of developers asking to use Project Oxford's capabilities offline but Microsoft wants to keep them online-only for now.

Releasing the tools to the public could help attract users to the company's Azure cloud platform, which features three Project Oxford services as part of the Cortana Analytics Suite. Developers can try out all features for free and talk to Microsoft if they need to use more than what's available through Project Oxford's free usage tier. 

Join the PC World newsletter!

Error: Please check your email address.

Tags Microsoft

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Blair Hanley Frank

IDG News Service
Show Comments

Essentials

Lexar® JumpDrive® S57 USB 3.0 flash drive

Learn more >

Microsoft L5V-00027 Sculpt Ergonomic Keyboard Desktop

Learn more >

Mobile

Lexar® JumpDrive® S45 USB 3.0 flash drive 

Learn more >

Exec

HD Pan/Tilt Wi-Fi Camera with Night Vision NC450

Learn more >

Audio-Technica ATH-ANC70 Noise Cancelling Headphones

Learn more >

Lexar® JumpDrive® C20c USB Type-C flash drive 

Learn more >

Lexar® Professional 1800x microSDHC™/microSDXC™ UHS-II cards 

Learn more >

Budget

Back To Business Guide

Click for more ›

Most Popular Reviews

Latest News Articles

Resources

PCW Evaluation Team

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Aysha Strobbe

Windows 10 / HP Spectre x360

Ultimately, I think the Windows 10 environment is excellent for me as it caters for so many different uses. The inclusion of the Xbox app is also great for when you need some downtime too!

Mark Escubio

Windows 10 / Lenovo Yoga 910

For me, the Xbox Play Anywhere is a great new feature as it allows you to play your current Xbox games with higher resolutions and better graphics without forking out extra cash for another copy. Although available titles are still scarce, but I’m sure it will grow in time.

Kathy Cassidy

STYLISTIC Q702

First impression on unpacking the Q702 test unit was the solid feel and clean, minimalist styling.

Anthony Grifoni

STYLISTIC Q572

For work use, Microsoft Word and Excel programs pre-installed on the device are adequate for preparing short documents.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?