Microsoft bolsters artificial intelligence with additions to Project Oxford

Developers will get access to advanced tools for face recognition, speech recognition and more

Microsoft's Project Oxford, a suite of developer tools based on the company's machine learning and artificial intelligence research, is getting a new quintet of services, the company announced at its Future Decoded conference in London. 

Developers can now take advantage of an emotion detection service that looks at a photo and lists an array of emotions that it detects on the subjects' faces. For each person in an image (up to a certain number), the service will pass back the probabilities that someone is expressing anger, happiness, fear, surprise, disgust, sadness, contempt or nothing at all.

According to Ryan Galgon, a senior program manager at Microsoft, the company built the service after it saw developers using Project Oxford's existing face detection technology in applications that run sentiment analysis on photographs. The new service makes different applications possible, like editing photos based on the feelings of the people in them.  

That's not to say Microsoft has handed developers an emotion-detection expert in a box. The service can only handle static images at this point, and Galgon said Microsoft is more confident in some of its emotion detection models (like finding happiness) than in others (contempt and disgust). 

Speaking of video, by the end of this year Project Oxford will have beta support for video tools including motion detection and image stabilization. The suite will also have face-tracking tools that will log where people are in each frame of a video so users can analyze what's going on.  

Depending on the size of the video, it could take a while to process a file with one of those services. Microsoft has placed a cap on how big the video files fed into the service can be, and Galgon suggests that developers scale down the resolution of large files. 

In voice, Microsoft announced Custom Recognition Intelligent Services (CRIS), which lets developers create voice-recognition models for specific circumstances. It's useful for taking dictation that a traditional model wouldn't be well suited to, like the speech of young kids or interactions with a kiosk at a baseball park.

To get more personalized results, developers have to feed the service a set of audio files, along with transcriptions of the speech, to build up the speech model.

There's also a forthcoming speaker recognition feature in Project Oxford's speech toolset. Right now, it's not really built to do something like analyze a recorded conversation and pick out who's speaking when. But it does let developers take a short clip of someone talking and determine whether the person speaking matches the person it's been trained to recognize. 

It's the sort of thing Galgon sees working as a lightweight form of authentication: Not as secure as a password or fingerprint but useful as one signal to see if someone is who they say they are. 

Put together, speaker recognition and face detection could be used as part of the foundation of a security system similar to Google's Project Abacus, which authenticates a user based on a variety of signals including voice and facial recognition. Abacus is still in development.

All the processing for these services is handled in Microsoft's cloud, which means applications using Project Oxford APIs have to be connected to the Internet. Galgon said the company has heard from plenty of developers asking to use Project Oxford's capabilities offline but Microsoft wants to keep them online-only for now.

Releasing the tools to the public could help attract users to the company's Azure cloud platform, which features three Project Oxford services as part of the Cortana Analytics Suite. Developers can try out all features for free and talk to Microsoft if they need to use more than what's available through Project Oxford's free usage tier. 

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags Microsoft

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Blair Hanley Frank

IDG News Service
Show Comments

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

David Coyle

Brother PocketJet PJ-773 A4 Portable Thermal Printer

I rate the printer as a 5 out of 5 stars as it has been able to fit seamlessly into my busy and mobile lifestyle.

Kurt Hegetschweiler

Brother PocketJet PJ-773 A4 Portable Thermal Printer

It’s perfect for mobile workers. Just take it out — it’s small enough to sit anywhere — turn it on, load a sheet of paper, and start printing.

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?