Microsoft takes steps to prevent another WGA meltdown

WGA project manager says his team will be better prepared for future outages

Three months after a major failure of Microsoft's anticounterfeit system fingered legitimate Windows XP and Vista users as pirates, a senior project manager has spelled out the steps his team has taken to prevent a repeat.

Alex Kochis, the senior project manager for Windows Genuine Advantage (WGA), used a company blog to outline new processes that have been put in place, including drills that test the WGA group's response to an outage like the one in late August.

"We've revamped the monitoring that is used to track what's happening within our server infrastructure so that we can identify potential problems faster, ideally before any customer gets impacted," Kochis said. "[And] since August, we have conducted more than a dozen 'fire-drills' designed to improve our ability to respond to issues affecting customers or that could impact the quality of the service."

Those drills, Kochis said, have ranged from pre-announced simulations to surprise alerts that test a specific scenario. "The team is now better prepared overall to take the right action and take it quickly," he promised.

In late August, servers operating the WGA validation system went dark for about 19 hours. Customers who tried to validate their copy of Windows -- a Microsoft requirement for both XP and Vista -- during the blackout were pegged as pirates; Vista owners found parts of the operating system had been disabled, including its Aero graphical interface.

Several days after the weekend meltdown, Microsoft blamed preproduction code for the snafu and said that a rollback to earlier versions of the server software didn't fix the problem immediately, as expected.

Microsoft, however, downplayed the incident, claiming that fewer than 12,000 PCs had been affected. The company's support forums, however, hinted that the problem was much more widespread: one message thread had collected over 450 messages within two days and had been viewed by 45,000 people.

One analyst gave Kochis' status report a mixed grade.

"I was looking for two things from Microsoft, and the first was that they would acknowledge that there was a failure," said Michael Cherry, an analyst at Kirkland, Wash.-based Directions on Microsoft. "If they couldn't do that, it would show a real lack of insight into the severity of the problem. But they called it an 'outage' [here], which I don't think they had actually admitted before."

Cherry was more than on the mark. While Kochis called the incident a "temporary service outage" in his newest post, three months ago, he denied that the word applied. "It's important to clarify that this event was not an outage," he said on August 29, five days after the servers went down.

"Second," said Cherry, "I wondered if Microsoft would acknowledge that failures are going to happen, that something's going to go wrong no matter how many drills they have. And when that happens, what would they do? But I don't see anything like that here."

Kochis said the WGA team has also changed the way it updates the validation service's servers, beefed up free WGA phone support to round-the-clock coverage and improved the speed of delivery of "get-legal" kits to users who discover they're running counterfeit software, but he made no mention of any modifications to the antipiracy program itself, how it's implemented or how users are handled when it determines they're using fake copies of Windows.

"They should make it so that any impact [of an outage] is on Microsoft and not on the customer," Cherry said.

Back in August, Kochis claimed that Microsoft's policy was to do just that -- err on the side of the customer -- but he contended that the outage had been an anomaly. "Our system is designed to default to genuine if the service is disrupted or unavailable," Kochis said then. "If our servers are down, your system will pass validation every time. [But] this event was not the same as an outage, because in this case the trusted source of validations itself responded incorrectly."

That's not good enough, according to Cherry. "If users can't validate, for whatever reason, Microsoft should leave them in their current state, but not invalidate them, or validate them, at least until the next check," he said.

"You have to take the utmost care before you deny something to someone that they have purchased in good faith," he concluded.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.
Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.
Gregg Keizer

Gregg Keizer

Show Comments

Brand Post

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Tom Pope

Dynabook Portégé X30L-G

Ultimately this laptop has achieved everything I would hope for in a laptop for work, while fitting that into a form factor and weight that is remarkable.

Tom Sellers


This smart laptop was enjoyable to use and great to work on – creating content was super simple.

Lolita Wang


It really doesn’t get more “gaming laptop” than this.

Jack Jeffries


As the Maserati or BMW of laptops, it would fit perfectly in the hands of a professional needing firepower under the hood, sophistication and class on the surface, and gaming prowess (sports mode if you will) in between.

Taylor Carr


The MSI PS63 is an amazing laptop and I would definitely consider buying one in the future.

Christopher Low

Brother RJ-4230B

This small mobile printer is exactly what I need for invoicing and other jobs such as sending fellow tradesman details or step-by-step instructions that I can easily print off from my phone or the Web.

Featured Content

Product Launch Showcase

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?