December is a relatively slow time of year at MLB Advanced Media, the company that brings you the official Major League Baseball Web sites. From pitch-by-pitch accounts of games to streaming audio and video -- plus news, schedules, statistics and more -- it has baseball covered. Doing so requires serious horsepower, so much so that the company's Manhattan data center is pretty much tapped out in terms of space and power, according to Ryan Nelson, director of operations for the firm. Strategic use of virtualization technology enabled him nevertheless to forge ahead with implementing new products during the 2007 season, and promises to smooth a shift to a new data center in Chicago in time for the 2008 season.
How long have you been using virtualization technology?
It's all pretty new. We are a homogeneous Sun shop, so we're not really touching a lot of the VMwares of the world. One of the big features of Solaris 10 is Solaris Containers and Zones. We started using Solaris Zones in the last year to actually split off server environments, development environments and [quality assurance] environments.
During the 2007 season we got hit with a big new challenge we didn't find out about until the All-Star break, which was to add a chat product. There was pressure to get it lit up before September so fans could chat about the playoff races and use it during the playoffs. But it was a big, ambitious project and I didn't have any rack space or spare power and [there was] no time to order new machines. So, we worked with a company called Joyent in California that provides hosting using virtual zones and virtual storage.
We said to Joyent, 'We need 30 machines; 10 in a development cluster and two more gangs of 10 as big chat clusters.' And so the MLB chat client was basically turned up in a couple of days vs. a month or two that it would have taken us to get somebody to ship and install all these machines. And then we developed like crazy for about a month, tested for another three weeks, then launched it.
At launch time we asked for another 16G bytes of RAM in each server. It scaled very well. When the playoffs and World Series came around, we ordered up 15 more machines and got twice as much memory and processors installed on them, as well as on the ones we already had. Joyent dials all this up and down. As soon as the World Series is over, we call and say, 'Thanks, that was great. Let's scale down to a skeleton crew of these machines.' So, when I have a need for it, we pay for the utilization. When we don't, we don't. We can turn it up and down as we need to.
We can respond to new projects really quickly, and it also lets us try out new products. If our chat product had been a huge failure, we could've turned the whole thing off and it wouldn't have been a big deal. It makes it easy to try new things. We don't have to sign a contract, get approvals and all that.
We can also respond to the seasonal load changes. And we can also respond to differences in the season that we know are coming. In April, we're focusing on registering new users and selling new products. On draft day, I might need to really beef up my stat resources because people are querying our minor-league stats engine to see who this guy is they just drafted. In the middle of July I may need an additional 10 machines to be generating the CAPTCHA images and processing All-Star balloting. All-Star balloting is about four days of crazy database load, and then it goes back to nothing.
Give us a sense of the MLB.com infrastructure.
In terms of Web servers, we have roughly 100 at our New York data center, and we have a second data center in Chicago that is just about to go online that has 130 servers. So, by the time we get cooking on the 2008 season, we'll have in production about 180 of those.
So you're just wrapping up the new center?
We've had it for about a year, but it's been in build-out phase. Part of the reason we're interested in virtualization is because of the power, space and data-center-capacity pain -- we've certainly felt that. We were actually in a facility in Chicago and outgrew it before we got in production, and so moved to another facility from the same company. We knew we would need more floor space and more power. We're finishing it this off-season. Once Chicago comes online, we're going to take much of the New York data center offline and rebuild it.