Network-upgrade horror story

IT executive learns key lessons during four-year effort to get revamp off the ground

It sounded like a no-brainer back in 2003. Replace the aging, 155Mbps ATM-over-SONET network running at the University of California, San Francisco with a state-of-the-art network based on 10G Ethernet over dense wavelength division multiplexing.

During the nine years that the ATM-over-SONET system has been in place, the metropolitan network has grown to 55,000 nodes encompassing two San Francisco campuses, four hospitals and more than 200 remote sites, including regional clinics spread throughout California. The campus network also has evolved into an essential, mission-critical utility, right up there with water and electric power.

Reliability had become a worry, however. Of great concern was the ticking clock: network devices that were at -- or rapidly heading toward -- end-of-life. That means no vendor support for such essentials as software patches, technical support and replacement of failed hardware components. Cisco's support for the Catalyst 5500s and LS-1010s was waning.

In addition, the demands of video distribution, telemedicine and medical-imaging technologies were quickly making the network outdated. It lacked QoS or multicast capabilities. That meant e-mail, Web surfing, video and medical images all got the same "best effort" treatment. Video packets were broadcast indiscriminately, causing bottlenecks and congestion. Applications that needed greater bandwidth or QoS, such as those used for remote clinician consultation and patient diagnosis and medical research, could not be carried efficiently -- or at all -- on the network.

Clear sailing in the design phase

In the summer of 2003, a design team of network technologists from campus IT, several campus departments and the medical center began to think about a new network. We considered what technologies offered the best mix of price and performance and which offered the greatest capability for expansion and the lowest risk of downtime.

DWDM quickly became a front-runner in terms of the potential technology. It can scale over time from eight lambdas (light-wave channels) all the way to 32 protected lambdas or 64 unprotected lambdas.

DWDM would provide a graceful evolution for the network's ever-increasing demands for capacity and capability. Each individual lambda running as fast as 2.5Gbps can carry a different service. For example, we could run the production Ethernet network over one lambda and a high-definition video feed over another. Or we could choose to provide a secure second Ethernet network for the medical center to connect the university's hospital facilities. This would let secure, electronic, protected health information move across the medical center's clinical network without coming in contact with student and faculty traffic on the campus network.

Then there is the matter of protected and unprotected lambdas. The bane of any optical-fiber-based network is the feared fiber cut. DWDM offers the option of protected lambdas, which run in one direction in the DWDM ring, while working lambdas run in the other direction.

Most DWDM gear has protection-switching that senses the loss of signal from the failed working lambdas and switches to the protected lambdas in less than 50 microseconds. There are few if any network applications that would notice that short an outage.

To add even more resiliency, we engineered in topology reliability. The new network was designed with diversely routed, dual-concentric rings at the main sites. Thus, a fiber cut or optical failure would have to take out both rings to cause a network failure. Even then, protected lambdas would take over.

Now we had the basis for the new network, which we christened UCSF's Next Generation Metropolitan Area Network (NGMAN).

NGMAN is made up of core and secondary sites. The core consists of the two main campuses and a central administrative building. San Francisco General Hospital, Mount Zion Medical Complex, Laurel Heights Conference Center and the Veterans Administration Medical Center are secondary sites.

Core sites are the locations with the heaviest traffic demands. They also are the sites with the most users. Therefore, they have the highest bandwidth (10Gbps) and the most resiliency. Most secondary sites connect to the core in a point-to-point fashion using unprotected lambdas running at 1Gbps or 10Gbps, depending on their traffic requirements.

The product of building reliability on top of reliability was a resilient, redundant and self-healing network that could survive such events as earthquakes and bioterrorism -- not an unimportant consideration for a patient care network in a seismically active area. In fact, NGMAN's design let it achieve five-nines of reliability -- no more than 5.26 minutes of downtime a year.

UCSF has a "build it and they will come" philosophy. We don't build things frivolously, but we do build them on faith. The university built an entirely new campus at Mission Bay hoping to attract top medical researchers from around the world. A number of educators and researchers in fact made their way to UCSF and wound up doing their research in the new state-of-the-art Mission Bay buildings, which were outfitted with high-performance networks.

There was an element of "build it and they will come" in the NGMAN project as well. The network was built to support future medical applications. It needed to be high-performance and support QoS and multicast. It had to support high-definition video distribution, IP telephony and real-time medical imaging. And it had to be scalable.

We chose a modular approach to minimize forklift upgrades. Modularity extended to more than just the equipment. We intended the modular concept to allow for adding and deleting secondary sites easily. If a site didn't need the full capabilities of DWDM, we could bring it online via alternative technologies, such as optical metropolitan Ethernet service or leased services.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.
Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Jeffrey Fritz

Network World
Show Comments

Cool Tech

Toys for Boys

Family Friendly

Stocking Stuffer

SmartLens - Clip on Phone Camera Lens Set of 3

Learn more >

Christmas Gift Guide

Click for more ›

Brand Post

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Aysha Strobbe

Microsoft Office 365/HP Spectre x360

Microsoft Office continues to make a student’s life that little bit easier by offering reliable, easy to use, time-saving functionality, while continuing to develop new features that further enhance what is already a formidable collection of applications

Michael Hargreaves

Microsoft Office 365/Dell XPS 15 2-in-1

I’d recommend a Dell XPS 15 2-in-1 and the new Windows 10 to anyone who needs to get serious work done (before you kick back on your couch with your favourite Netflix show.)

Maryellen Rose George

Brother PT-P750W

It’s useful for office tasks as well as pragmatic labelling of equipment and storage – just don’t get too excited and label everything in sight!

Cathy Giles

Brother MFC-L8900CDW

The Brother MFC-L8900CDW is an absolute stand out. I struggle to fault it.

Luke Hill


I need power and lots of it. As a Front End Web developer anything less just won’t cut it which is why the MSI GT75 is an outstanding laptop for me. It’s a sleek and futuristic looking, high quality, beast that has a touch of sci-fi flare about it.

Emily Tyson

MSI GE63 Raider

If you’re looking to invest in your next work horse laptop for work or home use, you can’t go wrong with the MSI GE63.

Featured Content

Product Launch Showcase

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?