Microsoft set to open source its powerful network emulator

Read time 3min 10sec
ONE goes open source.
ONE goes open source.

Microsoft is set to open source its powerful Open Network Emulator (ONE) system, which has been tested on its own Azure network infrastructure for the past year, because it is too important a resource for the Washington-based, global giant to keep to itself.

That's according to Dr Victor Bahl, distinguished scientist and director of Mobility and Networking at Microsoft Research, the Microsoft division that focuses on basic and applied research in all areas related to networked systems and mobile computing.

ONE was hailed as "the big announcement" of the recent Microsoft Research Facility Summit 2018, at which the latest information and results from Microsoft's product and research group was presented to leaders and researchers from the broad systems research area in computer science.

Reliability, availability

In a podcast interview recorded at the summit, Bahl said that ONE is about ensuring the reliability and availability of networks, particularly those complex, cloud-scale networks, which, because of their size had complexity, makes them exceptionally vulnerable to inadvertent human error.

He pointed out that emulating cloud-scale networks to be able to test any modifications before they go live was almost impossible yet a tiny error make during a change in a large cloud network could lead to a massive outage.

"Let's say everything (on the network) is working perfectly. Barring hardware failure, everything should be fine. But then somebody, who is part of your team, goes and changes something somewhere... and this can bring down an entire (cloud) region because if you break the network, your packets are going nowhere," Bahl said.

"I have many horror stories about that. It's what keeps me up at night, worrying that if something happens somewhere, millions of people will be impacted. I don't want to be the source of that.".

ONE was built to prevent change-induced network crashes by preventing changes to the network from going live until they have been tested and checked.

How this works is that ONE effectively replicated the entire network. When network engineers and operators make changes, they are actually only making the change to the emulator, not to the underlying network.

"Because it mimics the network underneath so amazingly, you can't tell the difference. Once the changes have been made, the emulator will then try them out and ensure they are good. Once this has been done, the emulator will make the change to the network below and, voila, it should all work," Bahl explained.

Catching bugs

In the year that Azure network engineers have used ONE, it has caught hundreds of bugs in proposed changes, potentially preventing major outages.

"There have been some major technical problems (in getting ONE launch-ready). Now we have decided that this is such an important resource for everyone that just hoarding it ourselves is not the right thing to do. We are making it available to the entire community," Bahl said.

This would help large enterprises to improve their network uptime. It would also provide students and researchers with a tool they could use to simulate hyperscale networks such as those built by Microsoft, Google and Amazon without having access to the actual networks themselves. Networking product vendors would also have a way to test new software at scale.

See also