A Bit of History
The current Gandi network has grown organically over the course of the past ten years, with everything starting from a relatively simple "flat" design. Over time, as more and more services and features were added to the Gandi product lineup, the network has had to be extended to keep up with the product innovations. Most notable in this gradual expansion has been the extension of the network across multiple datacenters, and the desire to maintain resilience across multiple sites.
Because of the relatively "flat" nature of the original network, and the desire for layer-2 adjacency across multiple datacenters, one of the simplest ways to provide this cross-site adjacency was to simply "span" or "trunk" the relevant LAN segments cross the different sites. While this is relatively efficient for a small network, with only a couple of sites to worry about, it becomes rapidly less and less efficient and drastically more cumbersome to continue this approach for very long. This is perhaps where we went astray several years ago.
In 2008 we realised that although the various pieces of routing and switching equipment in the network had "names" as if they were part of a structured hierarchical network model, this in fact turned out not to be the case because we found ourselves with a network infrastructure with six core switch/routers and numerous distribution aggregation routers, across four different sites -- all with the same network segments spanned/trunked through the core to all layers of the network. The result of this is a nightmare to manage and diagnose troubles related to spanning tree and odd routing behaviour.
The Origins of Operation Dragonfly
In addition to simply the network architecture itself, much of the equipment was starting to show signs of age and was not capable of features that were now required in order to further expand and improve Gandi's services. Therefore, starting in 2009, we set out to modernise and restructure the network to meet not only the current needs of the Gandi services, but also our future needs for the next three to five years, both here and France and internationally.
The first step was to consolidate inter-site connectivity, which until that point was using multiple trunked VLANs across dark-fibre connections. We retained the dark fibre, of course, but migrated to wavelength multiplexing to provide different point to point connectivity between equipment across the dark fibre links.
The challenge that we have with any major network restructuring activities is making the changes without drastically impacting the Gandi services at the same time. Certain services require a near 100% uptime to ensure correct operations and synchronisation and simply going around ripping out cables en-masse and reinstalling new equipment is not really feasible in an environment where high-availablity is required. This is why this work has been taking so long to accomplish, and we have been doing it piece by piece over the course of nearly two years.
Nevertheless, we must continue to salvage and maintain the services, whilst we move forward at full speed towards the exciting new features that will be implemented over the course of the next year. For the sailors among you, we just had to overhaul the outboards !
In September 2009, we upgraded the core switch processors from the Cisco Sup32 platform to the Sup720 3BXL platform.
The next step in November 2009 was the rejuvenation of the old Cisco 6500 platforms we had in the distribution layers of the network at our datacenters in St. Denis, migrating these from the old Supervisor-2A platforms to the Cisco Virtual Switching System 1440 (VSS). We also began deploying MultiProtocol Label Switching (MPLS) technology across the core of our network in order to ultimately provide traffic engineering and eventually innovative ways of interconnecting remote network segments together across our network core.
With the expansion of Gandi's network, not only here in France, but also internationally, we had to overcome a challenge of how to adapt the legacy network infrastructure to something scalable and capable of handling our international requirements both in terms of internet connectivity and for eventually datacenters on the other side of the Atlantic.
Any network architect will tell you that a scalable network is modular and hierarchical. This has been one of the guiding principles of network infrastructure design for many years, and still holds true today. Cisco calls this the "three-layer network model". Over the past few years, however, this has become a little more complex with virtualisation and service-oriented architectures where the service becomes part of the network itself. Nevertheless, practically all large scale network infrastructures in operation today follow a hierarchical and modular architecture model.
With the expansion of Gandi to the north-american continent, we needed to ensure resilient connectivity between the datacenters. A simple rule of thumb applies here -- each datacenter is connected to at least two other datacenters via diverse connectivity paths.
This past year, we built the datacenter in Baltimore, and this datacenter is now fully operational for the Gandi Hosting platform. Over the course of the coming months, a number of the other Gandi systems will be available in Baltimore as well -- most notably DNS and the Gandi Website.
As of the end of 2010, our international connectivity also included additional peering capacity in London, Amsterdam, and Ashburn Virginia/Washington DC -- more to come over the course of the next couple of years!
We have accelerated the network engineering activities since the beginning of 2011 and over the next few months there will be a number of highly-intrusive maintenance activities undertaken with the objective of finalising Operation Dragonfly before the end of the first half of 2011.
The first part of this final phase took place last weekend, at St. Denis, where we finally and definitively removed the "spanned/trunked" LANs between the distribution/aggregation and core layers of the network in St. Denis. As a result of this move, any "issue" in the access or distribution layers of the St. Denis network segments will have no impact on the routing core of the Gandi network in that location. Next we have to do the same thing at Telehouse2, which is a little more complex!
Over the course of the next few weeks, you will see a number of scheduled maintenance periods during which we will perform pretty much the rest of the necessary activities in support of this three-year project:
- Closure of our point of presence at Paris Telehouse-1 (Jeuneurs)
- Updating of a number of critical Gandi services to make use of new hardware and software technologies.
- Implementation of Anycast and Content Delivery technologies for a number of services to include DNS and the Gandi Website across all of our datatenters both in Europe and in the US.
- Complete upgrade and re-installation of the Gandi services at Paris Telehouse-1 (Voltaire) with inter-site resilience and interoperability.
What Will it Look Like?
The new Gandi network infrastructure will resemble something like the diagram below (though this is somewhat simplified). The key elements here is that the architecture is designed and build for resilience, modularity, and ultimate expansion… it's pretty elementary after all… As Mickey Mouse once said… "Arithmetic is being able to count to twenty without taking off your shoes". ;)