Foreword

Dear reader,

My name is Frederic Detienne and I had the chance to participate during the early days and eventually to lead cryptographic product development in Cisco, acting as Architect of DMVPN and FlexVPN. I started as a TAC Engineer and have evolved into network designer and consultant, inside Cisco and toward our customers. I work across functions with our Engineering, Advanced Services, Support, and Marketing departments.

It seems like yesterday or many eons ago ... In the beginning were crypto maps.

A few dinosaurs (like myself) started their journey in cryptographic protocols and algorithms when Cisco released CET (Cisco Encryption Technology) on IOS 11.2 in August 2003.

My only exposure to cryptography had been strictly theoretical, as a student at the University of Liège 7 years before. I suppose I was lucky to have had such a background as around me, nobody seemed to have received any crypto education nor felt inclined toward that very obscure technology. Before that, cryptography was managed through very complex systems, mostly reserved to governments and militaries.

CET was commercial grade in the sense that it was a major simplification over the former systems. It allowed a mere mortal to configure a very regular and relatively cheap router (Cisco 2500) to encrypt data across a public Layer 3 network. The cryptographic algorithms were very good: DES, then 3-DES, Diffie-Hellman key exchange. At 160 Kbps, the throughput was acceptable in those days.

In the aggregation services, technology goes through 4 steps: make it work, make it work reliably, make it work at speed, make it work at scale. There are other timelines of interest, but this one mattered particularly for cryptographic VPNs.

CET evolved into IKE/ISAKMP + IPsec as the drafts matured into standards under the leadership of Dan Harkins.

Nobody really knew what we were going toward. The initial code inherited from CET which we still had to support for our early adopters. It was also modular and ready to accommodate future enhancements, optimizations and hardware architectures. In a word: it was messy.

The data-plane vs control-plane separation outlined into RFC2408 was both a blessing and a curse. On one hand it brought complexity, on the other, it brought good OSI and code separation without which we may not have survived.

At the control plane level, IKE itself and its rekey complexity, the differences in behavior between IKE SA rekeys and IPsec SA rekeys triggered many race conditions. Overall, we managed to stabilize the system and we “made it to work reliably.” Step 2 was complete.

In the data plane things were less rosy. Crypto maps quickly showed their limits:

–the combinatory explosion of source/destination pairs on ever larger and complex networks

–code complexity due to packets being stolen in OSI layer 2 and re-encapsulated into a new IP header (OSI layer 3)

The security policy size explosion made mesh networks totally unmanageable. Besides, the security policy was mostly a static transcription of information we already had in the dynamic routing table, which led to customer frustration.

A few site-to-site and hub-and-spoke configurations were possible, but manageability and scalability suffered badly. The TAC was recommending the use of GRE protected by IPsec in order to run routing protocols on top of the tunnels. This quickly became the preferred way to deploy complex meshes. Scalability was relatively limited due to hardware performance, but it really made everyone’s life easier and the support of those network became very pleasant with more and more satisfied customers.

Meanwhile, EasyVPN had appeared and was offering a remote access solution. The clients were either PC software or small branch routers. The big advantage was that the hub configuration was very compact—a few lines would allow hundreds of remote branches to connect. Unfortunately, the underlying implementation relied on crypto maps and suffered from quality and supportability issues. While EasyVPN was very good, it was not stable enough compared to the GRE/IPsec solution we used in mesh.

Customers had to choose between easy of configuration for large but simple hub-and-spoke networks and a more complex configuration for mesh networks.

One day, someone showed me NHRP: a protocol to establish circuits on demand. The code was very crude and incomplete but the developer (who had left Cisco by then) had provisioned for GRE tunnels, very likely in order to test his code without expensive equipment. I had this light bulb moment and hacked together a prototype to encrypt those GRE tunnels as they were created.

DMVPN was born in a TAC lab in Brussels, demonstrated to our colleagues in San Jose, California, and developed into a product.

We now had something that worked well, was satisfactorily stable despite being a fresh feature, and offered an easy configuration for complex networks. It started as DMVPN phase 1 with hub-and-spoke only and followed quickly with DMVPN phase 2 allowing dynamic branch-to-branch tunnel creation.

Scale was not there yet though. The IGPs (OSPF and EIGRP mostly) caused significant burdens and deploying more than 350 nodes networks was still a burden. It may seem small today, but the bulk of the network sizes grew as technology permitted. A mesh network of 350 nodes was fantastic back then. Just that the market quickly got used to it and demand for more appeared.

The market demanded that we scale up both the tunnel density (the number of tunnels on a single given platform) and horizontally (the ability of a cluster of DMVPN hubs to collaborate to service). DMVPN phase 2 daisy chaining was a dreaded system to design and troubleshoot. Besides scalability problems, it also suffered from reconvergence time and convoluted configuration.

The workload reduced dramatically while market shares and revenue took off; we started work on scaling DMVPN before it become too stringent.

The semantic of the NHRP redirect and NHRP resolution forwarding appeared and helped us scale almost limitlessly across hubs. You could literally have dozens of hubs working in cluster mode. Also importantly, we could finally get away from the traditional IGPs and investigate lighter protocols such as RIP, OTV and even BGP (which is feature-rich and complex at large but out of which we only needed the simplest elements). We could now scale to about 1500 peers per hub and a virtually limitless number of hubs. Each additional hub would linearly add its capability to the cluster. This was an important step forward in network design.

The biggest challenge was now to educate our customers and sales team about the various options and design. When you work on Crypto VPNs all day, every day, it is easy to forget that very few people actually understand the ins and outs of every feature and design. Also, many customers were satisfied with what they had and had no reason to investigate for more—or even suspect that something better could exist.

A metric of complexity could be seen in our 8 hours CiscoLive session going over our multiple Crypto VPN solutions and their use case:

Image Crypto maps

Image Easy VPN (client mode and network extension mode)

Image Enhanced Easy VPN

Image GRE/IPsec

Image GET VPN

Image DMVPN phase 1, 2 and 3

The pros and cons panned out as below:

Image Crypto maps were still as limited and terrible as before but are necessary for third-party integration as they offer minimal compatibility with devices that have minimal functionality.

Image EasyVPN supported remote access (especially the software client) compact but the feature had grown organically and the UI was terrible; it was also crypto map based, and its quality was poor.

Image Enhanced Easy VPN solved the crypto map problem and was a major improvement over Easy VPN, but it did not enjoy proper marketing and remained poorly adopted. The UI was the same and hence difficult.

Image GRE/IPsec was slowly disappearing at the benefit of DMVPN and tunnel protection in the site-to-site scenarios.

Image GET VPN has lower security and limited scalability, but it is lighter on resources when used properly, if the use case is adequate. Notably, it allows native multicast.

Image DMVPN was growing in both the hub-and-spoke and partial mesh cases, but the routing protocol was a deterrent for Security Operations who preferred using EasyVPN.

This really meant 8 hours during which we barely had the time to describe how a solution worked and what use case it was best for.

Customers who were successively shopping for a remote access solution, then a site-to-site, then a dynamic mesh ... had to study and learn new ways of designing and troubleshooting for each feature, over and again.

The complexity we were witnessing in TAC on our fresh recruits was impacting our customers, partners, Advanced Services, and sales teams.

At the same time, as all things so far, after a few years, market demand slowly started to outgrow DMVPN. Tunnel density still had to increase and the routing protocols were not scaling anymore.

We decided to merge EasyVPN and DMVPN features into a single feature that would offer us the advantages of both under a single feature set: one time learning, applicable always. The characteristics had to be the following:

Image clear, consistent, compact, and powerful CLI: simple things ought to be simple to configure, complex things ought to be possible.

Image using routing protocols should be a customer choice, not mandatory.

Image NHRP usage could decrease except for spoke-spoke tunnel creation

Image increased scale to 10,000 tunnels per hub at least.

Image all the remote access management features had to be applicable to site-to-site and hub-and-spoke (AAA authorization in particular to apply per user QoS, ACLs, and so on.)

Image reduce the reliance on PKI and make pre-shared keys more manageable. Both had to be possible, at least for hub-and-spoke

Image backup and load balancing scenario

Image third-party interoperability

Image high serviceability/troubleshootability

Image reduced learning time by using consistent protocol and data flows

Image state of the art security at the cryptographic and network level

Because we could not take the risk to break IKEv1 stability nor invest in a protocol that was slated to disappear, we used IKEv2 as an inflection point to do things right. Clean implementation, clean user interface.

Today, we are capable of offering combined training, including hands-on experience, covering remote-access, hub-and-spoke, dynamic mesh, AAA management, and some troubleshooting in 4 (fours) hours. The total training time has decreased by an order of magnitude.

FlexVPN is not perfect and is not the end of the road, but in terms of applicability and total cost of ownership, taking in account training time and supportability, this is the best we have ever had.

I hope you will have as much pleasure discovering FlexVPN in this book as we had developing those features, thinking about you, our users, our customers, our sponsors.

None of this would have happened without great individuals who went beyond the basic market analysis that a typical Product Management team performs and took it on themselves to listen to our customers’ real demands.

Namely, it took the courage of one Senior Manager, Pratima Sethi, to sponsor and execute on the development of FlexVPN. She also made DMVPN and EasyVPN successful; she understood deeply the need of post-deployment capabilities such as monitoring and troubleshooting and made it all possible.

The authors of this book, Amjad Inamdar and Graham Bartlett, are long-time collaborators who also deeply impacted all our VPN solutions, and I am very proud to work with them.

The teams prominent members included Alexandre Honore, Olivier Pelerin, Wen Zhang, Raffaele Brancaleoni, Sairam Yeleshwarapu, Saikrishna Adoni, Tapesh Maheshwari, Raghunandan P., and many others to whom I apologize for not citing.

Frederic Detienne

Distinguished Services Engineer

Cisco

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset