Le SD-WAN pour les nuls JRES 2017 - Nantes Jérôme Durand, Consulting Systems Engineer, Cisco @JeromeDurand - http://reseauxblog.cisco.fr
Challenges sur les sites distants Introduction au SD-WAN Agenda Challenges sur les sites distants Introduction au SD-WAN La virtualisation sur les sites distants Démos Conclusion
Les challenges sur les sites distants
La digitalisation met les sites distants sous pression 80% Plus d’utilisateurs Of employee and customers are served in branch offices* 73% Digital Displays Omni-channel Apps SaaS Enterprise Apps Plus d’équipements Growth in in mobile devices from 2014 - 2018** 20-50% Plus d’applications Guest WiFi HD Video Online Training Increase in Enterprise bandwidth per year through 2018** Site distant 30% Plus de menaces Of advanced threats will target branch offices by 2016 (up from 5%) ** 1 = Networking Index: Global Mobile Data Traffic Forecast Update, 2013–2018: http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white_paper_c11-520862.html 2 = Cisco 2014 Mobility Landscape Survey: https://www.cisco.com/c/dam/en/us/solutions/collateral/enterprise-networks/mobile-workspace-solution/enterprisemobilitylandscapestudy-spring2014.pdf Mobility and cloud is driving Digital innovation is overwhelming in-branch networks. The branch serves up to 80% of employees as well as customers – and with most branches providing 4 Mbps or less connectivity, it is a chokepoint for getting the new breed of bandwidth intensive, latency sensitive applications through. According to Gartner average enterprise bandwidth will increase by 20% to 50% per year (through to 2018), depending on the region and the line of business. This rise will be driven largely by the adoption of video and cloud applications, rich media and data center centralization. Digital innovation is overwhelming in-branch networks. Look at retailers for example – Enabling web experiences in store represents a huge opportunity – both for employees and customers – according to recent Google research 89% of consumers use smartphones while shopping in store. But it’s not just consumer facing in-branch web experiences that are exploding. Most enterprises leverage Web based apps either delivered from their data center or from the cloud in the case of SaaS (e.g. salesforce.com). This explosion in Web based apps in the store has overwhelmed in-branch networks. Mostly in-branch networks aren’t optimized for performance or this flood of data – high latency and lack of bandwidth is the norm. And as we all know that can lead to application performance issues. Imagine a scenario (as was the case for one of our early customers) where a store associate is trying to complete an online order for an out of stock item – but the transactions is slow to load/timing out – the customer obviously didn’t stick around – and from that point on the iPad was left in a drawer and the store associate didn’t use that app ever again. Similar scenarios for slow loading/unusable online training, SaaS apps, etc. have been seen across our customers. Getting more bandwidth (i.e. more links) might seem like a viable option – but often more bandwidth isn’t even available (e.g mall locations) or cost prohibitive. According to analysts in 2013, three out of four organizations will not have any additional wide area network (WAN) budget (Nemertes Research). That means 75% of IT teams will not be able to buy more bandwidth to address this exponential traffic growth. We all know that apps that are slow or not working properly are not an option from a business perspective. So what can be done? Social Media OS Updates Mobile Apps
Une nouvelle tendance sur les sites distants Cloud Les applications migrent dans le cloud Data Centers Les accès Internet se rapprochent des sites distants Site distant Le WAN devient toujours plus critique Le WAN connecte l’utilisateur aux applications Besoin d’un réseau plus agile Besoin de SLA applicatifs
Quand l’utilisateur rencontre un problème de performance sur une application … Administrateur réseau Equipe serveurs / applications Mon application est lente, je ne peux pas travailler aujourd’hui Mes serveurs ont l’air OK, ça doit être le réseau Je ne vois pas de problème mais je ne peux rien certifier D’où vient le problème ? Augmentation de latence WAN ? Application ? Serveur ? PC ? Utilisateur ??
Que se passe-t-il dans mon réseau ? Port Monitoring Application Monitoring bittorrent Netflix share-point gtalk-voip google-docs rtp cirix Ssl sip skype webex-meeting https flash-video dns facebook unknown http https ica sip dns cifs hsrp icmp ldap msnp sap Applications Applications
La nécessité du DPI (Deep Packet Inspection) Classification statique par port ne suffit plus Chiffrement et applications qui veulent se fondre dans la masse Applications utilisant diverses sessions (video, voice, data)
Evolution de l’internet
Introduction au SD-WAN
L’ONUG – Open Networking User Group Communauté d’utilisateurs Définition des besoins des grandes entreprises Travaux importants sur le SD-WAN https://www.onug.net
Le SD-WAN selon l’ONUG
10 pré-requis SD-WAN selon l’ONUG (1/2) Gestion de plusieurs liens actifs (publics et privés) WAN construit sur des équipements physiques et virtuels WAN hybride sécurisé permettant d'appliquer une ingénierie de trafic par application, prenant en compte la performance des liens Visibilité et priorisation des applications critiques et temps réel selon les règles définies Architecture hautement redondante
10 pré-requis SD-WAN selon l’ONUG (2/2) Intéropérabilité au niveau 2 et 3 avec le reste de l'infrastructure Interface de management centralisée avec tableaux de bord par application, site et VPN Programmabilité de l’infrastructure à travers des API sur un contrôleur qui fournit une abstraction de l’ensemble. Envoi des logs vers collecteurs tiers (SIEM...) Un équipement doit pouvoir être déployé sans configuration et un minimum d'effort sur l'infrastructure actuelle Certification FIPS-140-2 pour le chiffrement
Une transition au niveau du SLA QUI ? Opérateur Organisation QUOI ? Réseau Application
WAN hybride et overlay Active/Standby WAN Paths TRADITIONAL HYBRID FULL SD-WAN OVERLAY Two WAN Routing Domains Active/Standby WAN Paths One IPsec Overlay One WAN Routing Domain Active/Active WAN Paths Data Center Data Center ISP A ISP B ISP A ISP B MPLS Internet Internet IPsec IPsec IPsec MPLS Traditional WAN Hybrid designs involved MPLS with DMVPN/Internet as a backup. Separate routing domains would exist for each WAN transport. A third domain would involve the Data Center. In order to integrate the separate routing domains redistribution is used while aligning the metrics to ensure that the MPLS path was preferred. For data privacy, typically 2 different encryption technologies are used, GETVPN over MPLS and DMVPN over public/Internet transports. Traditional Hybrid designs have been around for awhile and are well understood. With IWAN we improve and simplify the Hybrid WAN design. In a Hybrid IWAN model, DMVPN is used over both network topologies. This allows for a single WAN design independent of the transport. DMVPN is used over both MPLS, Internet 3G/4G, etc… instead of using getVPN for MPLS and DMVPN public internet transports for a consistent security profile. By using DMVPN as the tunnel overlay for both topologies we can use a single routing protocol that is easy to implement and tune for active-active WAN paths. The consistent IPsec tunnel overlay and single routing design, provides for simpler configuration, simpler topology management, and simpler change controls over any WAN design; single/dual MPLS, Hybrid, dual Internet, mobile branches, etc…. In addition when adding PfR to this new architecture, load-balancing becomes simpler and more efficient (more on this later). In the traditional model you typically have a primary connection over MPLS and a backup over the Internet. In that case we usually configure DMVPN over the Internet connection, while we have standard IP between the PE and the CE on the MPLS path. With the IWAN model we add DMVPN over the MPLS path. When we examine the routing models of the two approaches, the Traditional Hybrid approach requires either BGP or static routes to the MPLS path, while we have either an IGP (such as EIGRP) or iBGP over the DMVPN path. Which means we have two different routing domains for two different paths. In many cases the LAN is yet one more routing domain in which case we now need to redistribute between WAN and LAN. This can become quite complex. The IWAN Hybrid design requires two tunnels. The minimum requirement therefore on the two interfaces is the requirement to setup the tunnel. A routing protocol, such as EIGRP or BGP, is then configured on top of the tunnels, thereby creating a single routing domain. Furthermore we don’t expose the Enterprise routing table to the provider. The provider is simply routing the tunnels. IWAN Hybrid Design Model Characteristics Has a single premium transport – MPLS VPN or Ethernet VPN WAN Uses a single Internet link Use Front Door VRF (FVRF) to provide separation between internal and external routing Single DMVPN overlay Single routing domain Branch Branch
Downtime per Year 8 Hours 46 Minutes Exemples … Downtime per Year 4–9 Hours MPLS Internet 99.95%* Downtime per Year 8 Hours 46 Minutes 99.90%* 1 routeur 1 connexion 99.995% 99.995% 99.995% 1 routeur 2 connexions MPLS MPLS MPLS Internet Internet Internet 26 Minutes 99.999% 99.999% 99.999% MPLS MPLS MPLS Internet Internet Internet 2 routeurs 2 connexions 5 Minutes Network availability and resiliency questions usually come up when introducing the IWAN solution. To address these concerns we evaluated various WAN designs based upon industry best practices for calculating system availability. Most MPLS providers offer a 3.5-9’s availability SLA. This is approximately 4.25 hours of down or less per year to be in compliance with the SLA contract. Most business grade broadband providers are offering 3-9’s availability SLAs which is approximately 8.75 hours of down time a year. For the availability calculation, we added the MTBF down time estimates per year of the typical ISR. In the single router single path designs on the top row maintain the 3.5 and 3 9’s of availability because the single link is the lowest common denominator. When we added a second transport we increased the network availability to 4.5 9’s which is less than 30 minutes of down time a year. The assumption here is that the dual provider design had path diversity between providers. For MPLS, Broadband Cable, Broadband DSL this assumption is reasonable. Now when we added dual routers and dual paths, we increased availability to 5 9’s and 5 minutes of down time a year. The point to all of this is that redundancy and path diversity are more important factors in building highly available WAN networks. * Typical MPLS and Business Grade Broadband Availability SLAs and Downtime per Year, calculated with Cisco AS DAAP tool.
Quelques points de vigilance Le coût La sécurité Le cloud La migration
La virtualisation sur les sites distants
Autres challenges sur les sites distants Complexes à manager Intégration des équipements Plusieurs équipements Routeurs, Appliances, Serveurs OPEX important Upgrades, renouvellements, déplacements sur site The problems seen today at remote sites is that to improve the business many applications have been deployed which has resulted in multiple devices being installed. Routers, server appliances, and standalone servers for Windows and Linux. This scenario has created a branch architecture that is difficult to manage and costly to operate as upgrades, refresh cycles or repairs often require site visits and new equipment. In discussing this type of scenario with one customer who had around 1,000 locations with 4 network related devices at each site in addition to some internally developed merchandising applications. They stated that in any year on average they have 1 to 1 ½ FTE for 6 months working on refresh or upgrade activities. Une solution : virtualiser les fonctions sur les sites distants
Orchestration & Automation What if remote sites looked like this … Orchestration & Automation Platform Platform What if remote sites looked more like this, where a platform abstracted from the service, hosted a group of service functions. Say this started out on day 1 where there was a route service to provide the L2/L3 interface and transport functions. Platform
Orchestration & Automation What if a company wide webcast needed to be run … … and a solution could be deployed in under a day Orchestration & Automation Video Platform Video Platform Video Then one day the CIO comes down and says that they need to enable web casting for corporate communications. Now, instead of having to purchase, stage, ship, and deploy video caches or stream splitting devices, a software and meta data package was pushed out to the hosts and a service was put in place and tested in a day. Platform
Orchestration & Automation When the webcast is over, resources are released Orchestration & Automation Video Platform Video Platform Video Then, since the application is only required for corporate communications every 3 or 6 months now you can use the same system power down the service until it is needed again freeing up a pool of shared resources for other applications. Platform
Orchestration & Automation But you could deploy a solution in under a day What if a new ERP package couldn’t be leveraged Orchestration & Automation Platform Platform Lets consider another examples to see how this would work What if a new ERP packaged was deployed but could not be leveraged by employees due to performance. But you could deploy a solution to this to improve performance globally in under a day. Platform
Orchestration & Automation What if some sites needed new wireless control Orchestration & Automation Platform Platform At the same time, what if only some sites needed new wireless control. Same approach, central automation pushes out instructions and remote sites turn up new wireless controller Platform
Orchestration & Automation But a new defense network can be up in minutes Consider a new threat the business … everywhere at once Orchestration & Automation Platform Platform How about a more critical case … What if the business was threatened by a new attacker But a new defense network could be turned up in minutes ... Everywhere at once .... Not just turning up anoher virtual service, but also changing existing device conffiguration and operation along with creating a new network segment and implementing a new service path. This what Leveraging a centralized automation system to distribute policy with a virtualized network services platform this can be done. Platform
La Virtualisation sur les sites distants Spécificités Toutes les VNF sur le même serveur Management 7 Routeur 5 FW 1 Faible débit latence 4 Virtual SW-1 6 MPLS Kernel Port Hyperviseur 3 Serveur 2 8 L2 VLAN LAN SW Pas de lien de management dédié
La Virtualisation sur les sites distants Spécificités Format du serveur (Encombrement, bruit, durcissement…) Connectivité (LTE, DSL…?) Simplicité de déploiement (ZTP) Ouverture à de nombreuses VNF
La Virtualisation sur les sites distants Spécificités Performance Management Intégration dans l’écosystème réseau … et bien sûr le besoin d’optimiser le chaînage entre tous les services…
NSH – Network Service Header (1/2) IETF WG sfc (Service Function Chaining) Problématique et Architecture définis dans RFC 7498 et 7665 Objectif : mieux articuler les fonctions réseau entre elles (échange de Metadata) Laisse la liberté au mécanismes de communication réseau entre VNF (native, GRE, VXLAN…)
NSH – Network Service Header (2/2) SF Service Function 1 SF Service Function 2 SF Service Function 3 SC Service Classifier SFF Service Function Forwarder SFF Service Function Forwarder
Demo
Conclusion
Conclusion Le SD-WAN répond aux nouvelles problématiques Les solutions deviennent mûres En parallèle, la virtualisation sur les sites distants se développe avec de nouveaux challenges NSH en développement pour le service chaining Quelle application pour la communauté RENATER ?
Merci !