ArcOS: Building Scalable, Automated, Multi-Tenant Data Center Fabrics

March 26, 2019 | Keyur Patel

Introduction

As application-centric digital infrastructures have become mainstream, data center network architectures (across both enterprises and providers) are firmly rooted in routing-centric designs leaving behind the siloed switching-centric designs of the past. Starting with the hyperscale cloud providers and extending to most major enterprises, multi-tier Clos topologies (“Leaf-Spine”) are the norm today and into the foreseeable future. These designs enable wide fan-out bandwidth for east-west host-to-host communication, Layer 3 routing to the top-of-rack (or even into the server host), and consistent latency metrics.

From a control plane perspective, both the hyperscalers and big enterprises use eBGP as the data center transport-layer (aka “underlay”) because of its proven ability to scale (in the backbone of networks). This class of customers typically build their own overlays on top for application traffic pertaining to intra-PoD and inter-PoD communication.

The rest of enterprise landscape, with no such extensive resources, has tended to limit BGP usage to the overlay control plane with BGP EVPN while using traditional Interior Gateway Protocols (IS-IS, OSPF) as the underlay. This has the undesirable side effect of a lack of transport PoD scale and also a multi-layer protocol stack (e.g. ISIS/OSPF for underlay, EVPN-BGP for overlay) with associated complexity. In addition, it allows the incumbent vendors to use that as a reason to provide vertically integrated locked-in fabric solutions.

Bringing Scale and Automation to the Enterprise

At Arrcus, we continuously strive to address our customers’ needs of deploying and managing high-quality open networking infrastructure with a focus on agility, flexibility, and scalability at low ongoing operational costs. As such, it is my privilege to introduce our support for a new BGP EVPN/VXLAN solution to the networking world.

With this EVPN launch, Arrcus will support:

  • EVPN fabrics with traditional IGP as underlay and the IRB, L3 BGP EVPN/VXLAN for the multi-tenant overlay
  • EVPN fabrics with a new hybrid BGP-based protocol transport underlay option called Link State Vector Routing (LSVR)
  • EVPN-BGP with link neighbor discovery and liveness protocol called Link State over Ethernet (LSoE) that is being driven in IETF standards bodies

Link State Vector Routing as the Transport Underlay

Link-State Vector Routing (LSVR) (https://datatracker.ietf.org/wg/lsvr/about/) is a cross vendor-customer collaboration effort in the IETF standards bodies to simplify large scale data center fabric designs and will add another element towards building operationally simple and low cost fabrics (both IP Clos and multi-tenant BGP EVPN).

LSVR augments BGP by replacing its path vector algorithm with a Shortest Path Dijkstra algorithm. As a result, it replaces all the phases of existing BGP best path decision process. It also introduces a new BGP-LS-SPF SAFI (Subsequent Address Family Identifiers) within BGP. It has its own BGP NLRI (Network Layer Reachability Information) constructs for carrying IPv4 and IPv6 related link information using the new SAFI. Any routes that are computed as part of BGP-LS-SPF SAFI would be installed within the appropriate IPv4/IPv6 tables in RIB (Routing Information Base) and FIB (Forwarding Information Base). The preference and priority given to these routes are significantly higher than that of traditional BGP and IGP routes.

These modifications provide an option for BGP to connect with route reflectors or route controllers that are not in the forwarding path amongst the other peering models. The changes also allow BGP to be deployed as the only routing protocol in any kind of Clos deployments and thereby helps retain the already-built operator infrastructure to manage the networks.

Use of any centralized route controllers assists in inserting alternate paths for fast convergence, traffic engineering, efficient monitoring, and in many other applications where a centralized command and control is needed within a network.

Summary

The explosive demand for data-intensive applications requires today’s network infrastructure to be faster, smarter, and better. As this network transformation gains steam, operators are demanding simplicity and high programmability to quickly deliver applications and services. Yet, the existing fabric solutions are complex with separate protocols for overlay and underlay. With our BGP EVPN/VXLAN solution combined with LSVR, enterprises and service providers can now build scalable, automated, multi-tenant data center fabrics while minimizing their operational expenses.

Network Different with Arrcus!