NVO3 working group A. Ghanwani Internet Draft Dell Intended status: Standards Track L. Dunbar Expires: December 2014 Huawei V. Bannai Paypal R. Krishnan Brocade June 6, 2014 Framework of Supporting Applications Specific Multicast in NVO3 draft-ghanwani-nvo3-app-mcast-framework-00 Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, except to publish it as an RFC and to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on December 6, 2014. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. Ghanwani, et al. Expires December 6, 2014 [Page 1] Internet-Draft Framework of App multicast in NVO3 June 2014 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Abstract This draft discusses the framework of supporting applications specific multicast traffic, i.e. the non ARP/ND related multicast/broadcast traffic, in a network that uses Network Virtualization using Overlays over Layer 3 (NVO3). It describes the various mechanisms and considerations that can be used for delivering those application specific multicast traffic in networks that use NVO3. Table of Contents 1. Introduction...................................................2 2. Conventions used in this document..............................3 3. Multicast mechanisms in networks that use NVO3.................4 3.1. No multicast support......................................4 3.2. Replication at the source NVE.............................5 3.3. Replication at a multicast service node...................7 3.4. IP multicast in the underlay..............................8 3.5. Other schemes.............................................9 4. Simultaneous use of more than one mechanism....................9 5. Summary.......................................................10 6. Security Considerations.......................................10 7. IANA Considerations...........................................10 8. References....................................................10 8.1. Normative References.....................................10 8.2. Informative References...................................11 9. Acknowledgments...............................................12 1. Introduction Network virtualization using Overlays over Layer 3 (NVO3) is a technology that is used to address issues that arise in building Ghanwani, et al. Expires December 6, 2014 [Page 2] Internet-Draft Framework of App multicast in NVO3 June 2014 large, multitenant data centers that make extensive use of server virtualization [PS]. This draft discusses the framework of supporting application specific multicast traffic, i.e. the non ARP/ND related multicast/broadcast traffic, in a network that uses Network Virtualization using Overlays over Layer 3 (NVO3). It describes the various mechanisms and considerations that can be used for delivering those application specific multicast traffic in networks that use NVO3. The Application Specific Multicast traffic, either Source-Specific Multicast (SSM) or Any-source Multicast (ASM), has the following characteristics: 1. All participants in the group want to be aware of all other participants, and 2. The list of participants is not known in advance. Therefore, NVA can't get the list of participants for each multicast group ahead of time. The reader is assumed to be familiar with the terminology as defined in the NVO3 Framework document [FW] and NVO3 Architecture document [NVO3-ARCH]. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. In this document, these words will appear with that interpretation only when in ALL CAPS. Lower case uses of these words are not to be interpreted as carrying RFC-2119 significance. ASM: Any-Source Multicast allows hosts to transmit to/from groups without any restriction on the location of end-user computers or VMs by allowing any receiving host to become a transmission source. Application Specific Multicast: includes Source-Specific Multicast (SSM), Any-Source Multicast (ASM), or other multicast traffic that are not derived from ARP/ND protocols. Ghanwani, et al. Expires December 6, 2014 [Page 3] Internet-Draft Framework of App multicast in NVO3 June 2014 SSM: Source-Specific Multicast is a method of delivering multicast packets in which the only packets that are delivered to a receiver are those originating from a specific source address requested by the receiver. By so limiting the source, SSM reduces demands on the network and improves security. SSM requires that the receiver specify the source address and explicitly excludes the use of the (*,G) join for all multicast groups in RFC 3376, which is possible only in IPv4's IGMPv3 and IPv6's MLDv2. 3. Multicast mechanisms in networks that use NVO3 In NVO3 environments, traffic between NVEs is transported using a tunnel encapsulation such as VXLAN [VXLAN], NVGRE [NVGRE], STT [STT], etc. Besides the need to support the Address Resolution Protocol (ARP) and Neighbor Discovery (ND), there are several applications that require the support of multicast and/or broadcast in data centers [DC-MC]. With NVO3, there are many possible ways that multicast may be handled in such networks. We discuss some of the attributes of the following four methods, but other methods are also possible. 1. No multicast support. 2. Replication at the source NVE. 3. Replication at a multicast service node. 4. IP multicast in the underlay. These mechanisms are briefly mentioned in the NVO3 Framework [FW] and NVO3 architecture [NVO3-ARCH] document. This document attempts to fill in some more details about the basic mechanisms underlying each of these mechanisms and discusses the issues and tradeoffs of each. 3.1. No multicast support In this scenario, there is no support whatsoever for multicast traffic when using the overlay. This can only work if the following conditions are met: Ghanwani, et al. Expires December 6, 2014 [Page 4] Internet-Draft Framework of App multicast in NVO3 June 2014 1. All of the traffic is unicast. In other words, there is no applications specific multicast traffic in the network and the only multicast/broadcast traffic is from ARP/ND protocols and flooding of frames with an unknown MAC destination address. 2. A network virtualization authority (NVA) is used by the NVEs to determine the target's MAC/IP address to egress NVE mapping. In other words, there is no data plane learning, and address resolution requests via ARP/ND that are issued by the VMs must be resolved by the NVE that they are attached to. With this approach, certain multicast/broadcast applications such as DHCP can be supported by use of a helper function in the NVE. The main issues that need to be addressed with this mechanism are the handling of hosts for which a mapping does not already exist in the NVA or hosts that participate in application specific multicast. This issue can be particularly challenging if such end systems are reachable through more than one NVE. 3.2. Replication at the source NVE With this method, the overlay attempts to provide a multicast service without requiring any specific support from the underlay, other than that of a unicast service. A multicast or broadcast transmission is achieved by replicating the packet at the source NVE, and making copies, one for each destination NVE that the multicast packet must be sent to. For this mechanism to work, the source NVE must know, a priori, the IP addresses of all destination NVEs that need to receive the packet. For example, for a specific multicast group, the source NVE must know the IP addresses of all the remote NVEs where there are members of the tenant subnet and multicast group in question. In addition, the NVE may need to support the IGMP/MLD snooping function, i.e. to listen in on the IGMP/MLD conversation between hosts and routers. By listening to these conversations the NVEs can maintain a map of which hosts need which IP multicast streams. For some environment, it might be necessary to prevent the hosts which are not in a multicast group from receiving the specific multicast traffic. Ghanwani, et al. Expires December 6, 2014 [Page 5] Internet-Draft Framework of App multicast in NVO3 June 2014 The obvious drawback with this method is that we have multiple copies of the same packet that will traverse any common links that are along the path to each of the destination NVEs. If, for example, a tenant subnet is spread across 50 NVEs, the packet would have to be replicated 50 times at the source NVE. This also creates an issue with the forwarding performance of the NVE, especially if it is implemented in software. When it is necessary to prevent hosts under an NVE that are not in an application specific multicast from receiving the multicast traffic, the NVE needs to maintain the multicast group membership. Note that this method is similar to what was used in VPLS [VPLS] prior to extensive support of MPLS multicast [MPLS-MC]. If matching MPLS PE with NVO3's NVE, there are some similarities between MPLS VPN and the NVO3 overlay. However, there are some key differences: - The client attachment to VPN PEs is somewhat static, whereas in a DC that allows VMs to migrate anywhere, the VMs attachment to NVEs can be changing. - The number of PEs to which one VPN client is attached in MPLS VPN environment is normally less than the number of NVEs to which DC client's VMs are attached. When a VPN client has multiple multicast groups, [RFC6513] "Multicast VPN" combines all those multicast groups within each VPN client to one single multicast group in the MPLS (or VPN) core. End result: All messages from any multicast groups belonging to one VPN client will reach all the PE nodes of the client. I.e. any messages belonging to any multicast groups under Client XX will reach all PEs of the Client XX. When the Client XX only has a handful of PEs, there is not too much bandwidth wasted in the core. In DC environment, a typical vSwitch may only supports 10~20 VMs. A subnet with 200 VMs may spread across 200 vSwitches in the worst case scenario. Using "MPLS VPN multicast approach" will have to create a Multicast group in the core for this client network to reach 200 NVEs. If only small percentage of this client's VMs participate in application specific multicast, a great number of NVEs will receive multicast traffic that should not be forwarded to their attached VMs. Ghanwani, et al. Expires December 6, 2014 [Page 6] Internet-Draft Framework of App multicast in NVO3 June 2014 Therefore, the Multicast VPN solution may not scale in DC environment with dynamic attachment of Virtual Networks to NVEs and greater number of NVEs for each virtual network. 3.3. Replication at a multicast service node With this method, all multicast packets would be sent using a unicast tunnel encapsulation to a multicast service node. The multicast service node, in turn, would create multiple copies of the packet and would deliver a copy, using a unicast tunnel encapsulation, to each of the NVEs that are part of the multicast group for which the packet is intended. This mechanism is similar to that used by the ATM Forum's LAN Emulation [LANE] specification [LANE]. The following are the possible ways for the Multicast service nodes to get proper membership information for each multicast group: - the Multicast Service Nodes can exchange information with TSs multicast routers, - Snooping the IGMP/MLD messages between TSs and their routers to build the proper membership table for each multicast group Unlike the method described in Section 3.2, there is no performance impact at the ingress NVE, nor are there any issues with multiple copies of the same packet from the source NVE to the multicast service node. However there remain issues with multiple copies of the same packet on links that are common to the paths from the multicast service node to each of the egress NVEs. Additional issues that are introduced with this method include the availability of the multicast service node, methods to scale the services offered by the multicast service node, and the sub-optimality of the delivery paths. Finally, the IP address of the source NVE must be preserved in packet copies created at the multicast service node if data plane Ghanwani, et al. Expires December 6, 2014 [Page 7] Internet-Draft Framework of App multicast in NVO3 June 2014 learning is in use. This could create problems if IP source address reverse path forwarding (RPF) checks are in use. 3.4. IP multicast in the underlay In this method, the underlay supports IP multicast and the ingress NVE encapsulates the packet with the appropriate IP multicast address in the tunnel encapsulation header for delivery to the desired set of NVEs. The protocol in the underlay could be any variant of Protocol Independent Multicast (PIM), or protocol dependent multicast, such as [ISIS-Multicast]. If an NVE connects to its attached TSs via an IP network, then the NVE needs to support the interworking between the Tenant Networks' multicast protocols and the underlay multicast protocols. If an NVE connects to its attached TSs via Layer 2 network, there are multiple ways for NVEs to support the application specific multicast: - The NVE only supports the basic IGMP/MLD snooping function, let the TSs routers handling the application specific multicast. This scheme doesn't utilize the underlay IP multicast protocols. - The NVE can act as a pseudo multicast router for the directly attached VMs and support proper mapping of IGMP/MLD's messages to the messages needed by the underlay IP multicast protocols. With this method, there are none of the issues with the methods described in Sections 3.2. With PIM Sparse Mode (PIM-SM), the number of flows required would be (n*g), where n is the number of source NVEs that source packets for the group, and g is the number of groups. Bidirectional PIM (BIDIR- PIM) would offer better scalability with the number of flows required being g. In the absence of any additional mechanism, e.g. using an NVA for address resolution, for optimal delivery, there would have to be a separate group for each tenant, plus a separate group for each multicast address (used for multicast applications) within a tenant. Ghanwani, et al. Expires December 6, 2014 [Page 8] Internet-Draft Framework of App multicast in NVO3 June 2014 Additional considerations are that only the lower 23 bits of the IP address (regardless of whether IPv4 or IPv6 is in use) are mapped to the outer MAC address, and if there is equipment that prunes multicasts at Layer 2, there will be some aliasing. Finally, a mechanism to efficiently provision such addresses for each group would be required. There are additional optimizations which are possible, but they come with their own restrictions. For example, a set of tenants may be restricted to some subset of NVEs and they could all share the same outer IP multicast group address. This however introduces a problem of sub-optimal delivery (even if a particular tenant within the group of tenants doesn't have a presence on one of the NVEs which another one does, the former's multicast packets would still be delivered to that NVE). It also introduces an additional network management burden to optimize which tenants should be part of the same tenant group (based on the NVEs they share), which somewhat dilutes the value proposition of NVO3 which is to completely decouple the overlay and physical network design allowing complete freedom of placement of VMs anywhere within the data center. 3.5. Other schemes There are still other mechanisms that may be used that attempt to combine some of the advantages of the above methods by offering multiple replication points, each with a limited degree of replication [EDGE-REP]. Such schemes offer a trade-off between the amount of replication at an intermediate node (router) versus performing all of the replication at the source NVE or all of the replication at a multicast service node. 4. Simultaneous use of more than one mechanism While the mechanisms discussed in the previous section have been discussed individually, it is possible for implementations to rely on more than one of these. For example, the method of Section 3.1 could be used for minimizing ARP/ND, while at the same time, multicast applications may be supported by one, or a combination of, the other methods. For small multicast groups, the methods of Ghanwani, et al. Expires December 6, 2014 [Page 9] Internet-Draft Framework of App multicast in NVO3 June 2014 source NVE replication or the use of a multicast service node may be attractive, while for larger multicast groups, the use of multicast in the underlay may be preferable. 5. Summary This document has identified various mechanisms for supporting application specific multicast in networks that use NVO3. It highlights the basics of each mechanism and some of the issues with them. As solutions are developed, the protocols would need to consider the use of these mechanisms and co-existence may be a consideration. It also highlights some of the requirements for supporting multicast applications in an NVO3 network. 6. Security Considerations This draft does not introduce any new security considerations beyond what may be present in proposed solutions 7. IANA Considerations This document requires no IANA actions. RFC Editor: Please remove this section before publication. 8. References 8.1. Normative References [PS] Lasserre, M. et al., "Framework for DC network virtualization", work in progress, January 2014. [FW] Narten, T. et al., "Problem statement: Overlays for network virtualization", work in progress, July 2013. [NVO3-ARCH] Narten, T. et al.," An Architecture for Overlay Networks (NVO3)", work in progress, Feb 2014. [RFC 3376] B. Cain, et al, "Internet Group Management Protocol, Version 3", Oct 2002. Ghanwani, et al. Expires December 6, 2014 [Page 10] Internet-Draft Framework of App multicast in NVO3 June 2014 [RFC6513] Rosen, E. et al., "Multicast in MPLS/BGP IP VPNs". RFC6513, Feb 2012. 8.2. Informative References [VXLAN] Mahalingam, M. et al., "VXLAN: A framework for overlaying virtualized Layer 2 networks over Layer 3 networks," work in progress. [NVGRE] Sridharan, M. et al., "NVGRE: Network virtualization using Generic Routing Encapsulation," work in progress. [STT] Davie, B. and Gross J., "A stateless transport tunneling protocol for network virtualization," work in progress. [DC-MC] McBride M., and Lui, H., "Multicast in the data center overview," work in progress. [ISIS-Multicast] L. Yong, et al, "ISIS Protocol Extension For Building Distribution Trees", work in progress. Oct 2013. [VPLS] Lasserre, M., and Kompella, V. (Eds), "Virtual Private LAN Service (VPLS) using Label Distribution Protocol (LDP) signaling," RFC 4762, January 2007. [MPLS-MC] Aggarwal, R. et al., "Multicast in VPLS," work in progress. [LANE] "LAN emulation over ATM," The ATM Forum, af-lane- 0021.000, January 1995. Ghanwani, et al. Expires December 6, 2014 [Page 11] Internet-Draft Framework of App multicast in NVO3 June 2014 [EDGE-REP] Marques P. et al., "Edge multicast replication for BGP IP VPNs," work in progress, June 2012. 9. Acknowledgments This document was prepared using 2-Word-v2.0.template.dot. Ghanwani, et al. Expires December 6, 2014 [Page 12] Internet-Draft Framework of App multicast in NVO3 June 2014 Authors' Addresses Anoop Ghanwani Dell Email: anoop@alumni.duke.edu Linda Dunbar Huawei Technologies 5340 Legacy Drive, Suite 1750 Plano, TX 75024, USA Phone: (469) 277 5840 Email: ldunbar@huawei.com Vinay Bannai Paypal Email: vbannai@paypal.com Ram Krishnan Brocade Email: ramk@brocade.com Ghanwani, et al. Expires December 6, 2014 [Page 13]