Benchmarking Virtual Switches in
the Open Platform for NFV (OPNFV)Intelmaryam.tahhan@intel.comIntelbilly.o.mahony@intel.comAT&T Labs200 Laurel Avenue SouthMiddletownNJ07748United States of America+1 732 420 1571+1 732 368 1192acmorton@att.comThis memo describes the contributions of the Open Platform for NFV
(OPNFV) project on Virtual Switch Performance (VSPERF), particularly in
the areas of test setups and configuration parameters for the system
under test. This project has extended the current and completed work of
the Benchmarking Methodology Working Group in the IETF and references
existing literature. The Benchmarking Methodology Working Group has
traditionally conducted laboratory characterization of dedicated
physical implementations of internetworking functions. Therefore, this
memo describes the additional considerations when virtual switches are
implemented on general-purpose hardware. The expanded tests and
benchmarks are also influenced by the OPNFV mission to support
virtualization of the "telco" infrastructure.The Benchmarking Methodology Working Group (BMWG) has traditionally
conducted laboratory characterization of dedicated physical
implementations of internetworking functions. The black-box benchmarks
of throughput, latency, forwarding rates, and others have served our
industry for many years. Now, Network Function Virtualization (NFV) has
the goal of transforming how internetwork functions are implemented and
therefore has garnered much attention.A virtual switch (vSwitch) is an important aspect of the NFV
infrastructure; it provides connectivity between and among physical
network functions and virtual network functions. As a result, there are
many vSwitch benchmarking efforts but few specifications to guide the
many new test design choices. This is a complex problem and an
industry-wide work in progress. In the future, several of BMWG's
fundamental specifications will likely be updated as more testing
experience helps to form consensus around new methodologies, and BMWG
should continue to collaborate with all organizations that share the
same goal.This memo describes the contributions of the Open Platform for NFV
(OPNFV) project on Virtual Switch Performance (VSPERF) characterization
through the Danube 3.0 (fourth) release to
the chartered work of the BMWG (with stable references to their test
descriptions). This project has extended the current and completed work
of the BMWG IETF and references existing literature. For example, the
most often referenced RFC is (which depends on
), so the foundation of the benchmarking work in
OPNFV is common and strong. The recommended extensions are specifically
in the areas of test setups and configuration parameters for the system
under test.See for more background and the OPNFV
website for general information .The authors note that OPNFV distinguishes itself from other open
source compute and networking projects through its emphasis on existing
"telco" services as opposed to cloud computing. There are many ways in
which telco requirements have different emphasis on performance
dimensions when compared to cloud computing: support for and transfer of
isochronous media streams is one example.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 when, and only
when, they appear in all capitals, as shown here.For the purposes of this document, the following abbreviations
apply:The primary purpose and scope of the memo is to describe key aspects
of vSwitch benchmarking, particularly in the areas of test setups and
configuration parameters for the system under test, and extend the body
of extensive BMWG literature and experience. Initial feedback indicates
that many of these extensions may be applicable beyond this memo's
current scope (to hardware switches in the NFV infrastructure and to
virtual routers, for example). Additionally, this memo serves as a
vehicle to include more detail and relevant commentary from BMWG and
other open source communities under BMWG's chartered work to
characterize the NFV infrastructure.The benchmarking covered in this memo should be applicable to many
types of vSwitches and remain vSwitch agnostic to a great degree. There
has been no attempt to track and test all features of any specific
vSwitch implementation.This section highlights some specific considerations (from ) related to benchmarks for virtual switches. The
OPNFV project is sharing its present view on these areas as they develop
their specifications in the Level Test Design (LTD) document as defined
by .To compare the performance of virtual designs and implementations
with their physical counterparts, identical benchmarks are needed.
BMWG has developed specifications for many physical network functions.
The BMWG has recommended reusing existing benchmarks and methods in
, and the OPNFV LTD expands on them as
described here. A key configuration aspect for vSwitches is the number
of parallel CPU cores required to achieve comparable performance with
a given physical device or whether some limit of scale will be reached
before the vSwitch can achieve the comparable performance level.It's unlikely that the virtual switch will be the only application
running on the SUT, so CPU utilization, cache utilization, and memory
footprint should also be recorded for the virtual implementations of
internetworking functions. However, internally measured metrics such
as these are not benchmarks; they may be useful for the audience
(e.g., operations) to know and may also be useful if there is a
problem encountered during testing.Benchmark comparability between virtual and physical/hardware
implementations of equivalent functions will likely place more
detailed and exact requirements on the "testing systems" (in terms of
stream generation, algorithms to search for maximum values, and their
configurations). This is another area for standards development to
appreciate; however, this is a topic for a future document.External observations remain essential as the basis for benchmarks.
Internal observations with a fixed specification and interpretation
will be provided in parallel to assist the development of operations
procedures when the technology is deployed.A key consideration when conducting any sort of benchmark is trying
to ensure the consistency and repeatability of test results. When
benchmarking the performance of a vSwitch, there are many factors that
can affect the consistency of results; one key factor is matching the
various hardware and software details of the SUT. This section lists
some of the many new parameters that this project believes are
critical to report in order to achieve repeatability.It has been the goal of the project to produce repeatable results,
and a large set of the parameters believed to be critical is provided
so that the benchmarking community can better appreciate the increase
in configuration complexity inherent in this work. The parameter set
below is assumed sufficient for the infrastructure in use by the
VSPERF project to obtain repeatable results from test to test.Hardware details (platform, processor, memory, and network)
including:BIOS version, release date, and any configurations that were
modifiedPower management at all levels (ACPI sleep states, processor
package, OS, etc.)CPU microcode levelNumber of enabled coresNumber of cores used for the testMemory information (type and size)Memory DIMM configurations (quad rank performance may not be
the same as dual rank) in size, frequency, and slot locationsNumber of physical NICs and their details (manufacturer,
versions, type, and the PCI slot they are plugged into)NIC interrupt configuration (and any special features in
use)PCI configuration parameters (payload size, early ACK option,
etc.)Software details including:OS RunLevelOS version (for host and VNF)Kernel version (for host and VNF)GRUB boot parameters (for host and VNF)Hypervisor details (type and version)Selected vSwitch, version number, or commit ID usedvSwitch launch command line if it has been parameterizedMemory allocation to the vSwitchWhich NUMA node it is using and how many memory channelsDPDK or any other software dependency version number or commit
ID usedMemory allocation to a VM - if it's from
Hugepages/elsewhereVM storage type - snapshot, independent persistent, independent
non-persistentNumber of VMsNumber of virtual NICs (vNICs) - versions, type, and driverNumber of virtual CPUs and their core affinity on the hostNumber of vNICs and their interrupt configurationsThread affinitization for the applications (including the
vSwitch itself) on the hostDetails of resource isolation, such as CPUs designated for
Host/Kernel (isolcpu) and CPUs designated for specific processes
(taskset).Test traffic information:Test durationNumber of flowsTraffic type - UDP, TCP, and othersFrame Sizes - fixed or IMIX (note that
with , frames may be longer than 1500
bytes and up to 2000 bytes)Deployment Scenario - defines the communications path in the
SUTVirtual switches group packets into flows by processing and
matching particular packet or frame header information, or by matching
packets based on the input ports. Thus, a flow can be thought of as a
sequence of packets that have the same set of header field values or
have arrived on the same physical or logical port. Performance results
can vary based on the parameters the vSwitch uses to match for a flow.
The recommended flow classification parameters for any vSwitch
performance tests are: the input port (physical or logical), the
source MAC address, the destination MAC address, the source IP
address, the destination IP address, and the Ethernet protocol type
field (although classification may take place on other fields, such as
source and destination transport port numbers). It is essential to
increase the flow timeout time on a vSwitch before conducting any
performance tests that do not intend to measure the flow setup time
(see Section 3 of ). Normally, the first
packet of a particular stream will install the flow in the virtual
switch, which introduces additional latency; subsequent packets of the
same flow are not subject to this latency if the flow is already
installed on the vSwitch.This outline describes the measurement of baselines with isolated
resources at a high level, which is the intended approach at this
time.Baselines: Optional: Benchmark platform forwarding capability without
a vSwitch or VNF for at least 72 hours (serves as a means of
platform validation and a means to obtain the base performance
for the platform in terms of its maximum forwarding rate and
latency). Benchmark VNF forwarding capability with direct
connectivity (vSwitch bypass, e.g., SR/IOV) for at least 72
hours (serves as a means of VNF validation and a means to
obtain the base performance for the VNF in terms of its
maximum forwarding rate and latency). The metrics gathered
from this test will serve as a key comparison point for
vSwitch bypass technologies performance and vSwitch
performance. Benchmarking with isolated resources alone and with other
resources (both hardware and software) disabled; for example,
vSwitch and VM are SUT.Benchmarking with isolated resources alone, thus leaving
some resources unused.Benchmarking with isolated resources and all resources
occupied.Next Steps:Limited sharingProduction scenariosStressful scenariosThe overall specification in preparation is referred to as a Level
Test Design (LTD) document, which will contain a suite of performance
tests. The base performance tests in the LTD are based on the
pre-existing specifications developed by the BMWG to test the
performance of physical switches. These specifications include:Benchmarking Methodology for Network Interconnect Devices Benchmarking Methodology for LAN Switching Device Reset Characterization Packet Delay Variation Applicability Statement The two most recent RFCs above ( and ) are being applied in benchmarking for the first time
and represent a development challenge for test equipment developers.
Fortunately, many members of the testing system community have engaged
on the VSPERF project, including an open source test system.In addition to this, the LTD also reuses the terminology defined
by:Benchmarking Terminology for LAN Switching Devices It is recommended that these references be included in future
benchmarking specifications:Methodology for IP Multicast Benchmarking Packet Reordering Metrics As one might expect, the most fundamental internetworking
characteristics of throughput and latency remain important when the
switch is virtualized, and these benchmarks figure prominently in the
specification.When considering characteristics important to "telco" network
functions, additional performance metrics are needed. In this case, the
project specifications have referenced metrics from the IETF IP
Performance Metrics (IPPM) literature. This means that the latency test
described in is replaced by measurement of a
metric derived from IPPM's , where a set of
statistical summaries will be provided (mean, max, min, and
percentiles). Further metrics planned to be benchmarked include packet
delay variation as defined by , reordering,
burst behaviour, DUT availability, DUT capacity, and packet loss in
long-term testing at the throughput level, where some low level of
background loss may be present and characterized.Tests have been designed to collect the metrics below:Throughput tests are designed to measure the maximum forwarding
rate (in frames per second, fps) and bit rate (in Mbps) for a
constant load (as defined by ) without
traffic loss.Packet and frame-delay distribution tests are designed to measure
the average minimum and maximum packet (and/or frame) delay for
constant loads.Packet delay tests are designed to understand latency
distribution for different packet sizes and to uncover outliers over
an extended test run.Scalability tests are designed to understand how the virtual
switch performs with an increasing number of flows, number of active
ports, configuration complexity of the forwarding logic, etc.Stream performance tests (with TCP or UDP) are designed to
measure bulk data transfer performance, i.e., how fast systems can
send and receive data through the switch.Control-path and data-path coupling tests are designed to
understand how closely the data path and the control path are
coupled, as well as the effect of this coupling on the performance
of the DUT (for example, delay of the initial packet of a flow).CPU and memory consumption tests are designed to understand the
virtual switch's footprint on the system and are conducted as
auxiliary measurements with the benchmarks above. They include CPU
utilization, cache utilization, and memory footprint.The so-called "soak" tests, where the selected test is conducted
over a long period of time (with an ideal duration of 24 hours but
only long enough to determine that stability issues exist when
found; there is no requirement to continue a test when a DUT
exhibits instability over time). The key performance characteristics
and benchmarks for a DUT are determined (using short duration tests)
prior to conducting soak tests. The purpose of soak tests is to
capture transient changes in performance, which may occur due to
infrequent processes, memory leaks, or the low-probability
coincidence of two or more processes. The stability of the DUT is
the paramount consideration, so performance must be evaluated
periodically during continuous testing, and this results in use of
frame rate metrics instead of throughput
(which requires stopping traffic to allow
time for all traffic to exit internal queues), for example.Additional test specification development should include:Request/response performance tests (with TCP or UDP), which
measure the transaction rate through the switch.Noisy neighbor tests, in order to understand the effects of
resource sharing on the performance of a virtual switch.Tests derived from examination of ETSI NFV Draft GS IFA003
requirements on characterization of
acceleration technologies applied to vSwitches.The flexibility of deployment of a virtual switch within a
network means that it is necessary to characterize the performance of a
vSwitch in various deployment scenarios. The deployment scenarios under
consideration are shown in the following figures:A set of deployment scenario figures is available on the VSPERF "Test
Methodology" wiki page .This section organizes the many existing test specifications into the
"3x3" matrix (introduced in ). Because the LTD
specification ID names are quite long, this section is organized into
lists for each occupied cell of the matrix (not all are occupied; also,
the matrix has grown to 3x4 to accommodate scale metrics when displaying
the coverage of many metrics/benchmarks). The current version of the LTD
specification is available; see .The tests listed below assess the activation of paths in the data
plane rather than the control plane.A complete list of tests with short summaries is available on the
VSPERF "LTD Test Spec Overview" wiki page .Activation.RFC2889.AddressLearningRatePacketLatency.InitialPacketProcessingLatencyCPDP.Coupling.Flow.AdditionThroughput.RFC2544.SystemRecoveryTimeThroughput.RFC2544.ResetTimeActivation.RFC2889.AddressCachingCapacityThroughput.RFC2544.PacketLossRateStress.RFC2544.0PacketLossThroughput.RFC2544.PacketLossRateFrameModificationThroughput.RFC2544.BackToBackFramesThroughput.RFC2889.MaxForwardingRateThroughput.RFC2889.ForwardPressureThroughput.RFC2889.BroadcastFrameForwardingThroughput.RFC2544.WorstN-BestNThroughput.Overlay.Network.<tech>.RFC2544.PacketLossRatioThroughput.RFC2889.ErrorFramesFilteringThroughput.RFC2544.ProfileThroughput.RFC2889.SoakThroughput.RFC2889.SoakFrameModificationPacketDelayVariation.RFC3393.SoakScalability.RFC2544.0PacketLossMemoryBandwidth.RFC2544.0PacketLoss.ScalabilityScalability.VNF.RFC2544.PacketLossProfileScalability.VNF.RFC2544.PacketLossRatioBenchmarking activities as described in this memo are limited to
technology characterization of a Device Under Test/System Under Test
(DUT/SUT) using controlled stimuli in a laboratory environment with
dedicated address space and the constraints specified in the sections
above.The benchmarking network topology will be an independent test setup
and MUST NOT be connected to devices that may forward the test traffic
into a production network or misroute traffic to the test management
network.Further, benchmarking is performed on a "black-box" basis and relies
solely on measurements observable external to the DUT/SUT.Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
benchmarking purposes. Any implications for network security arising
from the DUT/SUT SHOULD be identical in the lab and in production
networks.Benchmarking Methodology for Virtualization Network
PerformanceAs the virtual network has been widely established in IDC, the
performance of virtual network has become a valuable consideration
to the IDC managers. This draft introduces a benchmarking
methodology for virtualization network performance based on
virtual switch.Test MethodologyLTD Test Spec OverviewOPNFVVSPERF Level Test Design (LTD)OPNFVDanubeOPNFVVSPERF HomeOPNFVOPNFVOPNFVNetwork Functions Virtualisation (NFV); Acceleration
Technologies; vSwitch Benchmarking and Acceleration
SpecificationETSIIEEE Standard for Local and metropolitan area networks --
Media Access Control (MAC) Service DefinitionIEEEIEEE Standard for Software and System Test DocumentationIEEEThe authors appreciate and acknowledge comments from Scott Bradner,
Marius Georgescu, Ramki Krishnan, Doug Montgomery, Martin Klozik,
Christian Trautman, Benoit Claise, and others for their reviews.We also acknowledge the early work in and useful discussion with the authors.