- add netmap (FreeBSD header files need to be updated with this) - move prototype perl scripts to prototype/ folder - create basic structure for sipcap app (no code yet)
417 lines
15 KiB
Plaintext
417 lines
15 KiB
Plaintext
EXPERIMENTING WITH NETMAP, VALE AND FAST QEMU
|
|
---------------------------------------------
|
|
|
|
To ease experiments with Netmap, the VALE switch and our Qemu enhancements
|
|
we have prepared a couple of bootable images (linux and FreeBSD).
|
|
You can find them on the netmap page
|
|
|
|
http://info.iet.unipi.it/~luigi/netmap/
|
|
|
|
where you can also look at more recent versions of this file.
|
|
|
|
Below are step-by-step instructions on experiments you can run
|
|
with these images. The two main versions are
|
|
|
|
picobsd.hdd -> FreeBSD HEAD (netmap + VALE)
|
|
tinycore.hdd -> Linux (qemu + netmap + VALE)
|
|
|
|
Booting the image
|
|
-----------------
|
|
For all experiments you need to copy the image on a USB stick
|
|
and boot a PC with it. Alternatively, you can use the image
|
|
with VirtualBox, Qemu or other emulators, as an example
|
|
|
|
qemu-system-x86_64 -hda IMAGE_FILE -m 1G -machine accel=kvm ...
|
|
|
|
(remove 'accel=kvm' if your host does not support kvm).
|
|
The images do not install anything on the hard disk.
|
|
|
|
Both systems have preloaded drivers for a number of network cards
|
|
(including the intel 10 Gbit ones) with netmap extensions.
|
|
The VALE switch is also available (it is part of the netmap module).
|
|
ssh, scp and a few other utilities are also included.
|
|
|
|
FreeBSD image:
|
|
|
|
+ the OS boots directly in console mode, you can switch
|
|
between terminals with ALT-Fn.
|
|
The password for the 'root' account is 'setup'
|
|
|
|
+ if you are connected to a network, you can use
|
|
dhclient em0 # or other interface name
|
|
to obtain an IP address and external connectivity.
|
|
|
|
Linux image:
|
|
|
|
+ in addition to the netmap/VALE modules, the KVM kernel module
|
|
is also preloaded.
|
|
|
|
+ the boot-loader gives you two main options (each with
|
|
a variant to delay boot in case you have slow devices):
|
|
|
|
+ "Boot TinyCore"
|
|
boots in an X11 environment as user 'tc'.
|
|
You can create a few terminals using the icon at the
|
|
bottom. You can use "sudo -s" to get root access.
|
|
In case no suitable video card is available/detected,
|
|
it falls back to command line mode.
|
|
|
|
+ "Boot Core (command line only)"
|
|
boots in console mode with virtual terminals.
|
|
You're automatically logged in as user 'tc'.
|
|
To log in the other terminals use the same username
|
|
(no password required).
|
|
|
|
+ The system should automatically recognize the existing ethernet
|
|
devices, and load the appropriate netmap-capable device drivers
|
|
when available. Interfaces are configured through DHCP when possible.
|
|
|
|
|
|
General test recommendations
|
|
----------------------------
|
|
NOTE: The tests outlined in the following sections can generate very high
|
|
packet rates, and some hardware misconfiguration problems may prevent
|
|
you from achieving maximum speed.
|
|
Common problems are:
|
|
|
|
+ slow link autonegotiation.
|
|
Our programs typically wait 2-4 seconds for
|
|
link negotiation to complete, but some NIC/switch combinations
|
|
are much slower. In this case you should increase the delay
|
|
(pkt-gen has the -w XX option for that) or possibly force
|
|
the link speed and duplex mode on both sides.
|
|
|
|
Check the link speed to make sure there are no nogotiation
|
|
problems, and that you see the expected speed.
|
|
|
|
ethtool IFNAME # on linux
|
|
ifconfig IFNAME # on FreeBSD
|
|
|
|
+ ethernet flow control.
|
|
If the receiving port is slow (often the case in presence
|
|
of multicast/broadcast traffic, or also unicast if you are
|
|
sending to non-netmap receivers), it will generate ethernet
|
|
flow control frames that throttle down the sender.
|
|
|
|
We recommend to disable BOTH RX and TX ethernet flow control
|
|
on BOTH sender and receiver.
|
|
On Linux this can be done with ethtool:
|
|
|
|
ethtool -A IFNAME tx off rx off
|
|
|
|
whereas on FreeBSD there are device-specific sysctl
|
|
|
|
sysctl dev.ix.0.queue0.flow_control = 0
|
|
|
|
+ CPU power saving.
|
|
The CPU governor on linux, or equivalent in FreeBSD, tend to
|
|
throttle down the clock rate reducing performance.
|
|
Unlike other similar systems, netmap does not have busy-wait
|
|
loops, so the CPU load is generally low and this can trigger
|
|
the clock slowdown.
|
|
|
|
Make sure that ALL CPUs run at maximum speed, possibly
|
|
disabling the dynamic frequency-scaling mechanisms.
|
|
|
|
cpufreq-set -gperformance # on linux
|
|
|
|
sysctl dev.cpu.0.freq=3401 # on FreeBSD.
|
|
|
|
+ wrong MAC address
|
|
netmap does not put the NIC in promiscuous mode, so unless the
|
|
application does it, the NIC will only receive broadcast traffic or
|
|
unicast directed to its own MAC address.
|
|
|
|
|
|
STANDARD SOCKET TESTS
|
|
---------------------
|
|
For most socket-based experiments you can use the "netperf" tool installed
|
|
on the system (version 2.6.0). Be careful to use a matching version for
|
|
the other netperf endpoint (e.g. netserver) when running tests between
|
|
different machines.
|
|
|
|
Interesting experiments are:
|
|
|
|
netperf -H x.y.z.w -tTCP_STREAM # test TCP throughput
|
|
netperf -H x.y.z.w -tTCP_RR # test latency
|
|
netperf -H x.y.z.w -tUDP_STREAM -- -m8 # test UDP throughput with short packets
|
|
|
|
where x.y.z.w is the host running "netserver".
|
|
|
|
|
|
RAW SOCKET AND TAP TESTS
|
|
------------------------
|
|
For experiments with raw sockets and tap devices you can use the l2
|
|
utilities (l2open, l2send, l2recv) installed on the system.
|
|
With these utilities you can send/receive custom network packets
|
|
to/from raw sockets or tap file descriptors.
|
|
|
|
The receiver can be run with one of the following commands
|
|
|
|
l2open -r IFNAME l2recv # receive from a raw socket attached to IFNAME
|
|
l2open -t IFNAME l2recv # receive from a file descriptor opened on the tap IFNAME
|
|
|
|
The receiver process will wait indefinitely for the first packet
|
|
and then keep receiving as long as packets keep coming. When the
|
|
flow stops (after a 2 seconds timeout) the process terminates and
|
|
prints the received packet rate and packet count.
|
|
|
|
To run the sender in an easy way, you can use the script l2-send.sh
|
|
in the home directory. This script defines several shell variables
|
|
that can be manually changed to customize the test (see
|
|
the comments in the script itself).
|
|
|
|
As an example, you can test configurations with Virtual
|
|
Machines attached to host tap devices bridged together.
|
|
|
|
|
|
Tests using the Linux in-kernel pktgen
|
|
--------------------------------------
|
|
To use the Linux in-kernel packet generator, you can use the
|
|
script "linux-pktgen.sh" in the home directory.
|
|
The pktgen creates a kernel thread for each hardware TX queue
|
|
of a given NIC.
|
|
|
|
By manually changing the script shell variable definitions you
|
|
can change the test configuration (e.g. addresses in the generated
|
|
packet). Please change the "NCPU" variable to match the number
|
|
of CPUs on your machine. The script has an argument which
|
|
specifies the number of NIC queues (i.e. kernel threads)
|
|
to use minus one.
|
|
|
|
For example:
|
|
|
|
./linux-pktgen.sh 2 # Uses 3 NIC queues
|
|
|
|
When the script terminates, it prints the per-queue rates and
|
|
the total rate achieved.
|
|
|
|
|
|
NETMAP AND VALE EXPERIMENTS
|
|
---------------------------
|
|
|
|
For most experiments with netmap you can use the "pkt-gen" command
|
|
(do not confuse it with the Linux in-kernel pktgen), which has a large
|
|
number of options to send and receive traffic (also on TAP devices).
|
|
|
|
pkt-gen normally generates UDP traffic for a specific IP address
|
|
and using the brodadcast MAC address
|
|
|
|
Netmap testing with network interfaces
|
|
--------------------------------------
|
|
|
|
Remember that you need a netmap-capable driver in order to use
|
|
netmap on a specific NIC. Currently supported drivers are e1000,
|
|
e1000e, ixgbe, igb. For updated information please visit
|
|
http://info.iet.unipi.it/~luigi/netmap/
|
|
|
|
Before running pkt-gen, make sure that the link is up.
|
|
|
|
Run pkt-gen on an interface called "IFNAME":
|
|
|
|
pkt-gen -i IFNAME -f tx # run a pkt-gen sender
|
|
pkt-gen -i IFNAME -f rx # run a pkt-gen receiver
|
|
|
|
pkt-gen without arguments will show other options, e.g.
|
|
+ -w sec modifies the wait time for link negotioation
|
|
+ -l len modifies the packet size
|
|
+ -d, -s set the IP destination/source addresses and ports
|
|
+ -D, -S set the MAC destination/source addresses
|
|
|
|
and more.
|
|
|
|
Testing the VALE switch
|
|
------------------------
|
|
|
|
To use the VALE switch instead of physical ports you only need
|
|
to change the interface name in the pkt-gen command.
|
|
As an example, on a single machine, you can run senders and receivers
|
|
on multiple ports of a VALE switch as follows (run the commands into
|
|
separate terminals to see the output)
|
|
|
|
pkt-gen -ivale0:01 -ftx # run a sender on the port 01 of the switch vale0
|
|
pkt-gen -ivale0:02 -frx # receiver on the port 02 of same switch
|
|
pkt-gen -ivale0:03 -ftx # another sender on the port 03
|
|
|
|
The VALE switches and ports are created (and destroyed) on the fly.
|
|
|
|
|
|
Transparent connection of physical ports to the VALE switch
|
|
-----------------------------------------------------------
|
|
|
|
It is also possible to use a network device as a port of a VALE
|
|
switch. You can do this with the following command:
|
|
|
|
vale-ctl -h vale0:eth0 # attach interface "eth0" to the "vale0" switch
|
|
|
|
To detach an interface from a bridge:
|
|
|
|
vale-ctl -d vale0:eth0 # detach interface "eth0" from the "vale0" switch
|
|
|
|
These operations can be issued at any moment.
|
|
|
|
|
|
Tests with our modified QEMU
|
|
----------------------------
|
|
|
|
The Linux image also contains our modified QEMU, with the VALE backend and
|
|
the "e1000-paravirt" frontend (a paravirtualized e1000 emulation).
|
|
|
|
After you have booted the image on a physical machine (so you can exploit
|
|
KVM), you can boot the same image a second time (recursively) with QEMU.
|
|
Therefore, you can run all the tests above also from within the virtual
|
|
machine environment.
|
|
|
|
To make VM testing easier, the home directory contains some
|
|
some useful scripts to set up and launch VMs on the physical machine.
|
|
|
|
+ "prep-taps.sh"
|
|
creates and sets up two permanent tap interfaces ("tap01" and "tap02")
|
|
and a Linux in-kernel bridge. The tap interfaces are then bridged
|
|
together on the same bridge. The bridge interface ("br0"), is given
|
|
the address 10.0.0.200/24.
|
|
|
|
This setup can be used to make two VMs communicate through the
|
|
host bridge, or to test the speed of a linux switch using
|
|
l2open
|
|
|
|
+ "unprep-taps.sh"
|
|
undoes the above setup.
|
|
|
|
+ "launch-qemu.sh"
|
|
can be used to run QEMU virtual machines. It takes four arguments:
|
|
|
|
+ The first argument can be "qemu" or "kvm", depending on
|
|
whether we want to use the standard QEMU binary translation
|
|
or the hardware virtualization acceleration.
|
|
|
|
+ The third argument can be "--tap", "--netuser" or "--vale",
|
|
and tells QEMU what network backend to use: a tap device,
|
|
the QEMU user networking (slirp), or a VALE switch port.
|
|
|
|
+ When the third argument is "--tap" or "--vale", the fourth
|
|
argument specifies an index (e.g. "01", "02", etc..) which
|
|
tells QEMU what tap device or VALE port to use as backend.
|
|
|
|
You can manually modify the script to set the shell variables that
|
|
select the type of emulated device (e.g. e1000, virtio-net-pci, ...)
|
|
and related options (ioeventfd, virtio vhost, e1000 mitigation, ....).
|
|
|
|
The default setup has an "e1000" device with interrupt mitigation
|
|
disabled.
|
|
|
|
You can try the paravirtualized e1000 device ("e1000-paravirt")
|
|
or the "virtio-net" device to get better performance. However, bear
|
|
in mind that these paravirtualized devices don't have netmap support
|
|
(whereas the standard e1000 does have netmap support).
|
|
|
|
Examples:
|
|
|
|
# Run a kvm VM attached to the port 01 of a VALE switch
|
|
./launch-qemu.sh kvm --vale 01
|
|
|
|
# Run a kvm VM attached to the port 02 of the same VALE switch
|
|
./launch-qemu.sh kvm --vale 02
|
|
|
|
# Run a kvm VM attached to the tap called "tap01"
|
|
./launch-qemu.sh kvm --tap 01
|
|
|
|
# Run a kvm VM attached to the tap called "tap02"
|
|
./launch-qemu.sh kvm --tap 02
|
|
|
|
|
|
Guest-to-guest tests
|
|
--------------------
|
|
|
|
If you run two VMs attached to the same switch (which can be a Linux
|
|
bridge or a VALE switch), you can run guest-to-guest experiments.
|
|
|
|
All the tests reported in the previous sections are possible (normal
|
|
sockets, raw sockets, pkt-gen, ...), indipendently of the backend used.
|
|
|
|
In the following examples we assume that:
|
|
|
|
+ Each VM has an ethernet interface called "eth0".
|
|
|
|
+ The interface of the first VM is given the IP 10.0.0.1/24.
|
|
|
|
+ The interface of the second VM is given the IP 10.0.0.2/24.
|
|
|
|
+ The Linux bridge interface "br0" on the host is given the
|
|
IP 10.0.0.200/24.
|
|
|
|
Examples:
|
|
|
|
[1] ### Test UDP short packets over traditional sockets ###
|
|
# On the guest 10.0.0.2 run
|
|
netserver
|
|
# on the guest 10.0.0.1 run
|
|
netperf -H10.0.0.2 -tUDP_STREAM -- -m8
|
|
|
|
[2] ### Test UDP short packets with pkt-gen ###
|
|
# On the guest 10.0.0.2 run
|
|
pkt-gen -ieth0 -frx
|
|
# On the guest 10.0.0.1 run
|
|
pkt-gen -ieth0 -ftx
|
|
|
|
[3] ### Test guest-to-guest latency ###
|
|
# On the guest 10.0.0.2 run
|
|
netserver
|
|
# On the guest 10.0.0.1 run
|
|
netperf -H10.0.0.2 -tTCP_RR
|
|
|
|
Note that you can use pkt-gen into a VM only if the emulated ethernet
|
|
device is supported by netmap. The default emulated device is
|
|
"e1000", which has netmap support. If you try to run pkt-gen on
|
|
an unsupported device, pkt-gen will not work, reporting that it is
|
|
unable to register the interface.
|
|
|
|
|
|
Guest-to-host tests (follows from the previous section)
|
|
-------------------------------------------------------
|
|
|
|
If you run only a VM on your host machine, you can measure the
|
|
network performance between the VM and the host machine. In this
|
|
case the experiment setup depends on the backend you are using.
|
|
|
|
With the tap backend, you can use the bridge interface "br0" as a
|
|
communication endpoint. You can run normal/raw sockets experiments,
|
|
but you cannot use pkt-gen on the "br0" interface, since the Linux
|
|
bridge interface is not supported by netmap.
|
|
|
|
Examples with the tap backend:
|
|
|
|
[1] ### Test TCP throughput over traditional sockets ###
|
|
# On the host run
|
|
netserver
|
|
# on the guest 10.0.0.1 run
|
|
netperf -H10.0.0.200 -tTCP_STREAM
|
|
|
|
[2] ### Test UDP short packets with pkt-gen and l2 ###
|
|
# On the host run
|
|
l2open -r br0 l2recv
|
|
# On the guest 10.0.0.1 run (xx:yy:zz:ww:uu:vv is the
|
|
# "br0" hardware address)
|
|
pkt-gen -ieth0 -ftx -d10.0.0.200:7777 -Dxx:yy:zz:ww:uu:vv
|
|
|
|
|
|
With the VALE backend you can perform only UDP tests, since we don't have
|
|
a netmap application which implements a TCP endpoint: pkt-gen generates
|
|
UDP packets.
|
|
As a communication endpoint on the host, you can use a virtual VALE port
|
|
opened on the fly by a pkt-gen instance.
|
|
|
|
Examples with the VALE backend:
|
|
|
|
[1] ### Test UDP short packets ###
|
|
# On the host run
|
|
pkt-gen -ivale0:99 -frx
|
|
# On the guest 10.0.0.1 run
|
|
pkt-gen -ieth0 -ftx
|
|
|
|
[2] ### Test UDP big packets (receiver on the guest) ###
|
|
# On the guest 10.0.0.1 run
|
|
pkt-gen -ieth0 -frx
|
|
# On the host run pkt-gen -ivale0:99 -ftx -l1460
|
|
|