Hpcbench - High Performance Networks Benchmarking

Hpcbench Project Page in SF.NET

High Performance Networks Benchmarking

Download

· hpcbench.tar.gz
· udp.tar.gz
· tcp.tar.gz
· mpi.tar.gz
· sysmon.tar.gz

Documentation

· Usage & Options
· UDP Experiments
· TCP Experiments
· MPI Experiments

Feedback

· Bug Report
· Hpcbench Forum
· Join the Project
· Email Me

Usage and options of Hpcbench

[ UDP Communication Test ] [ TCP Communication Test ]
[ MPI Communication Test ] [ System Resource Monitor ]

Hpcbench includes four packages. Each of them can work independently. UDP and TCP benchmarks are used in pairs, and you should start the server process before the client process. MPI benchmark must work with a MPI implementation. All these three tools measure end-to-end performance in a network. Sysmon is a Linux-based system resource monitoring tool, functioning like vmstat/iostat with more information about the network statistics.

[ UDP Communication Test ]

Two executables, udpserver and udptest, will be created in the directory after compilation. You should start the server process at first, except udptest(client) runs as UDP traffic generator. The command line options:

UDP server usage: udpserver [options]

$ udpserver [-v] [-p port]

[-p port] Port number for TCP listening (0 picked by system), 5678 by default.
[-v] Verbose mode. Disable by default.

UDP client usage: udptest -h host [options]

$ udptest -h host [-vacdeiP] [-p port] [-A rtt] [-b buffer] [-B buffer] [-m msssage] [-q qos] [-l datagram] [-d data] [-t time] [-r repeat] [-o output] [-T throughput]

[-a] UDP Round Trip Time (RTT or latency) test.
[-A rtt-size] UDP RTT (latency) test with specified message size.
[-b buffer] Client UDP buffer size in bytes. Using system default value if not defined.
[-B buffer] Server UDP buffer size in bytes. The same as cleint's by default.
[-c] CPU log option. Tracing system info during the test. Only available when output is defined.
[-d data-size] Data size of each read/write in bytes. The same as packet size by default.
[-e] Exponential test (data size of each sending increasing from 1 byte to packet size).
[-g] UDP traffic generator (Keep sending data to a host). Work without server's support.
[-h host] Hostname or IP address of UDP server. Must be specified.
[-i] Bidirectional UDP throughput test. Default is unidirection stream test.
[-l datagram] UDP datagram (packet) size in bytes ( < udp-buffer-szie ). 1460 by default.
[-m message] Total message size in bytes. 1Mbytes by default.
[-o output] Output file name.
[-p port] Port number of UDP server. 5678 by default.
[-P] Write the plot file for gnuplot. Only enable when the output is specified.
[-q qos] Define the TOS field of IP packets. Six predefined values can be used for this setting:
- 1: (IPTOS)-Minimize delay 2: (IPTOS)-Maximize throughput
- 3: (DiffServ)-Class1 with low drop probability 4: (DiffServ)-class1 with high drop probability
- 5: (DiffServ)-Class4 with low drop probabiltiy 6: (DiffServ)-Class4 with high drop probabiltiy
[-r repeat] Repetition of tests. 10 by default.
[-t time] Test time constraint in seconds. 5 by default.
[-T throughput] Throughput constraint for UDP generator or throughput test. Unlimited by default.
[-v] Verbose mode. Disable by default.

NOTE: Input (except -T) supports the postfix of "kKmM", 1K=1024, 1M=1024x1024. Throughput constraint option (-T): 1K=1000, 1M=1000000.

With plot option (-P), when an "output" file is specified, an "output.plot" file will also be created for plotting. Use "gnuplot ouput.plot" to plot the data. With CPU option (-c), when an "output" file is specified, "output.c_log" and "output.s_log" files store the system information of client and server, respectively.

UDP Round Trip Time (latency) test is just a UDP version of "ping". RTT is too short to be measured in HPC environments, so we repeat RTT test many times and get the average of RTTs.

A UPD throughput test is done when both of the conditions are satisfied: message size AND test time. So the actual size of sent message could be greater than the message size you specify if the test time is large.

In UPD throughput tests, message size (-m option) specifies the total amount of data to be sent. Messages are actually sent by small pieces (defined by -d option) that must be smaller than datagram (packet) size. In exponential tests, the sending size increases exponentially from 1 byte to the datagram (packet) size; while in the fixed-size tests, the size of each sending is always the same as datagram (packet) size. Most systems have a 64KB maximum size limit of UDP datagram (packet).

UDP traffic generator keeps sending UDP packets to a remote host that is unnecessary running as server. Better to pick an unused port for this test. You can specify the throughput to be sent (-T option). Be aware that this test may affect target host's performance.

If CPU and system monitoring option (-c) is defined, both client and server's CPU and memory usages (Maximum 8 CPUs supported for SMP systems), network interface statistics and its interrupts to each CPU will be recorded. Currently this option is only available for Linux system.

Examples

1. Start server process
[server] $ udpserver

2. Start client process

Example 1: [client] $ udptest -ah server
UDP Round Trip Time (latency) test.

Example 2: [client] $ udptest -h server
UDP throughput test with default set of parameters (Port: 5678, test-time: 5, test-repeat: 10, Message-size: 1Mbytes, packet-size: 1460, send-size: 1460)

Example 3: [server] $ udpserver -p 3000
[client] $ udptest -vh server -p 3000 -b 1M -m 10m -l 20k -t 2 -r 20 -o output.txt
Repeat throughput tests by 20 times with communication port of 3000; store results in "output.txt"; buffer-size: 1MB, message-size: 10MB, test-time: 2 Seconds, packet-size: 20KB

Example 4: [client] $ udptest -eP -h server -b 100k -o output.txt
Exponential throughput test for buffer size of 100KB, writing output and plot file.

Example 5: [client] $ udptest -gh abc.com -T 10M -t 30
Keep sending UDP data to the target host by 10Mbps throughput for 30 seconds.

[ TCP Communication Test ] [ TOP ]

Two executables, tcpserver and tcptest, will be created in the directory after compiplation. You should start the server process at first. The command line options:

TCP server usage: tcpserver [options]

$ tcpserver [-v] [-p port]

[-p port] Port number for TCP listening (0 picked by system), 5677 by default.
[-v] Verbose mode. Disable by default.

TCP clinet usage: tcptest -h host [options]

$ tcptest -h host [-vanicCNP] [-p port] [-A rtt-size] [-e exponent] [-b buffer] [-B buffer] [-q qos] [-M MSS] [-d data] [-m message] [-r repeat] [-t time] [-f sendfile] [-I iteration] [-o output]

[-a] Test the TCP Round Trip Time (RTT). Ignore all other options if defined.
[-A test-size] TCP RTT test with specified message size.
[-b buffer-size] TCP buffer (windows) size in bytes. System default if not defined.
[-B buffer] Server UDP buffer size in bytes. The same as cleint's by default.
[-c] CPU log option. Tracing system information during the test. Only availabe when output is defined.
[-C] Turn on socket's TCP_CORK option (avoid sending partial frames). Disable by default.
[-d data-size] Data size of each read/write in bytes. The same as packet size by default.
[-e n] Exponential tests with message size increasing exponentially from 1 to 2^n.
[-f sendfile] Sendfile test. Memory mapping is used to reduce the workload. Disable by default.
[-h host-name] Hostname or IP address of server. Must be specified.
[-i] Bidirectional UDP throuhghput test. Default is unidirection stream test.
[-I iteration] Iteration of sending/receiving for each test. Auto-determined by default.
[-m message-size] Message size in bytes. 65536 by default.
[-M MSS-size] Maximum Segent Size in bytes (MTU-40 for TCP). System default if not defined.
[-n] Non-blocking communication. Blocking communication by default.
[-N] Turn on socket's TCP_NODELAY option (disable Nagel algorithm). Disable by default.
[-o output] Output file name.
[-p port-number] Server's port number. 5677 by default.
[-P] Write the plot file for gnuplot. Only enable when the output is specified.
[-q qos] Define the TOS field of IP packets. Six predefined values can be used for this setting:
- 1: (IPTOS)-Minimize delay 2: (IPTOS)-Maximize throughput
- 3: (DiffServ)-Class1 with low drop probability 4: (DiffServ)-class1 with high drop probability
- 5: (DiffServ)-Class4 with low drop probabiltiy 6: (DiffServ)-Class4 with high drop probabiltiy
- [-r repeat] Repetition of tests. 10 by default.
[-t test-time] Test time in seconds. Disable if iteration is sepcified. 5 by default.
[-v] Verbose mode. Disable by default.

NOTE: Input supports the postfix of "kKmM", 1k=1024, 1M=1024x1024.

The TCP RTT (latency) test is just a TCP version of "ping". RTT is too short to be measured in HPC environments, so we repeat RTT test many times and get the average of RTTs. In the TCP tests, message size (-m option) specifies the amount of data to be sent each time.

The iteration of sending/receiving for a test time (-t option) is determined by an evaluation test, so the actually test time could vary slightly. In exponential test, message size increases exponentially from 1 byte to a large number (-e option). Be aware that there is a minimum number of iteration, and the test time might be much greater than what you specify if the message size is very large.

Examples

1. Start server process
[server] $ tcpserver

2. Start client process

Example 1: [client] $ tcptest -ah server
TCP Round Trip Time (RTT) test. A TCP version of ping.

Example 2: [client] $ tcptest -h server
TCP blocking stream test with default set of parameters, verbose off, no result writing.

Example 3: [server] $ tcpserver -p 3000
[client] $ tcptest -vn -h server -p 3000 -b 100k -m 10m -t 2 -r 20 -o output.txt
Repeat non-blocking stream tests by 20 times with communication port of 3000. Buffer size: 100K, message size: 10M, test time: 2 Seconds, store results in "output.txt".

Example 4: [client] $ tcptest -e 20 -vh server -b 100k -o output.txt
Exponential stream test for buffer size of 100 KB with verbose mode with message size increasing exponentially from 1 Byte to 1 MByte (2^20).

[ MPI Communication Test ] [ TOP ]

To compile the MPI benchmark (mpi directory), you need a MPI implementation installed, such as MPICH or LAM/MPI. And you should define the MPI compiler in the makefile. Most MPI implementations have a script named mpicc to do this job. An executable file mpitest will be created when compilation is finished. To run the mpitest, you should run the program with another script named mpirun, or submit the job to a high level queuing systems like RMS and LSF. For MPICH, the command line options of mpitest:

MPI test: mpirun -np 2 mpitest [options]

$ mpirun -np 2 mpitest [-acinP] [-A size] [-e exponent] [-m message] [-o output] [-r repeat] [-t time]

[-a] Round Trip Time (latency) test. Disable by default.
[-A RTT-size] Specify the message size in bytes for RTT (latency) test.
[-c] CPU log option. Tracing system information during the test. Only available for Linux systems.
[-e n] Exponential tests with message size increasing exponentially from 1 to 2^n. Disable by default.
[-i] Ping-pong (bidirectional) test. Stream (unidirectional) test by default.
[-m message-size] Message size by bytes (1M by default). Disable in exponential tests.
[-n] Non-blocking communication. Blocking communication by default.
[-o output] Write test results to a file. Disable by default.
[-P] Plot file for gnuplot. Only enable when the output is specified. Disable by default.
[-r repeat] Repeat tests many times. Disable in exponential tests. 10 times by default.
[-t test-time] Specify test time by seconds. 5 seconds by default.

NOTE: Input supports the postfix of "kKmM". 1k=1024, 1M=1024x1024.

Examples:

Example1: $ mpirun -np 2 mpitest
Throughput stream test with default parameters.

Example2: $ mpirun -np 2 mpitest -e 20
Exponential stream (unidirectional) test, message size from 1 byte to 2^20 (1M) bytes.

Example3: $ mpirun -np 2 mpitest -c -m 10m -Po output.txt
Throughput stream test with 10MBytes message size, write result/plot files, log system info.

Example4: $ mpirun -np 2 mpitest -ni -m 100k -t 3 -r 10
Nonblocking ping-pong test. Message-size: 100KBytes; test-ime: 3 seconds; repeat 10 times.

Example5: $ mpirun -np 2 mpitest -a -o rtt.txt
MPI Round Trip Time (latency) test. Write the result to file "rtt.txt".

To use own machine file (MPICH):

$ mpirun -np 2 -machinefile <machine file> mpitest [options]

or use the p4 procgroup file (MPICH):

$ mpirun -p4pg <p4 procgroup file> mpitest [options]

There are sample machinefile and p4pg files in mpi directory. To submit your job to queuing system such as LSF, refer to the test-mpi.lsf and test-mpi.sh scripts in testscript directory.

[ SYSMON - System Resource Monitor ] [ TOP ]

Sysmon is a lightweight Linux-based system resource tracing tool. Although it's not a benchmark, it's very helpful to trace what is happening in the kernel level during the benchmarking. The output includes the CPU and memory usage, swapping, paging and context switches information, interrupts kernel received, and each network interface's statistics, which includes interrupts to kernel, packets and bytes that received and sent in a specified interval.

Sysmon usage: sysmon [options]

$ sysmon [-bhkwW] [-i interface-name] [-r repeat] [-t test-interval] [-T test-time]

[-b] Background (daemon) mode. Only valid when write option is defined.
[-h] Printout this help messages.
[-k] Kill the sysmon background process (daemon). Disable by default.
[-w] Write all results to a file. Disable by default.
[-W] Write statistics of each network device to separate files. Disable by default.
[-i interface-name] Define the network device name (e.g. eth0). Monitor all if no interface defined.
[-r repeat] Repetition of monitoring. 10 times by default.
[-t test-interval] The interval (sample time) between each tracing in seconds. 2 seconds by default.
[-T test-time] The duration of system monitoring in minutes. Valid only write option defined.
[-o output] Specify the output (log) filename. Implies the write option.

NOTE: Default log file has format of hostname-start-time.log if write option (-w) is defined and output option (-o) is not defined. If separate write option (-W) is defined, besides the overall log file "output", each network interface has its itself log file with name like "output.eth0". This smaller log file is more readable than the lengthy overall log file.

You can use command "/sbin/ifconfig" to check the network devices and their names in your computer. Possible names: eth0, wlan0, elan0, etc. If your system has some strange NIC names, you can define them with constant NETNAME in util.h. We use the name of "loop" for loopback address.

Examples:

Example1: $ sysmon
Monitor all network devices. Output has a very long format if the computer has several network cards.

Example2: $ sysmon -r 100 -t 1 -i eth0 -o net.log
Only monitor the first Ethernet card, repeat test 100 times with time interval of 1 second, write results to net.log

Example3: $ sysmon -bw -i eth0 -t 600 -T 10080
Log every 10 minutes for one week, run process in background (daemon).

Last updated: Sept. 2004 by Ben Huang