SF.NET Logo

Hpcbench Project Page in SF.NET


High Performance Networks Benchmarking



UDP Communication Experiments

[ Unidirectional throughput test ]   [ Bidirectional throughput test ]   [ Exponential throughput test ]  

[ Latency test ]   [ UDP traffic generotor ]   [ Test with system log ]   [ Test UDP socket options ]   [ Plot data ]

Our testbed is a cluster named "mako" (mako.sharcnet.ca) in SHARCNET. Mako consists of 8 nodes (mk1-mk8), each node possessing 4 Intel Xeon 3GHz Hyperthreading processors and 2GB of RAM. There are two high speed, low latency interconnects between all nodes: Myrinet and Gigabit Ethernet. Our UDP tests are based on Gigabit Ethernet communication.

In the following examples, we will test the UDP communication between mk1(client) and mk4(server). The first step for UDP tests is to start the server process at remote or local machine:

[mk4 ~/hpcbench/udp]$ udpserver &
[1] 12103
TCP socket listening on port [5678]
[mk4 ~/hpcbench/udp]$

You can also run server in foreground and enable the verbose mode (-v option). If the default port number is not available, you can specify another port by (-p) option, or let the system to pick the port number:

[mk4 ~/hpcbench/udp]$ udpserver -p 0
TCP socket listening on port [21097]
 

 



[ Unidirectional throughput test ]   [ TOP ]

Unidirectional throughput test is the default test. We repeat the test by 6 times and write results to a output file.

[mk1 ~/hpcbench/udp]$ ./udptest -h mk4 -r 6 -o output.txt
 (1) Throughput: 956.128885  Loss-rate: 0.000000
 (2) Throughput: 955.408556  Loss-rate: 0.000000
 (3) Throughput: 955.610297  Loss-rate: 0.000650
 (4) Throughput: 954.218541  Loss-rate: 0.000000
 (5) Throughput: 956.239700  Loss-rate: 0.000000
 (6) Throughput: 953.984917  Loss-rate: 0.000000
Test done! The results are stored in file "output.txt"
[mk1 ~/hpcbench/udp]$ cat output.txt 
# UDP communication test -- Tue Jul 13 23:02:45 2004
# Fixed packet size unidirectional stream test
# Hosts: mk1 (client) <--> mk4 (server)

# Client UDP socket buffer size (Bytes) -- SNDBUF: 65535 RCVBUF: 65535
# Server UDP socket buffer size (Bytes) -- SNDBUF: 65535 RCVBUF: 65535
# Client IP TOS type: Default
# Server IP TOS type: Default
# UDP datagram (packet) size (Bytes) -- Client: 1460 Server: 1460
# Data size of each read/write (Bytes) -- Client: 1460 Server: 1460
# Message size (Bytes): 1048576
# Test time (Second): 5.000000
# Test repeat: 6

#   Network(Mbps) Local(Mbps) SentPkg(C) SentByte(C) RecvPkg(S) RecvByte(S)  LostPkg  LossRate
1       956.1289    956.2260     409357   597659792     409357   597659792         0     0.000
2       955.4086    955.5255     409045   597204272     409045   597204272         0     0.000
3       955.6103    956.3425     409395   597715272     409129   597326912       266     0.001
4       954.2185    954.3277     408532   596455292     408532   596455292         0     0.000
5       956.2397    956.3361     409392   597710892     409392   597710892         0     0.000
6       953.9849    954.0862     408435   596313672     408435   596313672         0     0.000

# Local Average: 955.474011  Minimum: 954.086228  Maximum: 956.342523
# Network Average: 955.265149  Minimum: 953.984917  Maximum: 956.239700

# Process information for each test: 

#         Client     C-process   C-process      Server     S-process   S-process
#      Elapsed-time  User-mode  System-mode  Elapsed-time  User-mode  System-mode
#       (Seconds)    (Seconds)   (Seconds)    (Seconds)    (Seconds)   (Seconds)
#1          5.00         0.17        2.35         5.00         0.07        1.68
#2          5.00         0.16        1.95         5.00         0.15        1.57
#3          5.00         0.10        2.25         5.00         0.12        1.78
#4          5.00         0.19        2.20         5.00         0.12        1.64
#5          5.00         0.11        2.30         5.00         0.14        1.71
#6          5.00         0.13        1.97         5.00         0.12        1.75

[mk1 ~/hpcbench/udp]$ 

 


[ Bidirectional throughput test ]   [ TOP ]

When test starts, both client and server send/receive UDP data to each other simultaneously:

[mk1 ~/hpcbench/udp]$ ./udptest -ih mk4 -r 6 -o output.txt
 (1) Throughput: 1601.171587  Loss-rate: 0.000296
 (2) Throughput: 1617.064904  Loss-rate: 0.000286
 (3) Throughput: 1615.875853  Loss-rate: 0.001041
 (4) Throughput: 1617.110834  Loss-rate: 0.000292
 (5) Throughput: 1626.004534  Loss-rate: 0.000293
 (6) Throughput: 1612.288565  Loss-rate: 0.000299
Test done! The results are stored in file "output.txt"
[mk1 ~/hpcbench/udp]$ cat output.txt 
# UDP communication test -- Tue Jul 13 23:12:45 2004
# Fixed packet size bidirectional stream test
# Hosts: mk1 (client) <--> mk4 (server)

# Client UDP socket buffer size (Bytes) -- SNDBUF: 65535 RCVBUF: 65535
# Server UDP socket buffer size (Bytes) -- SNDBUF: 65535 RCVBUF: 65535
# Client IP TOS type: Default
# Server IP TOS type: Default
# UDP datagram (packet) size (Bytes) -- Client: 1460 Server: 1460
# Data size of each read/write (Bytes) -- Client: 1460 Server: 1460
# Message size (Bytes): 1048576
# Test time (Second): 5.000000
# Test repeat: 6

#     Network     Local    Client     Client    Server     Server    Client     Client    Server     Server   Server
#   throughput throughput   sent       sent      recv       recv      recv       recv      sent       sent     recv
#     (Mbps)     (Mbps)    packet      byte     packet      byte     packet      byte     packet      byte   loss-rate
1    1601.172   1752.833   341068   497813656   340967   497810392   344451   502897032   409606   598024760    0.000
2    1617.065   1757.123   342892   500480980   342794   500477812   349429   510164912   409608   598027680    0.000
3    1615.876   1756.690   342958   500578768   342601   500196032   349113   509703552   409356   597659760    0.001
4    1617.111   1755.880   342365   499708704   342265   499705472   349976   510963532   409607   598026220    0.000
5    1626.005   1761.509   344782   503236096   344681   503232832   351367   512994392   409609   598029140    0.000
6    1612.289   1752.638   340982   497686668   340880   497683372   349297   509972192   409607   598026220    0.000

# Local Average: 1756.112012  Minimum: 1752.637614  Maximum: 1761.508608
# Network Average: 1614.919380  Minimum: 1601.171587  Maximum: 1626.004534

# Process information for each test: 

#         Client     C-process   C-process      Server     S-process   S-process
#      Elapsed-time  User-mode  System-mode  Elapsed-time  User-mode  System-mode
#       (Seconds)    (Seconds)   (Seconds)    (Seconds)    (Seconds)   (Seconds)
#1          5.00         0.34        4.58         5.00         0.26        3.78
#2          5.00         0.29        4.66         5.00         0.24        3.92
#3          5.00         0.32        4.65         5.00         0.33        4.02
#4          5.00         0.31        4.64         5.00         0.39        3.85
#5          5.00         0.26        4.64         5.00         0.34        3.89
#6          5.00         0.33        4.59         5.00         0.38        3.91

[mk1 ~/hpcbench/udp]$ 

 


[ Exponential throughput test ]   [ TOP ]

In exponential tests, the message size will increase exponentially from 1 Byte to a 2^n Bytes, where 2^n is less than the UDP packet (datagram) size.

[mk1 ~/hpcbench/udp]$ ./udptest -eh mk4 -o output.txt
Test done! The results are stored in file "output.txt"
[mk1 ~/hpcbench/udp]$ cat output.txt 
# UDP communication test -- Tue Jul 13 23:19:10 2004
# Exponential size unidirectional stream test
# Hosts: mk1 (client) <--> mk4 (server)

# Client UDP socket buffer size (Bytes) -- SNDBUF: 65535 RCVBUF: 65535
# Server UDP socket buffer size (Bytes) -- SNDBUF: 65535 RCVBUF: 65535
# Client IP TOS type: Default
# Server IP TOS type: Default
# UDP datagram (packet) size (Bytes) -- Client: 1460 Server: 1460
# Test time (second): 5.000000

#   Size    Network     Local    Client      Client    Server      Server    Server  ServerRecv
#  (Byte)    (Mbps)    (Mbps)   SentPkg    SentByte   RecvPkg    RecvByte   LostPkg    LossRate
1       1     1.441     1.441    900352      900383    900352      900383         0     0.000
2       2     2.884     2.884    901250     1802530    901250     1802530         0     0.000
3       4     5.773     5.773    901954     3607844    901954     3607844         0     0.000
4       8    11.581    11.581    904746     7237992    904746     7237992         0     0.000
5      16    24.186    24.197    945192    15123088    944723    15115584       469     0.000
6      32    51.467    51.465   1005171    32165472   1005171    32165472         0     0.000
7      64   103.409   103.406   1009821    64628512   1009821    64628512         0     0.000
8     128   204.645   204.637    999208   127898528    999208   127898528         0     0.000
9     256   396.979   397.256    969865   248285216    969157   248103968       708     0.001
10    512   771.211   771.186    941391   481991712    941391   481991712         0     0.000
11   1024   938.194   938.270    572676   586419232    572676   586419232         0     0.000
12   1460   951.105   951.526    407333   594704752    407243   594573352        90     0.000

# Process information for each test: 

#         Client     C-process   C-process      Server     S-process   S-process
#      Elapsed-time  User-mode  System-mode  Elapsed-time  User-mode  System-mode
#       (Seconds)    (Seconds)   (Seconds)    (Seconds)    (Seconds)   (Seconds)
#1          5.00         0.39        4.55         5.00         0.29        2.47
#2          5.00         0.33        4.63         5.00         0.22        2.33
#3          5.00         0.43        4.55         5.00         0.23        2.23
#4          5.00         0.46        4.54         5.00         0.26        2.27
#5          5.00         0.33        4.41         5.00         0.28        2.27
#6          5.00         0.52        4.47         5.00         0.26        2.47
#7          5.00         0.41        4.59         5.00         0.27        2.62
#8          5.00         0.50        4.50         5.00         0.21        2.64
#9          5.00         0.37        4.59         5.00         0.28        2.68
#10         5.00         0.33        4.68         5.00         0.29        2.66
#11         5.00         0.20        3.09         5.00         0.13        2.39
#12         5.00         0.17        2.26         5.00         0.11        1.58

[mk1 ~/hpcbench/udp]$

 


[ Latency (roundtrip time) test ]    [ TOP ]

[mk1 ~/hpcbench/udp]$ ./udptest -ah mk4 
UDP Round Trip Time (1) : 61.265 usec
UDP Round Trip Time (2) : 59.861 usec
UDP Round Trip Time (3) : 58.141 usec
UDP Round Trip Time (4) : 58.381 usec
UDP Round Trip Time (5) : 57.679 usec
UDP Round Trip Time (6) : 60.486 usec
UDP Round Trip Time (7) : 61.316 usec
UDP Round Trip Time (8) : 61.697 usec
UDP Round Trip Time (9) : 60.712 usec
UDP Round Trip Time (10) : 58.354 usec
10 trials with message size 64 Bytes.
UDP RTT min/avg/max = 57.679/59.789/61.697 usec
[mk1 ~/hpcbench/udp]$ ./udptest -h mk4 -A 1k -r 5 -o output.txt 
UDP Round Trip Time (1) : 111.552 usec
UDP Round Trip Time (2) : 110.450 usec
UDP Round Trip Time (3) : 109.443 usec
UDP Round Trip Time (4) : 107.210 usec
UDP Round Trip Time (5) : 107.303 usec
5 trials with message size 1024 Bytes.
UDP RTT min/avg/max = 107.210/109.192/111.552 usec
[mk1 ~/hpcbench/udp]$ cat output.txt 
# UDP roundtrip time test Tue Jul 13 23:34:19 2004
# mk1 <--> mk4
# UDP-send-buffer: 65535 UDP-recv-buffer: 65535
# Message-size: 1024 Iteration: 4096
UDP Round Trip Time (1) : 111.552 usec
UDP Round Trip Time (2) : 110.450 usec
UDP Round Trip Time (3) : 109.443 usec
UDP Round Trip Time (4) : 107.210 usec
UDP Round Trip Time (5) : 107.303 usec
5 trials with message size 1024 Bytes.
UDP RTT min/avg/max = 107.210/109.192/111.552 usec
[mk1 ~/hpcbench/udp]$ 

 


[ UDP traffic generator ]   [ TOP ]

We have two ways to generate UDP traffic. First, we can use client/server model, and specify the throughput constraint option (-T) in the client test. This constraint throughput rate should be less than the bandwidth between client and server, otherwise it will be ignored and the maximum throughput is generated. To send UDP packet to server with 150 Mbps data rate for 10 seconds:

[mk1 ~/hpcbench/udp]$ udptest -v -h mk4 -t 10 -T 150M -r 1
UDP throughput fixed packet size test
mk1 (client) <--> mk4 (server)
UDP-port: 36025 UDP-send-buffer: 65535 UDP-recv-buffer: 65535

[Client] Sent-bytes: 187468412 Sent-packets: 128404 Recv-bytes: 0 Recv-Packets: 0
[Server] Recv-bytes: 187468412 Recv-Packets: 128404 Sent-bytes: 0 Sent-Packets: 0
Client-time(Sec.): 10.000211  Server-time(Sec.): 9.999856
Network-throughput(Mbps): 149.971889 Local-throughput: 149.973565
Lost-packets (c->s): 0  Loss-rate(c->s): 0.000000

Test done!
[mk1 ~/hpcbench/udp]$ 

The other way is to use UDP generator option (-g). We don't need a server process running at the remote machine at this mode, so you can send UDP traffic to any routable hosts with specified UDP port number. Be aware that this test may affect target host's performance.

[mk1 ~/hpcbench/udp]$ udptest -g -h mk2 -t 10 -T 200M
Try to send UDP packets to mk2 on port 5678
Send-time(Seconds): 10.00  Target-througput(Mbps): 200.00
Done! Elapsed-time(Seconds): 10.00
Sent-packets: 171248  Sent-bytes: 250022080  Throughput(Mbps): 200.001214
[mk1 ~/hpcbench/udp]$ udptest -gh mk6 -p 3000 -t 10 -T 200M
Try to send UDP packets to mk6 on port 3000
Send-time(Seconds): 10.00  Target-througput(Mbps): 200.00
Done! Elapsed-time(Seconds): 10.00
Sent-packets: 171216  Sent-bytes: 249975360  Throughput(Mbps): 199.979788
[mk1 ~/hpcbench/udp]$  

Notice that -T option support the input with "kKmM" postfix, implying the multiplication with 1000 and 1000x1000 respectively. However, all other input in the argument list with "kKmM" postfix will multiply by a base of 1024, since the data rate is measured in bit rate and message size is measured in Byte.


[ Test with system log ]    [ TOP ]

Currently the system resource tracing functionality is only available for Linux boxes.  To enable the system logging, you should enable the write option (-o) and CPU logging option (-c). In the following example, the file "output" records the results of tests, "ouput.c_log" logs client's side system information, "output.s_log" logs server's system information. System logs have two more entries than test repetition, the first one showing pre-test system information and  the last one showing system's post-test information.

[mk1 ~/hpcbench/udp]$ udptest -ch mk4 -r 5 -o output
 (1) Throughput: 956.277957  Loss-rate: 0.000010
 (2) Throughput: 953.878337  Loss-rate: 0.000002
 (3) Throughput: 953.696928  Loss-rate: 0.000000
 (4) Throughput: 955.945228  Loss-rate: 0.000000
 (5) Throughput: 954.882271  Loss-rate: 0.000000
Test done! The results are stored in file "output"
Local-syslog: "output.c_log"  server-syslog: "output.s_log"
[mk1 ~/hpcbench/udp]$ cat output
# UDP communication test -- Tue Jul 13 23:50:45 2004
# Fixed packet size unidirectional stream test
# Hosts: mk1 (client) <--> mk4 (server)

# Client UDP socket buffer size (Bytes) -- SNDBUF: 65535 RCVBUF: 65535
# Server UDP socket buffer size (Bytes) -- SNDBUF: 65535 RCVBUF: 65535
# Client IP TOS type: Default
# Server IP TOS type: Default
# UDP datagram (packet) size (Bytes) -- Client: 1460 Server: 1460
# Data size of each read/write (Bytes) -- Client: 1460 Server: 1460
# Message size (Bytes): 1048576
# Test time (Second): 5.000000
# Test repeat: 5

#   Network(Mbps) Local(Mbps) SentPkg(C) SentByte(C) RecvPkg(S) RecvByte(S)  LostPkg  LossRate
1       956.2780    956.5202     409471   597826232     409467   597820392         4     0.000
2       953.8783    954.0961     408433   596310752     408432   596309292         1     0.000
3       953.6969    953.9025     408362   596207092     408362   596207092         0     0.000
4       955.9452    956.1571     409317   597601392     409317   597601392         0     0.000
5       954.8823    954.9808     408816   596869932     408816   596869932         0     0.000

# Local Average: 955.131345  Minimum: 953.902539  Maximum: 956.520249
# Network Average: 954.936144  Minimum: 953.696928  Maximum: 956.277957

# Process information for each test: 

#         Client     C-process   C-process      Server     S-process   S-process
#      Elapsed-time  User-mode  System-mode  Elapsed-time  User-mode  System-mode
#       (Seconds)    (Seconds)   (Seconds)    (Seconds)    (Seconds)   (Seconds)
#1          5.00         0.23        2.40         5.00         0.09        1.53
#2          5.00         0.23        2.18         5.00         0.09        1.22
#3          5.00         0.16        2.05         5.00         0.05        1.27
#4          5.00         0.18        2.41         5.00         0.11        1.28
#5          5.00         0.15        2.34         5.00         0.05        1.43

[mk1 ~/hpcbench/udp]$ cat output.c_log 
# mk1 syslog -- Tue Jul 13 23:50:45 2004
# Watch times: 7
# Network devices (interface): 3 ( loop eth0 eth1 )
# CPU number: 4

##### System info, statistics of network interface <loop> and its interrupts to each CPU #####
#       CPU(%)     Mem(%)  Interrupt  Page   Swap   Context           <loop> information
#   Load User  Sys  Usage   Overall  In/out In/out   Swtich   RecvPkg    RecvByte   SentPkg    SentByte  Int-CPU0 Int-CPU1 Int-CPU2 Int-CPU3 
0      0    0    0     99       398      40      0      520         0           0         0           0         0        0        0        0
1     12    1   11     99     83973      48      0    29119        76        5190        76        5190         0        0        0        0
2     12    1   11     99     83784      56      0    29312        52        3323        52        3323         0        0        0        0
3     11    0   10     99     83787      28      0    29679        38        2595        38        2595         0        0        0        0
4     12    0   11     99     83948      28      0    29766        72        4982        72        4982         0        0        0        0
5     12    1   11     99     83842      56      0    29457        50        3219        50        3219         0        0        0        0
6      0    0    0     99       341       0      0      475         0           0         0           0         0        0        0        0

##### System info, statistics of network interface <eth0> and its interrupts to each CPU #####
#       CPU(%)     Mem(%)  Interrupt  Page   Swap   Context           <eth0> information
#   Load User  Sys  Usage   Overall  In/out In/out   Swtich   RecvPkg    RecvByte   SentPkg    SentByte  Int-CPU0 Int-CPU1 Int-CPU2 Int-CPU3 
0      0    0    0     99       398      40      0      520        53        5672        57        8647       154        0        0        0
1     12    1   11     99     83973      48      0    29119       375       36087    410864   616797166     82506        0        0        0
2     12    1   11     99     83784      56      0    29312       396       38373    409847   615237459     82321        0        0        0
3     11    0   10     99     83787      28      0    29679       372       35765    409748   615126339     82306        0        0        0
4     12    0   11     99     83948      28      0    29766       385       37809    410719   616567558     82509        0        0        0
5     12    1   11     99     83842      56      0    29457       331       32025    410161   615804010     82384        0        0        0
6      0    0    0     99       341       0      0      475        53        5524        54        7834       131        0        0        0

##### System info, statistics of network interface <eth1> and its interrupts to each CPU #####
#       CPU(%)     Mem(%)  Interrupt  Page   Swap   Context           <eth1> information
#   Load User  Sys  Usage   Overall  In/out In/out   Swtich   RecvPkg    RecvByte   SentPkg    SentByte  Int-CPU0 Int-CPU1 Int-CPU2 Int-CPU3 
0      0    0    0     99       398      40      0      520        46        6898        43        3550       124        0        0        0
1     12    1   11     99     83973      48      0    29119       293       45136       284       23701       697        0        0        0
2     12    1   11     99     83784      56      0    29312       335       51692       323       26870       707        0        0        0
3     11    0   10     99     83787      28      0    29679       325       50954       323       26772       712        0        0        0
4     12    0   11     99     83948      28      0    29766       309       48264       302       25154       686        0        0        0
5     12    1   11     99     83842      56      0    29457       337       51914       324       26772       686        0        0        0
6      0    0    0     99       341       0      0      475        51        7911        50        4268       110        0        0        0

## CPU workload distribution: 
##
##         CPU0 workload (%)           Overall CPU workload (%)
#   < load   user  system   idle >  < load   user  system   idle >
0      0.0    0.0    0.0   100.0       0.5    0.0    0.5    99.5
1     19.6    0.6   19.0    80.4      12.6    1.0   11.5    87.4
2     12.5    0.0   12.5    87.5      12.9    1.6   11.3    87.1
3     11.8    0.0   11.8    88.2      11.6    0.9   10.7    88.4
4     10.2    0.0   10.2    89.8      12.1    0.9   11.2    87.9
5     27.9    3.7   24.2    72.1      12.9    1.4   11.5    87.1
6      0.0    0.0    0.0   100.0       0.2    0.2    0.0    99.8

##         CPU1 workload (%)           Overall CPU workload (%)
#   < load   user  system   idle >  < load   user  system   idle >
0      0.0    0.0    0.0   100.0       0.5    0.0    0.5    99.5
1      0.0    0.0    0.0   100.0      12.6    1.0   11.5    87.4
2      0.0    0.0    0.0   100.0      12.9    1.6   11.3    87.1
3     13.7    1.3   12.4    86.3      11.6    0.9   10.7    88.4
4     13.9    1.7   12.2    86.1      12.1    0.9   11.2    87.9
5      0.1    0.0    0.1    99.9      12.9    1.4   11.5    87.1
6      0.0    0.0    0.0   100.0       0.2    0.2    0.0    99.8

##         CPU2 workload (%)           Overall CPU workload (%)
#   < load   user  system   idle >  < load   user  system   idle >
0      2.0    0.0    2.0    98.0       0.5    0.0    0.5    99.5
1      6.0    1.1    4.8    94.0      12.6    1.0   11.5    87.4
2     16.9    5.0   11.9    83.1      12.9    1.6   11.3    87.1
3     13.5    1.8   11.7    86.5      11.6    0.9   10.7    88.4
4     13.5    1.0   12.5    86.5      12.1    0.9   11.2    87.9
5      0.7    0.1    0.6    99.3      12.9    1.4   11.5    87.1
6      0.0    0.0    0.0   100.0       0.2    0.2    0.0    99.8

##         CPU3 workload (%)           Overall CPU workload (%)
#   < load   user  system   idle >  < load   user  system   idle >
0      0.0    0.0    0.0   100.0       0.5    0.0    0.5    99.5
1     24.7    2.4   22.3    75.3      12.6    1.0   11.5    87.4
2     22.2    1.6   20.6    77.8      12.9    1.6   11.3    87.1
3      7.4    0.6    6.8    92.6      11.6    0.9   10.7    88.4
4     10.8    0.9   10.0    89.2      12.1    0.9   11.2    87.9
5     22.8    1.6   21.2    77.2      12.9    1.4   11.5    87.1
6      1.0    1.0    0.0    99.0       0.2    0.2    0.0    99.8
[mk1 ~/hpcbench/udp]$ cat output.s_log 
# mk4 syslog -- Tue Jul 13 23:50:45 2004
# Watch times: 7
# Network devices (interface): 2 ( loop eth0 )
# CPU number: 4

##### System info, statistics of network interface <loop> and its interrupts to each CPU #####
#       CPU(%)     Mem(%)  Interrupt  Page   Swap   Context           <loop> information
#   Load User  Sys  Usage   Overall  In/out In/out   Swtich   RecvPkg    RecvByte   SentPkg    SentByte  Int-CPU0 Int-CPU1 Int-CPU2 Int-CPU3 
0      0    0    0     10       178       0      0      102         0           0         0           0         0        0        0        0
1     15    0   15     10    410550      32      0   812538         0           0         0           0         0        0        0        0
2     14    0   14     10    409677      32      0   810950         0           0         0           0         0        0        0        0
3     14    0   13     10    409630      16      0   812428         0           0         0           0         0        0        0        0
4     14    0   13     10    410559      16      0   814570         0           0         0           0         0        0        0        0
5     14    0   14     10    409770      32      0   813458         0           0         0           0         0        0        0        0
6      0    0    0     10       169       0      0       96         0           0         0           0         0        0        0        0

##### System info, statistics of network interface <eth0> and its interrupts to each CPU #####
#       CPU(%)     Mem(%)  Interrupt  Page   Swap   Context           <eth0> information
#   Load User  Sys  Usage   Overall  In/out In/out   Swtich   RecvPkg    RecvByte   SentPkg    SentByte  Int-CPU0 Int-CPU1 Int-CPU2 Int-CPU3 
0      0    0    0     10       178       0      0      102        44        5762        44        5040        47        0        0        0
1     15    0   15     10    410550      32      0   812538    410603   616757333       143       16556    409573        0        0        0
2     14    0   14     10    409677      32      0   810950    409563   615194626       143       17075    408715        0        0        0
3     14    0   13     10    409630      16      0   812428    409503   615088728       151       17735    408658        0        0        0
4     14    0   13     10    410559      16      0   814570    410456   616526830       151       17735    409611        0        0        0
5     14    0   14     10    409770      32      0   813458    409954   615772259       151       17738    408828        0        0        0
6      0    0    0     10       169       0      0       96        20        2719        24        2982        50        0        0        0

## CPU workload distribution: 
##
##         CPU0 workload (%)           Overall CPU workload (%)
#   < load   user  system   idle >  < load   user  system   idle >
0      0.0    0.0    0.0   100.0       0.0    0.0    0.0   100.0
1     45.4    0.3   45.2    54.6      15.6    0.3   15.3    84.4
2     57.5    1.3   56.3    42.5      14.4    0.3   14.1    85.6
3     56.6    0.7   55.8    43.4      14.1    0.2   14.0    85.9
4     57.0    1.6   55.4    43.0      14.3    0.4   13.9    85.7
5     53.8    0.6   53.3    46.2      15.0    0.2   14.8    85.0
6      0.0    0.0    0.0   100.0       0.0    0.0    0.0   100.0

##         CPU1 workload (%)           Overall CPU workload (%)
#   < load   user  system   idle >  < load   user  system   idle >
0      0.0    0.0    0.0   100.0       0.0    0.0    0.0   100.0
1     17.1    1.0   16.1    82.9      15.6    0.3   15.3    84.4
2      0.0    0.0    0.0   100.0      14.4    0.3   14.1    85.6
3      0.0    0.0    0.0   100.0      14.1    0.2   14.0    85.9
4      0.1    0.1    0.0    99.9      14.3    0.4   13.9    85.7
5      6.0    0.1    5.8    94.0      15.0    0.2   14.8    85.0
6      0.0    0.0    0.0   100.0       0.0    0.0    0.0   100.0

##         CPU2 workload (%)           Overall CPU workload (%)
#   < load   user  system   idle >  < load   user  system   idle >
0      0.0    0.0    0.0   100.0       0.0    0.0    0.0   100.0
1      0.0    0.0    0.0   100.0      15.6    0.3   15.3    84.4
2      0.0    0.0    0.0   100.0      14.4    0.3   14.1    85.6
3      0.0    0.0    0.0   100.0      14.1    0.2   14.0    85.9
4      0.0    0.0    0.0   100.0      14.3    0.4   13.9    85.7
5      0.0    0.0    0.0   100.0      15.0    0.2   14.8    85.0
6      0.0    0.0    0.0   100.0       0.0    0.0    0.0   100.0

##         CPU3 workload (%)           Overall CPU workload (%)
#   < load   user  system   idle >  < load   user  system   idle >
0      0.0    0.0    0.0   100.0       0.0    0.0    0.0   100.0
1      0.0    0.0    0.0   100.0      15.6    0.3   15.3    84.4
2      0.0    0.0    0.0   100.0      14.4    0.3   14.1    85.6
3      0.0    0.0    0.0   100.0      14.1    0.2   14.0    85.9
4      0.0    0.0    0.0   100.0      14.3    0.4   13.9    85.7
5      0.0    0.0    0.0   100.0      15.0    0.2   14.8    85.0
6      0.0    0.0    0.0   100.0       0.0    0.0    0.0   100.0
[mk1 ~/hpcbench/udp]$

In this test, client machine consumes about 12% CPU clock cycles (mainly CPU0),  and server has about 15% CPU usage (mainly CPU0). Contrast to TCP tests, UDP communication is less expensive.


[ Test UDP socket options ]    [ TOP ]

There are some UDP socket options we can set. In the following example, we set the UDP socket buffer size to 500KBytes,  let packet (datagram) set be 8KBytes, and set the packet's TOS bit to Maximize-Throughput mode:

[mk1 ~/hpcbench/udp]$ udptest -h mk4 -b 500k -l 8k -q 2 -r 5 -o output.txt
 (1) Throughput: 951.931709  Loss-rate: 0.000000
 (2) Throughput: 956.052796  Loss-rate: 0.000000
 (3) Throughput: 958.187119  Loss-rate: 0.000000
 (4) Throughput: 958.073051  Loss-rate: 0.000000
 (5) Throughput: 958.187390  Loss-rate: 0.000000
Test done! The results are stored in file "output.txt"
[mk1 ~/hpcbench/udp]$ cat output.txt
# UDP communication test -- Tue Jul 13 23:59:04 2004
# Fixed packet size unidirectional stream test
# Hosts: mk1 (client) <--> mk4 (server)

# Client UDP socket buffer size (Bytes) -- SNDBUF: 262142 RCVBUF: 262142
# Server UDP socket buffer size (Bytes) -- SNDBUF: 262142 RCVBUF: 262142
# Client IP TOS type: IPTOS_Maximize_Throughput
# Server IP TOS type: IPTOS_Maximize_Throughput
# UDP datagram (packet) size (Bytes) -- Client: 8192 Server: 8192
# Data size of each read/write (Bytes) -- Client: 8192 Server: 8192
# Message size (Bytes): 1048576
# Test time (Second): 5.000000
# Test repeat: 5

#   Network(Mbps) Local(Mbps) SentPkg(C) SentByte(C) RecvPkg(S) RecvByte(S)  LostPkg  LossRate
1       951.9317    952.4461      72667   595279904      72667   595279904         0     0.000
2       956.0528    956.4425      72972   597778464      72972   597778464         0     0.000
3       958.1871    958.7220      73146   599203872      73146   599203872         0     0.000
4       958.0731    958.5749      73137   599130144      73137   599130144         0     0.000
5       958.1874    958.6665      73142   599171104      73142   599171104         0     0.000

# Local Average: 956.970389  Minimum: 952.446132  Maximum: 958.721977
# Network Average: 956.486413  Minimum: 951.931709  Maximum: 958.187390

# Process information for each test: 

#         Client     C-process   C-process      Server     S-process   S-process
#      Elapsed-time  User-mode  System-mode  Elapsed-time  User-mode  System-mode
#       (Seconds)    (Seconds)   (Seconds)    (Seconds)    (Seconds)   (Seconds)
#1          5.00         0.04        1.80         5.00         0.01        0.91
#2          5.00         0.03        2.22         5.00         0.03        1.42
#3          5.00         0.03        2.61         5.00         0.01        1.47
#4          5.00         0.05        2.42         5.00         0.02        1.47
#5          5.00         0.03        2.58         5.00         0.02        1.51

[mk1 ~/hpcbench/udp]$ 

We are testing a pure idle cluster, and we can see there is no big difference of the throughput between this setting and that of default setting. Notice the socket buffer size is not the same as we defined, and 256KBytes is the maximum size we can get.

Let's try a small size of each sending and socket buffer:

[mk1 ~/hpcbench/udp]$ udptest -h mk4 -b 10k -d 500 -o output.txt
 (1) Throughput: 673.031846  Loss-rate: 0.000258
 (2) Throughput: 678.080272  Loss-rate: 0.000809
 (3) Throughput: 673.798008  Loss-rate: 0.000004
 (4) Throughput: 682.131479  Loss-rate: 0.000005
 (5) Throughput: 682.554382  Loss-rate: 0.000000
 (6) Throughput: 676.001683  Loss-rate: 0.000848
 (7) Throughput: 678.981900  Loss-rate: 0.000479
 (8) Throughput: 676.500887  Loss-rate: 0.000025
 (9) Throughput: 678.020663  Loss-rate: 0.000812
 (10) Throughput: 675.721539  Loss-rate: 0.000689
Test done! The results are stored in file "output.txt"
[mk1 ~/hpcbench/udp]$ cat output.txt 
# UDP communication test -- Tue Jul 13 23:59:54 2004
# Fixed packet size unidirectional stream test
# Hosts: mk1 (client) <--> mk4 (server)

# Client UDP socket buffer size (Bytes) -- SNDBUF: 10240 RCVBUF: 10240
# Server UDP socket buffer size (Bytes) -- SNDBUF: 10240 RCVBUF: 10240
# Client IP TOS type: Default
# Server IP TOS type: Default
# UDP datagram (packet) size (Bytes) -- Client: 1460 Server: 1460
# Data size of each read/write (Bytes) -- Client: 500 Server: 500
# Message size (Bytes): 1048576
# Test time (Second): 5.000000
# Test repeat: 10

#   Network(Mbps) Local(Mbps) SentPkg(C) SentByte(C) RecvPkg(S) RecvByte(S)  LostPkg  LossRate
1       673.0318    673.1834     841482   420740532     841265   420632032       217     0.000
2       678.0803    678.6068     848261   424130032     847575   423787032       686     0.001
3       673.7980    673.7778     842225   421112032     842222   421110532         3     0.000
4       682.1315    682.1120     852642   426320532     852638   426318532         4     0.000
5       682.5544    682.5334     853169   426584032     853169   426584032         0     0.000
6       676.0017    676.5530     845693   422846032     844976   422487532       717     0.001
7       678.9819    679.2858     849109   424554032     848702   424350532       407     0.000
8       676.5009    676.4958     845622   422810532     845601   422800032        21     0.000
9       678.0207    678.5496     848190   424094532     847501   423750032       689     0.001
10      675.7215    676.1646     845208   422603532     844626   422312532       582     0.001

# Local Average: 677.726207  Minimum: 673.183370  Maximum: 682.533359
# Network Average: 677.482266  Minimum: 673.031846  Maximum: 682.554382

# Process information for each test: 

#         Client     C-process   C-process      Server     S-process   S-process
#      Elapsed-time  User-mode  System-mode  Elapsed-time  User-mode  System-mode
#       (Seconds)    (Seconds)   (Seconds)    (Seconds)    (Seconds)   (Seconds)
#1          5.00         0.34        4.63         5.00         0.31        2.35
#2          5.00         0.38        4.60         5.00         0.33        2.61
#3          5.00         0.33        4.38         5.00         0.20        2.63
#4          5.00         0.36        4.65         5.00         0.23        2.69
#5          5.00         0.32        4.67         5.00         0.26        2.48
#6          5.00         0.34        4.66         5.00         0.21        2.62
#7          5.00         0.35        4.63         5.00         0.23        2.69
#8          5.00         0.36        4.57         5.00         0.21        2.63
#9          5.00         0.31        4.53         5.00         0.14        2.83
#10         5.00         0.46        4.53         5.00         0.28        2.55

[mk1 ~/hpcbench/udp]$ 

 


[ Plot data ]    [ TOP ]

If write option (-o) and plot option (-P) are both defined, a configuration file for plotting with format of "ouput.plot" will be created. Use gnuplot to plot the data or create the postscript files of the plotting:

 


Last updated: Sept. 2004 by Ben Huang