[ Blocking unidirectional throughput test ] [
TOP ]
Blocking stream (unidirectional) throughput test is the
default setting. We repeat the test by 5 times and write results to a output
file.
[mk1 ~/hpcbench/tcp]$
tcptest -h mk4 -r 5 -o output
(1) : 938.306894 Mbps
(2) : 936.791097 Mbps
(3) : 936.252991 Mbps
(4) : 939.005970 Mbps
(5) : 938.736784 Mbps
Test done!
Test-result: "output"
[mk1 ~/hpcbench/tcp]$ cat output
# TCP communication test -- Tue Jul 13 17:52:09 2004
# Hosts: mk1 (client) <----> mk4 (server)
# TCP test mode: stream(unidirectional) throughput test
# Socket Recv-buffer (Bytes) -- client: 87380 server: 87380
# Socket Send-buffer (Bytes) -- client: 16384 server: 16384
# Socket blocking option -- client: ON server: ON
# TCP_NODELAY option -- client: OFF server: OFF
# TCP_CORK option -- client: OFF server: OFF
# TCP Maximum-segment-size(MSS) (Bytes) -- client: 1448 server: 1448
# IP TOS type -- client: Default server: Default
# Data size of each read/write (Bytes): 8192
# Total data size sent of each test (Bytes): 587792384
# Message size (Bytes): 65536
# Iteration: 8969
# Test Repetition: 5
# Network Client C-process C-process Server
S-process S-process
# Throughput Elapsed-time User-mode System-mode Elapsed-time
User-mode System-mode
# (Mbps) (Seconds) (Seconds) (Seconds) (Seconds)
(Seconds) (Seconds)
1 938.307 5.01 0.00 1.21
5.01 0.12 2.90
2 936.791 5.02 0.00 1.18
5.02 0.07 2.87
3 936.253 5.02 0.03 1.26
5.02 0.06 3.00
4 939.006 5.01 0.00 1.36
5.01 0.08 2.97
5 938.737 5.01 0.00 1.84
5.01 0.13 2.64
# Throughput statistics : Average 937.9449 Minimum 936.2530 Maximum
939.0060
[mk1 ~/hpcbench/tcp]$ |
[ Blocking bidirectional throughput test ] [
TOP ]
The blocking bidirectional throughput test is called
"ping-pong" test: server receives a message and sends back to clients:
[mk1 ~/hpcbench/tcp]$ tcptest -i -h mk4 -r 5 -o out.txt
(1) : 835.843170 Mbps
(2) : 845.507710 Mbps
(3) : 846.742031 Mbps
(4) : 843.675498 Mbps
(5) : 842.522398 Mbps
Test done!
Test-result: "out.txt"
[mk1 ~/hpcbench/tcp]$ cat out.txt
# TCP communication test -- Tue Jul 13 18:15:53 2004
# Hosts: mk1 (client) <----> mk4 (server)
# TCP test mode: ping-pong(bidirectional) throughput test
# Socket Recv-buffer (Bytes) -- client: 87380 server: 87380
# Socket Send-buffer (Bytes) -- client: 16384 server: 16384
# Socket blocking option -- client: ON server: ON
# TCP_NODELAY option -- client: OFF server: OFF
# TCP_CORK option -- client: OFF server: OFF
# TCP Maximum-segment-size(MSS) (Bytes) -- client: 1448 server: 1448
# IP TOS type -- client: Default server: Default
# Data size of each read/write (Bytes): 8192
# Total data size sent of each test (Bytes): 262602752
# Message size (Bytes): 65536
# Iteration: 4007
# Test Repetition: 5
# Network Client C-process C-process Server S-process S-process
# Throughput Elapsed-time User-mode System-mode Elapsed-time User-mode System-mode
# (Mbps) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds)
1 835.843 5.03 0.03 2.91 5.03 0.09 2.19
2 845.508 4.97 0.08 3.01 4.97 0.04 2.28
3 846.742 4.96 0.05 2.96 4.96 0.04 2.16
4 843.675 4.98 0.07 2.86 4.98 0.03 2.07
5 842.522 4.99 0.08 2.72 4.99 0.05 2.30
# Throughput statistics : Average 843.9019 Minimum 835.8432 Maximum 846.7420
[mk1 ~/hpcbench/tcp]$
|
[ Blocking exponential throughput test ] [
TOP ]
In exponential tests, the message size will increase
exponentially from 1 Byte to a 2^n Bytes, where n is defined by (-e) option.
We tests both stream and ping-pong exponential tests with maximum data size
of 32MByte (2^25):
[mk1 ~/hpcbench/tcp]$ tcptest -h mk4 -e 25
(1) : 0.966954 Mbps
(2) : 2.028758 Mbps
(3) : 4.018384 Mbps
(4) : 8.962581 Mbps
(5) : 114.021023 Mbps
(6) : 193.792581 Mbps
(7) : 348.583878 Mbps
(8) : 516.233112 Mbps
(9) : 754.716981 Mbps
(10) : 903.635722 Mbps
(11) : 939.406449 Mbps
(12) : 936.153679 Mbps
(13) : 936.881712 Mbps
(14) : 938.658356 Mbps
(15) : 940.367532 Mbps
(16) : 938.735666 Mbps
(17) : 937.824883 Mbps
(18) : 938.344359 Mbps
(19) : 941.224639 Mbps
(20) : 941.180170 Mbps
(21) : 940.614833 Mbps
(22) : 939.158388 Mbps
(23) : 933.832581 Mbps
(24) : 932.930817 Mbps
(25) : 940.043719 Mbps
(26) : 941.001496 Mbps
Test done!
[mk1 ~/hpcbench/tcp]$ tcptest -h mk4 -e 25 -i -o output.txt
(1) : 0.279742 Mbps
(2) : 0.526432 Mbps
(3) : 1.053214 Mbps
(4) : 2.096010 Mbps
(5) : 4.115054 Mbps
(6) : 8.171917 Mbps
(7) : 15.999450 Mbps
(8) : 32.126498 Mbps
(9) : 56.310765 Mbps
(10) : 95.175468 Mbps
(11) : 144.502655 Mbps
(12) : 224.478024 Mbps
(13) : 368.048684 Mbps
(14) : 516.538273 Mbps
(15) : 634.885268 Mbps
(16) : 776.249393 Mbps
(17) : 842.645579 Mbps
(18) : 891.314645 Mbps
(19) : 912.319905 Mbps
(20) : 924.485150 Mbps
(21) : 929.543704 Mbps
(22) : 931.308711 Mbps
(23) : 938.358981 Mbps
(24) : 939.385785 Mbps
(25) : 940.040040 Mbps
(26) : 930.230512 Mbps
Test done!
Test-result: "output.txt"
[mk1 ~/hpcbench/tcp]$ cat output.txt
# TCP communication test -- Tue Jul 13 18:30:36 2004
# Hosts: mk1 (client) <----> mk4 (server)
# TCP test mode: ping-pong(bidirectional) exponential throughput test
# Socket Recv-buffer (Bytes) -- client: 87380 server: 87380
# Socket Send-buffer (Bytes) -- client: 16384 server: 16384
# Socket blocking option -- client: ON server: ON
# TCP_NODELAY option -- client: OFF server: OFF
# TCP_CORK option -- client: OFF server: OFF
# TCP Maximum-segment-size(MSS) (Bytes) -- client: 1448 server: 1448
# IP TOS type -- client: Default server: Default
# Data-size Network-hroughput Elapsed-time Iteration
# (Bytes) (Mbps) (Seconds)
1 0.2797 0.28598 5000
2 0.5264 0.30393 5000
4 1.0532 0.30383 5000
8 2.0960 0.30534 5000
16 4.1151 0.31105 5000
32 8.1719 0.31327 5000
64 15.9995 0.32001 5000
128 32.1265 0.31874 5000
256 56.3108 0.36370 5000
512 95.1755 0.43036 5000
1024 144.5027 0.56691 5000
2048 224.4780 0.72987 5000
4096 368.0487 0.89032 5000
8192 516.5383 1.26875 5000
16384 634.8853 2.06450 5000
32768 776.2494 3.37706 5000
65536 842.6456 4.60547 3701
131072 891.3146 4.72693 2009
262144 912.3199 4.88244 1062
524288 924.4852 4.92708 543
1048576 929.5437 4.96344 275
2097152 931.3087 4.97205 138
4194304 938.3590 4.93469 69
8388608 939.3858 4.85786 34
16777216 940.0400 4.85448 17
33554432 930.2305 4.61710 8
[mk1 ~/hpcbench/tcp]$
|
[ Non-blocking unidirectional throughput test ]
[ TOP ]
We can set the TCP socket in non-blocking mode, then the
read/write system calls will immediately return in the application layer.
[mk1 ~/hpcbench/tcp]$ tcptest -n -h mk4 -r 5 -o output.txt
(1) : 930.135129 Mbps
(2) : 936.424894 Mbps
(3) : 933.355274 Mbps
(4) : 939.637175 Mbps
(5) : 941.422417 Mbps
Test done!
Test-result: "output.txt"
[mk1 ~/hpcbench/tcp]$ cat output.txt
# TCP communication test -- Tue Jul 13 18:35:41 2004
# Hosts: mk1 (client) <----> mk4 (server)
# TCP test mode: stream(unidirectional) throughput test
# Socket Recv-buffer (Bytes) -- client: 87380 server: 87380
# Socket Send-buffer (Bytes) -- client: 16384 server: 16384
# Socket blocking option -- client: OFF server: OFF
# TCP_NODELAY option -- client: OFF server: OFF
# TCP_CORK option -- client: OFF server: OFF
# TCP Maximum-segment-size(MSS) (Bytes) -- client: 1448 server: 1448
# IP TOS type -- client: Default server: Default
# Data size of each read/write (Bytes): 8192
# Total data size sent of each test (Bytes): 587792384
# Message size (Bytes): 65536
# Iteration: 8969
# Test Repetition: 5
# Network Client C-process C-process Server S-process S-process
# Throughput Elapsed-time User-mode System-mode Elapsed-time User-mode System-mode
# (Mbps) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds)
1 930.135 5.06 0.37 4.67 5.06 0.17 3.76
2 936.425 5.02 0.16 4.82 5.02 0.16 3.81
3 933.355 5.04 0.26 4.76 5.04 0.21 3.69
4 939.637 5.00 0.20 4.73 5.00 0.20 3.81
5 941.422 4.99 0.28 4.71 4.99 0.18 3.55
# Throughput statistics : Average 936.4724 Minimum 930.1351 Maximum 941.4224
[mk1 ~/hpcbench/tcp]$
|
Contrast to blocking tests above, result shows that there is no big
difference of throughputs between unidirectional blocking and non-blocking
communication, and non-blocking consumes more system resource (process time
of system mode). This is because we are exhaustively using select() system
call in the application layer, and those system calls may repeated in kernel
level.
[ Non-blocking bidirectional throughput test ]
[ TOP ]
[mk1 ~/hpcbench/tcp]$ tcptest -in -h mk4 -r 5 -o output.txt
(1) : 1456.741918 Mbps
(2) : 1353.053078 Mbps
(3) : 1349.546522 Mbps
(4) : 1346.721659 Mbps
(5) : 1348.522626 Mbps
Test done!
Test-result: "output.txt"
[mk1 ~/hpcbench/tcp]$ cat output.txt
# TCP communication test -- Tue Jul 13 18:44:21 2004
# Hosts: mk1 (client) <----> mk4 (server)
# TCP test mode: ping-pong(bidirectional) throughput test
# Socket Recv-buffer (Bytes) -- client: 87380 server: 87380
# Socket Send-buffer (Bytes) -- client: 16384 server: 16384
# Socket blocking option -- client: OFF server: OFF
# TCP_NODELAY option -- client: OFF server: OFF
# TCP_CORK option -- client: OFF server: OFF
# TCP Maximum-segment-size(MSS) (Bytes) -- client: 1448 server: 1448
# IP TOS type -- client: Default server: Default
# Data size of each read/write (Bytes): 8192
# Total data size sent of each test (Bytes): 514785280
# Message size (Bytes): 65536
# Iteration: 7855
# Test Repetition: 5
# Network Client C-process C-process Server S-process S-process
# Throughput Elapsed-time User-mode System-mode Elapsed-time User-mode System-mode
# (Mbps) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds)
1 1456.742 5.65 0.04 5.59 5.65 0.13 4.93
2 1353.053 6.09 0.06 6.02 6.09 0.13 4.32
3 1349.547 6.10 0.03 6.07 6.10 0.11 4.37
4 1346.722 6.12 0.09 6.02 6.12 0.08 4.63
5 1348.523 6.11 0.07 6.04 6.11 0.22 4.71
# Throughput statistics : Average 1350.3741 Minimum 1346.7217 Maximum 1456.7419
[mk1 ~/hpcbench/tcp]$
|
As we can see, in bidirectional tests, non-blocking throughputs are much
higher than those of blocking communication.
[ Non-blocking exponential throughput test ]
[ TOP ]
In exponential tests, the message size will increase
exponentially from 1 Byte to a 2^n Bytes, where n is defined by (-e) option.
We tests both unidirectional and bidirectional exponential tests with
maximum data size of 32MByte (2^25):
[mk1 ~/hpcbench/tcp]$ tcptest -ne 25 -h mk4
(1) : 179.372197 Mbps
(2) : 361.990950 Mbps
(3) : 553.633218 Mbps
(4) : 751.173709 Mbps
(5) : 783.353733 Mbps
(6) : 880.935994 Mbps
(7) : 911.356355 Mbps
(8) : 925.356949 Mbps
(9) : 933.199672 Mbps
(10) : 929.135287 Mbps
(11) : 939.255658 Mbps
(12) : 939.223352 Mbps
(13) : 937.777244 Mbps
(14) : 934.495749 Mbps
(15) : 934.355855 Mbps
(16) : 940.092351 Mbps
(17) : 938.246807 Mbps
(18) : 938.310372 Mbps
(19) : 937.102111 Mbps
(20) : 937.951656 Mbps
(21) : 934.376846 Mbps
(22) : 938.401579 Mbps
(23) : 936.109626 Mbps
(24) : 938.588935 Mbps
(25) : 933.255092 Mbps
(26) : 935.683583 Mbps
Test done!
[mk1 ~/hpcbench/tcp]$ tcptest -ine 25 -h mk4 -o output.txt
(1) : 615.384615 Mbps
(2) : 1012.658228 Mbps
(3) : 1314.168378 Mbps
(4) : 1479.768786 Mbps
(5) : 1680.892974 Mbps
(6) : 1576.354680 Mbps
(7) : 1630.832935 Mbps
(8) : 1539.733855 Mbps
(9) : 1643.527807 Mbps
(10) : 1632.392795 Mbps
(11) : 1635.995087 Mbps
(12) : 1623.078142 Mbps
(13) : 1635.954248 Mbps
(14) : 1326.670560 Mbps
(15) : 1359.569326 Mbps
(16) : 1350.383811 Mbps
(17) : 1344.656175 Mbps
(18) : 1345.188891 Mbps
(19) : 1343.651458 Mbps
(20) : 1344.313125 Mbps
(21) : 1351.314055 Mbps
(22) : 1354.159095 Mbps
(23) : 1356.183126 Mbps
(24) : 1354.537707 Mbps
(25) : 1351.288206 Mbps
(26) : 1352.852697 Mbps
Test done!
Test-result: "output.txt"
[mk1 ~/hpcbench/tcp]$ cat output.txt
# TCP communication test -- Tue Jul 13 19:38:47 2004
# Hosts: mk1 (client) <----> mk4 (server)
# TCP test mode: ping-pong(bidirectional) exponential throughput test
# Socket Recv-buffer (Bytes) -- client: 87380 server: 87380
# Socket Send-buffer (Bytes) -- client: 16384 server: 16384
# Socket blocking option -- client: OFF server: OFF
# TCP_NODELAY option -- client: OFF server: OFF
# TCP_CORK option -- client: OFF server: OFF
# TCP Maximum-segment-size(MSS) (Bytes) -- client: 1448 server: 1448
# IP TOS type -- client: Default server: Default
# Data-size Network-hroughput Elapsed-time Iteration
# (Bytes) (Mbps) (Seconds)
1 615.3846 0.00026 10000
2 1012.6582 0.00032 10000
4 1314.1684 0.00049 10000
8 1479.7688 0.00086 10000
16 1680.8930 0.00152 10000
32 1576.3547 0.00325 10000
64 1630.8329 0.00628 10000
128 1539.7339 0.01330 10000
256 1643.5278 0.02492 10000
512 1632.3928 0.05018 10000
1024 1635.9951 0.10015 10000
2048 1623.0781 0.20189 10000
4096 1635.9542 0.40060 10000
8192 1326.6706 0.98798 10000
16384 1359.5693 1.92814 10000
32768 1350.3838 3.88251 10000
65536 1344.6562 5.02119 6439
131072 1345.1889 4.99660 3205
262144 1343.6515 5.00388 1603
524288 1344.3131 4.99206 800
1048576 1351.3141 4.96619 400
2097152 1354.1591 4.98054 201
4194304 1356.1831 4.94836 100
8388608 1354.5377 4.95437 50
16777216 1351.2882 4.96629 25
33554432 1352.8527 4.76212 12
[mk1 ~/hpcbench/tcp]$
|
[ Test with system log ] [
TOP ]
Currently the system resource tracing functionality is only
available for Linux boxes. To enable the system logging, you should enable
the write option (-o) and CPU logging option (-c). In the following example,
the file "output" records the results of tests, "ouput.c_log" logs client's
side system information, "output.s_log" logs server's system information.
System logs have two more entries than test repetition, the first one
showing pre-test system information and the last one showing system's
post-test information.
[mk1 ~/hpcbench/tcp]$ tcptest -ch mk4 -r 5 -o output
(1) : 932.349667 Mbps
(2) : 939.459044 Mbps
(3) : 940.130663 Mbps
(4) : 939.778465 Mbps
(5) : 939.497761 Mbps
Test done!
Test-result: "output" Local-syslog: "output.c_log" server-syslog: "output.s_log"
[mk1 ~/hpcbench/tcp]$ cat output
# TCP communication test -- Tue Jul 13 22:05:30 2004
# Hosts: mk1 (client) <----> mk4 (server)
# TCP test mode: stream(unidirectional) throughput test
# Socket Recv-buffer (Bytes) -- client: 87380 server: 87380
# Socket Send-buffer (Bytes) -- client: 16384 server: 16384
# Socket blocking option -- client: ON server: ON
# TCP_NODELAY option -- client: OFF server: OFF
# TCP_CORK option -- client: OFF server: OFF
# TCP Maximum-segment-size(MSS) (Bytes) -- client: 1448 server: 1448
# IP TOS type -- client: Default server: Default
# Data size of each read/write (Bytes): 8192
# Total data size sent of each test (Bytes): 587005952
# Message size (Bytes): 65536
# Iteration: 8957
# Test Repetition: 5
# Network Client C-process C-process Server S-process S-process
# Throughput Elapsed-time User-mode System-mode Elapsed-time User-mode System-mode
# (Mbps) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds)
1 932.350 5.04 0.01 1.41 5.04 0.08 2.68
2 939.459 5.00 0.02 1.35 5.00 0.10 2.77
3 940.131 5.00 0.01 1.26 4.99 0.09 2.72
4 939.778 5.00 0.01 1.29 5.00 0.07 2.46
5 939.498 5.00 0.02 1.31 5.00 0.08 2.56
# Throughput statistics : Average 939.5784 Minimum 932.3497 Maximum 940.1307
[mk1 ~/hpcbench/tcp]$ cat output.c_log
# mk1 syslog -- Tue Jul 13 22:05:30 2004
# Watch times: 7
# Network devices (interface): 3 ( loop eth0 eth1 )
# CPU number: 4
##### System info, statistics of network interface <loop> and its interrupts to each CPU #####
# CPU(%) Mem(%) Interrupt Page Swap Context <loop> information
# Load User Sys Usage Overall In/out In/out Swtich RecvPkg RecvByte SentPkg SentByte Int-CPU0 Int-CPU1 Int-CPU2 Int-CPU3
0 2 1 0 99 358 0 0 431 0 0 0 0 0 0 0 0
1 21 0 21 99 203056 32 0 34858 40 2699 40 2699 0 0 0 0
2 20 1 18 99 203742 28 0 38783 56 3531 56 3531 0 0 0 0
3 23 1 21 99 203038 32 0 40582 56 3531 56 3531 0 0 0 0
4 24 1 22 99 201045 28 0 42597 38 2595 38 2595 0 0 0 0
5 22 1 21 99 201381 28 0 42220 56 3531 56 3531 0 0 0 0
6 1 0 1 99 321 0 0 414 0 0 0 0 0 0 0 0
##### System info, statistics of network interface <eth0> and its interrupts to each CPU #####
# CPU(%) Mem(%) Interrupt Page Swap Context <eth0> information
# Load User Sys Usage Overall In/out In/out Swtich RecvPkg RecvByte SentPkg SentByte Int-CPU0 Int-CPU1 Int-CPU2 Int-CPU3
0 2 1 0 99 358 0 0 431 57 6120 60 8885 137 0 0 0
1 21 0 21 99 203056 32 0 34858 189870 13298831 379800 575704642 201957 0 0 0
2 20 1 18 99 203742 28 0 38783 196012 13727462 391821 593860215 202722 0 0 0
3 23 1 21 99 203038 32 0 40582 196000 13729962 391868 593827773 202036 0 0 0
4 24 1 22 99 201045 28 0 42597 235372 16481861 470419 713509848 200021 0 0 0
5 22 1 21 99 201381 28 0 42220 196032 13730331 391923 594124906 200319 0 0 0
6 1 0 1 99 321 0 0 414 7081 496603 14068 21326241 111 0 0 0
##### System info, statistics of network interface <eth1> and its interrupts to each CPU #####
# CPU(%) Mem(%) Interrupt Page Swap Context <eth1> information
# Load User Sys Usage Overall In/out In/out Swtich RecvPkg RecvByte SentPkg SentByte Int-CPU0 Int-CPU1 Int-CPU2 Int-CPU3
0 2 1 0 99 358 0 0 431 51 7965 50 4220 105 0 0 0
1 21 0 21 99 203056 32 0 34858 246 38150 242 20275 542 0 0 0
2 20 1 18 99 203742 28 0 38783 245 38287 232 19304 482 0 0 0
3 23 1 21 99 203038 32 0 40582 248 38758 238 19722 457 0 0 0
4 24 1 22 99 201045 28 0 42597 223 34312 213 17644 470 0 0 0
5 22 1 21 99 201381 28 0 42220 242 37351 232 19304 523 0 0 0
6 1 0 1 99 321 0 0 414 46 6908 43 3566 94 0 0 0
## CPU workload distribution:
##
## CPU0 workload (%) Overall CPU workload (%)
# < load user system idle > < load user system idle >
0 1.0 0.0 1.0 99.0 3.0 2.0 1.0 97.0
1 70.2 0.2 70.0 29.8 21.5 0.2 21.3 78.5
2 70.6 0.2 70.4 29.4 20.1 1.7 18.4 79.9
3 72.0 0.2 71.8 28.0 23.2 2.0 21.2 76.8
4 67.8 4.6 63.2 32.2 24.2 1.8 22.4 75.8
5 73.2 3.8 69.4 26.8 22.9 1.1 21.8 77.1
6 0.0 0.0 0.0 100.0 1.5 0.2 1.2 98.5
## CPU1 workload (%) Overall CPU workload (%)
# < load user system idle > < load user system idle >
0 0.0 0.0 0.0 100.0 3.0 2.0 1.0 97.0
1 5.1 0.0 5.1 94.9 21.5 0.2 21.3 78.5
2 2.2 0.2 2.0 97.8 20.1 1.7 18.4 79.9
3 15.3 7.8 7.6 84.7 23.2 2.0 21.2 76.8
4 14.9 2.6 12.3 85.1 24.2 1.8 22.4 75.8
5 2.6 0.4 2.2 97.4 22.9 1.1 21.8 77.1
6 1.0 1.0 0.0 99.0 1.5 0.2 1.2 98.5
## CPU2 workload (%) Overall CPU workload (%)
# < load user system idle > < load user system idle >
0 2.0 0.0 2.0 98.0 3.0 2.0 1.0 97.0
1 6.9 0.2 6.7 93.1 21.5 0.2 21.3 78.5
2 7.6 6.4 1.2 92.4 20.1 1.7 18.4 79.9
3 5.4 0.0 5.4 94.6 23.2 2.0 21.2 76.8
4 1.2 0.0 1.2 98.8 24.2 1.8 22.4 75.8
5 0.8 0.0 0.8 99.2 22.9 1.1 21.8 77.1
6 2.0 0.0 2.0 98.0 1.5 0.2 1.2 98.5
## CPU3 workload (%) Overall CPU workload (%)
# < load user system idle > < load user system idle >
0 8.9 7.9 1.0 91.1 3.0 2.0 1.0 97.0
1 3.7 0.4 3.4 96.3 21.5 0.2 21.3 78.5
2 0.0 0.0 0.0 100.0 20.1 1.7 18.4 79.9
3 0.0 0.0 0.0 100.0 23.2 2.0 21.2 76.8
4 12.7 0.0 12.7 87.3 24.2 1.8 22.4 75.8
5 14.9 0.2 14.7 85.1 22.9 1.1 21.8 77.1
6 3.0 0.0 3.0 97.0 1.5 0.2 1.2 98.5
[mk1 ~/hpcbench/tcp]$ cat output.s_log
# mk4 syslog -- Tue Jul 13 22:05:30 2004
# Watch times: 7
# Network devices (interface): 2 ( loop eth0 )
# CPU number: 4
##### System info, statistics of network interface <loop> and its interrupts to each CPU #####
# CPU(%) Mem(%) Interrupt Page Swap Context <loop> information
# Load User Sys Usage Overall In/out In/out Swtich RecvPkg RecvByte SentPkg SentByte Int-CPU0 Int-CPU1 Int-CPU2 Int-CPU3
0 0 0 0 10 187 0 0 118 0 0 0 0 0 0 0 0
1 25 0 25 10 405641 32 0 672780 0 0 0 0 0 0 0 0
2 28 0 28 10 405419 16 0 805291 0 0 0 0 0 0 0 0
3 28 0 28 10 405411 16 0 805986 0 0 0 0 0 0 0 0
4 26 0 26 10 405385 16 0 805753 0 0 0 0 0 0 0 0
5 27 0 27 10 405437 16 0 805660 0 0 0 0 0 0 0 0
6 0 0 0 10 192 0 0 130 0 0 0 0 0 0 0 0
##### System info, statistics of network interface <eth0> and its interrupts to each CPU #####
# CPU(%) Mem(%) Interrupt Page Swap Context <eth0> information
# Load User Sys Usage Overall In/out In/out Swtich RecvPkg RecvByte SentPkg SentByte Int-CPU0 Int-CPU1 Int-CPU2 Int-CPU3
0 0 0 0 10 187 0 0 118 21 2884 23 2586 48 0 0 0
1 25 0 25 10 405641 32 0 672780 407773 618402173 203812 14274437 404948 0 0 0
2 28 0 28 10 405419 16 0 805291 391589 593779142 195823 13712492 404752 0 0 0
3 28 0 28 10 405411 16 0 805986 391754 593997738 195905 13717329 404766 0 0 0
4 26 0 26 10 405385 16 0 805753 391537 593968445 195785 13710252 404713 0 0 0
5 27 0 27 10 405437 16 0 805660 391296 593460875 195675 13703206 404748 0 0 0
6 0 0 0 10 192 0 0 130 65930 99976153 32981 2309297 50 0 0 0
## CPU workload distribution:
##
## CPU0 workload (%) Overall CPU workload (%)
# < load user system idle > < load user system idle >
0 0.0 0.0 0.0 100.0 0.0 0.0 0.0 100.0
1 68.8 0.8 68.0 31.2 25.4 0.4 25.0 74.6
2 58.8 0.0 58.8 41.2 29.0 0.5 28.5 71.0
3 60.0 0.0 60.0 40.0 29.0 0.4 28.5 71.0
4 57.0 0.0 57.0 43.0 26.9 0.3 26.5 73.1
5 57.8 0.0 57.8 42.2 27.6 0.4 27.2 72.4
6 0.0 0.0 0.0 100.0 0.0 0.0 0.0 100.0
## CPU1 workload (%) Overall CPU workload (%)
# < load user system idle > < load user system idle >
0 0.0 0.0 0.0 100.0 0.0 0.0 0.0 100.0
1 32.8 0.8 32.0 67.2 25.4 0.4 25.0 74.6
2 57.2 2.0 55.2 42.8 29.0 0.5 28.5 71.0
3 56.0 1.8 54.2 44.0 29.0 0.4 28.5 71.0
4 50.6 1.4 49.2 49.4 26.9 0.3 26.5 73.1
5 52.6 1.6 51.0 47.4 27.6 0.4 27.2 72.4
6 0.0 0.0 0.0 100.0 0.0 0.0 0.0 100.0
## CPU2 workload (%) Overall CPU workload (%)
# < load user system idle > < load user system idle >
0 0.0 0.0 0.0 100.0 0.0 0.0 0.0 100.0
1 0.0 0.0 0.0 100.0 25.4 0.4 25.0 74.6
2 0.0 0.0 0.0 100.0 29.0 0.5 28.5 71.0
3 0.0 0.0 0.0 100.0 29.0 0.4 28.5 71.0
4 0.0 0.0 0.0 100.0 26.9 0.3 26.5 73.1
5 0.0 0.0 0.0 100.0 27.6 0.4 27.2 72.4
6 0.0 0.0 0.0 100.0 0.0 0.0 0.0 100.0
## CPU3 workload (%) Overall CPU workload (%)
# < load user system idle > < load user system idle >
0 0.0 0.0 0.0 100.0 0.0 0.0 0.0 100.0
1 0.0 0.0 0.0 100.0 25.4 0.4 25.0 74.6
2 0.0 0.0 0.0 100.0 29.0 0.5 28.5 71.0
3 0.0 0.0 0.0 100.0 29.0 0.4 28.5 71.0
4 0.0 0.0 0.0 100.0 26.9 0.3 26.5 73.1
5 0.0 0.0 0.0 100.0 27.6 0.4 27.2 72.4
6 0.0 0.0 0.0 100.0 0.0 0.0 0.0 100.0
[mk1 ~/hpcbench/tcp]$
|
We found that client machine takes about 20% CPU usage and mainly
consumes CPU0 system time, while server has about 26% CPU usage and
distributes workload to CPU0 and CPU1. We also observe Client has one more
network interface than server, on which there is a little traffic.
[ Test TCP socket options ] [
TOP ]
There are some TCP socket options we can set. In the
following example, we set the TCP socket buffer size to 500KBytes, turn on
the TCP_NODELAY option, set the MSS to 8KBytes, and set the packet's TOS bit
to Maximize-Throughput mode:
[mk1 ~/hpcbench/tcp]$ tcptest -h mk4 -b 500k -N -q 2 -M 8k -r 6 -o output.txt
(1) : 935.895594 Mbps
(2) : 937.464018 Mbps
(3) : 937.376642 Mbps
(4) : 933.568767 Mbps
(5) : 934.880739 Mbps
(6) : 938.093950 Mbps
Test done!
Test-result: "output.txt"
[mk1 ~/hpcbench/tcp]$ cat output.txt
# TCP communication test -- Tue Jul 13 19:50:21 2004
# Hosts: mk1 (client) <----> mk4 (server)
# TCP test mode: stream(unidirectional) throughput test
# Socket Recv-buffer (Bytes) -- client: 262142 server: 262142
# Socket Send-buffer (Bytes) -- client: 262142 server: 262142
# Socket blocking option -- client: ON server: ON
# TCP_NODELAY option -- client: ON server: ON
# TCP_CORK option -- client: OFF server: OFF
# TCP Maximum-segment-size(MSS) (Bytes) -- client: 1448 server: 1448
# IP TOS type -- client: IPTOS_Maximize_Throughput server: IPTOS_Maximize_Throughput
# Data size of each read/write (Bytes): 8192
# Total data size sent of each test (Bytes): 585826304
# Message size (Bytes): 65536
# Iteration: 8939
# Test Repetition: 6
# Network Client C-process C-process Server S-process S-process
# Throughput Elapsed-time User-mode System-mode Elapsed-time User-mode System-mode
# (Mbps) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds)
1 935.896 5.01 0.02 2.85 5.01 0.11 2.18
2 937.464 5.00 0.01 3.11 5.00 0.09 2.16
3 937.377 5.00 0.02 2.91 5.00 0.05 2.16
4 933.569 5.02 0.02 2.44 5.02 0.07 2.17
5 934.881 5.01 0.02 2.88 5.01 0.05 2.04
6 938.094 5.00 0.02 2.92 5.00 0.09 2.18
# Throughput statistics : Average 936.4042 Minimum 933.5688 Maximum 938.0940
[mk1 ~/hpcbench/tcp]$
|
We are testing a pure idle cluster, and we can see there is no big
difference of the throughput between this setting and that of default
setting. Notice the socket buffer size is not the same as we defined, and
256KBytes is the maximum size we can get. The MSS setting is also ignored
since it's bigger than the system MTU.
Let's try a small size of MSS and socket buffer:
[mk1 ~/hpcbench/tcp]$ tcptest -h mk4 -b 10k -M 500 -r 6 -o output.txt
(1) : 244.712325 Mbps
(2) : 243.906450 Mbps
(3) : 244.264968 Mbps
(4) : 244.139996 Mbps
(5) : 243.599390 Mbps
(6) : 242.246976 Mbps
Test done!
Test-result: "output.txt"
[mk1 ~/hpcbench/tcp]$ cat output.txt
# TCP communication test -- Tue Jul 13 20:06:57 2004
# Hosts: mk1 (client) <----> mk4 (server)
# TCP test mode: stream(unidirectional) throughput test
# Socket Recv-buffer (Bytes) -- client: 10240 server: 10240
# Socket Send-buffer (Bytes) -- client: 10240 server: 10240
# Socket blocking option -- client: ON server: ON
# TCP_NODELAY option -- client: OFF server: OFF
# TCP_CORK option -- client: OFF server: OFF
# TCP Maximum-segment-size(MSS) (Bytes) -- client: 500 server: 500
# IP TOS type -- client: Default server: Default
# Data size of each read/write (Bytes): 8192
# Total data size sent of each test (Bytes): 151388160
# Message size (Bytes): 65536
# Iteration: 2310
# Test Repetition: 6
# Network Client C-process C-process Server S-process S-process
# Throughput Elapsed-time User-mode System-mode Elapsed-time User-mode System-mode
# (Mbps) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds) (Seconds)
1 244.712 4.95 0.00 0.35 4.95 0.03 1.74
2 243.906 4.97 0.01 0.38 4.97 0.05 1.66
3 244.265 4.96 0.02 0.31 4.96 0.03 1.75
4 244.140 4.96 0.00 0.31 4.96 0.02 1.55
5 243.599 4.97 0.00 0.29 4.97 0.04 1.41
6 242.247 5.00 0.01 0.29 5.00 0.02 1.71
# Throughput statistics : Average 243.9777 Minimum 242.2470 Maximum 244.7123
[mk1 ~/hpcbench/tcp]$
|
[ Latency (Roundtrip time) test ] [
TOP ]
This test is like a TCP version of "ping". We test
roundtrip time with default messge size (64Bytes) and 1KBytes data:
[mk1 ~/hpcbench/tcp]$ tcptest -ah mk4
TCP Round Trip Time (1) : 61.551 usec
TCP Round Trip Time (2) : 60.696 usec
TCP Round Trip Time (3) : 60.233 usec
TCP Round Trip Time (4) : 60.439 usec
TCP Round Trip Time (5) : 60.254 usec
TCP Round Trip Time (6) : 60.371 usec
TCP Round Trip Time (7) : 60.774 usec
TCP Round Trip Time (8) : 60.414 usec
TCP Round Trip Time (9) : 60.583 usec
TCP Round Trip Time (10) : 60.359 usec
10 trials with message size 64 Bytes.
TCP RTT min/avg/max = 60.233/60.568/61.551 usec
[mk1 ~/hpcbench/tcp]$ tcptest -h mk4 -A 1k -r 5 -o output.txt
TCP Round Trip Time (1) : 112.682 usec
TCP Round Trip Time (2) : 112.682 usec
TCP Round Trip Time (3) : 112.507 usec
TCP Round Trip Time (4) : 110.707 usec
TCP Round Trip Time (5) : 112.020 usec
5 trials with message size 1024 Bytes.
TCP RTT min/avg/max = 110.707/112.119/112.682 usec
[mk1 ~/hpcbench/tcp]$ cat output.txt
# TCP roundtrip time test Tue Jul 13 20:11:43 2004
# mk1 <--> mk4
# TCP-send-buffer: 16384 TCP-recv-buffer: 87380
# Message-size: 1024 Iteration: 1024
TCP Round Trip Time (1) : 112.682 usec
TCP Round Trip Time (2) : 112.682 usec
TCP Round Trip Time (3) : 112.507 usec
TCP Round Trip Time (4) : 110.707 usec
TCP Round Trip Time (5) : 112.020 usec
5 trials with message size 1024 Bytes.
TCP RTT min/avg/max = 110.707/112.119/112.682 usec
[mk1 ~/hpcbench/tcp]$
|
[ Plot data ] [
TOP ]
If write option (-o) and plot option (-P) are both defined,
a configuration file for plotting with format of "ouput.plot" will be
created. Use gnuplot to plot the data or create the postscript files of the
plotting:
|