dev@grizzly.java.net

Request for grizzly-sendfile benchmarking

From: Igor Minar <iiminar_at_gmail.com>
Date: Wed, 13 Jan 2010 21:06:00 -0800

Hi there,

As we discussed during the meeting I'd love to see results from some
3rd party benchmarks, either from the grizzly team members or
glassfish performance folks.

Setting up grizzly-sendfile and comparing it to grizzly is very simple:

1) fetch the latest snapshot of grizzly-sendfile-server from
   https://kenai.com/svn/grizzly-sendfile~maven/com/igorminar/grizzly-
sendfile/grizzly-sendfile-server/0.4-SNAPSHOT/grizzly-sendfile-
server-0.4-SNAPSHOT-full.jar

2) starting the server is as simple as

java -server -jar grizzly-sendfile-server-0.4-SNAPSHOT-jar-with-
dependencies.jar /path/to/some/files

this will cause grizzly-sendfile to auto configure some of the
options, it might be a better idea to set the most important options
explicitly as follows.

3) start the server in the default mode with explicit config and test it

SimpleBlockingAlgorithm config (behaves similar to grizzly = doesn't
scale well when number of concurrent downloads > thread count):

java -server -jar grizzly-sendfile-server-0.4-SNAPSHOT-full.jar \
        --port 8081 \
        --mode auto-sendfile \
        --algorithm
com.igorminar.grizzlysendfile.algorithm.SimpleBlockingAlgorithm \
        --thread-count 50 \
        --buffer-size 81000 \
        /path/to/some/files

You might notice that I use large buffer-size, this is because I found
out that I can get the best performance out of grizzly-sendfile when
my buffer is close to the socket send buffer (SO_SNDBUF). JFA says
that 8k buffers are the best, but IMO he's wrong. I think that these
days servers have lots of RAM available, so we should utilize them to
get the best perf.


EqualBlockingAlgorithm config (uses blocking io in combination with
connection multiplexing - great when there are many long (or slow)
downloads being processed concurently, though there is a perf penalty):

java -server -jar grizzly-sendfile-server-0.4-SNAPSHOT-full.jar \
        --port 8082 \
        --mode auto-sendfile \
        --algorithm
com.igorminar.grizzlysendfile.algorithm.EqualBlockingAlgorithm \
        --thread-count 8 \
         --concur-download-max 1000 \
        --buffer-size 81000 \
        /path/to/some/files

* ideal value for --thread-count is usually 2x - 4x the number of
available cpus
* --concur-download-max is a concurrency limit independent from --
thread-count, if the there are more concurrent connections than this
number, the remaining connections will be queued



3) start the server in the grizzly mode with comparable settings and
test it

java -server -jar grizzly-sendfile-server-0.4-SNAPSHOT-full.jar \
        --port 8083 \
        --mode grizzly \
        --thread-count 50 \
        --buffer-size 81000 \
        /path/to/some/files

it's also possible to run grizzly in the async mode:

java -server -jar grizzly-sendfile-server-0.4-SNAPSHOT-full.jar \
        --port 8084 \
        --mode grizzly-async \
        --thread-count 50 \
        --buffer-size 81000 \
        /path/to/some/files


------------------------------------------------------------------------------------------------------

grizzly-sendfile should perform and scale much better, especially when
tested with larger files (1MB+). I ran a brief test just now with
faban and ab and here are the results I got for different
configurations (faban first, ab second):

grizzly-sendfile SimpleBlockingAlgorithm:

ops/sec: 518.013
% errors: 0.015119589519230203
avg. time: 0.193
max time: 10.086
90th %: 0.270

Time taken for tests: 565.584 seconds
Requests per second: 176.81 [#/sec] (mean)
Time per request: 565.584 [ms] (mean)
Transfer rate: 181090.03 [Kbytes/sec] received



grizzly-sendfile EqualBlockingAlgorithm:

ops/sec: 509.727
% errors: 0.0
avg. time: 0.196
max time: 5.304
90th %: 0.240

Time taken for tests: 654.472 seconds
Requests per second: 152.79 [#/sec] (mean)
Time per request: 654.472 [ms] (mean)
Transfer rate: 156494.90 [Kbytes/sec] received



grizzly:

ops/sec: 96.738
% errors: 0.0
avg. time: 1.011
max time: 6.471
90th %: 1.600

Time taken for tests: 3242.309 seconds
Requests per second: 30.84 [#/sec] (mean)
Time per request: 3242.309 [ms] (mean)
Transfer rate: 31586.20 [Kbytes/sec] received



grizzly-async:

ops/sec: 94.537
% errors: 0.0
avg. time: 1.035
max time: 6.708
90th %: 1.650

Time taken for tests: 2950.196 seconds
Requests per second: 33.90 [#/sec] (mean)
Time per request: 2950.196 [ms] (mean)
Transfer rate: 34713.70 [Kbytes/sec] received


faban command used:

./bin/fhb -r 60/600/10 -c 100 -s http://localhost:8081/1m.file; ./bin/
fhb -r 60/600/10 -c 100 -s http://localhost:8082/1m.file; ./bin/fhb -r
60/600/10 -c 100 -s http://localhost:8083/1m.file; ./bin/fhb -r
60/600/10 -c 100 -s http://localhost:8084/1m.file;

ab command used:

ab -c 100 -n 100000 localhost:8081/1m.file; ab -c 100 -n 100000
localhost:8082/1m.file; ab -c 100 -n 100000 localhost:8083/1m.file; ab
-c 100 -n 100000 localhost:8084/1m.file


I know, I know +435% improvements (or +421% when messured via ab)
looks too good to be true, but as far as I can tell, I'm not doing
anything incorrectly. The only questionable part of my benchmarks is
that I run them on a single box via loop back net interface, which
means that there is plenty of network bandwidth available. But as far
as I know that only proves that grizzly-sendfile has lower overhead
than grizzly and can utilize bandwidth better.

There is a possibility that I'm misconfiguring grizzly in my tests
somehow, but I don't think that's the case. Anyway, the sources are
open [1] so you can check out how the "grizzly" mode is being
configured, or even how grizzly-sendfile works.

Even if the real world difference was not 400% or more, I think that
it should be significant. Having said that, I know about many
inefficiencies in grizzly-sendfile that could improve the performance
even further (support for keepalive being at the top of the list), so
there is still a lot of room for improvements.

If anyone is willing to spend some time doing benchmarks with grizzly-
sendfile, I'd really appreciate that. If there are any questions about
configuration or other stuff, just let me know. (oh yeah, btw the
configuration can be modified on the fly via JMX, that combined with
several performance stats exposed via JMX makes it really easy to tune
grizzly-sendfile under load and see the effects immediately).

cheers,
Igor


[1] http://kenai.com/projects/grizzly-sendfile/sources/main/show