RE: Grizzly in Hadoop

From: Devaraj Das <ddas_at_yahoo-inc.com>
Date: Tue, 12 Jun 2007 19:41:21 +0530

Thanks for the confirmation! After moving to JDK-1.6u1, the problem seems to
have disappeared.

-----Original Message-----
From: Jeanfrancois.Arcand_at_Sun.COM [mailto:Jeanfrancois.Arcand_at_Sun.COM]
Sent: Tuesday, June 12, 2007 6:27 PM
To: users_at_grizzly.dev.java.net
Subject: Re: Grizzly in Hadoop

Devaraj Das wrote:
> I am using jdk1.6.0 and I think I am hitting the bug
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6481709 ("epoll
> based Selector throws java.io.IOException: Operation not permitted
> during load")

Yes you are hitting that bug. This has been fixed in ur1 (confirmed by the
NIO lead just now :-))

> and that's why I am seeing the NIO exception as documented in my
> earlier mail .. So maybe it can be treated as not being a Grizzly
> problem but could it have effect on the network performance?

For sure it has :-)

Does this exception mean that we
> will fail to fetch from the server that just threw the exception (and
> thereby affect the network performance since our client implementation
> will backoff)?

Some connection will be closed. Can you try with 6.0 ur1

Thanks!

-- Jeanfrancois

>
> -----Original Message-----
> From: Devaraj Das [mailto:ddas_at_yahoo-inc.com]
> Sent: Tuesday, June 12, 2007 12:02 PM
> To: 'users_at_grizzly.dev.java.net'
> Subject: RE: Grizzly in Hadoop
>
> Forgot to mention one detail .. the file serving happens through the
> service method in an Adapter that we implemented. It happens this way
> since we parse a few arguments in the http request url to figure out
> which file to serve, etc.
>
> -----Original Message-----
> From: Devaraj Das [mailto:ddas_at_yahoo-inc.com]
> Sent: Tuesday, June 12, 2007 11:57 AM
> To: 'users_at_grizzly.dev.java.net'
> Subject: Grizzly in Hadoop
>
> Hi,
>
> We are considering using Grizzly (1.5) in Hadoop (an open source
> framework that has the MapReduce and Distributed File System
> implementations). The main reason for using it is to optimize a framework
phase called "shuffle".
> In this phase we move lots of data across the network.
>
> We are currently using HTTP for moving the data (actually files) and
> we use Jetty5. Now we are thinking of moving to Grizzly (to have NIO
> and all its niceness). But initial experiments with our benchmark
> showed that with Grizzly the performance of the shuffle phase is
> nearly the same as we have with Jetty5. This is not what we initially
> expected and hence would like to get feedback on where we might be going
wrong.
>
> Hadoop is designed to run on large clusters of 100s of nodes
> (currently it can run stable/reliably in a 1K node cluster). From the
> Grizzly point of view, what needs to be known is that each node has a
> HTTP server. Both
> Jetty5 and Grizzly provides the ability to have multiple handlers to
> service the incoming requests.
>
> There are 2 clients on each node, and each client has a configurable
> number of fetcher threads. The fetcher code is written using the
> java.net.URLConnection API.
> Every node has both the server and the clients. The threads basically
> hit the HTTP server asking for specific files. They all do this at
> once (with some randomness in the order of hosts, maybe).
>
> The benchmark that I tested with is a sort for ~5TB of data with a
> cluster of 500 nodes. On the hardware side, I used a cluster that has
> 4 dualcore processors in each machine. The machines are partitioned
> into racks with a gigabit ethernet within the rack and 100Mbps across
> the racks. There are roughly 78000 independent files spread across
> these 500 nodes each of size ~60KB that the client pulls (and again we
have two such clients per node).
> So you can imagine you have a massive all-all communication happening.
> The configuration for the server and client is as follows:
> Grizzly configuration for port 9999
> maxThreads: 100
> minThreads: 1
> ByteBuffer size: 8192
> useDirectByteBuffer: false
> useByteBufferView: false
> maxHttpHeaderSize: 8192
> maxKeepAliveRequests: 256
> keepAliveTimeoutInSeconds: 10
> Static File Cache enabled: true
> Stream Algorithm :
> com.sun.grizzly.http.algorithms.NoParsingAlgorithm
> Pipeline : com.sun.grizzly.http.LinkedListPipeline
> Round Robin Selector Algorithm enabled: false
> Round Robin Selector pool size: 0
> recycleTasks: true
> Asynchronous Request Processing enabled: false I also tried
> some configs with multiple selectorReadThreads but didn't make much
difference.
>
> The client has 30 fetcher threads and the way it is designed is that
> only one fetch from any given host would happen at any point of time.
> So if a server host, h1, has 'n' files that we should pull, we do that
> one at a time (as opposed to multiple threads hitting that server to
> fetch multiple files in parallel).
>
> Also, we don't use the features of HTTP1.1 persistent connections or
> pipelining. We fetch exactly one file and close the connection to the
> server.
>
> With the above setup, the performance I see is not different from what
> I see with Jetty5. I see a lot of Read timeouts on the client side
> (and we have a backoff (for the server host) on the client
> implementation whenever we fail to fetch a file). I also saw some
exceptions of the form on the server:
>
> Jun 11, 2007 5:04:51 PM com.sun.grizzly.Controller doSelect
> SEVERE: doSelect exception
> java.io.IOException: Operation not permitted
> at sun.nio.ch.EPollArrayWrapper.epollCtl(Native Method)
> at
>
sun.nio.ch.EPollArrayWrapper.updateRegistrations(EPollArrayWrapper.java:202)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:183)
> at
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> at
> com.sun.grizzly.TCPSelectorHandler.select(TCPSelectorHandler.java:277)
> at com.sun.grizzly.Controller.doSelect(Controller.java:218)
> at com.sun.grizzly.Controller.start(Controller.java:451)
> at
>
com.sun.grizzly.http.SelectorThread.startListener(SelectorThread.java:1158)
> at
>
com.sun.grizzly.http.SelectorThread.startEndpoint(SelectorThread.java:1121)
> at
> com.sun.grizzly.http.SelectorThread.run(SelectorThread.java:1099)
>
> Are we missing something in the configuration or something else?
>
> Thanks for the help.
>
> Regards,
> Devaraj.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
For additional commands, e-mail: users-help_at_grizzly.dev.java.net