Re: Grizzly in Hadoop

From: Oleksiy Stashok <Oleksiy.Stashok_at_Sun.COM>
Date: Fri, 15 Jun 2007 12:27:10 +0200

Hello Deveraj,

> By the way, one thing that is unclear to me is how the does the pipeline
> threads interact with the read selector threads. So, for example, if I do
> SelectorThread.setMaxThreads(100) &
> SelectorThread.setSelectorReadThreadsCount(10),
> 1) would it mean that the all the 10 selector threads dumps data into that
> one pipeline (with 100 threads)? Is that configurable (I guess so)?
>
Selector threads will share the pipeline threads.

> 2) For a multiple CPU machine (like a 4 CPU one), do I really require
> multiple selector read threads if all I want to read is HTTP requests
> (simple GET requests with some headers).
> Assume the system is heavily loaded (esp. from the data IO point of view;
> the server writes a lot, few kilobytes to many megabytes, of http response
> bytes to the clients) and has to support ~1000 clients concurrently.
>
IMHO Selector read threads will help you, as basically they are
responsible for handling more accept/read events -> more concurrent users.

> Also, a related question is since the data that the server writes to each
> client could vary in kilobytes to megabytes range, do you think that we
> should look at a different strategy for handling clients (depending on how
> long it would take to serve a client assuming that we have a rough idea of
> the serving time). I am coming from the angle that if let's say we have 100
> threads (in the pool) serving 100 clients, and if for some reason we are
> taking a long time to process those client requests (or writing responses),
> then although we will accept requests from other clients, we can't service
> them and those unfortunate clients could potentially be starved for a long
> time.
>
Here we possibly come (again :) ) to "Async writes", which are not
clearly implemented in Grizzly, and IMHO, this is the place for improvement.
But what I would think first.... If it really takes long time for
response - may be it's the sign to think about better application
scalability ;)

Thanks.

WBR,
Alexey.

> -----Original Message-----
> From: Jeanfrancois.Arcand_at_Sun.COM [mailto:Jeanfrancois.Arcand_at_Sun.COM]
> Sent: Tuesday, June 12, 2007 11:49 PM
> To: users_at_grizzly.dev.java.net
> Cc: 'Owen O'Malley'; 'Sameer Paranjpye'; 'Tahir Hashmi'
> Subject: Re: Grizzly in Hadoop
>
> Hi Devaraj,
>
> Devaraj Das wrote:
>
>> Hi Jeanfrancois,
>> Thanks for filing the bug! I would request you to fix this ASAP since
>> it's critical for us.
>>
>
> Sure. We should have something this week if all goes well. Now if you can't
> wait, you can always fallback to Grizzly 1.0. Not ideal, but a possible
> workaround :-)
>
> Thanks!
>
> -- Jeanfrancois
>
>
>> Regards,
>> Devaraj.
>>
>> -----Original Message-----
>> From: Jeanfrancois.Arcand_at_Sun.COM [mailto:Jeanfrancois.Arcand_at_Sun.COM]
>> Sent: Tuesday, June 12, 2007 10:14 PM
>> To: users_at_grizzly.dev.java.net
>> Cc: 'Owen O'Malley'; 'Sameer Paranjpye'
>> Subject: Re: Grizzly in Hadoop
>>
>> Hi,
>>
>> Jeanfrancois Arcand wrote:
>>
>>> Hi Devaraj,
>>>
>>> Devaraj Das wrote:
>>>
>>>> Hi Jeanfrancois,
>>>> I will get back to you in detail in some time from now. But for now,
>>>> I just want to update you that setting the keepalivetimeout to 0
>>>> seemed to have improved performance. Thanks for that tip. Now, do
>>>> you think, if we start using the persistent connections (transfer a
>>>> batch of files at a time) feature, it would help us significantly?
>>>>
>>> Yes it would as you will save the network overhead of opening a
>>>
>> connection.
>>
>>>> Also, will using multiple selectorReadThreads help the performance
>>>> (since we have multiple CPUs on our machines)?
>>>>
>>> Right now it is not supported with the http module (Just wakes up I
>>> didn't port it yet). Hence that's why you aren't seeing any
>>> performance difference.
>>>
>> I've filled:
>>
>> https://grizzly.dev.java.net/issues/show_bug.cgi?id=4
>>
>> to track the issue.
>>
>> -- Jeanfrancois
>>
>>
>>> Taking about your benchmark, It would be interesting to understand
>>> how you configure Jetty. Your connections seem to be data intensive
>>> as opposed to connection intensive. NIO is good at handling large
>>> amount of connections without spending idle threads but if the
>>> threads are not idle anyway (as in this case), the place to make this
>>> faster is only to reduce the number of bcopy/memcopy calls. I'm just
>>> thinking loud here :-)
>>>
>>>
>>>
>>>> I will send the source code of the adpater shortly...
>>>> Thanks,
>>>>
>>> Thanks!
>>>
>>> -- Jeanfrancois
>>>
>>>
>>>> Devaraj.
>>>>
>>>> -----Original Message-----
>>>> From: Jeanfrancois.Arcand_at_Sun.COM
>>>> [mailto:Jeanfrancois.Arcand_at_Sun.COM]
>>>> Sent: Tuesday, June 12, 2007 6:25 PM
>>>> To: users_at_grizzly.dev.java.net; Scott Oaks
>>>> Subject: Re: Grizzly in Hadoop
>>>>
>>>> Hi,
>>>>
>>>> Devaraj Das wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> We are considering using Grizzly (1.5) in Hadoop (an open source
>>>>> framework that has the MapReduce and Distributed File System
>>>>> implementations). The main reason for using it is to optimize a
>>>>> framework
>>>>>
>>>> phase called "shuffle".
>>>>
>>>>> In this phase we move lots of data across the network.
>>>>>
>>>> Cool :-) I know Hadoop as your Web 2.0 stack is also considering
>>>> using it
>>>> :-)
>>>>
>>>>
>>>>> We are currently using HTTP for moving the data (actually files)
>>>>> and we use Jetty5. Now we are thinking of moving to Grizzly (to
>>>>> have NIO and all its niceness). But initial experiments with our
>>>>> benchmark showed that with Grizzly the performance of the shuffle
>>>>> phase is nearly the same as we have with Jetty5. This is not what
>>>>> we initially expected and hence would like to get feedback on where
>>>>> we might be going
>>>>>
>>>> wrong.
>>>>
>>>>> Hadoop is designed to run on large clusters of 100s of nodes
>>>>> (currently it can run stable/reliably in a 1K node cluster). From
>>>>> the Grizzly point of view, what needs to be known is that each node
>>>>> has a HTTP server. Both
>>>>> Jetty5 and Grizzly provides the ability to have multiple handlers
>>>>> to service the incoming requests.
>>>>>
>>>>> There are 2 clients on each node, and each client has a
>>>>> configurable number of fetcher threads. The fetcher code is written
>>>>> using the java.net.URLConnection API.
>>>>> Every node has both the server and the clients. The threads
>>>>> basically hit the HTTP server asking for specific files. They all
>>>>> do this at once (with some randomness in the order of hosts, maybe).
>>>>>
>>>>> The benchmark that I tested with is a sort for ~5TB of data with a
>>>>> cluster of 500 nodes. On the hardware side, I used a cluster that
>>>>> has
>>>>> 4 dualcore processors in each machine. The machines are partitioned
>>>>> into racks with a gigabit ethernet within the rack and 100Mbps
>>>>> across the racks. There are roughly 78000 independent files spread
>>>>> across these 500 nodes each of size ~60KB that the client pulls
>>>>> (and again we
>>>>>
>>>> have two such clients per node).
>>>>
>>>>> So you can imagine you have a massive all-all communication
>>>>> happening. The configuration for the server and client is as follows:
>>>>> Grizzly configuration for port 9999
>>>>> maxThreads: 100
>>>>> minThreads: 1
>>>>> ByteBuffer size: 8192
>>>>> useDirectByteBuffer: false
>>>>> useByteBufferView: false
>>>>> maxHttpHeaderSize: 8192
>>>>> maxKeepAliveRequests: 256
>>>>> keepAliveTimeoutInSeconds: 10
>>>>> Static File Cache enabled: true
>>>>> Stream Algorithm :
>>>>> com.sun.grizzly.http.algorithms.NoParsingAlgorithm
>>>>> Pipeline : com.sun.grizzly.http.LinkedListPipeline
>>>>> Round Robin Selector Algorithm enabled: false
>>>>> Round Robin Selector pool size: 0
>>>>> recycleTasks: true
>>>>> Asynchronous Request Processing enabled: false I also
>>>>> tried some configs with multiple selectorReadThreads but didn't
>>>>> make much difference.
>>>>>
>>>> keepAliveTimeoutInSeconds: 10 seems for me a little low...what is
>>>> the reason to have such a low number? Closing faster idle connection?
>>>>
>>>>
>>>>
>>>>> The client has 30 fetcher threads and the way it is designed is
>>>>> that only one fetch from any given host would happen at any point
>>>>> of time. So if a server host, h1, has 'n' files that we should
>>>>> pull, we do that one at a
>>>>>
>>>> time
>>>>
>>>>> (as opposed to multiple threads hitting that server to fetch
>>>>> multiple
>>>>>
>>>> files
>>>>
>>>>> in parallel).
>>>>>
>>>>> Also, we don't use the features of HTTP1.1 persistent connections
>>>>> or pipelining. We fetch exactly one file and close the connection
>>>>> to the server.
>>>>>
>>>> That's answer my previous question :-) I would recommend setting the
>>>> keepAliveTimeoutInSeconds=0 then as I'm sure the performance will
>>>> improve (no call to the keep-alive subsystem).
>>>>
>>>>
>>>>> With the above setup, the performance I see is not different from
>>>>> what I
>>>>>
>>>> see
>>>>
>>>>> with Jetty5.
>>>>>
>>>> Can it be the benchmark itself that it not able to load more?
>>>>
>>>>
>>>> I see a lot of Read timeouts on the client side (and we have a
>>>>
>>>>> backoff (for the server host) on the client implementation whenever
>>>>> we
>>>>>
>>>> fail
>>>>
>>>>> to fetch a file). I also saw some exceptions of the form on the server:
>>>>>
>>>> You seem to hit the epoll problem on Linux. I know there is a way to
>>>> avoid using epoll (a property). I will ping the NIO team and let you
>>>> know.
>>>>
>>>> Also, what exactly your Adapter implementation is doing? Can you
>>>> share the code? If I can have access to your setup, I would like to
>>>> see if using Grizzly 1.0.15 makes a difference (just to make sure we
>>>> don't have a bug in 1.5....as far as I can tell, I.5 is as fast as
>>>> 1.0 on my benchmark).
>>>>
>>>> Thanks,
>>>>
>>>> --Jeanfrancois
>>>>
>>>>
>>>>
>>>>> Jun 11, 2007 5:04:51 PM com.sun.grizzly.Controller doSelect
>>>>> SEVERE: doSelect exception
>>>>> java.io.IOException: Operation not permitted
>>>>> at sun.nio.ch.EPollArrayWrapper.epollCtl(Native Method)
>>>>> at
>>>>>
>>>>>
>> sun.nio.ch.EPollArrayWrapper.updateRegistrations(EPollArrayWrapper.jav
>> a:202)
>>
>>
>>>>> at
>>>>>
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:183)
>
>>>>> at
>>>>>
>>>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>>>>
>>>>> at
>>>>>
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>
>>>>> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>>>>> at
>>>>> com.sun.grizzly.TCPSelectorHandler.select(TCPSelectorHandler.java:277)
>>>>> at com.sun.grizzly.Controller.doSelect(Controller.java:218)
>>>>> at com.sun.grizzly.Controller.start(Controller.java:451)
>>>>> at
>>>>>
>>>>>
>> com.sun.grizzly.http.SelectorThread.startListener(SelectorThread.java:
>> 1158)
>>
>>>>> at
>>>>>
>>>>>
>> com.sun.grizzly.http.SelectorThread.startEndpoint(SelectorThread.java:
>> 1121)
>>
>>>>> at
>>>>>
>>>> com.sun.grizzly.http.SelectorThread.run(SelectorThread.java:1099)
>>>>
>>>>> Are we missing something in the configuration or something else?
>>>>>
>>>>> Thanks for the help.
>>>>>
>>>>> Regards,
>>>>> Devaraj.
>>>>>
>>>>> -------------------------------------------------------------------
>>>>> -- To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>>>>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>>>>
>>>>>
>>>> --------------------------------------------------------------------
>>>> - To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>>>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>>>
>>>>
>>>> --------------------------------------------------------------------
>>>> - To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>>>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>>>
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>>
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>
>