Hi Michael
Hope you are doing well.
Thanks for jumping in to help. Much appreciated.
Hi Mathias
I very much appreciate your trying out these experiments. Needless to
say, we have never put Shoal through this level of intensive message sizes.
On one level, I am happy to see data like this so feel free to
contribute data and extracted tests if possible to either FishFarm or to
Shoal or both. It would help us a lot to have such tests in the
community repository so it can become part of a smoke test.
As Michael Bien says using a tool like VisualVM might provide a lot of
insights. Will await Michael's assessments of the thread dump. If you
can share those here we can also take a look.
Cheers
Shreedhar
Michael Bien wrote:
> Hello Mathias,
>
> great to hear that you have still not giving up ;-)
>
> Well it is not entirely impossible that its FishFarm fault :) (thats
> why i filed a bug against ff). The problem is just that I can't
> reproduce it here, all my prior load tests with small messages where
> pretty reliable (5 out of 40k tasks where lost in action but recovered
> by fishfarm). The other problem is that I can't run the load tests
> again because I can't access the test cluster (holidays on my
> university...).
>
> could you next time it happens create a thread dump of the master node
> and just to be sure of the worker too (visualvm is a great tool for
> this purpose) and send it to me? Just to exclude deadlocks somewhere
> in FishFarm.
>
> Also if it is possible in some way to extract your /RasterAdapter /to
> a testcase this would be great.
>
> regards,
>
> Michael
>
> btw I am the FishFarm author
>
> Mathias Chouet wrote:
>> Hi.
>>
>> Thank you for your answer!
>>
>> Following your advice, I've made a little iterative testing program,
>> and the results are strange. First I tried to send repeatedly large
>> /int/ arrays, up to 100MB (yes, MB!) and it worked perfectly. Out of
>> 50, not even one was lost! However, when I tried to perform the same
>> iterative test using the objects I'm working with in real condition,
>> it failed.
>>
>> The application I'm working on exchanges custom objects (called
>> /RasterAdapter/) designed to encapsulate /java.awt.image.Raster/ and
>> make them /Serializable/. Thus, a /RasterAdapter/ uses custom
>> /writeObject()/ and /readObject()/ methods (which work, I'm
>> absolutely sure of this). The fact is when I send repeatedly such
>> /RasterAdapter/s, it works until 1100KB (approximately, it's hard for
>> me to control precisely the size of these objects) and when the
>> objects exceed this weigh, they get lost.
>>
>> By "they get lost", I mean they don't reach their destination. Indeed
>> they transit through FishFarm, not directly Shoal and the issue may
>> be totally independent from Shoal. But as FishFarm is supposed to
>> just serialize the objects, send them as Shoal messages and
>> unserialize them on the other side, I'm surprized... it works with
>> /int/ arrays but with my objects the unserialization is never
>> performed, as if the message had been lost or corrupted... I've
>> already got in touch with FishFarm's author, who told me he was using
>> Shoal with the default configuration.
>>
>> The output shows no particular message, and no log file is produced.
>> I'm ashamed as I didn't figure out how to enable logging in Shoal (or
>> JXTA), it's probably very simple but I didn't find anything...
>>
>> Thank you for JXTA's programmatic configuration Api, I'll look at it.
>> Nevertheless I think you're right, as Shoal uses TCP channels I won't
>> be able to tune anything, and multicast confiuration probably won't
>> change anything either...
>>
>> Anyway I thank you very much for your answer, and I apologize in the
>> case the problem doesn't come from Shoal, which is very probable.
>>
>> Regards,
>> Mathias CHOUET
>>
>> 2008/8/6 Shreedhar Ganapathy <Shreedhar.Ganapathy_at_sun.com
>> <mailto:Shreedhar.Ganapathy_at_sun.com>>
>>
>> Hi Mathias
>> Thanks for posting here. Its certainly good to know about your
>> usage of Fishfarm and indirectly, Shoal.
>>
>> A few questions:
>> A few iterative trials might establish where the issue lies.
>> Could you try with packet sizes lower than 64KB i.e say 10KB
>> through to 60 KB and then go beyond ? That would indicate where
>> possible bottlenecks could be. Would be even better if this could
>> be profiled.
>>
>> Are you seeing any exceptions or messages in the log output to
>> suggest failure to send messages? Could you share the logs here?
>>
>> We can surely look into exposing properties that can be set in Jxta.
>> Jxta provides a programmatic Api for configurations called
>> NetworkConfigurator.
>> https://jxta-jxse.dev.java.net/source/browse/jxta-jxse/trunk/api/src/net/jxta/platform/NetworkConfigurator.java?rev=555&view=markup
>> <https://jxta-jxse.dev.java.net/source/browse/jxta-jxse/trunk/api/src/net/jxta/platform/NetworkConfigurator.java?rev=555&view=markup>
>>
>> However it does not expose anything related to tuning TCP
>> presumably because the underlying OS level TCP protocol
>> implementation is expected to provide flow control, fragmentation
>> support, etc. There could be opportunity for some RFEs there.
>>
>> I can see an api for setting the Multicast packet size but in
>> this case you report it may not be relevant as Shoal uses the TCP
>> channel to send messages.
>>
>> hth
>> Shreedhar
>>
>>
>