Hello Mathias,
great to hear that you have still not giving up ;-)
Well it is not entirely impossible that its FishFarm fault :) (thats why
i filed a bug against ff). The problem is just that I can't reproduce it
here, all my prior load tests with small messages where pretty reliable
(5 out of 40k tasks where lost in action but recovered by fishfarm). The
other problem is that I can't run the load tests again because I can't
access the test cluster (holidays on my university...).
could you next time it happens create a thread dump of the master node
and just to be sure of the worker too (visualvm is a great tool for this
purpose) and send it to me? Just to exclude deadlocks somewhere in FishFarm.
Also if it is possible in some way to extract your /RasterAdapter /to a
testcase this would be great.
regards,
Michael
btw I am the FishFarm author
Mathias Chouet wrote:
> Hi.
>
> Thank you for your answer!
>
> Following your advice, I've made a little iterative testing program,
> and the results are strange. First I tried to send repeatedly large
> /int/ arrays, up to 100MB (yes, MB!) and it worked perfectly. Out of
> 50, not even one was lost! However, when I tried to perform the same
> iterative test using the objects I'm working with in real condition,
> it failed.
>
> The application I'm working on exchanges custom objects (called
> /RasterAdapter/) designed to encapsulate /java.awt.image.Raster/ and
> make them /Serializable/. Thus, a /RasterAdapter/ uses custom
> /writeObject()/ and /readObject()/ methods (which work, I'm absolutely
> sure of this). The fact is when I send repeatedly such
> /RasterAdapter/s, it works until 1100KB (approximately, it's hard for
> me to control precisely the size of these objects) and when the
> objects exceed this weigh, they get lost.
>
> By "they get lost", I mean they don't reach their destination. Indeed
> they transit through FishFarm, not directly Shoal and the issue may be
> totally independent from Shoal. But as FishFarm is supposed to just
> serialize the objects, send them as Shoal messages and unserialize
> them on the other side, I'm surprized... it works with /int/ arrays
> but with my objects the unserialization is never performed, as if the
> message had been lost or corrupted... I've already got in touch with
> FishFarm's author, who told me he was using Shoal with the default
> configuration.
>
> The output shows no particular message, and no log file is produced.
> I'm ashamed as I didn't figure out how to enable logging in Shoal (or
> JXTA), it's probably very simple but I didn't find anything...
>
> Thank you for JXTA's programmatic configuration Api, I'll look at it.
> Nevertheless I think you're right, as Shoal uses TCP channels I won't
> be able to tune anything, and multicast confiuration probably won't
> change anything either...
>
> Anyway I thank you very much for your answer, and I apologize in the
> case the problem doesn't come from Shoal, which is very probable.
>
> Regards,
> Mathias CHOUET
>
> 2008/8/6 Shreedhar Ganapathy <Shreedhar.Ganapathy_at_sun.com
> <mailto:Shreedhar.Ganapathy_at_sun.com>>
>
> Hi Mathias
> Thanks for posting here. Its certainly good to know about your
> usage of Fishfarm and indirectly, Shoal.
>
> A few questions:
> A few iterative trials might establish where the issue lies. Could
> you try with packet sizes lower than 64KB i.e say 10KB through to
> 60 KB and then go beyond ? That would indicate where possible
> bottlenecks could be. Would be even better if this could be profiled.
>
> Are you seeing any exceptions or messages in the log output to
> suggest failure to send messages? Could you share the logs here?
>
> We can surely look into exposing properties that can be set in Jxta.
> Jxta provides a programmatic Api for configurations called
> NetworkConfigurator.
> https://jxta-jxse.dev.java.net/source/browse/jxta-jxse/trunk/api/src/net/jxta/platform/NetworkConfigurator.java?rev=555&view=markup
> <https://jxta-jxse.dev.java.net/source/browse/jxta-jxse/trunk/api/src/net/jxta/platform/NetworkConfigurator.java?rev=555&view=markup>
>
> However it does not expose anything related to tuning TCP
> presumably because the underlying OS level TCP protocol
> implementation is expected to provide flow control, fragmentation
> support, etc. There could be opportunity for some RFEs there.
>
> I can see an api for setting the Multicast packet size but in this
> case you report it may not be relevant as Shoal uses the TCP
> channel to send messages.
>
> hth
> Shreedhar
>
>