users@glassfish.java.net

Re: Major Application client frustrations

From: Florian Bruckner (3Kraft) <"Florian>
Date: Sat, 05 May 2007 12:24:41 +0200

Hi Ken,

thanks a lot for your support.

We are using V1UR1 currently. We are getting close to acceptance and
production, due once we sorted out some showstoppers. So V2 is not an
option for us.
> Just a few more comments:
>
>
>> Hello everybody,
>>
>>
>> - Time for login: On slower machines (like a 1.4Ghz
>> iBook), login takes >20 seconds, with most of the
>> time 100% CPU. Profiling shows that the time is spent
>> in ORB.init()... WTF is it doing?
>>
>
> This one is interesting. I know what ORB.init does both in the ORB
> and in the app server, but 20 seconds is a long time. Could you share
> your profiling info with us?
>
I will take a Netbeans profiling snapshot and follow up with that.
>
>
>> - Request performance: As pointed out by others in
>> this forum, the performance of the ORB greatly sucks.
>> We're communicating with statless EJBs, transmitting
>> value beans back and forth. Compared to other
>> application servers, it takes up to 10 times as long.
>>
>>
>
> It should suck less now, as some significant performance problems
> have been fixed. Which build are you using? There was a big
> problem with the huge classpath of the app server creating a huge
> codebase interacting with a bug in the marshalling code that did not
> properly handle repeated codebase strings. This fix is certainly in the recent
> builds for the past few months.
>
> We also greatly improved how the ORB reads messages from a socket.
> It reads all available data as fast as possible, instead of reading 12
> bytes then the message size. Internal benchmarks show a 50% or so
> increase in the throughput.
>
We currently use ORBUseNIOSelectToWait=false, and this alone reduced it
by 50%. We havent tested with newer versions of Glassfish as this is
currently not an option.
>
>
>> - Latency: This time, Jonas is 5 times as quick. Our
>> average request latency for requests without payload
>> (e.g. a simple ping()) is well above 10ms, sometimes
>> as bad as 30ms. Without any network interference.
>>
>>
>
> We'll need to look at this again. We are mainly focusing on
> throughput oriented tests (like SpecJ) at this point.
>
Unfortunately latency ia critical for our application, and I assume for
most rich clients that use fine grained operations. We have a lot of
small operations going against our service layer (mostly data requests),
and batching them is not really an option (we do a lot of lazy fetching
of data).

>> - configurability of ORB behaviour: The ORB as a ton
>> of properties that could be set. Possibly, but
>> whenever I thought I had found a setting it turned
>> out to be non-settable. Lets talk about an retry
>> count for communication (see above, could save a ton
>> of poor electrons wasted for logging). Lets talk
>> about a socket timeout - all not settable. This last
>> one (network timeouts) is one of our major headaches
>> so far.
>>
>
> We don't directly use a count, but the TcpTimeouts I described
> give you more or less the same capability.
>

>> - network timeout: There is no way to specify, how
>> long I want to wait for a connect and how long I want
>> a socket to be allowed idle.
>>
>
> Wait for connect is covered. The ORB also has settable
> high water marks on the inbound and outbound connection caches,
> which will cause connections to be closed. But so far we have
> not seen a need to directly age out connections.
>
Having settings in the ORB should be sufficient, at least if their
behaviour is documented. I already found out about some other parameters
(mainly WAIT_FOR_RESPONSE_TIMEOUT) and tried to set them low, but this
had sideeffects. For example, if the operation we call has a huge
payload (lets say an image of 3000K), the request times out before it
arrives fully on the server. At least we do not see the method entry on
the server side when we get the Exception on the client side for this
special case.

Nevertheless, WAIT_FOR_RESPONSE_TIMEOUT may be the wrong setting, I will
try the paramters you described in your earlier message.
> I searched and tried a
>
>> lot, even creating my own socket factory. But guess
>> what - the current version does not allow it to be
>> overridden. There is a property, which is even being
>> described in some documents (e.g. Sun App Server 8
>> docs), but Glassfish hardcoded sets it to its
>> IIOPSSLSocketFactory.
>>
>
> You're right; I don't think there is a way to override this.
> We do need to control this in order to support CSIv2.
> Again, with the new timeouts, do you need anything else?
>
I don't think so. If timeouts can be specified properly there should not
be a reason for this.
>
>> We're having troubles with
>> this, because from time to time, a client may
>> experience network problems. When this happens, the
>> client is frozen to a point where it is not usable.
>>
>
> You could use the request timeout to handle this,
> if the granularity is OK (see previous post).
>
>
>> If I cannot specify a timeout on socket level, I
>> thought I maybe could just have watchdog thread and
>> interrupt() some other threads (i.e. the locked
>> thread). Try this - its a major show in the log
>> console and a broken ORB. Shouldn't any blocking
>> thread be prepared to receive an interrupt()?
>>
>
> In general, yes, but many thread could choose to ignore
> interrupt, for example a while(!condition) wait loop would
> have no choice but to spin and continue waiting if interrupted.
> We could probably make the CorbaResponseWaitingRoom
> code respond to an interrupt() call, but I'm not sure if that's a
> good idea or not.
>
This is more or less a question of request monitoring. You were
mentioning this earlier:

> Please let me know how this works for you (it will probably be another 2 weeks
> before you can get a build with this ORB in it). Also, I'd like to know
> if there are any needs for fine grained control over timeouts. Right now the
> granularity is the entire ORB, so you cannot set a timeout that applies
> to only a single request, or a single EJB reference.

Definitely, yes. Our operations are with varying granularity. For
example we might have a very small request just pushing a single value
to the service layer. On the other hand we have very huge request, when
we transmit a BLOB to the server. So therefore specifying the timout on
a per-request basis would be a good thing to do (at least for our use
case).

Another thing is to monitor the state of the operation and to make it
cancellable. I could, for example, put the remote operation call on one
thread and have another one monitoring the request, giving the user some
feedback about the current state of the application. Thats also when
interrupting becomes relevant, as the user may choose to cancel the
current operation (again, for example when uploading a BLOB), or we may
want to show some progress information when transmitting a large amount
of data.

If you want we can discuss this in detail (I think that would be a bit
out of scope for this thread).

Florian.
> Ken.
> [Message sent by forum member 'kcavanaugh' (kcavanaugh)]
>
> http://forums.java.net/jive/thread.jspa?messageID=215664
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: users-help_at_glassfish.dev.java.net
>
>


-- 
3Kraft Software | Applications | Development
Wasagasse 26/2
1090 Vienna
Austria
Phone: +43 (0)1 920 45 49
Fax: +43 (0)1 920 45 49
Mobile: +43 (699) 102 53 901
E-Mail: florian.bruckner_at_3kraft.com
Web: http://www.3kraft.com