Re: Vague CORBA issue causing unexplainable problems

From: Amy Kang <amy.kang_at_oracle.com>
Date: Thu, 30 Dec 2010 12:15:51 -0800

Paul,

So far my comments on this has been focusing on JMS side with the
assumption that everything else in the GlassFish server that you were
using works as expected and with piece by piece info on your
application. For example, some other factors to consider

. Concurrent use of a Stateless session bean instance
- You can move the make/closeConnection to the announceNewQuestion()
method, since the JMS connections are pooled, to see if you still see
the problem.
. Any potential bugs in GlassFish 3 that has been fixed in 3.0.1 and 3.1
that could be related to this (?)
- If you can not try 3.1 latest promoted build, you should at least
try 3.0.1 to see if you can reproduce the same problem
. If necessary, a non-public (engineer's) property that you maybe able
to try to rule out of 1 area (when the colleague who works in this area
returns from vacation next week)
. Any other exceptions seen in the server log (?)

You can also enable FINE debug logging to relevant components of the
GlassFish server, ejb, jts/jta, jca, jms, corba, .., which you can set
in GlassFish Administration Console, and some additional JMSRA logger
names (as seen in the source code)
com.sun.messaging.jmq.jmsclient.XAResourceForMC
javax.resourceadapter.mqjmsra.outbound.connection
javax.resourceadapter.mqjmsra.xa
com.sun.messaging.jms.ra.DirectXAResource
javax.resourceadapter.mqjmsra
com.sun.messaging.jms.ra.ResourceAdapter

and enabling MQ broker side transaction protocol debugging can also be
helpful, by setting following broker properties
imq.debug.com.sun.messaging.jmq.jmsserver.data.handlers.TransactionHandler=true
imq.debug.com.sun.messaging.jmq.jmsserver.data.protocol.ProtocolImpl=true

or by running
imqcmd debug class -n
com.sun.messaging.jmq.jmsserver.data.handlers.TransactionHandler -debug
imqcmd debug class -n
com.sun.messaging.jmq.jmsserver.data.protocol.ProtocolImpl -debug

When you file a JIRA issue (if not sure the component, file to 'other'),
please attach the complete server/broker logs.

amy

On 12/29/2010 07:06 AM, Paul Giblock wrote:
> Amy,
>
>> @PostConstruct is to create the JMS connection, what does @PreConstruct and QuestionManagerBean.announceQuestion() do (not shown in your code snippet below) ? or did you actually mean @PreDestroy which is to close the JMS connection
> Right, I meant @PreDestory. There are, in fact, no other annotated
> lifecycle methods on this class.
>
>> and QuestionManagerBean.announceNewQuestion which is to send a JMS message ?
>>
> Yes, the announceNewQuestion method is the only one to send a JMS
> message among both the QuestionManagerBean and WidgetHelperBean
> classes. This method, as well as the methods in JmsUtils and
> VHMStringUtils are complete and unadulterated.
>
>> If the later, the problem looks like a JMS related issue if without calling these methods the problem does not occur. It's possible the problem is triggered by create/closeConnection, which could indicates a JMS related bug (GlassFish+JMSRA) in the area of "recycle" JMS connection. You can try to set the bean pool configuration of QuestionManagerBean to avoid bean destory, e.g. no idle timeout, max pool size large enough for your possible highest load and do the similar for the JMS connector pool. However, it's necessary to find out the root cause the problem in order to give you the right advise to avoid (if possible) the problem, and most importantly to ensure the problem is fixed in a later release of GlassFish.
>>
> Your explanation seems to match what I am observing. I agree, we don't
> want this to be a long-term standing bug in GF for everyone's sake.
> The workaround you mention is not optimal, and I am not confident I
> would know what our upper bound for load/traffic is.
>
>> How is QuestionManagerBean.announceNewQuestion invoked ?
>>
> From a servlet, which does:
>
> Context ctx = new InitialContext();
> QuestionManager questionMgr =
> (QuestionManager) ctx.lookup("java:comp/env/ejb/QuestionManager");
>
> // Load values from servlet request
>
> try {
> AskTuple at = questionMgr.ask(e,i,t,m,a);
> // Prepare response ...
> }
> catch (/* All our application exceptions */) ...
>
> The QuestionManager.ask method is simple:
>
> @Override
> public AskTuple ask (long eventId, final String ip, String userAlias,
> String html, MediaTuple media)
> throws EventAskingClosedException, EventExpiredException,
> EventNotStartedException, OffensiveException,
> EventMediaNotAllowedException, EventMediaRequiredException,
> StringMaxLengthException {
> // Basic input validation
> // Several JPA loads
> // Prepare new JPA Entity
> // A JPA persist
>
> if (needToAnnounce) {
> // we are calling this method with a JPA entity as a param
> announceNewQuestion(tuple.getQuestion());
> }
> return tuple;
> }
>
>
>> Could you please file a JIRA issue for this with as much information as possible in order to reproduce it (preferrably with a reproducible test case, and be sure to include GlassFish version/build #) ?
>>
> I will try. Any hints on how best to file it (category, imporant
> keywords, etc)? The big issue for me is, I cannot recreate this error
> on my own. I've only experienced it on both of our production
> systems, apparently due to the higher traffic loads. I'll have to
> figure out some way to generate enough traffic to cause the problem in
> a vacuum. Any ideas on how to make the problem appear sooner or with
> less traffic? Possibly lowering the number max-pool of
> ConnectionFactory to some very low (how low?) value..
>
> Thank you for your continued support,
> Paul G