Re: Vague CORBA issue causing unexplainable problems

From: Amy Kang <amy.kang_at_oracle.com>
Date: Thu, 30 Dec 2010 12:28:58 -0800

and

On 12/30/2010 12:15 PM, Amy Kang wrote:
> Paul,
>
> So far my comments on this has been focusing on JMS side with the
> assumption that everything else in the GlassFish server that you were
> using works as expected and with piece by piece info on your
> application. For example, some other factors to consider
>
> . Concurrent use of a Stateless session bean instance
> - You can move the make/closeConnection to the announceNewQuestion()
> method, since the JMS connections are pooled, to see if you still see
> the problem.

when you try the above, change the instance variable 'connection' to
method local. -amy

> . Any potential bugs in GlassFish 3 that has been fixed in 3.0.1 and
> 3.1 that could be related to this (?)
> - If you can not try 3.1 latest promoted build, you should at least
> try 3.0.1 to see if you can reproduce the same problem
> . If necessary, a non-public (engineer's) property that you maybe able
> to try to rule out of 1 area (when the colleague who works in this
> area returns from vacation next week)
> . Any other exceptions seen in the server log (?)
>
> You can also enable FINE debug logging to relevant components of the
> GlassFish server, ejb, jts/jta, jca, jms, corba, .., which you can
> set in GlassFish Administration Console, and some additional JMSRA
> logger names (as seen in the source code)
> com.sun.messaging.jmq.jmsclient.XAResourceForMC
> javax.resourceadapter.mqjmsra.outbound.connection
> javax.resourceadapter.mqjmsra.xa
> com.sun.messaging.jms.ra.DirectXAResource
> javax.resourceadapter.mqjmsra
> com.sun.messaging.jms.ra.ResourceAdapter
>
> and enabling MQ broker side transaction protocol debugging can also be
> helpful, by setting following broker properties
> imq.debug.com.sun.messaging.jmq.jmsserver.data.handlers.TransactionHandler=true
>
> imq.debug.com.sun.messaging.jmq.jmsserver.data.protocol.ProtocolImpl=true
>
> or by running
> imqcmd debug class -n
> com.sun.messaging.jmq.jmsserver.data.handlers.TransactionHandler -debug
> imqcmd debug class -n
> com.sun.messaging.jmq.jmsserver.data.protocol.ProtocolImpl -debug
>
> When you file a JIRA issue (if not sure the component, file to
> 'other'), please attach the complete server/broker logs.
>
> amy
>
> On 12/29/2010 07:06 AM, Paul Giblock wrote:
>> Amy,
>>
>>> @PostConstruct is to create the JMS connection, what does
>>> @PreConstruct and QuestionManagerBean.announceQuestion() do (not
>>> shown in your code snippet below) ? or did you actually mean
>>> @PreDestroy which is to close the JMS connection
>> Right, I meant @PreDestory. There are, in fact, no other annotated
>> lifecycle methods on this class.
>>
>>> and QuestionManagerBean.announceNewQuestion which is to send a JMS
>>> message ?
>>>
>> Yes, the announceNewQuestion method is the only one to send a JMS
>> message among both the QuestionManagerBean and WidgetHelperBean
>> classes. This method, as well as the methods in JmsUtils and
>> VHMStringUtils are complete and unadulterated.
>>
>>> If the later, the problem looks like a JMS related issue if without
>>> calling these methods the problem does not occur. It's possible
>>> the problem is triggered by create/closeConnection, which could
>>> indicates a JMS related bug (GlassFish+JMSRA) in the area of
>>> "recycle" JMS connection. You can try to set the bean pool
>>> configuration of QuestionManagerBean to avoid bean destory, e.g. no
>>> idle timeout, max pool size large enough for your possible highest
>>> load and do the similar for the JMS connector pool. However, it's
>>> necessary to find out the root cause the problem in order to give
>>> you the right advise to avoid (if possible) the problem, and most
>>> importantly to ensure the problem is fixed in a later release of
>>> GlassFish.
>>>
>> Your explanation seems to match what I am observing. I agree, we don't
>> want this to be a long-term standing bug in GF for everyone's sake.
>> The workaround you mention is not optimal, and I am not confident I
>> would know what our upper bound for load/traffic is.
>>
>>> How is QuestionManagerBean.announceNewQuestion invoked ?
>>>
>> From a servlet, which does:
>>
>> Context ctx = new InitialContext();
>> QuestionManager questionMgr =
>> (QuestionManager)
>> ctx.lookup("java:comp/env/ejb/QuestionManager");
>>
>> // Load values from servlet request
>>
>> try {
>> AskTuple at = questionMgr.ask(e,i,t,m,a);
>> // Prepare response ...
>> }
>> catch (/* All our application exceptions */) ...
>>
>> The QuestionManager.ask method is simple:
>>
>> @Override
>> public AskTuple ask (long eventId, final String ip, String userAlias,
>> String html, MediaTuple media)
>> throws EventAskingClosedException, EventExpiredException,
>> EventNotStartedException, OffensiveException,
>> EventMediaNotAllowedException,
>> EventMediaRequiredException,
>> StringMaxLengthException {
>> // Basic input validation
>> // Several JPA loads
>> // Prepare new JPA Entity
>> // A JPA persist
>>
>> if (needToAnnounce) {
>> // we are calling this method with a JPA entity as a param
>> announceNewQuestion(tuple.getQuestion());
>> }
>> return tuple;
>> }
>>
>>
>>> Could you please file a JIRA issue for this with as much information
>>> as possible in order to reproduce it (preferrably with a
>>> reproducible test case, and be sure to include GlassFish
>>> version/build #) ?
>>>
>> I will try. Any hints on how best to file it (category, imporant
>> keywords, etc)? The big issue for me is, I cannot recreate this error
>> on my own. I've only experienced it on both of our production
>> systems, apparently due to the higher traffic loads. I'll have to
>> figure out some way to generate enough traffic to cause the problem in
>> a vacuum. Any ideas on how to make the problem appear sooner or with
>> less traffic? Possibly lowering the number max-pool of
>> ConnectionFactory to some very low (how low?) value..
>>
>> Thank you for your continued support,
>> Paul G
>