users@glassfish.java.net

Re: Vague CORBA issue causing unexplainable problems

From: Paul Giblock <pgiblox_at_gmail.com>
Date: Mon, 3 Jan 2011 16:49:27 -0500

Amy -

I am now opening JMS connections on demand using a method-scoped
Connection. This seems to 'fix' the problem. However, I know it is
'fixed' simply because there are not as many JMS connections. I am
afraid that we may encounter some usage pattern in the future which
may cause many JMS connections to be opened simultaneously - causing
the error again sometime when we don't expect it.

Regardless, this solution is obviously more efficient as it takes
advantage of GF's connection pooling. Although, it surprises me that
Sun/Oracle approves of the "Cache your connection at @PostConstruct"
paradigm if it causes errors. Is there a way I can keep track of "#
opened JMS connections / max pool size" so I can be alerted as we
start to approach or maximum?

I haven't tested on the 3.1 snapshot yet, but I have tested on 3.0.1
and the same errors occur as on our production servers (CORBA-ish
exception in EMBEDDED or LOCAL mode).

Anyways, Thank you for the help so far, I will continue inspecting this.

-Paul G

On Thu, Dec 30, 2010 at 3:28 PM, Amy Kang <amy.kang_at_oracle.com> wrote:
> and
>
> On 12/30/2010 12:15 PM, Amy Kang wrote:
>>
>> Paul,
>>
>> So far my comments on this has been focusing on JMS side with the
>> assumption that everything else in the GlassFish server that you were using
>> works as expected and with piece by piece info on your application.   For
>> example,  some other factors to consider
>>
>> . Concurrent use of a Stateless session bean instance
>>  - You can move the make/closeConnection to the announceNewQuestion()
>> method, since the JMS connections are pooled, to see if you still see the
>> problem.
>
> when you try the above,  change the instance variable 'connection' to method
> local.     -amy
>
>> . Any potential bugs in GlassFish 3 that has been fixed in 3.0.1 and 3.1
>> that could be related to this (?)
>>  - If you can not try 3.1 latest promoted build, you should at least try
>> 3.0.1 to see if you can reproduce the same problem
>> . If necessary, a non-public (engineer's) property that you maybe able to
>> try to rule out of 1 area (when the colleague who works in this area returns
>> from vacation next week)
>> . Any other exceptions seen in the server log (?)
>>
>> You can also enable FINE debug logging to relevant components of the
>> GlassFish server,  ejb, jts/jta, jca, jms, corba, .., which you can set in
>> GlassFish Administration Console, and some additional JMSRA logger names (as
>> seen in the source code)
>> com.sun.messaging.jmq.jmsclient.XAResourceForMC
>> javax.resourceadapter.mqjmsra.outbound.connection
>> javax.resourceadapter.mqjmsra.xa
>> com.sun.messaging.jms.ra.DirectXAResource
>> javax.resourceadapter.mqjmsra
>> com.sun.messaging.jms.ra.ResourceAdapter
>>
>> and enabling MQ broker side transaction protocol debugging can also be
>> helpful,  by setting following broker properties
>>
>> imq.debug.com.sun.messaging.jmq.jmsserver.data.handlers.TransactionHandler=true
>> imq.debug.com.sun.messaging.jmq.jmsserver.data.protocol.ProtocolImpl=true
>>
>> or by running
>> imqcmd debug class -n
>> com.sun.messaging.jmq.jmsserver.data.handlers.TransactionHandler -debug
>> imqcmd debug class -n
>> com.sun.messaging.jmq.jmsserver.data.protocol.ProtocolImpl -debug
>>
>> When you file a JIRA issue (if not sure the component, file to 'other'),
>> please attach the complete server/broker logs.
>>
>> amy
>>
>> On 12/29/2010 07:06 AM, Paul Giblock wrote:
>>>
>>> Amy,
>>>
>>>> @PostConstruct is to create the JMS connection, what does @PreConstruct
>>>> and QuestionManagerBean.announceQuestion() do (not shown in your code
>>>> snippet below) ? or did you actually mean @PreDestroy which is to close the
>>>> JMS connection
>>>
>>> Right, I meant @PreDestory.  There are, in fact, no other annotated
>>> lifecycle methods on this class.
>>>
>>>> and QuestionManagerBean.announceNewQuestion which is to send a JMS
>>>> message ?
>>>>
>>> Yes, the announceNewQuestion method is the only one to send a JMS
>>> message among both the QuestionManagerBean and WidgetHelperBean
>>> classes.  This method, as well as the methods in JmsUtils and
>>> VHMStringUtils are complete and unadulterated.
>>>
>>>> If the later,  the problem looks like a JMS related issue if without
>>>> calling these methods the problem does not occur.    It's possible the
>>>> problem is triggered by create/closeConnection, which could indicates a JMS
>>>> related bug (GlassFish+JMSRA) in the area of "recycle" JMS connection.  You
>>>> can try to set the bean pool configuration of QuestionManagerBean to avoid
>>>> bean destory, e.g. no idle timeout,  max pool size large enough for your
>>>> possible highest load and do the similar for the JMS connector pool.
>>>>  However,  it's necessary to find out the root cause the problem in order to
>>>> give you the right advise to avoid (if possible) the problem, and most
>>>> importantly to ensure the problem is fixed in a later release of GlassFish.
>>>>
>>> Your explanation seems to match what I am observing. I agree, we don't
>>> want this to be a long-term standing bug in GF for everyone's sake.
>>> The workaround you mention is not optimal, and I am not confident I
>>> would know what our upper bound for load/traffic is.
>>>
>>>> How is QuestionManagerBean.announceNewQuestion invoked ?
>>>>
>>>  From a servlet, which does:
>>>
>>>   Context ctx = new InitialContext();
>>>   QuestionManager questionMgr =
>>>       (QuestionManager) ctx.lookup("java:comp/env/ejb/QuestionManager");
>>>
>>>   // Load values from servlet request
>>>
>>>   try {
>>>      AskTuple at = questionMgr.ask(e,i,t,m,a);
>>>      // Prepare response ...
>>>   }
>>>   catch (/* All our application exceptions */) ...
>>>
>>> The QuestionManager.ask method is simple:
>>>
>>>   @Override
>>>   public AskTuple ask (long eventId, final String ip, String userAlias,
>>>                            String html, MediaTuple media)
>>>       throws EventAskingClosedException, EventExpiredException,
>>>              EventNotStartedException, OffensiveException,
>>>              EventMediaNotAllowedException, EventMediaRequiredException,
>>>              StringMaxLengthException {
>>>     // Basic input validation
>>>     // Several JPA loads
>>>     // Prepare new JPA Entity
>>>     // A JPA persist
>>>
>>>     if (needToAnnounce) {
>>>       // we are calling this method with a JPA entity as a param
>>>       announceNewQuestion(tuple.getQuestion());
>>>     }
>>>     return tuple;
>>>   }
>>>
>>>
>>>> Could you please file a JIRA issue for this with as much information as
>>>> possible in order to reproduce it (preferrably with a reproducible test
>>>> case, and be sure to include GlassFish version/build #) ?
>>>>
>>> I will try. Any hints on how best to file it (category, imporant
>>> keywords, etc)?  The big issue for me is, I cannot recreate this error
>>> on my own.  I've only experienced it on both of our production
>>> systems, apparently due to the higher traffic loads.  I'll have to
>>> figure out some way to generate enough traffic to cause the problem in
>>> a vacuum. Any ideas on how to make the problem appear sooner or with
>>> less traffic? Possibly lowering the number max-pool of
>>> ConnectionFactory to some very low (how low?) value..
>>>
>>> Thank you for your continued support,
>>> Paul G
>>
>
>