dev@glassfish.java.net

Re: Faulty persistence loading

From: Sanjeeb Kumar Sahoo <Sanjeeb.Sahoo_at_Sun.COM>
Date: Thu, 02 Feb 2006 20:46:26 +0530

Hi Jerome,

Mitesh has already clarified the number of class loaders. Now getting to
the real issue:

Thanks for raising this issue. As you rightly said there is a
*potential* issue in our loading sequence which can lead to provider
instrumented byte codes not being used. We have so far not seen this
because it is a *corner* case. If you look at the attached email you can
see that in the past (as back as August '05 when byte code
instrumentation by provider first came into EJB3 spec) I had tried to
find a solution to this issue. I have had several rounds of email
discussion with Ken Saks. Since we (users) never hit the problem coupled
with the fact that it is not a typical use case, we never got to the
point of resolving it. But I *do* feel, we should address this before FCS.

Yes, the root cause is we are *late* in loading PUs, but it is *not*
related to annotation processing. The real problem is that during
application load process, we *validate* descriptors (using
ApplicationValidator) before loading PUs. During descriptor validation
we some times load classes (e.g.) to find out if a session bean
implements TimedObject interface or to find out injection target type is
field or method etc. On PE, this can only happen during server restart.
On EE, I believe, it can happen every time. If during validation, we
load an entity class, then the bug can be reproduced.

The current sequence of events during application loading (during server
restart only as the sequence is slightly different if it is post
deployment) is shown below:

    - open archive (1)
    - read XML DD (2)
    - create class loader (say) *CL*, associate this with descriptor (3)
    - *validate descriptor using ApplicationValidator*(this can cause
class loading) (4)
    - load application (5)
        - load RARs (6)
        - load all persistence units in application including those of
embedded modules (7)
             - provider uses puInfo.*getNewTempClassLoader()* for
introspection. (8)
             - provider registers ClassTransformer with *CL*. (9)
        - load rest of the embedded modules(10)

To fix, this we just have to /load persistence units/ between step #3 & 4.

No where is annotation processing coming into picture. I don't think
container should do any annotation processing of ORM annotations because
the scanning rules are non-trivial. It is best handled by persistence
provider.

This issue is not applicable to Java SE because in Java SE environment
TopLink Essential uses javaagent to instrument byte codes.

There are other reasons why I had postponed addressing this issue. The
rules about instrumentation are not clear in the spec. The spec does not
clearly specify what are the classes that can be transformed by a
provider? Is it limited to only entity/embeddable classes or can a
provider transform classes related to @Entity classes, e.g. its super
class? Secondly what happens if a class that needs to be transformed is
loaded by a class loader that is parent of application class loader
(this can happen if that class is part of an installed optional package
which in our server is loaded by extension class loader).

Hope I made sense.

Thanks,
Sahoo

p.s':
Sequence of events when app is deployed:
------------------------------------------------------
    - open archive
    - create a class loader (say) *CL1*
    - process annotation using *CL1*
    - read XML DD
    - validate descriptor using ApplicationValidator
At this point deployment gets over and the class loader is released,
loading begins.
    - create a new class loader (say) *CL2*, associate it with the
descriptor.
    - load application
        - load RARs
        - load all persistence units in application including those of
embedded modules
             - provider uses puInfo.*getNewTempClassLoader()* for
introspection.
             - provider registers ClassTransformer with *CL2*.
        - load rest of the embedded modules
*CL2* is subsequently used as long as the app is alive.

There are no issues here even if @Entity annotated classes annotated
with Java EE annotations.


Jerome Dochez wrote:

> Hi All
>
> This will be a long email so bear with me...
>
> Talking with Mitesh, I discovered that we process persistence units
> too late in the application deployment/loading. The issue is that
> today, we do the following :
> - open the archive
> - process Java EE annotations (non persistence ones)
> - load XML
> - optional step : deployment
> - load application in ejb container which trigger
> + load persistence units with a new classloader
> (getNewEjbClassLoader API)
> + process @Entity entities
> + create a new classloader with the provided ClassTransformer
> + reload the enhanced application classes.
> - run the application.
>
> This is incorrect because :
> 1. we create 3 class loaders when loading
> 2. the classloader that was used to populate the DOL is not the
> same as the one used to load the classes in the container (3rd class
> loader) which is an issue since on the DOL, we store information like
> the Method object related to security artifacts and such...
> 3. the classloader on the DOL is not reset hence we keep reference
> to at least 2 classloaders during the lifetime of the application.
>
> We got lucky up to now because most @Entity classes do not use other
> Java EE annotations and this did not trigger failures but it is an
> unacceptable assumption.
>
> It seems to me we need to change the way persistence units are loaded
> in the application server to the following :
>
> - open the archive
> - create a temp class loader
> - find all persistence units and initialize the PersistenceProvider
> with each of them and get the ClassTransformer from the provider
> - initialize a new ClassLoader with the ClassTransformer to use
> from now on
> - process Java EE annotations
> - load XML
> - (deployment)
> - load applications.
>
> When there are no persistence units in the deployment artifact, we
> skip 2, 3 and 4
>
> Sahoo/Mitesh : how would that impact the Java SE path ?
>
> This way we create only 2 class loaders (one to load the unchanged
> classes, one to load the bytecode enhanced ones) . More importantly.
> we ensure that the class used by the Java EE annotation processor and
> deployment are the *final* class as they will be run in the EJB
> container.
>
> Also a number of APIs on the PersistenceProvider interface like
> getManagedClassNames() could allow us to plug my annotation scanner
> enhancement to avoid loading extra classes when we know they do not
> contain interesting annotations. I can work this out with Sahoo and
> Mitesh.
>
> Let me know if I missed anything.
>
> Thanks, Jerome




attached mail follows:



Sanjeeb Kumar Sahoo wrote:

>
> There is no annotation processing during server restart. So, why
> should some classes be loaded as part of server restart to build
> descriptors? Can you please give me some examples of what kinds of
> classes must be loaded?

The injection information in the descriptor, even the full descriptor,
is not sufficient. The .class
must be consulted to fully process it, e.g. to determine field vs.
method injection. In addition,
the java type within ejb-ref, ejb-local-ref, resource-ref, etc. is now
optional in the descriptor.
The rationale was the deployment process could determine the correct
typing information from
the .class.

>
>
>> That's not the piece of code that should be moving. You'll need to
>> either :
>>
>> a) ensure that a different classloader is used for processing
>> the .ear than is used as the runtime application classloader
>>
> Probably not an option as it is too difficult to manage two class loaders.

Not sure I understand where the complexity comes from here. There may be
some additional overhead associated with loading .classes once for
deployment processing and once when the application starts running but
that's a separate issue.

>
>
>> or
>>
>> b) ensure that the same classloader is used for processsing the .ear
>> as is
>> used for the runtime application classloader , but install your
>> transformer
>> early enough that it's in place to handle any classloading that happens
>> to load the entity .classes you're concerned about.
>
>
> Yes, this is probably an option. I have not investigated completely.
> But I think, if we go along this route, we may have to bootstrap
> TopLink using a private API as opposed to using the
> PersistenceProvider.createEntityManagerFactory() method. So I don't
> want to do this unless this is the only option. Hence I would like to
> understand why we are loading classes during descriptor building process.
>
> I think it will be good if we can talk about this.
> Qingqing & Mahesh,
> what do you think? I think this stuff is very important for the
> TopLink integration with container.
>
> Thanks,
> Sahoo
>
>>> This can follow loading of par files in an ear file. Do you see any
>>> issues here?
>>>
>>> Thanks,
>>> Sahoo
>>>
>>> Kenneth Saks wrote:
>>>
>>>> Hi Sanjeeb,
>>>>
>>>> This issue is independent of the runtime injection manager. What
>>>> you're referring to
>>>> is just the normal .ear processing that happens when the deployment
>>>> code reads
>>>> the descriptors/annotations. When deployment opens the .ear and
>>>> processes the
>>>> descriptors/annotations lots of classloading takes place. The
>>>> resulting set of
>>>> descriptors are passed to the runtime ejb/web containers so this
>>>> stage must take
>>>> place before runtime application initialization.
>>>> There are lots of different code paths depending on standalone .jar
>>>> vs. .ear and
>>>> initial deployment vs. server restart, so I'm not sure whether the
>>>> same classloader
>>>> that is used to process the .ear is always the one used as the
>>>> runtime application
>>>> classloader. I'm pretty sure in the case of server restart they're
>>>> one and the same,
>>>> but I don't know whether that's the case during an initial
>>>> deployment within PE/DAS.
>>>>
>>>>
>>>> Sanjeeb Kumar Sahoo wrote:
>>>>
>>>>> Hi Ken,
>>>>> This is regarding class transformation requirement from TopLink.
>>>>> TopLink expects us to call transform for each POJO entity class
>>>>> that is getting defined by application class loader. I have not
>>>>> understood injection manager implementation in our container well
>>>>> enough to say when it will load any classes during application
>>>>> load time. I see some injection related code in EjbBundleValidator
>>>>> and that in turn tries to load classes to detect injection
>>>>> property type. AFAIK, EjbBundleValidator gets called before
>>>>> ApplicationLoader.load() gets called for an application. Is this
>>>>> true?
>>>>> Does it mean, we load some application classes before
>>>>> ApplicationLoader.load() is called. In that case, there is a
>>>>> remote chance that some POJO entity class may also have been
>>>>> loaded by injection manager (let's say the class that got loaded
>>>>> by EjbBundleValidator had a direct dependency on a POJO entity
>>>>> class and hence VM decided to load the POJO class). Since
>>>>> TopLink's ClassFileTransformer gets registered during
>>>>> ApplicationLoader.load() (this is my proposal), it is possible
>>>>> that certain POJO entity classes won't be transformed by TopLink.
>>>>> This might lead to bugs.
>>>>> Is it possible to delay loading of classes by injection manager?
>>>>>
>>>>> Thanks,
>>>>> Sahoo
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>