dev@glassfish.java.net

RE: Strange phenomenon when failed in loading the ejb application

From: Jeremy Lv <lvsongping_at_cn.fujitsu.com>
Date: Mon, 27 Jan 2014 11:10:20 +0800

Hi, Hong:



    Sorry for replying the mail later as I’m a little busy during the
weekends, here’s the further investigation in line:



From: Hong Zhang [mailto:hong.hz.zhang_at_oracle.com]
Sent: Saturday, January 25, 2014 11:12 AM
To: dev_at_glassfish.java.net
Cc: marina.vatkina_at_oracle.com; Jeff Tancill
Subject: Re: Strange phenomenon when failed in loading the ejb application



Hi, Jeremy

Please see my comments in line:

On 1/24/2014 3:08 AM, Jeremy Lv wrote:

Hi, Hong:

Cc: Marina, Jeff:



Here’s my Comment:



It is by design that we don't generate or delete any files for load and
unload. The file generation and deletion only happens during prepare and
clean phases as part of the deployment and undeployment.
Jeremy Lv: If it is the right scenario that we shouldn’t generate or delete
any files for load and unload, the phenomenon I have describe is also
different from the right scenario. The Ejb related policy file will be
created when execute the appInfo.load(context, tracker) in the method of
ApplicationLifecycle.deploy() when start the cluster after the ejb
application has been deployed.



In the case you described, as the cluster was offline when the initial
deployment happens, we could not detect this type of error that time to roll
back. And when the error happens during the cluster starts up (which is a
load for the application), we don't do roll back. So this is kind of
expected behavior with the current design. At that time you would have to
perform an undeploy operation to clean up the application bits explicitly.

Jeremy Lv: I just feel it is strange that the ejb related policy file is
created during the cluster starts up (which is a load for the application),
why not rollback the load process when the cluster started(Delete the ejb
related policy file created during the load process)?

In general, the generated artifacts are synced over from DAS to cluster when
the (offline) cluster starts up instead of being generated as part of the
cluster start up. But maybe policy system does things differently here. If
the policy is generated as part of the cluster start up in load phase,
probably we should clean it up as part of the roll back in unload phase.



This may cause the some critical or blocker issues:



【Sample】

Here’s one of the similar blocker issue still opened in the JIRA

https://java.net/jira/browse/GLASSFISH-11348

Here’s the reproduced steps:

1) asadmin start-domain

2) asadmin create-cluster clu1

3) asadmin create-instance -cluster clu1 ins1

4) asadmin deploy --target clu1 compositeweb-ejb-1.0-SNAPSHOT.jar

5) asadmin deploy --target clu1 compositeweb-web-1.0-SNAPSHOT.war

6) asadmin start-cluster clu1

7) asadmin stop-cluster clu1

8) asadmin start-cluster clu1



After step(8), the following exception will be thrown out:

[2014-01-24T15:52:04.855+0800] [glassfish 4.0] [SEVERE] [NCLS-CORE-00026]
[javax.enterprise.system.core] [tid: _ThreadID=17
_ThreadName=RunLevelControllerThread-1390549911763] [timeMillis:
1390549924855] [levelValue: 1000] [[

  Exception during lifecycle processing

org.glassfish.deployment.common.DeploymentException: Error in linking
security policy for compositeweb-web-1.0-SNAPSHOT -- Inconsistent Module
State

              at
com.sun.enterprise.security.ee.SecurityUtil.linkPolicyFile(SecurityUtil.java
:336)

              at
com.sun.enterprise.security.ee.SecurityDeployer.linkPolicies(SecurityDeploye
r.java:318)

              at
com.sun.enterprise.security.ee.SecurityDeployer.access$100(SecurityDeployer.
java:87)

              at
com.sun.enterprise.security.ee.SecurityDeployer$AppDeployEventListener.event
(SecurityDeployer.java:145)

              at org.glassfish.kernel.event.EventsImpl.send(EventsImpl.java:
131)

              at
org.glassfish.internal.data.ApplicationInfo.load(ApplicationInfo.java:328)

              at
com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycl
e.java:496)

              at
com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(App
licationLoaderService.java:406)

              at
com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(Applicat
ionLoaderService.java:243)

              at
org.jvnet.hk2.internal.ClazzCreator.postConstructMe(ClazzCreator.java:298)

              at
org.jvnet.hk2.internal.ClazzCreator.create(ClazzCreator.java:346)

              at
org.jvnet.hk2.internal.SystemDescriptor.create(SystemDescriptor.java:456)

              at
org.glassfish.hk2.runlevel.internal.AsyncRunLevelContext.findOrCreate(AsyncR
unLevelContext.java:225)

              at
org.glassfish.hk2.runlevel.RunLevelContext.findOrCreate(RunLevelContext.java
:82)

              at
org.jvnet.hk2.internal.Utilities.createService(Utilities.java:2358)

              at
org.jvnet.hk2.internal.ServiceHandleImpl.getService(ServiceHandleImpl.java:9
8)

              at
org.jvnet.hk2.internal.ServiceHandleImpl.getService(ServiceHandleImpl.java:8
7)

             at
org.glassfish.hk2.runlevel.internal.CurrentTaskFuture$QueueRunner.oneJob(Cur
rentTaskFuture.java:1162)

              at
org.glassfish.hk2.runlevel.internal.CurrentTaskFuture$QueueRunner.run(Curren
tTaskFuture.java:1147)

              at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
10)

              at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
03)

              at java.lang.Thread.run(Thread.java:722)

]]



【Reason】

1). The policy file hasn’t been removed when stop the cluster or the policy
file shouldn’t be created when start the domain.



【Solution】

This is the reason why I regarded this scenario as a bug but not the
expected behavior. After all, I have the following idea to fix this bug

1) The policy file shouldn’t be created when load the application.

Can you check which phase this is done? In SecurityDeployer.load or prepare?

Jeremy Lv: Here’s the stack trace how to generate the policy file:

       [stack trace]:

SecurityDeployer.commitEjbs(Application) line: 284

SecurityDeployer.access$200(SecurityDeployer, Application) line: 87

SecurityDeployer$AppDeployEventListener.event(EventListener$Event) line: 146


EventsImpl.send(EventListener$Event, boolean) line: 131

ApplicationInfo.load(ExtendedDeploymentContext, ProgressTracker) line: 328


ApplicationLifecycle.deploy(Collection<Sniffer>, ExtendedDeploymentContext)
line: 496



[Conclution]:

The policy file is created after SecurityDeployer.load, it will be created
when execute “events.send(new
Event<ApplicationInfo>(Deployment.APPLICATION_LOADED, this), false)” in
ApplicationInfo.load, which means it will execute the internal method
AppDeployEventListener.event in SecurityDeployer. The policy file will be
created during the process to execute AppDeployEventListener.event. which
means the policy file has been created during the period of loading the
application info but not created during the period of SecurityDeployer.load.



2) The rollback action when failed in loading or starting the application
should remove the policy file which is created during the load process.

If the policy generation happens in SecurityDeployer.load, then the policy
removing should happen in SecurityDeployer.unload which will be executed as
part of the roll back.
Jeremy Lv: No, the the policy generation happens in SecurityDeployer.load,
so I think the policy removing should happen during the period of
EjbDeployer.clean as part of the roll back. I have attached the patch again
which is same to the patch I have attached at first time. The changes will
also resolve the issue of https://java.net/jira/browse/GLASSFISH-11348, but
I don’t know whether the changes will cause any regressions. So I wonder if
you can help me to review and confirm the patch.



Thanks



Jeremy



Thanks,

- Hong





Wish if you can give me some suggestions about this.



Thanks



Regards

Jeremy Lv





From: Hong Zhang [mailto:hong.hz.zhang_at_oracle.com]
Sent: Friday, January 24, 2014 12:20 AM
To: Jeremy Lv; marina.vatkina_at_oracle.com
Cc: dev_at_glassfish.java.net
Subject: Re: Strange phenomenon when failed in loading the ejb application



Hi, Jeremy

It is by design that we don't generate or delete any files for load and
unload. The file generation and deletion only happens during prepare and
clean phases as part of the deployment and undeployment.

In the case you described, as the cluster was offline when the initial
deployment happens, we could not detect this type of error that time to roll
back. And when the error happens during the cluster starts up (which is a
load for the application), we don't do roll back. So this is kind of
expected behavior with the current design. At that time you would have to
perform an undeploy operation to clean up the application bits explicitly.

Thanks,

- Hong

On 1/23/2014 6:29 AM, Jeremy Lv wrote:

Hi, Hong, Marina:

Cc: dev:





    I have found a strange situation recently in failing deploy the ejb
application:

Here’s my reproduced steps to reproduce this phenomenon:

1) asadmin start-domain

2) asadmin create-cluster clu1

3) asadmin create-instance -cluster clu1 ins1

4) asadmin deploy --target clu1 compositeweb-ejb.jar

5) Confirm whether there’s no file called compositeweb-ejb exist under
the directory of GF_HOME\nodes\localhost-domain1\ins1\generated\policy

6) asadmin start-cluster clu1

7) The file called compositeweb-ejb has been generated under the
directory of GF_HOME\nodes\localhost-domain1\ins1\generated\policy but
hasn’t been removed even it is failed in loading the ejb application



On the other hand, if you failed in deploying the ejb application after the
cluster has been started, the file called compositeweb-ejb under the
directory of GF_HOME\nodes\localhost-domain1\ins1\generated\policy will be
removed as expected:

1) asadmin start-domain

2) asadmin create-cluster clu1

3) asadmin create-instance -cluster clu1 ins1

4) asadmin start-cluster clu1

5) Confirm whether there’s no file called compositeweb-ejb exist under
the directory of GF_HOME\nodes\localhost-domain1\ins1\generated\policy

6) asadmin deploy --target clu1 compositeweb-ejb.jar

7) The file called compositeweb-ejb has been generated under the
directory of GF_HOME\nodes\localhost-domain1\ins1\generated\policy when
loading the application and it will be removed because of the failure of
deploying the ejb applications





【Conclusion】

After Looking into the method of EjbDeployer.clean(DeploymentContext), you
will found the following related code segment to illustrate the reason why
the scenario is different:

        //Security related cleanup is to be done for the undeploy event

◆ if( params.origin.isUndeploy()|| params.origin.isDeploy()) {



            //Removing EjbSecurityManager for undeploy case

            String appName = params.name();

            String[] contextIds =

                    ejbSecManagerFactory.getContextsForApp(appName, false);

            if (contextIds != null) {

                for (String contextId : contextIds) {

                    try {

                        //TODO:appName is not correct, we need the module
name

                        //from the descriptor.


probeProvider.policyDestructionStartedEvent(contextId);

◆ SecurityUtil.removePolicy(contextId);
// It will remove the policy file it this code has been executed.


probeProvider.policyDestructionEndedEvent(contextId);

                        probeProvider.policyDestructionEvent(contextId);

                    } catch (IASSecurityException ex) {

                        _logger.log(Level.WARNING, "Error removing the
policy file " +

                                "for application " + appName + " " + ex);

                    }



                    ArrayList<EJBSecurityManager> managers =

                            ejbSecManagerFactory.getManagers(contextId,
false);

                    if (managers != null) {

                        for (EJBSecurityManager m : managers) {

                            m.destroy();

                        }

                   }

                }

            }

            //Removing the RoleMapper

            SecurityUtil.removeRoleMapper(dc);

        }



(1). If you start the cluster after the ejb and web application had been
deployed, the value of params.origin will be “load” and it will return
false when execute “if( params.origin.isUndeploy()||
params.origin.isDeploy())”. The SecurityUtil.removePolicy(contextId) won’t
be executed, this is the reason why the policy can’t be removed when
execute the rollback action.

(2). I have also confirmed that if you deploy the web or ejb application
after the cluster is started, the value of params.origin will be
“deploy_instance” and it will return true when execute “if(
params.origin.isUndeploy()|| params.origin.isDeploy())”. Then it will
execute the SecurityUtil.removePolicy(contextId) to remove the related
policy file.





【The Key point question】

Why the scenario in rollback action is different between load and deploy the
ejb application, why remove the policy file when failed in deploying the ejb
application after cluster has been started but still keep the policy while
starting the cluster after the ejb application has been deployed.



【Conclusion】

I don’t know whether it is the right scenario or it is a bug, In my human
option, I think it is a bug,The policy file should be removed during the
rollback action both in failing loading the application and deploying the
application to consist these two different scenario.



Please also let me know if it is a right scenario.



Thanks a lot!



Best regards

Jeremy Lv

--------------------------------------------------

Lv Songping

Software Division II

    Development Department I

Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST)

ADDR.: No.6 Wenzhu Road, Software Avenue,

        Nanjing, 210012, China

TEL : +86+25-86630566-9327

COINS: 7998-9327

FAX : +86+25-83317685

MAIL : lvsongping_at_cn.fujitsu.com

BLOG : https://www.java.net//author/jeremy-lv
<https://www.java.net/author/jeremy-lv>