dev@glassfish.java.net

Re: Strange phenomenon when failed in loading the ejb application

From: Hong Zhang <hong.hz.zhang_at_oracle.com>
Date: Mon, 27 Jan 2014 08:42:12 -0500

Hi, Jeremy

I see, so the policy generation is done as part of the listening to the
load event.

Your patch looks fine to me. But it will be good for someone from ejb
team and/or security team to take a look to make sure it will not cause
any side effect.

You should run ejb dev tests to make sure all tests pass in the meantime.

Thanks,

- Hong

On 1/26/2014 10:10 PM, Jeremy Lv wrote:
>
> Hi,Hong:
>
> Sorry for replying the mail later as I’m a little busy during the
> weekends, here’s the further investigation in line:
>
> *From:*Hong Zhang [mailto:hong.hz.zhang_at_oracle.com]
> *Sent:* Saturday, January 25, 2014 11:12 AM
> *To:* dev_at_glassfish.java.net
> *Cc:* marina.vatkina_at_oracle.com; Jeff Tancill
> *Subject:* Re: Strange phenomenon when failed in loading the ejb
> application
>
> Hi, Jeremy
>
> Please see my comments in line:
>
> On 1/24/2014 3:08 AM, Jeremy Lv wrote:
>
> Hi, Hong:
>
> Cc: Marina, Jeff:
>
> Here’s my Comment:
>
> It is by design that we don't generate or delete any files for
> load and unload. The file generation and deletion only happens
> during prepare and clean phases as part of the deployment and
> undeployment.
> Jeremy Lv: If it is the right scenario that we shouldn’t generate
> or delete any files for load and unload, the phenomenon I have
> describe is also different from the right scenario. The Ejb
> related policy file will be created when execute the
> appInfo.load(context, tracker) in the method of
> ApplicationLifecycle.deploy() when start the cluster after the ejb
> application has been deployed.
>
> In the case you described, as the cluster was offline when the
> initial deployment happens, we could not detect this type of error
> that time to roll back. And when the error happens during the
> cluster starts up (which is a load for the application), we don't
> do roll back. So this is kind of expected behavior with the
> current design. At that time you would have to perform an undeploy
> operation to clean up the application bits explicitly.
>
> Jeremy Lv:I just feel it is strange that the ejb related policy
> file is created during the cluster starts up (which is a load for
> the application), why not rollback the load process when the
> cluster started(Delete the ejb related policy file created during
> the load process)?
>
> In general, the generated artifacts are synced over from DAS to
> cluster when the (offline) cluster starts up instead of being
> generated as part of the cluster start up. But maybe policy system
> does things differently here. If the policy is generated as part of
> the cluster start up in load phase, probably we should clean it up as
> part of the roll back in unload phase.
>
> This may cause the some critical or blocker issues:
>
> 【Sample】
>
> Here’s one of the similar blocker issue still opened in the JIRA
>
> https://java.net/jira/browse/GLASSFISH-11348
>
> Here’s the reproduced steps:
>
> 1)asadmin start-domain
>
> 2)asadmin create-cluster clu1
>
> 3)asadmin create-instance --cluster clu1 ins1
>
> 4)asadmin deploy --target clu1 compositeweb-ejb-1.0-SNAPSHOT.jar
>
> 5)asadmin deploy --target clu1 compositeweb-web-1.0-SNAPSHOT.war
>
> 6)asadmin start-cluster clu1
>
> 7)asadmin stop-cluster clu1
>
> 8)asadmin start-cluster clu1
>
> After step(8), the following exception will be thrown out:
>
> [2014-01-24T15:52:04.855+0800] [glassfish 4.0] [SEVERE]
> [NCLS-CORE-00026] [javax.enterprise.system.core] [tid: _ThreadID=17
> _ThreadName=RunLevelControllerThread-1390549911763] [timeMillis:
> 1390549924855] [levelValue: 1000] [[
>
> Exception during lifecycle processing
>
> org.glassfish.deployment.common.DeploymentException: Error in linking
> security policy for compositeweb-web-1.0-SNAPSHOT -- Inconsistent
> Module State
>
> at
> com.sun.enterprise.security.ee.SecurityUtil.linkPolicyFile(SecurityUtil.java:336)
>
> at
> com.sun.enterprise.security.ee.SecurityDeployer.linkPolicies(SecurityDeployer.java:318)
>
> at
> com.sun.enterprise.security.ee.SecurityDeployer.access$100(SecurityDeployer.java:87)
>
> at
> com.sun.enterprise.security.ee.SecurityDeployer$AppDeployEventListener.event(SecurityDeployer.java:145)
>
> at org.glassfish.kernel.event.EventsImpl.send(EventsImpl.java:131)
>
> at
> org.glassfish.internal.data.ApplicationInfo.load(ApplicationInfo.java:328)
>
> at
> com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:496)
>
> at
> com.sun.enterprise.v3.server.ApplicationLoaderService.processApplication(ApplicationLoaderService.java:406)
>
> at
> com.sun.enterprise.v3.server.ApplicationLoaderService.postConstruct(ApplicationLoaderService.java:243)
>
> at
> org.jvnet.hk2.internal.ClazzCreator.postConstructMe(ClazzCreator.java:298)
>
> at org.jvnet.hk2.internal.ClazzCreator.create(ClazzCreator.java:346)
>
> at
> org.jvnet.hk2.internal.SystemDescriptor.create(SystemDescriptor.java:456)
>
> at
> org.glassfish.hk2.runlevel.internal.AsyncRunLevelContext.findOrCreate(AsyncRunLevelContext.java:225)
>
> at
> org.glassfish.hk2.runlevel.RunLevelContext.findOrCreate(RunLevelContext.java:82)
>
> at org.jvnet.hk2.internal.Utilities.createService(Utilities.java:2358)
>
> at
> org.jvnet.hk2.internal.ServiceHandleImpl.getService(ServiceHandleImpl.java:98)
>
> at
> org.jvnet.hk2.internal.ServiceHandleImpl.getService(ServiceHandleImpl.java:87)
>
> at
> org.glassfish.hk2.runlevel.internal.CurrentTaskFuture$QueueRunner.oneJob(CurrentTaskFuture.java:1162)
>
> at
> org.glassfish.hk2.runlevel.internal.CurrentTaskFuture$QueueRunner.run(CurrentTaskFuture.java:1147)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
> at java.lang.Thread.run(Thread.java:722)
>
> ]]
>
> 【Reason】
>
> 1). The policy file hasn’t been removed when stop the cluster or the
> policy file shouldn’t be created when start the domain.
>
> 【Solution】
>
> This is the reason why I regarded this scenario as a bug but not the
> expected behavior. After all, I have the following idea to fix this bug
>
> 1)The policy file shouldn’t be created when load the application.
>
> Can you check which phase this is done? In SecurityDeployer.load or
> prepare?
> Jeremy Lv: Here’s the stack trace how to generate the policy file:
>
> [stack trace]:
>
> SecurityDeployer.commitEjbs(Application) line: 284
>
> SecurityDeployer.access$200(SecurityDeployer, Application) line: 87
>
> SecurityDeployer$AppDeployEventListener.event(EventListener$Event)
> line: 146
>
> EventsImpl.send(EventListener$Event, boolean) line: 131
>
> ApplicationInfo.load(ExtendedDeploymentContext, ProgressTracker) line:
> 328
>
> ApplicationLifecycle.deploy(Collection<Sniffer>,
> ExtendedDeploymentContext) line: 496
>
> [Conclution]:
>
> The policy file is created after SecurityDeployer.load, it will be
> created when execute “events.send(new
> Event<ApplicationInfo>(Deployment.APPLICATION_LOADED, this), false)”
> in ApplicationInfo.load, which means it will execute the internal
> method AppDeployEventListener.event in SecurityDeployer. The policy
> file will be created during the process to execute
> AppDeployEventListener.event. which means the policy file has been
> created during the period of loading the application info but not
> created during the period of SecurityDeployer.load.
>
> 2)The rollback action when failed in loading or starting the
> application should remove the policy file which is created during the
> load process.
>
> If the policy generation happens in SecurityDeployer.load, then the
> policy removing should happen in SecurityDeployer.unload which will be
> executed as part of the roll back.
> Jeremy Lv: No, the the policy generation happens in
> SecurityDeployer.load, so I think the policy removing should happen
> during the period of EjbDeployer.clean as part of the roll back. I
> have attached the patch again which is same to the patch I have
> attached at first time. The changes will also resolve the issue of
> https://java.net/jira/browse/GLASSFISH-11348,but I don’t know whether
> the changes will cause any regressions. So I wonder if you can help me
> to review and confirm the patch.
>
> Thanks
>
> Jeremy
>
> Thanks,
>
> - Hong
>
> Wish if you can give me some suggestions about this.
>
> Thanks
>
> Regards
>
> Jeremy Lv
>
> *From:*Hong Zhang [mailto:hong.hz.zhang_at_oracle.com]
> *Sent:* Friday, January 24, 2014 12:20 AM
> *To:* Jeremy Lv; marina.vatkina_at_oracle.com
> <mailto:marina.vatkina_at_oracle.com>
> *Cc:* dev_at_glassfish.java.net <mailto:dev_at_glassfish.java.net>
> *Subject:* Re: Strange phenomenon when failed in loading the ejb
> application
>
> Hi, Jeremy
>
> It is by design that we don't generate or delete any files for load
> and unload. The file generation and deletion only happens during
> prepare and clean phases as part of the deployment and undeployment.
>
> In the case you described, as the cluster was offline when the initial
> deployment happens, we could not detect this type of error that time
> to roll back. And when the error happens during the cluster starts up
> (which is a load for the application), we don't do roll back. So this
> is kind of expected behavior with the current design. At that time you
> would have to perform an undeploy operation to clean up the
> application bits explicitly.
>
> Thanks,
>
> - Hong
>
> On 1/23/2014 6:29 AM, Jeremy Lv wrote:
>
> Hi, Hong, Marina:
>
> Cc: dev:
>
> I have found a strange situationrecentlyin failing deploy the ejb
> application:
>
> Here’s my reproduced steps to reproduce this phenomenon:
>
> 1)asadmin start-domain
>
> 2)asadmin create-cluster clu1
>
> 3)asadmin create-instance --cluster clu1 ins1
>
> 4)asadmin deploy --target clu1 compositeweb-ejb.jar
>
> 5)Confirm whether there’s no file called compositeweb-ejb exist
> under the directory of
> GF_HOME\nodes\localhost-domain1\ins1\generated\policy
>
> 6)asadmin start-cluster clu1
>
> 7)The file called compositeweb-ejb has been generated under the
> directory of GF_HOME\nodes\localhost-domain1\ins1\generated\policy
> but hasn’t been removed even it is failed in loading the ejb
> application
>
> On the other hand, if you failed in deploying the ejb application
> after the cluster has been started, the file called
> compositeweb-ejb under the directory of
> GF_HOME\nodes\localhost-domain1\ins1\generated\policy will be
> removedas expected:
>
> 1)asadmin start-domain
>
> 2)asadmin create-cluster clu1
>
> 3)asadmin create-instance --cluster clu1 ins1
>
> 4)asadmin start-cluster clu1
>
> 5)Confirm whether there’s no file called compositeweb-ejb exist
> under the directory of
> GF_HOME\nodes\localhost-domain1\ins1\generated\policy
>
> 6)asadmin deploy --target clu1 compositeweb-ejb.jar
>
> 7)The file called compositeweb-ejb has been generated under the
> directory of GF_HOME\nodes\localhost-domain1\ins1\generated\policy
> when loading the application and it will be removed because of the
> failure of deploying the ejb applications
>
> 【Conclusion】
>
> After Looking into the method of
> EjbDeployer.clean(DeploymentContext), you will found the following
> related code segmentto illustrate the reason why the scenario is
> different:
>
> //Security related cleanup is to be done for the undeploy event
>
> ◆if( params.origin.isUndeploy()|| params.origin.isDeploy()) {
>
> //Removing EjbSecurityManager for undeploy case
>
> String appName = params.name();
>
> String[] contextIds =
>
> ejbSecManagerFactory.getContextsForApp(appName, false);
>
> if (contextIds != null) {
>
> for (String contextId : contextIds) {
>
> try {
>
> //TODO:appName is not correct, we need the module name
>
> //from the descriptor.
>
> probeProvider.policyDestructionStartedEvent(contextId);
>
> ◆SecurityUtil.removePolicy(contextId); // It will remove the
> policy file it this code has been executed.
>
> probeProvider.policyDestructionEndedEvent(contextId);
>
> probeProvider.policyDestructionEvent(contextId);
>
> } catch (IASSecurityException ex) {
>
> _logger.log(Level.WARNING, "Error removing the policy file " +
>
> "for application " + appName + " " + ex);
>
> }
>
> ArrayList<EJBSecurityManager> managers =
>
> ejbSecManagerFactory.getManagers(contextId, false);
>
> if (managers != null) {
>
> for (EJBSecurityManager m : managers) {
>
> m.destroy();
>
> }
>
> }
>
> }
>
> }
>
> //Removing the RoleMapper
>
> SecurityUtil.removeRoleMapper(dc);
>
> }
>
> (1). If you start the cluster after the ejb and web application
> had been deployed, the value of params.origin will be “load”and it
> will return false when execute “if( params.origin.isUndeploy()||
> params.origin.isDeploy())”. The
> SecurityUtil.removePolicy(contextId) won’t be executed, this is
> the reason why the policy can’t be removed when execute the
> rollback action.
>
> (2). I have also confirmed that if you deploy the web or ejb
> application after the cluster is started, the value of
> params.origin will be “deploy_instance”and it will return true
> when execute “if( params.origin.isUndeploy()||
> params.origin.isDeploy())”. Then it will execute the
> SecurityUtil.removePolicy(contextId) to remove the related policy
> file.
>
> 【The Key point question】
>
> Why the scenario in rollback action is different between load and
> deploy the ejb application, why remove the policy file when failed
> in deploying the ejb application after cluster has been started
> but still keep the policy while starting the cluster after the ejb
> application has been deployed.
>
> 【Conclusion】
>
> I don’t know whether it is the right scenario or it is a bug, In
> my human option, I think it is a bug,The policy file should be
> removed during the rollback action both in failing loading the
> application and deploying the application to consist these two
> different scenario.
>
> Please also let me know if it is a right scenario.
>
> Thanks a lot!
>
> Best regards
>
> Jeremy Lv
>
> --------------------------------------------------
>
> Lv Songping
>
> Software Division II
>
> Development Department I
>
> Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST)
>
> ADDR.: No.6 Wenzhu Road, Software Avenue,
>
> Nanjing, 210012, China
>
> TEL : +86+25-86630566-9327
>
> COINS: 7998-9327
>
> FAX : +86+25-83317685
>
> MAIL : lvsongping_at_cn.fujitsu.com <mailto:lvsongping_at_cn.fujitsu.com>
>
> BLOG : https://www.java.net//author/jeremy-lv
> <https://www.java.net/author/jeremy-lv>
>