dev@glassfish.java.net

Re: start-domain fails to finish for 600 seconds

From: Richard S. Hall <heavy_at_ungoverned.org>
Date: Fri, 18 Feb 2011 10:39:47 -0500

On 2/18/11 10:32, Ludovic Champenois wrote:
> On 2/18/11 7:17 AM, Richard S. Hall wrote:
>> On 2/18/11 9:52, Tom Mueller wrote:
>>> Vince,
>>> I appears that the Felix is not properly detecting that the
>>> directory moved.
>>>
>>> It looks like it is seeing the "new" modules at the new directory
>>> name, but every new module looks like a duplicate of the old module
>>> at the old directory name. Felix is not realizing that the old
>>> modules are now gone.
>>
>> Felix is not involved in such issues, it is the framework launcher
>> that tries to re-deploy bundles, so it would be the one that would
>> need to detect it.
>>
>> -> richard
> Yep...Bad experience anyway...When I detect this case in Eclipse
> (upgrade of the GF runtime plugin for example), I clean the osgi-cache
> area...
>
> I think the fix should be extensive:
>
> 1/ make sure a domain cannot be accessed at the same time with 2
> server installations (not done currently, and I am pretty sure ramdom
> things would happen on the saved domain.xml used by 2 servers):
> - put a lock file (like most products that access possible not
> sharable resources (i.e eclipse, openoffice...)

The version of the Felix framework in GFv3.1 does now use a lock
> - put in the domain somewhere the location of the server that last
> used the domain.
> - at start up, if the used server location is not the same as the
> current server location, flush the osgi-cache area.
>

That might work.

What we could also do is modify our launcher so that it doesn't use the
file system path as the location of the bundle, but perhaps just the JAR
file name (we could also peek inside the JAR and get the bundle symbolic
name, but there is a performance issue here). Then it wouldn't matter if
the directory was moved or renamed. We'd still have an issue with File
Install, though, which also uses the file system location.

-> richard

> Only trouble would be stalled lock file (i.e after a crash or a ctrl
> -c that did not shutdown correctly the server... We would need to ask
> a question to the user? Continue and unlock? Stop?
>
> Ludo
>