admin@glassfish.java.net

GFv3 bugs/illnesses/oddities

From: Jens Elkner <jel_at_cs.uni-magdeburg.de>
Date: Sat, 13 Mar 2010 05:38:28 +0100

Hi,

just trying to get started with GFv3/SGFv3-ES on Solaris 10+, but somehow
it gives me the impression, that at least the pkg stuff is pretty bogus
and nobody really tried to review the setup procedures wrt. to a real
Administrator's point of view, i.e. with security/SMF in mind.

So to make it a little bit more understandable and to be able to point
to the annoying stuff, here my idea of an "ideal" GF setup which lets
an admin feel to deal with regular, "no headaches creating" Software
(for less clutter/lazyness I use '+' below in place/as alias for pfexec,
which in turn assumes profiles=Primary Administrator/roles=root):

E.g. vars used for better understanding:

INSTALL_HOME = /opt/glassfish/v3
DATA_DIR = /var/data/gf3
AS_USER = webservd
DOMAIN = jforum

(1) Install GF software (SW) as root or privileged user (RBAC SW admin)
    using the command line (i.e. no GUI!!!)
    Command:
        + sh /tmp/sges-v3-unix.sh -s -a /tmp/answer
    Result:
        Everything should be placed below $INSTALL_HOME with owner 'bin'
        and group 'bin' or 'root:sys' ownership, so that no unprivileged
        user is able to modify or manipulate the software/installation.

(2) Install GF updates as root or privileged user (RBAC SW admin)
    using the command line (i.e. no GUI!!!)
    Command:
        + $INSTALL_HOME/bin/pkg image-update
    Result:
        a) Wenn it runs, it should state
           - what stuff (module/package/etc.) gets downloaded
           - where this stuff gets stored (since it obviously gets never
             removed [what a pain] an admin needs to know, what a cron job
             needs to do, to keep the system clean and how to prevent
             blindly waisting backup ressources)
           - what gets replaced
        b) When it encounters a problem, it should clearly state, what the
           problem is - generic messages are usually not helpful!
        c) Installed/replaced files should have 'bin:bin' or 'root:sys'
           ownership (see (1))

(3) Install add. GF modules/package as root or privileged user (RBAC SW
    admin) using the command line (i.e. no GUI!!!)
    Command:
        + $INSTALL_HOME/bin/pkg install ...
    Result:
        see (2)

(4.0) Install a domain as root or privileged user (RBAC SW admin)
      using the command line (i.e. no GUI!!!)
    Command:
        + $INSTALL_HOME/bin/asadmin create-domain \
            --domaindir $DATA_DIR --user admin \
            --instanceport 80 --domainproperties http.ssl.port=443 \
            --asuser $AS_USER \
            $DOMAIN
    Result:
        $DATA_DIR/$DOMAIN where only the files and directories have
        ownership of $AS_USER (deduced from /etc/passwd), which really
        need write permissions by the domain instance process (which
        should be assumed to run as user $AS_USER). Everything else should
        have ownership of 'root:root' or the RBAC role in action.

(4.1) As an alternative (to 4.0) Install a domain as unprivileged user
      (e.g. $AS_USER) using the command line (i.e. no GUI!!!)
    Command:
        $INSTALL_HOME/bin/asadmin create-domain \
            --domaindir $DATA_DIR --user admin \
            --instanceport 80 --domainproperties http.ssl.port=443 \
            [--netprivileged] \
            $DOMAIN
    Result:
        - Fail with an appropriate error message if $DATA_DIR is not
          writable by $AS_USER
        - if private ports are used:
            - if '--netprivileged' is given, assume that the user is aware
              of the fact, that 'net_privaddr' privilege is required to
              run the instance properly (so this option is basically
              required to be able to batch setups, not annoy the user)
            - otherwise emit a warning message and ask, whether to continue
              (e.g. "WARNING: 'net_privaddr' privilege is required to be
              able to bind to private ports (0-1023). At the moment
              $AS_USER does not have this privilege and thus the instance
              will not work. Continue anyway [y/N]:")
        - $DATA_DIR/$DOMAIN where all files and directories are owned by
          $AS_USER

(5.0) Create the SMF service as root or privileged user (RBAC SW admin)
      using the command line (i.e. no GUI!!!)
    Command:
        + $INSTALL_HOME/bin/asadmin create-service \
            --domaindir $DATA_DIR $DOMAIN
    Result:
         a) if /var/svc/manifest/application/GlassFish does not yet
            exist, create it with ownership root:sys and 0755
            permissions.
         b) MANIFEST=/var/svc/manifest/application/GlassFish/$DOMAIN.xml
            with ownership root:sys and 0644 permissions. If that file
            already exists, bail out with an appropriate error message
            (e.g. $MANIFEST already exists. Please remove that service
            first or choose another name ...)
         c) $MANIFEST's start and stop method context should have
            appropriate values, which reflect this instance. I.e.
            - working_directory = $DATA_DIR/$DOMAIN
            - method_credential user="$AS_USER"
              NOTE: since the instance must be able to write to its log
              directory, $AS_USER can be deduced from the ownership of
              $DATA_DIR/$DOMAIN/logs, i.e. there is not really a need
              for an --asuser switch wrt. create-service cmd (but might
              be beneficial for other use cases).
            - set privileges unconditionally to:
                  "basic,!proc_session,!proc_info,!file_link_any"
            - if $AS_USER != "root" && $privatePortsInUse add
              'net_privaddr' to the privileges unconditionally
              NOTE: even if $AS_USER == "root" GF should never run with
                    all available OS/role/user privileges. For the rare
                    cases, where add. privileges are really required,
                    they should be explicitly added by editing $MANIFEST
                    or passing them via a TBD switch (e.g.
                    --addprivs=proc_info[:...])
         d) add an appropriate property group to the $MANIFEST's instance.
            E.g.
            <property_group name='general' type='framework'>
                <propval name='action_authorization' type='astring'
                    value='solaris.smf.manage.glassfish' />
                <propval name='value_authorization' type='astring'
                    value='solaris.smf.manage.glassfish' />
            </property_group>
            For more fine grained permissions,
            solaris.smf.manage.glassfish.$DOMAIN might be also an option
         e) Optional: document the new auths if not already done. E.g.:
            echo 'solaris.smf.manage.glassfish.:::GF Management' \
>>/etc/security/auth_attr,
            emit a message, that the new auths (which) have been created
         f) import the $MANIFEST
         g) emit svcadm|svccfg hints

(5.1) As an alternative to (5.0) - Create an appropriate Manifest ready
      for SMF import as unprivileged user (e.g. $AS_USER) using the
      command line (i.e. no GUI!!!)
    Command:
        $INSTALL_HOME/bin/asadmin create-service \
            [--out $file] \
            --domaindir $DATA_DIR $DOMAIN
    Result:
        a) if --out is given, set MANIFEST=$file
           otherwise use MANIFEST=$DATA_DIR/$DOMAIN/manifest.xml
        b) same as (5.0) c)
        c) same as (5.0) d)
        d) emit a hint, where one should put the manifest and how to
           import it. E.g.:
           'Please ask your system administrator to import the SMF
            manifest "$MANIFEST", so that this instance gets
            automatically stopped/started on system shutdown/boot
            respectively using the following commands:

            cp $MANIFEST /var/svc/manifest/application/GlassFish/$DOMAIN.xml
            svccfg import /var/svc/manifest/application/GlassFish/$DOMAIN.xml
            svcadm enable glassfish/$DOMAIN

            Also you may ask your Admin to allow you to manage glassfish
            instances by adding solaris.smf.manage.glassfish.* to your
            auths. E.g.:
            usermod -A 'solaris.smf.manage.glassfish.*,'`auths $USER` $USER
           '
        e) same as (5.0) g)


Now to the problems/bugs/oddities I encountered during setup, which
actually lead to the suggestion above:

A)
  First illness wrt. Installation (1) is the requirement to define the
  parameter for a domain, which is not required at all and gets usually
  never used. Is one actually able to explain, why domain1 is required?
  I know a lot of admins, which hate software, which daubs into software
  installation directories. Creating domain1 within it is really an
  invitation to break the system, because many users install it as root
  (as normal behaving software expect it), than they wonder, why it does
  not start/work and that's why they start it as root. Crazy!
  Unexperienced users than say, well - it's running - have not a good
  feeling, but at least 'I got that beast running' ...
  BTW: Avoiding the domain creation would also make the SW installation
  much simpler, natural and would make the more or less complex part of
  creating answer files superfluous ...

B)
  The installation wrt. to dir/file ownership seems to be severely broken.
  Even when installing as root, all files/dirs belong to root:root and
  thus 'pkg verify' spits out a lot of warnings like:
    file: glassfish/modules/websecurity.jar
        Owner: 'root (0)' should be 'nobody (60001)'
        Group: 'root (0)' should be 'sys (3)'
        Size: 35571 bytes should be 35229
        Hash: 8878e9c597c99fedb96ef119370c7851e3b17b23 should be 6f76a152b3b7a37045cf45913d966d878ad2c400

  or
    file: pkg/bin/depot.py
        Group: 'root (0)' should be 'sys (3)'
        Mode: 0644 should be 0664
        Timestamp: 20100312T044808Z should be 20091008T194854Z
  
  Even so it does not make any sense at all, why right after
  installation files have a different size and hash as expected. Even
  simple things like timestamps or file mode don't match. At least
  Security Officers become at this point pretty nervous ...

C)
  Next thing is the bad choice of ownership: whereby group sys might be
  ok, user 'nobody' is NOT AT ALL! This completely ignores decades of
  common pratice to run processes like webservers as user nobody to
  reduce the risk, that even malefunctioning software is able to harm the
  system. IMHO the correct ownership should be bin:bin as for all normal
  Solaris packages and gives RBAC Softadmin users the ability to update
  the software without the need for getting root privileges.

D)
  'pkg fix' seems to be pretty dumb. I can't see the need to download
  ~90MB (about 2500 files) just to fix permissions. Normal Solaris
  packages have a pkgmap file, which contains all required info.
  I guess, the GF packages have a similar file - so if the 'fix' doesn't
  trust it, wouldn't it be sufficient to download the "pkgmap", only?

E)
  Not sure, whether 'pkg fix' is broken or the packages itself (probably
  both), but after a 'pkg fix' has been done, 'pkg verify' still ERRORs
  with messages like:

  pkg:/pkg ERROR
      file: pkg/vendor-packages/pkg/nrlock.pyc
          Group: 'root (0)' should be 'sys (3)'
          Mode: 0644 should be 0664
          Size: 1715 bytes should be 1712
          Hash: 691f994b72ae781dce36b9227516d847d7296837 should be
  f9bf2cdc7394275fe3e457394c52cdb2fd077ec2
  ...
  pkg:/glassfish-appclient ERROR
      dir: bin
          Owner: 'root (0)' should be 'nobody (60001)'
  ...
  pkg:/mq-config-gf ERROR
      file: mq/etc/imqenv.conf
          Owner: 'root (0)' should be 'nobody (60001)'
          Group: 'root (0)' should be 'sys (3)'

F)
  Either 'pkg list' or 'pkg image-update' is broken:
  + bin/pkg list -v | fgrep 'u--'
  pkg:/felix_at_2.0.2,0-0:20091203T054540Z installed u---
  pkg:/glassfish-appclient_at_3.0,0-74.2:20091203T061841Z installed u---
  pkg:/glassfish-cmp_at_3.0,0-74.2:20091203T060956Z installed u---
  pkg:/glassfish-common_at_3.0,0-74.2:20091203T062350Z installed u---
  pkg:/glassfish-common-full_at_3.0,0-74.2:20091203T060704Z installed u---
  ...

  but: + bin/pkg image-update -v
Creating Plan / Before evaluation:
UNEVALUATED:

After evaluation:
Actuators:

No updates available for this image.

  Strange, actually I have no clue, neither whether it is possible to
  update packages nor how to do it. Even when starting domain1 and
  registering the UI show 48 package updates available, but if one
  changes to the 'Available Updates' tab, after thinking "an hour" it
  displays an empty table.

  BTW: Wondering, what takes so much time to just display something:
  Is the dojo stuff so awefully slow (probably) or is it the package
  stuff, running behind it. My logik tells me, if I already have
  determined, that 48 packages need an update, I must also have all
  relevant package infos, which lead to that result. So why does it
  take centuries to get displayed - must be slow UI/dojo stuff ...

G)
  'asadmin create-domain' tries to be intelligent, what one gets is the
  opposite. E.g. an admin switches to the unprivileged user under which
  the instance should be running to avoid screwig up file permissions
  which would lead to a non functioning instance (e.g. dueto
  chmod 600 config/* ; etc.):

    /opt/glassfish/v3/bin/asadmin --user admin create-domain \
        --domaindir /data/sites/glassfish/v3 \
        --instanceport 80 --domainproperties http.ssl.port=443 \
        jforum

  Gives:

Enter the admin password [Enter to accept default of no password]>
Enter the admin password again>
Enter the master password [Enter to accept default password "changeit"]>
Enter the master password again>
You do not have permission to use port 80 for jforum. Try a different
port number or login to a more privileged account.
CLI130 Could not create domain, jforum
Command create-domain failed.

  a) First annoying thing, I've to enter something 4 time, before the SW
     recognizes an "error".

  b) IMHO the SW designer makes wrong assumptions here: Not the user but
     the SERVICE needs net_privaddr privileges. And if a normal user, e.g.
     student should be able to manage this service, it needs the required
     auths to manage that service (e.g. solaris.smf.manage.glassfish.* as
     shown in (5.1) c),d)) - not less but also not more!

     To second this: Think! Why should an admin allow an arbitrary user to
     run any processes similar to smbd/nmbd/ftp/imap*/pop*/dhcp*/ldap/
     finger/nntpd/rlogind etc. on private ports?
     Just because he should be able to start/stop the glassfish instance?

     So the designer's logik is severely broken wrt. security!
     In this case asadmin should really WARN the user and ask, whether it
     should continue to setup the domain in question as suggested in (5.1),
     and if he says 'yes, I know', it should set it up as desired - no
     net_privaddr privileges are required for this step.

     BTW: Yes, the user might be smart enough to take this hurdle using
     the option '--checkports false', but this is a really bad compromise
     because the port overlapping check will not be made anymore! So a
     different switch (e.g. --netprivileged) is required to accomplish,
     what the normal user really desires.

  c) Assuming the unprivileged user got its domain created (e.g. by
     using the checkports option, asking the admin, etc.)
     the next illness gets uncovered: E.g:

                /opt/glassfish/v3/bin/asadmin create-service \
                        --domaindir /data/sites/glassfish/v3 jforum

The user [webservd] does not have permission to create the service
manifest related files and directories at
[/var/svc/manifest/application/GlassFish/]. This structure is required
per SMF guidelines. Either become super-user to do this operation or
contact the System Administrator to explicitly get the relevant
permissions and try again.
Usage: asadmin [asadmin-utility-options] create-service [--name <name>]
    [--serviceproperties <serviceproperties>]
    [--dry-run[=<dry-run(default:false)>]] [--domaindir <domaindir>]
    [-?|--help[=<help(default:false)>]] [domain_name]
Command create-service failed.

     First hmm: Why is this message spoiled with the usage info?
     2nd hmmm: A more or less relaxed admin might say - oh know,
     don't bother me all the time with your crap:
     mkdir -p /var/svc/manifest/application/GlassFish/
     chown student:sgid /var/svc/manifest/application/GlassFish
     and tell him 'send me an email, when you changed something'.

     Student tries it again:

The user [student] does not seem to have adequate authorizations
[solaris.smf.*] on this System to create and configure an SMF service.
The authorizations available are
[solaris.device.cdrw,solaris.profmgr.read,solaris.jobs.users,solaris.mail.mailq,solaris.admin.usermgr.read,solaris.admin.logsvc.read,solaris.admin.fsmgr.read,solaris.admin.serialmgr.read,solaris.admin.diskmgr.read,solaris.admin.procmgr.user,solaris.compsys.read,solaris.admin.printer.read,solaris.admin.prodreg.read,solaris.admin.dcmgr.read,solaris.snmp.read,solaris.project.read,solaris.admin.patchmgr.read,solaris.network.hosts.read,solaris.admin.volmgr.read
].
See smf_security(5), rbac(5).

Usage: asadmin [asadmin-utility-options] create-service [--name <name>]
    [--serviceproperties <serviceproperties>]
    [--dry-run[=<dry-run(default:false)>]] [--domaindir <domaindir>]
    [-?|--help[=<help(default:false)>]] [domain_name]
Command create-service failed.
         
     First hmmm: Why gets the message spoiled with the usage info?
     2nd hmmm: Oh my goodness! I need to bother the admin again.
     3rd hmm: the admin starts asking itself, what kind of strange
     software is it, which requires solaris.smf.* just to be able to
     create a manifest ...

     So the SW designer made another fault: He tries to solve
     two different problems at once (create the manifest, import it).
    
     IMHO the correct and less annoying way is:
     1) create the manifest as described in (5.0) c) and d)
     2) try to copy it to /var/svc/manifest/application/GlassFish/$DOMAIN.xml
        if that fails
            a) store it as $DATA_DIR/$DOMAIN.xml
            b) emit a warning 'Unable to save manifest as ....'
               and a message as suggested in (5.1) d) and exit
     3) try to import the manifest
        if that fails, emit a warning like
        'Unable to import manifest $MANIFEST (insufficient auths).' and
        add a message similar to (5.1) d) but without the copy instruction
        and exit

     The hint wrt. /var/svc/manifest/application/GlassFish/ permission
     is IMHO completely misleading and dangerous as well - so vaporize
     it completely!

     Furthermore the error message wrt. to solaris.smf.* auths required
     is also misleading. The unexperienced user/admin @home interpretes
     this actually as an instruction to assign those auths to the given
     user, which is plain wrong (analog to the net_privaddr problem).

  d) Glassfish or any other webservice usually does not need the
     privileges to create hard links to files owned by a UID different
     from the process's effective UID. Also it usually does not need
     to examine the status of processes other than its own sub
     processes or to send signals or trace processes outside its own
     session. So file_link_any, proc_info and proc_session privileges
     should be removed from the service's 'basic' privilege set (see
     (5.0) c)).


  So the whole point here: Solaris provides really a lot of features
  obsoleting the "'all' or 'nothing'" weaknesses. SW should help
  user and admins to make use of it in a correct manner and make others
  aware of, that there is actually an OS, which allows hosting services
  in a more secure/fine grained way, than they know it from others.

That's all for now.

Thanx for your attention,
jel.

PS: X-posting to admin_at_glassfish.dev.java.net and
quality-feedback_at_glassfish.dev.java.net since I'm not sure, which one
is the correct one. Please point it into the right direction, of necessary.
-- 
Otto-von-Guericke University     http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany         Tel: +49 391 67 12768