[Fwd: GFv3 bugs/illnesses/oddities].eml Subject: GFv3 bugs/illnesses/oddities From: Jens Elkner Date: Sat, 13 Mar 2010 05:38:28 +0100 To: dev@updatecenter.dev.java.net CC: quality-feedback@glassfish.dev.java.net, admin@glassfish.dev.java.net Hi, just trying to get started with GFv3/SGFv3-ES on Solaris 10+, but somehow it gives me the impression, that at least the pkg stuff is pretty bogus and nobody really tried to review the setup procedures wrt. to a real Administrator's point of view, i.e. with security/SMF in mind. So to make it a little bit more understandable and to be able to point to the annoying stuff, here my idea of an "ideal" GF setup which lets an admin feel to deal with regular, "no headaches creating" Software (for less clutter/lazyness I use '+' below in place/as alias for pfexec, which in turn assumes profiles=Primary Administrator/roles=root): E.g. vars used for better understanding: INSTALL_HOME = /opt/glassfish/v3 DATA_DIR = /var/data/gf3 AS_USER = webservd DOMAIN = jforum (1) Install GF software (SW) as root or privileged user (RBAC SW admin) using the command line (i.e. no GUI!!!) Command: + sh /tmp/sges-v3-unix.sh -s -a /tmp/answer Result: Everything should be placed below $INSTALL_HOME with owner 'bin' and group 'bin' or 'root:sys' ownership, so that no unprivileged user is able to modify or manipulate the software/installation. (2) Install GF updates as root or privileged user (RBAC SW admin) using the command line (i.e. no GUI!!!) Command: + $INSTALL_HOME/bin/pkg image-update Result: a) Wenn it runs, it should state - what stuff (module/package/etc.) gets downloaded - where this stuff gets stored (since it obviously gets never removed [what a pain] an admin needs to know, what a cron job needs to do, to keep the system clean and how to prevent blindly waisting backup ressources) - what gets replaced b) When it encounters a problem, it should clearly state, what the problem is - generic messages are usually not helpful! c) Installed/replaced files should have 'bin:bin' or 'root:sys' ownership (see (1)) (3) Install add. GF modules/package as root or privileged user (RBAC SW admin) using the command line (i.e. no GUI!!!) Command: + $INSTALL_HOME/bin/pkg install ... Result: see (2) (4.0) Install a domain as root or privileged user (RBAC SW admin) using the command line (i.e. no GUI!!!) Command: + $INSTALL_HOME/bin/asadmin create-domain \ --domaindir $DATA_DIR --user admin \ --instanceport 80 --domainproperties http.ssl.port=443 \ --asuser $AS_USER \ $DOMAIN Result: $DATA_DIR/$DOMAIN where only the files and directories have ownership of $AS_USER (deduced from /etc/passwd), which really need write permissions by the domain instance process (which should be assumed to run as user $AS_USER). Everything else should have ownership of 'root:root' or the RBAC role in action. (4.1) As an alternative (to 4.0) Install a domain as unprivileged user (e.g. $AS_USER) using the command line (i.e. no GUI!!!) Command: $INSTALL_HOME/bin/asadmin create-domain \ --domaindir $DATA_DIR --user admin \ --instanceport 80 --domainproperties http.ssl.port=443 \ [--netprivileged] \ $DOMAIN Result: - Fail with an appropriate error message if $DATA_DIR is not writable by $AS_USER - if private ports are used: - if '--netprivileged' is given, assume that the user is aware of the fact, that 'net_privaddr' privilege is required to run the instance properly (so this option is basically required to be able to batch setups, not annoy the user) - otherwise emit a warning message and ask, whether to continue (e.g. "WARNING: 'net_privaddr' privilege is required to be able to bind to private ports (0-1023). At the moment $AS_USER does not have this privilege and thus the instance will not work. Continue anyway [y/N]:") - $DATA_DIR/$DOMAIN where all files and directories are owned by $AS_USER (5.0) Create the SMF service as root or privileged user (RBAC SW admin) using the command line (i.e. no GUI!!!) Command: + $INSTALL_HOME/bin/asadmin create-service \ --domaindir $DATA_DIR $DOMAIN Result: a) if /var/svc/manifest/application/GlassFish does not yet exist, create it with ownership root:sys and 0755 permissions. b) MANIFEST=/var/svc/manifest/application/GlassFish/$DOMAIN.xml with ownership root:sys and 0644 permissions. If that file already exists, bail out with an appropriate error message (e.g. $MANIFEST already exists. Please remove that service first or choose another name ...) c) $MANIFEST's start and stop method context should have appropriate values, which reflect this instance. I.e. - working_directory = $DATA_DIR/$DOMAIN - method_credential user="$AS_USER" NOTE: since the instance must be able to write to its log directory, $AS_USER can be deduced from the ownership of $DATA_DIR/$DOMAIN/logs, i.e. there is not really a need for an --asuser switch wrt. create-service cmd (but might be beneficial for other use cases). - set privileges unconditionally to: "basic,!proc_session,!proc_info,!file_link_any" - if $AS_USER != "root" && $privatePortsInUse add 'net_privaddr' to the privileges unconditionally NOTE: even if $AS_USER == "root" GF should never run with all available OS/role/user privileges. For the rare cases, where add. privileges are really required, they should be explicitly added by editing $MANIFEST or passing them via a TBD switch (e.g. --addprivs=proc_info[:...]) d) add an appropriate property group to the $MANIFEST's instance. E.g. For more fine grained permissions, solaris.smf.manage.glassfish.$DOMAIN might be also an option e) Optional: document the new auths if not already done. E.g.: echo 'solaris.smf.manage.glassfish.:::GF Management' \ >>/etc/security/auth_attr, emit a message, that the new auths (which) have been created f) import the $MANIFEST g) emit svcadm|svccfg hints (5.1) As an alternative to (5.0) - Create an appropriate Manifest ready for SMF import as unprivileged user (e.g. $AS_USER) using the command line (i.e. no GUI!!!) Command: $INSTALL_HOME/bin/asadmin create-service \ [--out $file] \ --domaindir $DATA_DIR $DOMAIN Result: a) if --out is given, set MANIFEST=$file otherwise use MANIFEST=$DATA_DIR/$DOMAIN/manifest.xml b) same as (5.0) c) c) same as (5.0) d) d) emit a hint, where one should put the manifest and how to import it. E.g.: 'Please ask your system administrator to import the SMF manifest "$MANIFEST", so that this instance gets automatically stopped/started on system shutdown/boot respectively using the following commands: cp $MANIFEST /var/svc/manifest/application/GlassFish/$DOMAIN.xml svccfg import /var/svc/manifest/application/GlassFish/$DOMAIN.xml svcadm enable glassfish/$DOMAIN Also you may ask your Admin to allow you to manage glassfish instances by adding solaris.smf.manage.glassfish.* to your auths. E.g.: usermod -A 'solaris.smf.manage.glassfish.*,'`auths $USER` $USER ' e) same as (5.0) g) Now to the problems/bugs/oddities I encountered during setup, which actually lead to the suggestion above: A) First illness wrt. Installation (1) is the requirement to define the parameter for a domain, which is not required at all and gets usually never used. Is one actually able to explain, why domain1 is required? I know a lot of admins, which hate software, which daubs into software installation directories. Creating domain1 within it is really an invitation to break the system, because many users install it as root (as normal behaving software expect it), than they wonder, why it does not start/work and that's why they start it as root. Crazy! Unexperienced users than say, well - it's running - have not a good feeling, but at least 'I got that beast running' ... BTW: Avoiding the domain creation would also make the SW installation much simpler, natural and would make the more or less complex part of creating answer files superfluous ... B) The installation wrt. to dir/file ownership seems to be severely broken. Even when installing as root, all files/dirs belong to root:root and thus 'pkg verify' spits out a lot of warnings like: file: glassfish/modules/websecurity.jar Owner: 'root (0)' should be 'nobody (60001)' Group: 'root (0)' should be 'sys (3)' Size: 35571 bytes should be 35229 Hash: 8878e9c597c99fedb96ef119370c7851e3b17b23 should be 6f76a152b3b7a37045cf45913d966d878ad2c400 or file: pkg/bin/depot.py Group: 'root (0)' should be 'sys (3)' Mode: 0644 should be 0664 Timestamp: 20100312T044808Z should be 20091008T194854Z Even so it does not make any sense at all, why right after installation files have a different size and hash as expected. Even simple things like timestamps or file mode don't match. At least Security Officers become at this point pretty nervous ... C) Next thing is the bad choice of ownership: whereby group sys might be ok, user 'nobody' is NOT AT ALL! This completely ignores decades of common pratice to run processes like webservers as user nobody to reduce the risk, that even malefunctioning software is able to harm the system. IMHO the correct ownership should be bin:bin as for all normal Solaris packages and gives RBAC Softadmin users the ability to update the software without the need for getting root privileges. D) 'pkg fix' seems to be pretty dumb. I can't see the need to download ~90MB (about 2500 files) just to fix permissions. Normal Solaris packages have a pkgmap file, which contains all required info. I guess, the GF packages have a similar file - so if the 'fix' doesn't trust it, wouldn't it be sufficient to download the "pkgmap", only? E) Not sure, whether 'pkg fix' is broken or the packages itself (probably both), but after a 'pkg fix' has been done, 'pkg verify' still ERRORs with messages like: pkg:/pkg ERROR file: pkg/vendor-packages/pkg/nrlock.pyc Group: 'root (0)' should be 'sys (3)' Mode: 0644 should be 0664 Size: 1715 bytes should be 1712 Hash: 691f994b72ae781dce36b9227516d847d7296837 should be f9bf2cdc7394275fe3e457394c52cdb2fd077ec2 ... pkg:/glassfish-appclient ERROR dir: bin Owner: 'root (0)' should be 'nobody (60001)' ... pkg:/mq-config-gf ERROR file: mq/etc/imqenv.conf Owner: 'root (0)' should be 'nobody (60001)' Group: 'root (0)' should be 'sys (3)' F) Either 'pkg list' or 'pkg image-update' is broken: + bin/pkg list -v | fgrep 'u--' pkg:/felix@2.0.2,0-0:20091203T054540Z installed u--- pkg:/glassfish-appclient@3.0,0-74.2:20091203T061841Z installed u--- pkg:/glassfish-cmp@3.0,0-74.2:20091203T060956Z installed u--- pkg:/glassfish-common@3.0,0-74.2:20091203T062350Z installed u--- pkg:/glassfish-common-full@3.0,0-74.2:20091203T060704Z installed u--- ... but: + bin/pkg image-update -v Creating Plan / Before evaluation: UNEVALUATED: After evaluation: Actuators: No updates available for this image. Strange, actually I have no clue, neither whether it is possible to update packages nor how to do it. Even when starting domain1 and registering the UI show 48 package updates available, but if one changes to the 'Available Updates' tab, after thinking "an hour" it displays an empty table. BTW: Wondering, what takes so much time to just display something: Is the dojo stuff so awefully slow (probably) or is it the package stuff, running behind it. My logik tells me, if I already have determined, that 48 packages need an update, I must also have all relevant package infos, which lead to that result. So why does it take centuries to get displayed - must be slow UI/dojo stuff ... G) 'asadmin create-domain' tries to be intelligent, what one gets is the opposite. E.g. an admin switches to the unprivileged user under which the instance should be running to avoid screwig up file permissions which would lead to a non functioning instance (e.g. dueto chmod 600 config/* ; etc.): /opt/glassfish/v3/bin/asadmin --user admin create-domain \ --domaindir /data/sites/glassfish/v3 \ --instanceport 80 --domainproperties http.ssl.port=443 \ jforum Gives: Enter the admin password [Enter to accept default of no password]> Enter the admin password again> Enter the master password [Enter to accept default password "changeit"]> Enter the master password again> You do not have permission to use port 80 for jforum. Try a different port number or login to a more privileged account. CLI130 Could not create domain, jforum Command create-domain failed. a) First annoying thing, I've to enter something 4 time, before the SW recognizes an "error". b) IMHO the SW designer makes wrong assumptions here: Not the user but the SERVICE needs net_privaddr privileges. And if a normal user, e.g. student should be able to manage this service, it needs the required auths to manage that service (e.g. solaris.smf.manage.glassfish.* as shown in (5.1) c),d)) - not less but also not more! To second this: Think! Why should an admin allow an arbitrary user to run any processes similar to smbd/nmbd/ftp/imap*/pop*/dhcp*/ldap/ finger/nntpd/rlogind etc. on private ports? Just because he should be able to start/stop the glassfish instance? So the designer's logik is severely broken wrt. security! In this case asadmin should really WARN the user and ask, whether it should continue to setup the domain in question as suggested in (5.1), and if he says 'yes, I know', it should set it up as desired - no net_privaddr privileges are required for this step. BTW: Yes, the user might be smart enough to take this hurdle using the option '--checkports false', but this is a really bad compromise because the port overlapping check will not be made anymore! So a different switch (e.g. --netprivileged) is required to accomplish, what the normal user really desires. c) Assuming the unprivileged user got its domain created (e.g. by using the checkports option, asking the admin, etc.) the next illness gets uncovered: E.g: /opt/glassfish/v3/bin/asadmin create-service \ --domaindir /data/sites/glassfish/v3 jforum The user [webservd] does not have permission to create the service manifest related files and directories at [/var/svc/manifest/application/GlassFish/]. This structure is required per SMF guidelines. Either become super-user to do this operation or contact the System Administrator to explicitly get the relevant permissions and try again. Usage: asadmin [asadmin-utility-options] create-service [--name ] [--serviceproperties ] [--dry-run[=]] [--domaindir ] [-?|--help[=]] [domain_name] Command create-service failed. First hmm: Why is this message spoiled with the usage info? 2nd hmmm: A more or less relaxed admin might say - oh know, don't bother me all the time with your crap: mkdir -p /var/svc/manifest/application/GlassFish/ chown student:sgid /var/svc/manifest/application/GlassFish and tell him 'send me an email, when you changed something'. Student tries it again: The user [student] does not seem to have adequate authorizations [solaris.smf.*] on this System to create and configure an SMF service. The authorizations available are [solaris.device.cdrw,solaris.profmgr.read,solaris.jobs.users,solaris.mail.mailq,solaris.admin.usermgr.read,solaris.admin.logsvc.read,solaris.admin.fsmgr.read,solaris.admin.serialmgr.read,solaris.admin.diskmgr.read,solaris.admin.procmgr.user,solaris.compsys.read,solaris.admin.printer.read,solaris.admin.prodreg.read,solaris.admin.dcmgr.read,solaris.snmp.read,solaris.project.read,solaris.admin.patchmgr.read,solaris.network.hosts.read,solaris.admin.volmgr.read ]. See smf_security(5), rbac(5). Usage: asadmin [asadmin-utility-options] create-service [--name ] [--serviceproperties ] [--dry-run[=]] [--domaindir ] [-?|--help[=]] [domain_name] Command create-service failed. First hmmm: Why gets the message spoiled with the usage info? 2nd hmmm: Oh my goodness! I need to bother the admin again. 3rd hmm: the admin starts asking itself, what kind of strange software is it, which requires solaris.smf.* just to be able to create a manifest ... So the SW designer made another fault: He tries to solve two different problems at once (create the manifest, import it). IMHO the correct and less annoying way is: 1) create the manifest as described in (5.0) c) and d) 2) try to copy it to /var/svc/manifest/application/GlassFish/$DOMAIN.xml if that fails a) store it as $DATA_DIR/$DOMAIN.xml b) emit a warning 'Unable to save manifest as ....' and a message as suggested in (5.1) d) and exit 3) try to import the manifest if that fails, emit a warning like 'Unable to import manifest $MANIFEST (insufficient auths).' and add a message similar to (5.1) d) but without the copy instruction and exit The hint wrt. /var/svc/manifest/application/GlassFish/ permission is IMHO completely misleading and dangerous as well - so vaporize it completely! Furthermore the error message wrt. to solaris.smf.* auths required is also misleading. The unexperienced user/admin @home interpretes this actually as an instruction to assign those auths to the given user, which is plain wrong (analog to the net_privaddr problem). d) Glassfish or any other webservice usually does not need the privileges to create hard links to files owned by a UID different from the process's effective UID. Also it usually does not need to examine the status of processes other than its own sub processes or to send signals or trace processes outside its own session. So file_link_any, proc_info and proc_session privileges should be removed from the service's 'basic' privilege set (see (5.0) c)). So the whole point here: Solaris provides really a lot of features obsoleting the "'all' or 'nothing'" weaknesses. SW should help user and admins to make use of it in a correct manner and make others aware of, that there is actually an OS, which allows hosting services in a more secure/fine grained way, than they know it from others. That's all for now. Thanx for your attention, jel. PS: X-posting to admin@glassfish.dev.java.net and quality-feedback@glassfish.dev.java.net since I'm not sure, which one is the correct one. Please point it into the right direction, of necessary. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 --------------------------------------------------------------------- To unsubscribe, e-mail: admin-unsubscribe@glassfish.dev.java.net For additional commands, e-mail: admin-help@glassfish.dev.java.net