Benefits for LWN subscribers The primary benefit fromsubscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!
September 27, 2016
This article was contributed by Neil Brown
Some time ago, we publisheda pair of articles about systemd programming that extolled the value of providing high-quality unit files in upstream packages. The hope was that all distributions would use them and that problems could be fixed centrally rather than each distribution fixing its own problems independently. Now, 30 months later, it seems like a good time to see how well that worked out for nfs-utils, the focus of much of that discussion. Did distributors benefit from upstream unit files, and what sort of problems were encountered?
Systemd unit files for nfs-utils first appearedin nfs-utils-1.3.0, released in March 2014. Since then, there have been 26 commitsthat touched files in the systemdsubdirectory; some of those commits are less interesting than others. Two, for example, make changes to the set of unit files that are installed when you run " make install". If distributors maintained their unit files separately (like they used to maintain init scripts separately), this wouldn't have been an issue at all, so these cannot be seen as a particular win for upstreaming.
Most of the changes of interest are refinements to the ordering and dependencies between various services, which is hardly surprising given that dependencies and ordering are a big part of what systemd provides. With init scripts we didn't need to think about ordering very much, as those scripts ran the commands in the proper order. Systemd starts different services in parallel as much as possible, so it should be no surprise that more thought needs to be given to ordering and more bugs in that area are to be expected.
As hoped, the fixes came from a range of sources, including one commitfrom an Ubuntu developer that removed the default dependency on basic.target. That enabled the NFS service to start earlier, which is particularly useful when /varis mounted via NFS. Another, from a Red Hat developer, removed an ordering cycle caused by the nfs-client.targetinexplicably being told to start before the GSS servicesit relies on, rather than after. A third, from the developer ofOSTree, made sure that /var/lib/nfs/rpc-pipefswasn't mounted until after the systemd-tmpfiles.servicehad a chance to create that directory. This is important in configurations where /varis not permanent.
Each of these changes involved subtle ordering dependencies that were not easy to foresee when the unit files were initially assembled. Some of them have the potential to benefit many users by improving robustness or startup time. Others have much narrower applicability, but still benefit developers by documenting the needs that others have. This makes it less likely that future changes will break working use cases and can allow delayed collaboration, as the final example will show.
There were two changes deserving of special note, partly because they required multiple attempts to get right and partly because they both involve dependencies that are affected by the configuration of the NFS services; they take quite different approaches to handling those dependencies. The first of these changes revised the dependency on rpcbind, which is a lookup service that maps an ONC-RPC service number into a Internet port number. When RPC services start, they choose a port number and register with rpcbind, so it can tell clients which port each service can be reached on.
When version 2 or version 3 of NFS is in use, rpcbindis required. It is necessary for three auxiliary protocols (MOUNT, LOCK, and STATUS), and is the preferred way to find the NFS service, though in practice that service always uses port 2049. When only version 4 of NFS is in use, rpcbindis not necessary, since NFSv4 incorporates all the functionality that was previously included in the three extra protocols and it mandates the use of port 2049. Some system administrators prefer not to run unnecessary daemons and so don't want rpcbindstarted when only NFSv4 is configured. There are two requirements to bear in mind when meeting this need; one is to make sure the service isn't started, the other is to ensure the main service starts even though rpcbindis missing.
As discussed in the earlier articles, systemd doesn't have much visibility into non-systemd configuration files, so it cannot easily detect if NFSv3 is enabled and start rpcbindonly if it is. Instead it needs to explicitly be told to disable rpcbindwith:
systemctl mask rpcbind
There is subtlety hiding behind this command. rpcbinduses three unit files: rpcbind.target, rpcbind.service, and rpcbind.socket. Previously, I recommended using the target file to activate rpcbindbut that was a mistake. Target files canbe used for higher-level abstractions as described then, but there is no guarantee that they will be. rpcbind.targetis defined by systemd only to provide ordering with rpcbind(or equally "portmap"). This provides compatibility with SysV init, which has a similar concept. rpcbind.targetcannot be used to activate those services, and so should be ignored by nfs-utils. rpcbind.socketdescribes how to use socket-activation to enable rpcbind.service, the main service. nfs-utils only cares about the sockets being ready to listen, so it should only have (and now does only have) dependencies on rpcbind.socket.
Masking rpcbindensures that rpcbind.servicedoesn't run. The socket activation is not directly affected, but systemd sorts this out soon enough. Systemd will still listen on the various sockets at first but, as soon as some process tries to connect to one of those sockets, systemd will notice the inconsistency and will shut down the sockets as well. So this simple and reasonably obvious command does what you might expect.
Ensuring that other services cope with rpcbindbeing absent is as easy as using a Wantsdependency rather than a Requiresdependency. These ask the service to start, but won't fail if it doesn't. Some parts of NFS only "want" rpcbindto be running, but one, rpc.statd, cannot function without it, so it still Requires rpcbind. This has the effect of implicitly disabling rpc.statdwhen rpcbindis masked.
It's worth spending a while reflecting on why the command is " systemctl mask" rather than " systemctl disable", as I've often come across the expectation that enableand disableare the commands to enable or disable a unit file. As a concrete example, Martin Pitt stated in Ubuntu bug 1428486that they are " the canonical way to enable/disable a unit", but this was not the first place that I found this expectation.
The reality is that enableis the canonical way to request activationof a unit file. It doesn't actually start it (" systemctl start" will do that), and it isn't the only way to activate a unit file, as some other unit file can do so with a Requiresdirective. This may seem to be splitting hairs, but the distinction is more clear with the disablecommand, which does notdisable a unit file. Instead, it only reverts any explicit request made by enablethat a unit be activated. It is quite possible that a unit file will still be fully functional even after running " systemctl disable" on it.
If you want to be sure that a unit file will be activated, then " systemctl enable" is probably the right thing to do. If you want to be sure that it is not activated, then " systemctl disable" won't provide that guarantee; you need " systemctl mask" instead. This command ensures that the unit file won't run even if some other unit file Requiresit. So that is the command that we use to ensure rpcbindisn't running, and it could also be used to ensure rpc.statdisn't running, though that isn't really needed as masking rpcbindeffectively masked rpc.statdas mentioned.
Ordering nfsd with respect to filesystem mounting using a generator
One dependency for the NFS server, which is particularly obvious in hindsight, is that it should only be started afterthe filesystems that it is exporting have been mounted. Without this ordering, an NFS client might manage to mount the filesystem that is about to have something mounted on top of it, which can cause confusion — or worse. The default dependencies imposed by systemd will start services after local-fs.target, which ensures all local filesystems are mounted. When the commit mentioned above removed the default dependencies to allow NFS to start earlier, it explicitly added local-fs.target. So this seems well in hand.
For remote filesystems mounted over NFS, we need the reverse ordering. In particular, if a filesystem is NFS mounted from the local host (a "loopback" mount), the NFS server should be started beforethe filesystem is mounted. This is particularly important during system shutdown when ordering is reversed. If the NFS server is stopped before the loopback NFS filesystem is unmounted, that unmount can hang indefinitely.
To avoid this hang, Pitt added a dependencyso that nfs-server.servicewould start before (and so be stopped after) remote-fs-pre.target. This ensures that the NFS server will be running whenever a loopback NFS filesystem might be mounted. This seems like it makes perfect sense, but there is a wrinkle: sometimes, filesystems that are considered by systemd to be "remote" can be exported by NFS. A particular example is filesystems mounted from a network-attached block device, such as one accessed over iSCSI.
Had I confronted the need to export iSCSI filesystems before Pitt had added the dependency on remote-fs-pre.service, I probably would have simply told systemd to start nfs-server.service" After remote-fs.target". This would have solved the iSCSI situation, but broken the loopback NFS situation. Had the unit files not been upstream, this is undoubtedly what would have happened.
Instead, a more general solution was needed. The NFS server needs to start afterthe mounting of any filesystems that are exported, but beforeany NFS filesystem is mounted. Systemd is not able to make this determination itself, but fortunately it has a flexible extension mechanism so it can have the details explained to it. Using this extension mechanism isn't quite as easy as adding a script to /etc/init.d, but perhaps that is a good thing. It should probably only be used as a last resort, but it is good to have it when that resort is needed.
Before systemd reads all its unit files, either at startup or in response to " systemctl daemon-reload", it will run any programs found in various "generator" directories such as /usr/lib/systemd/system-generators. These programs are run in parallel, are expected to complete quickly, and will normally read a foreign (i.e. non-systemd) configuration file and create new unit files or drop-ins (which extend existing unit files) in a directory given to the program, typically /run/systemd/generator. These will then be read when other unit files and drop-ins are read, so they can exercise a large degree of control over systemd.
For the nfs-server dependency, with respect to various mount points, we want to read /etc/exportsand add a RequiresMountsFor=directive for each exported directory. Then we want to read /etc/fstaband add a Before= MOUNT_POINT.mount directive for each MOUNT_POINT of an nfsor nfs4filesystem. As library code already exists for reading both of these files, this all comes to less than 200 lines of code. Once the problem is understood, the answer is easy.
Having experienced the power of systemd generators, I immediately started to wonder how else I might use them. It is tempting to use a generator to automatically disable rpcbindwhen only NFSv4 is in use, but I think that is a temptation best avoided. rpcbindisn't only used by NFS. NIS, the Network Information Service (previously called "yellow pages") makes use of it, andsites could easily have their own local RPC services. It is best if disabling rpcbindremains a separate administrative decision, for which the "mask" function seems well suited.
In the earlier articles I described a modest amount of complexity required to pass local configuration through systemd to affect the parameters passed to various programs. Using a generator to process the configuration file could make all of that more transparent, or it might just replace one sort of complexity with another. While I don't agree with all the advice the systemd developers provide, this advice from the systemd.generatormanual page is certainly worth considering:
Instead of heading off now and writing all kind of generators for legacy configuration file formats, please think twice! It is often a better idea to just deprecate old stuff instead of keeping it artificially alive.
The evidence presented here supports the claim that keeping systemd unit files upstream can benefit all developers and users. The different experiences generated in different contexts were brought together into a single conversation so all could benefit from, and respond to, all the changes. This should not be surprising when one thinks of unit files as just another sort of code used to write the whole system. The only part that seems to be missing from upstream is a place to document the advice that " systemctl mask rpcbind" is the appropriate way to disable rpcbindand rpc-statdwhen only NFSv4 is in use. Maybe we need an nfs.systemdman page.
( Log in to post comments)