Guide to pmtr

Pmtr is one of my many C projects. Back to the pmtr Github page.

About

Back when I wrote pmtr in 2011, systemd had not yet been widely adopted. For several years, it provided a uniform way to run a set of services across various distributions. As systemd became widely adopted, it became the uniform system manager, but pmtr ended up being useful in two secondary roles.

On a host: pmtr, running under systemd, is a quick way to run a group of processes from a single configuration file

In a container: pmtr, as process 1, launches a set of processes when the container starts

I wrote pmtr as a workalike to a proprietary process supervisor with these goals:

a single configuration file lists the processes to run
to create the managed processes in a standard execution context where the only open file descriptors are 0, 1, and 2 corresponding to standard input and output
for the supervisory process to consume few resources, and collect any exited child processes
to limit the restart rate of any failed or exiting child processes
to signal child processes when the supervisory process itself is terminated
to have no dependencies

It is written in C, supports Linux only, and is MIT licensed.

Over the years, pmtr has been used on many Linux variants including Ubuntu, RHEL, Arch, Amazon Linux, Pi OS and others.

Example /etc/pmtr.conf

job {
  name tunnel
  cmd /usr/bin/ssh -i key -TNL 5901:127.0.0.1:5901 192.168.0.1
}

job {
  name demo-daemon
  dir /home/demo
  cmd /usr/bin/demod
  user demo
  env HOME=/home/demo
  cpu 0-8
}

job {
  name capture
  dir /data
  cmd /usr/sbin/tcpdump -i eth0 -s 0 -G 10 -w %Y%m%d%H%M%S.pcap
}

When pmtr is started, it starts all the jobs in pmtr.conf, likewise, pmtr terminates them when it is stopped.

Any syntax errors in pmtr.conf are reported through syslog, which will be echoed to the container logs (when run as a container entrypoint) or, on a systemd host, visible through journalctl -u pmtr.

Processes that run under pmtr should stay in the foreground, exit on SIGTERM or SIGKILL, and clean up after their own sub-processes when exiting.

If a job exits, pmtr restarts it. If it exits too quickly- less than 10 seconds after it started- pmtr delays its restart 10 seconds to avoid rapid cycling.

If the operator edits pmtr.conf and saves the file, the changes take effect immediately. There is no need to tell pmtr to reload its configuration file.

Build & Install

To build pmtr from source follow these steps.

Clone pmtr

git clone https://github.com/troydhanson/pmtr.git

Build and install

cd pmtr
mkdir build && cd build
cmake ..
make
sudo cmake --install . --prefix=/usr

This will install the executable into /usr/bin/pmtr.

Usage as a container entrypoint

Dockerfile syntax to configure pmtr as the entrypoint is:

ENTRYPOINT ["/usr/bin/pmtr", "-Fc", "/etc/pmtr.conf"]

The pmtr options used above are

-F        (run pmtr in the foreground, keeping container from exiting)
-c <file> (path to pmtr configuration, defaults to /etc/pmtr.conf)

The pmtr logs will be sent to the container logs (e.g. docker logs).

Usage as a systemd-managed service on a host

To set up pmtr to run under systemd on a host, build as shown above then install the systemd service, enable and start pmtr (and cause pmtr to start on future reboots), by issuing these commands:

sudo cp pmtr.service /etc/systemd/system
sudo systemctl daemon-reload
sudo systemctl enable pmtr.service
sudo systemctl start pmtr.service

Config file

You can create an empty file called /etc/pmtr.conf and add temporary content to get the hang of the pmtr configuration syntax, such as:

job {
  name demo
  cmd /bin/sleep 20
}

This is a minimal job having a name and a command. Save the file. At this point, if you execute pmtr in the foreground, you can see it run the job,

pmtr -F -c /etc/pmtr.conf

You will see output like this:

pmtr[21992]: pmtr: starting
pmtr[21992]: started job demo [21994]
pmtr[21992]: job demo [21994] exited after 20 sec: exit status 0
pmtr[21992]: started job demo [21997]

You can see that pmtr is restarting it when the sleep exits every 20 seconds. Press Ctrl-C to terminate pmtr and it will terminate the sleep subprocess too.

The config file can be changed to a different file using a command line option (pmtr -c <file>).

As shown previously pmtr.conf consists of zero or more job definitions. An empty pmtr.conf is valid, in which case pmtr will idle harmlessly. Add jobs using the curly brace delimited syntax. Another example of a job:

job {
  name demo
  cmd /usr/bin/date
  out /var/log/demo.out
  err /var/log/demo.err
}

A job must have a name (used to improve log readability only) and a cmd at a minimum. The contrived job above will log the date, and since it exits immediately, pmtr will wait 10 seconds and then restart it. This limits the impact of misconfigured, failing, or quick exiting jobs. If a job has been running more than ten seconds, then exits, pmtr will restart it immediately.

In the pmtr.conf syntax,

Indentation is optional.
The order of options does not matter.
Blank lines are ok.
Comments start with #.

You can run pmtr -t -c /etc/pmtr.conf to test the configuration file syntax.

Options

The full list of options that may appear inside a job are listed here.

Table 1. Job options
option	argument
`name`	descriptive job name used for logging - must be unique
`cmd`	executable (fully-qualified) and any arguments
`dir`	working directory (fully qualified) to run the process in
`out`	send stdout to this file
`err`	send stderr to this file
`in`	take stdin from this file
`env`	environment variable to set (repeatable)
`user`	unix username under whose id to run the process
`nice`	unix priority between -19 (highest) and 20 (lowest)
`cpu`	CPU affinity as hex mask (0xABCD) or number/ranges (0,2-4)
`ulimit`	process ulimits
`bounce every`	a time interval to restart the process
`depends`	files to watch, any changes induce the job to restart
`disable`	disable the job
`wait`	(special) wait for the job to finish before going on
`once`	(special) do not restart the job

More details on each option follows.

cmd

Specifies the absolute path to the executable (there is no $PATH searching!)
It may have arguments after the executable (e.g. cmd /usr/bin/python foo.py)
Use double-quotes to form a quoted string into a single argument.

There is no shell expansion: no wildcards, backticks, variables, etc. If you need shell features in your command, invoke it through a shell script.

env

Sets an environment variable for a job, e.g. env DEBUG=1.
Use repeatedly on separate lines to set multiple environment variables.

disable

Use disable on a line by itself to make the job disabled.
This is sometimes quicker than commenting out the whole job.

out, err, in

Use out and err to send stdout or stderr to a file.
stdout and stderr go to syslog by default.
stdin defaults to /dev/null; use in to override

nice

This changes the process priority
Takes a number in the range -19 (highest priority) to 20 (lowest)

cpu

This sets the CPU affinity- the list of CPU’s the task can run on
Takes a CPU number (e.g. 0) or range (e.g. 2-4) or a mix (e.g. 0,2-4)
Alternatively, can take a 0x-prefixed hex mask (e.g. 0x8f)
Any CPUs in the set that are physically absent are ignored

user

Specifies the unix username to run the process as.
That user’s uid/gid becomes those of the process.
Defaults to root (when pmtr is running as root)

depends

Specifies a block of one or more files that the job depends on.
pmtr watches the dependencies for changes to their content.

Pmtr restarts the job if a change is detected.

job {
  name demo-daemon
  dir /home/demo
  cmd /usr/bin/demod
  user demo
  env HOME=/home/demo
  depends {
    /home/demo/.demo/demo.conf
  }
}

bounce every

Use bounce every to restart a job on a periodic interval.
It takes a number and unit [smhd] e.g. bounce every 1d.
Units [smhd] are seconds, minutes, hours or days.
The exact timing of the restart is approximate.

ulimit

Use to modify the system resource limits for the job.
Takes a flag and value, e.g., ulimit -n 30.
Values are numeric or the keyword infinity.

To see the current limits on a process by its PID:

cat /proc/<pid>/limits

Pmtr sets both the "hard" and "soft" limit to the same value. Any error in setting the limit is logged to syslog.

See man prlimit for technical descriptions of each limit. In the bash shell, ulimit -a and help ulimit display the limits and a list of flags and descriptions respectively.

Table 2. ulimit flags
`-c`	the maximum size of core files created
`-d`	the maximum size of a process’s data segment
`-e`	the maximum scheduling priority (taken as 20-limit)
`-f`	the maximum size of files the process may create
`-i`	the maximum number of pending signals
`-l`	the maximum bytes a process may lock into memory
`-m`	the maximum resident set size
`-n`	the maximum number of open file descriptors
`-q`	the maximum number of bytes in POSIX message queues
`-r`	the maximum real-time scheduling priority
`-s`	the maximum stack size
`-t`	the maximum amount of cpu time in seconds
`-u`	the maximum number processes or threads
`-v`	the maximum size of process virtual memory

The units are discussed in the prlimit(2) man page.

wait/once

Used for setup jobs that need not be restarted.
wait pauses the startup of subsequent jobs.

once tells pmtr not to restart this job.

job {
  name initial-setup
  cmd /bin/mkdir /dev/shm/go
  wait
  once
}

Operator notes

Pmtr detects writes to pmtr.conf and applies the changes immediately. This allows a user to edit the config file manually, and save it, and immediately have pmtr read the config file, taking appropriate action.

A newly-added job gets started.
A deleted job is terminated.
A changed job is restarted with its new configuration.

You can run pmtr -t to check the config file syntax and report any errors.

Viewing the logs

When pmtr is a container entrypoint, the pmtr logs can be viewed using docker logs for that container.

On a host, use journalctl -u pmtr to view pmtr logs.

Checking status, and starting and stopping pmtr

Use the systemd management commands to start or stop pmtr or check the pmtr status on a systemd-managed host.

systemctl status pmtr
systemctl start pmtr
systemctl stop pmtr

Behavior when a job exits

Jobs that exit on their own

If a job terminates by itself, when pmtr did not signal it to exit, (and the job does not have the once option), pmtr restarts it. However, if it exited within 10 seconds of when it started, pmtr waits 10 seconds to restart it. The 10-second wait prevents rapid process cycling. Also, a job that’s waiting for something (like a network resource to come up) can be designed to exit instead of retry, relying on pmtr to restart it periodically to try again.

How pmtr terminates a job

Pmtr terminates a job when it is deleted, disabled, or altered in pmtr.conf, or is being bounced due to the bounce every option; or because pmtr itself is being shut down. To terminate a job, pmtr sends SIGTERM to it, then SIGKILL shortly afterward, if it’s still running.

Command line options

The host init system normally runs pmtr at system boot. You can instead run pmtr manually. Or, it can be the init process inside a container. In these use cases, the following command line options may be useful.

Table 3. pmtr command-line options
`-h`	show help
`-F`	stay in foreground (enabled by default when PID 1)
`-c <file>`	specify configuration file
`-t`	test syntax (parse config file and exit)
`-v`	verbose logging (repeatable), -vv shows parsing
`-p <file>`	make pidfile

"onconnect" utility

For processes that run in response to an accepted network connection, pmtr includes a helper that will only run the process when a client actually connects to that socket. Until then, only the socket listener will run; when the client connects, the listener will fork and execute the program to handle it. This is a way to defer running a network service until a client connects to it. A new instance of the service will be started for each client connection. To use it, configure a job using the "onconnect" utility as the command, followed by the service itself. For example, a hypothetical Gstreamer pipeline could be started only upon a client connection on port 5000 like this:

job {
  name gstreamer-job
  cmd /usr/bin/onconnect -p 5000 /path/to/gstreamer-job <args>
}

The "onconnect" utility is built with pmtr and installed alongside it so it will be in /usr/bin if you ran cmake --install . --prefix=/usr.

The full syntax for onconnect includes the ability to customize the IP address it listens on (for a multihomed host); all is the default.

onconnect [-a 0.0.0.0] -p <port> /process/to/fork [args]

When a client connection is made to this port, onconnect will fork and execute the named subprocess with the accepted client connection on file descriptor 3. The subprocess must be coded to expect the client connection on that descriptor.

UDP control

Note	This feature is disabled by default.

These options may appear in pmtr.conf at the global scope.

report to udp://127.0.0.1:9999
listen on udp://0.0.0.0:10000

The report to option designates a remote address and port to which pmtr should send a a UDP packet every ten seconds. The packet payload lists the job names, enabled or disabled status, and elapsed runtimes in simple text. If the report to address falls in the multicast UDP range (e.g. 239.0.0.1, etc), the specification may include a trailing interface, e.g., report to udp://239.0.0.1:9999@eth2 to designate the interface from which the multicast UDP datagrams should egress.

The listen on option allows jobs to be remotely enabled or disabled. It specifies a UDP address and port that pmtr should listen on for datagrams of form enable abc or disable abc, where abc is a job name. The address 0.0.0.0 can be used as a shortcut to denote "any address" on this system. The effect is temporary; the settings in pmtr.conf resume precedence when it’s edited or pmtr is restarted.

These options are considered experimental and may be replaced or removed.