Pmtr is one of my many C projects. Back to the pmtr Github page.
Back when I wrote pmtr in 2011, systemd had not yet been widely adopted. For several years, it provided a uniform way to run a set of services across various distributions. As systemd became widely adopted, it became the uniform system manager, but pmtr ended up being useful in two secondary roles.
On a host: pmtr, running under systemd, is a quick way to run a group of processes from a single configuration file
In a container: pmtr, as process 1, launches a set of processes when the container starts
I wrote pmtr as a workalike to a proprietary process supervisor with these goals:
-
a single configuration file lists the processes to run
-
to create the managed processes in a standard execution context where the only open file descriptors are 0, 1, and 2 corresponding to standard input and output
-
for the supervisory process to consume few resources, and collect any exited child processes
-
to limit the restart rate of any failed or exiting child processes
-
to signal child processes when the supervisory process itself is terminated
-
to have no dependencies
It is written in C, supports Linux only, and is MIT licensed.
Over the years, pmtr has been used on many Linux variants including Ubuntu, RHEL, Arch, Amazon Linux, Pi OS and others.
job {
name tunnel
cmd /usr/bin/ssh -i key -TNL 5901:127.0.0.1:5901 192.168.0.1
}
job {
name demo-daemon
dir /home/demo
cmd /usr/bin/demod
user demo
env HOME=/home/demo
cpu 0-8
}
job {
name capture
dir /data
cmd /usr/sbin/tcpdump -i eth0 -s 0 -G 10 -w %Y%m%d%H%M%S.pcap
}
When pmtr is started, it starts all the jobs in pmtr.conf
, likewise, pmtr
terminates them when it is stopped.
Any syntax errors in pmtr.conf are reported through syslog, which will
be echoed to the container logs (when run as a container entrypoint) or,
on a systemd host, visible through journalctl -u pmtr
.
Processes that run under pmtr should stay in the foreground, exit on SIGTERM or SIGKILL, and clean up after their own sub-processes when exiting.
If a job exits, pmtr restarts it. If it exits too quickly- less than 10 seconds after it started- pmtr delays its restart 10 seconds to avoid rapid cycling.
If the operator edits pmtr.conf
and saves the file, the changes take effect
immediately. There is no need to tell pmtr to reload its configuration file.
Build & Install
To build pmtr from source follow these steps.
git clone https://github.com/troydhanson/pmtr.git
cd pmtr
mkdir build && cd build
cmake ..
make
sudo cmake --install . --prefix=/usr
This will install the executable into /usr/bin/pmtr.
Dockerfile syntax to configure pmtr as the entrypoint is:
ENTRYPOINT ["/usr/bin/pmtr", "-Fc", "/etc/pmtr.conf"]
The pmtr options used above are
-F (run pmtr in the foreground, keeping container from exiting)
-c <file> (path to pmtr configuration, defaults to /etc/pmtr.conf)
The pmtr logs will be sent to the container logs (e.g. docker logs
).
To set up pmtr to run under systemd on a host, build as shown above then install the systemd service, enable and start pmtr (and cause pmtr to start on future reboots), by issuing these commands:
sudo cp pmtr.service /etc/systemd/system
sudo systemctl daemon-reload
sudo systemctl enable pmtr.service
sudo systemctl start pmtr.service
Config file
You can create an empty file called /etc/pmtr.conf
and add temporary content
to get the hang of the pmtr configuration syntax, such as:
job {
name demo
cmd /bin/sleep 20
}
This is a minimal job having a name and a command. Save the file. At this point, if you execute pmtr in the foreground, you can see it run the job,
pmtr -F -c /etc/pmtr.conf
You will see output like this:
pmtr[21992]: pmtr: starting
pmtr[21992]: started job demo [21994]
pmtr[21992]: job demo [21994] exited after 20 sec: exit status 0
pmtr[21992]: started job demo [21997]
You can see that pmtr is restarting it when the sleep
exits every 20 seconds.
Press Ctrl-C to terminate pmtr and it will terminate the sleep subprocess too.
The config file can be changed to a different file using a command line option
(pmtr -c <file>
).
As shown previously pmtr.conf consists of zero or more job definitions. An empty
pmtr.conf
is valid, in which case pmtr will idle harmlessly. Add jobs using
the curly brace delimited syntax. Another example of a job:
job {
name demo
cmd /usr/bin/date
out /var/log/demo.out
err /var/log/demo.err
}
A job must have a name
(used to improve log readability only) and a cmd
at a
minimum. The contrived job above will log the date, and since it exits
immediately, pmtr will wait 10 seconds and then restart it. This limits the
impact of misconfigured, failing, or quick exiting jobs. If a job has been
running more than ten seconds, then exits, pmtr will restart it immediately.
In the pmtr.conf syntax,
-
Indentation is optional.
-
The order of options does not matter.
-
Blank lines are ok.
-
Comments start with
#
.
You can run pmtr -t -c /etc/pmtr.conf
to test the configuration file syntax.
Options
The full list of options that may appear inside a job are listed here.
option | argument |
---|---|
|
descriptive job name used for logging - must be unique |
|
executable (fully-qualified) and any arguments |
|
working directory (fully qualified) to run the process in |
|
send stdout to this file |
|
send stderr to this file |
|
take stdin from this file |
|
environment variable to set (repeatable) |
|
unix username under whose id to run the process |
|
unix priority between -19 (highest) and 20 (lowest) |
|
CPU affinity as hex mask (0xABCD) or number/ranges (0,2-4) |
|
process ulimits |
|
a time interval to restart the process |
|
files to watch, any changes induce the job to restart |
|
disable the job |
|
(special) wait for the job to finish before going on |
|
(special) do not restart the job |
More details on each option follows.
cmd
-
Specifies the absolute path to the executable (there is no $PATH searching!)
-
It may have arguments after the executable (e.g.
cmd /usr/bin/python foo.py
) -
Use double-quotes to form a quoted string into a single argument.
There is no shell expansion: no wildcards, backticks, variables, etc. If you need shell features in your command, invoke it through a shell script.
env
-
Sets an environment variable for a job, e.g.
env DEBUG=1
. -
Use repeatedly on separate lines to set multiple environment variables.
disable
-
Use
disable
on a line by itself to make the job disabled. -
This is sometimes quicker than commenting out the whole job.
out, err, in
-
Use
out
anderr
to send stdout or stderr to a file. -
stdout and stderr go to syslog by default.
-
stdin defaults to
/dev/null
; usein
to override
nice
-
This changes the process priority
-
Takes a number in the range -19 (highest priority) to 20 (lowest)
cpu
-
This sets the CPU affinity- the list of CPU’s the task can run on
-
Takes a CPU number (e.g. 0) or range (e.g. 2-4) or a mix (e.g. 0,2-4)
-
Alternatively, can take a 0x-prefixed hex mask (e.g. 0x8f)
-
Any CPUs in the set that are physically absent are ignored
user
-
Specifies the unix username to run the process as.
-
That user’s uid/gid becomes those of the process.
-
Defaults to root (when pmtr is running as root)
depends
-
Specifies a block of one or more files that the job depends on.
-
pmtr watches the dependencies for changes to their content.
-
Pmtr restarts the job if a change is detected.
job { name demo-daemon dir /home/demo cmd /usr/bin/demod user demo env HOME=/home/demo depends { /home/demo/.demo/demo.conf } }
bounce every
-
Use
bounce every
to restart a job on a periodic interval. -
It takes a number and unit [smhd] e.g.
bounce every 1d
. -
Units [smhd] are seconds, minutes, hours or days.
-
The exact timing of the restart is approximate.
ulimit
-
Use to modify the system resource limits for the job.
-
Takes a flag and value, e.g.,
ulimit -n 30
. -
Values are numeric or the keyword
infinity
.
To see the current limits on a process by its PID:
cat /proc/<pid>/limits
Pmtr sets both the "hard" and "soft" limit to the same value. Any error in setting the limit is logged to syslog.
See man prlimit
for technical descriptions of each limit.
In the bash
shell, ulimit -a
and help ulimit
display the
limits and a list of flags and descriptions respectively.
|
the maximum size of core files created |
|
the maximum size of a process’s data segment |
|
the maximum scheduling priority (taken as 20-limit) |
|
the maximum size of files the process may create |
|
the maximum number of pending signals |
|
the maximum bytes a process may lock into memory |
|
the maximum resident set size |
|
the maximum number of open file descriptors |
|
the maximum number of bytes in POSIX message queues |
|
the maximum real-time scheduling priority |
|
the maximum stack size |
|
the maximum amount of cpu time in seconds |
|
the maximum number processes or threads |
|
the maximum size of process virtual memory |
The units are discussed in the prlimit(2) man page.
wait/once
-
Used for setup jobs that need not be restarted.
-
wait
pauses the startup of subsequent jobs. -
once
tells pmtr not to restart this job.job { name initial-setup cmd /bin/mkdir /dev/shm/go wait once }
Operator notes
Pmtr detects writes to pmtr.conf
and applies the changes immediately.
This allows a user to edit the config file manually, and save it, and
immediately have pmtr read the config file, taking appropriate action.
-
A newly-added job gets started.
-
A deleted job is terminated.
-
A changed job is restarted with its new configuration.
You can run pmtr -t
to check the config file syntax and report any errors.
Viewing the logs
When pmtr is a container entrypoint, the pmtr logs can be viewed using docker
logs
for that container.
On a host, use journalctl -u pmtr
to view pmtr logs.
Checking status, and starting and stopping pmtr
Use the systemd management commands to start or stop pmtr or check the pmtr status on a systemd-managed host.
systemctl status pmtr
systemctl start pmtr
systemctl stop pmtr
Behavior when a job exits
If a job terminates by itself, when pmtr did not signal it to exit, (and the
job does not have the once
option), pmtr restarts it. However, if it exited
within 10 seconds of when it started, pmtr waits 10 seconds to restart it. The
10-second wait prevents rapid process cycling. Also, a job that’s waiting for
something (like a network resource to come up) can be designed to exit instead
of retry, relying on pmtr to restart it periodically to try again.
Pmtr terminates a job when it is deleted, disabled, or altered in pmtr.conf
,
or is being bounced due to the bounce every
option; or because pmtr itself is
being shut down. To terminate a job, pmtr sends SIGTERM to it, then SIGKILL
shortly afterward, if it’s still running.
Command line options
The host init system normally runs pmtr at system boot. You can instead run pmtr manually. Or, it can be the init process inside a container. In these use cases, the following command line options may be useful.
|
show help |
|
stay in foreground (enabled by default when PID 1) |
|
specify configuration file |
|
test syntax (parse config file and exit) |
|
verbose logging (repeatable), -vv shows parsing |
|
make pidfile |
"onconnect" utility
For processes that run in response to an accepted network connection, pmtr includes a helper that will only run the process when a client actually connects to that socket. Until then, only the socket listener will run; when the client connects, the listener will fork and execute the program to handle it. This is a way to defer running a network service until a client connects to it. A new instance of the service will be started for each client connection. To use it, configure a job using the "onconnect" utility as the command, followed by the service itself. For example, a hypothetical Gstreamer pipeline could be started only upon a client connection on port 5000 like this:
job {
name gstreamer-job
cmd /usr/bin/onconnect -p 5000 /path/to/gstreamer-job <args>
}
The "onconnect" utility is built with pmtr and installed alongside it
so it will be in /usr/bin if you ran cmake --install . --prefix=/usr
.
The full syntax for onconnect includes the ability to customize the IP address it listens on (for a multihomed host); all is the default.
onconnect [-a 0.0.0.0] -p <port> /process/to/fork [args]
When a client connection is made to this port, onconnect will fork and execute the named subprocess with the accepted client connection on file descriptor 3. The subprocess must be coded to expect the client connection on that descriptor.
UDP control
Note
|
This feature is disabled by default. |
These options may appear in pmtr.conf
at the global scope.
report to udp://127.0.0.1:9999
listen on udp://0.0.0.0:10000
The report to
option designates a remote address and port to which pmtr
should send a a UDP packet every ten seconds. The packet payload lists the job
names, enabled or disabled status, and elapsed runtimes in simple text. If the
report to
address falls in the multicast UDP range (e.g. 239.0.0.1, etc), the
specification may include a trailing interface, e.g., report to
udp://239.0.0.1:9999@eth2
to designate the interface from which the multicast
UDP datagrams should egress.
The listen on
option allows jobs to be remotely enabled or disabled. It
specifies a UDP address and port that pmtr should listen on for datagrams of
form enable abc
or disable abc
, where abc is a job name. The address
0.0.0.0 can be used as a shortcut to denote "any address" on this system. The
effect is temporary; the settings in pmtr.conf resume precedence when it’s
edited or pmtr is restarted.
These options are considered experimental and may be replaced or removed.