Comments
-
Greg Hurrell
Relevant docs here:
You can check the exit status of a program or a script. This test may only be used within a program service entry in the Monit control file.
An example:
check program myscript with path "/usr/local/bin/myscript.sh" if status != 0 then alert
Monit will execute the program periodically and if the exit status of the program does not match the expected result, Monit can perform an action. In the example above, Monit will raise an alert if the exit value of myscript is different from 0. By convention, 0 means the program exited normally.
Program checks are asynchronous. Meaning that Monit will not wait for the program to exit, but instead, Monit will start the program in the background and immediately continue checking the next service entry in monitrc. At the next cycle, Monit will check if the program has finished and if so, collect the programs exit status - if the status indicate a failure, Monit will raise an alert message containing the program's error (stderr) output, if any. If the program has not exited after the first cycle, Monit will wait another cycle and so on. If the program is still running after 5 minutes, Monit will kill it and generate a program timeout event. It is possible to override the default timeout (see the syntax below).
The asynchronous nature of the program check allows for non-blocking behavior in the current Monit design, but it comes with a side-effect: when the program has finished executing and is waiting for Monit to collect the result, it becomes a so-called "zombie" process. A zombie process does not consume any system resources (only the PID remains in use) and it is under Monit's control; The zombie process is removed from the system as soon as Monit collects the exit status. This means that every "check program" will be associated with either a running process or a temporary zombie. This unwanted zombie side-effect will be removed in a later release of Monit.
The syntax of the program status statement is:
IF STATUS operator value [TIMEOUT <N> SECONDS] [[<X>] <Y> CYCLES] THEN action [ELSE IF SUCCEEDED [[<X>] <Y> CYCLES] THEN action]
operator is a choice of "<",">","!=","==" in c notation, "gt", "lt", "eq", "ne" in shell sh notation and "greater", "less", "equal", "notequal" in human readable form (if not specified, default is EQUAL).
action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or "UNMONITOR".
-
Greg Hurrell
Just happened again (had to kill off some stale processes manually; xinetd was refusing to launch any more as the limit had been reached).
Could also look at using nagios, although it may be overkill for this use case.
-
anonymous
not working i tried many times
-
Greg Hurrell
Just happened again (issue #1981).
-
Greg Hurrell
I've turned off the old xinetd-based management and replaced it with monit running directly from
/etc/inittab
and set to manage Git using this init.d script. If a connection to port 9418 is refused the daemon will be restarted. Checks happen every 30 seconds. -
Greg Hurrell
Status changed:
- From: new
- To: closed
-
Greg Hurrell
Status changed:
- From: closed
- To: open
-
Greg Hurrell
I'm reopening this one. It still happens from time to time. Monitoring on port 9418 is not enough; I think I'll need to use a script that attempts to perform a successful clone of a repo, and if that fails, be prepared to
kill -9
the deadgit-daemon
process which is holding onto the port but not serving traffic. -
Greg Hurrell
And it just happened again...
This time I noticed not only a bunch of stale
git-daemon
processes lingering around, but alsogit-upload-pack
ones too.For both of these, a
sudo pkill git-daemon
/sudo pkill git-upload-pack
only killed off some of them; the others needed the-9
treatment as well. Overall, the sequence looks like:$ sudo monit stop git $ sudo pkill git-daemon $ sudo pkill -9 git-daemon $ sudo pkill git-upload-pack $ sudo pkill -9 git-upload-pack $ sudo monit stop git
-
Greg Hurrell
Looking at using something hacky like:
check program git-clone with path "/bin/sh -c 'cd $(/bin/mktemp -d) && /usr/bin/git clone --bare --depth 1 --quiet --single-branch git://git.wincent.dev/wincent.git'" if status != 0 then exec "/usr/bin/pkill git-daemon && /usr/bin/pkill git-upload-pack && sleep 5 && /usr/bin/pkill -9 git-daemon && /usr/bin/pkill -9 git-upload-pack"
Doesn't work though:
/etc/monit.d/git:7: Error: syntax error 'git-clone'
So either I am doing it wrong or my version of monit is too ancient to support this.
-
Greg Hurrell
And there it is:
the "check program" requires monit 5.3 or newer
But I am on an older version:
$ yum list installed|grep monit monit.x86_64 5.2.5-3.11.amzn1 @amzn-main
Add a comment
Comments are now closed for this issue.