I have a server which is monitored by Watchdog, and experiences reboots occasionally due to faulty network hardware I am unable to replace at the moment. As I read, Watchdog sends a SIGTERM to all processes, requesting a safe shutdown, and after a short time, sends a SIGKILL, which will stop the process immediately. However, in this case, it leads to data corruption since the main process of importance is not fully shut down yet and has unwritten data.
How long is this pause Watchdog takes between asking all processes to stop, and forcing them to stop? Is it hardwired within Watchdog, set in watchdog.conf(if it was, it never got documented in the manpage), or the same as another system setting? How may I change this setting?
Edit: I've found the timeout, but I am still looking for instructions on how to rebuild and integrate with the system properly.