Time for some ghetto monitoring

If you came here between April 28 and about an hour ago, you got a “couldn’t connect to database” error. Oops! Seems my limited memory EC2 instance got a little overwhelmed by php processes and decided the db server, drizzled, should die to make more room for PHP. Ooops! Time to drop pm.max_children.

I don’t have any monitoring setup for the site, so I just now figured it out. Until I get proper monitoring, I’ve installed this fancy bit of duct-tape upstart magic:

start on stopping
task
script
env | mail -s "$JOB is stopping!" me@myemail.com
end script

What does this do? Well is emails me whenever upstart gives up respawning something, or I manually stop a service.

Its not monitoring. I need monitoring. But this is a nice little hack to prevent a regression while I figure that out.

5 thoughts on “Time for some ghetto monitoring

  1. Psst….you can use ‘monit’ for this sort of thing almost as easily as your little shell script.

  2. True stephane, I could use a number of different process managers. What I used isn’t just a shell script, its tied directly into upstart, which is built in to Ubuntu, so I didn’t have to setup anything new to get my little hack in place.

  3. how is this a regression if you didn’t have monitoring before?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>