[okfn-help] backup, monitoring, etc

Rufus Pollock rufus.pollock at okfn.org
Thu Nov 5 19:33:12 GMT 2009


2009/11/5 James Casbon <casbon at gmail.com>:
> 2009/11/4 Rufus Pollock <rufus.pollock at okfn.org>:
>> 2009/11/4 James Casbon <casbon at gmail.com>:
>>> For monitoring, I want to get munin back running -
>>> http://knowledgeforge.net/okfn/tasks/ticket/133
>>> Munin is running but munin.okfn,org is not resolving.  Who has the DNS details?
>>
>> DNS for all our stuff is run off http://www.everydns.net/ -- see
>> <http://knowledgeforge.net/okfn/tasks/wiki/SystemOrganization#DNS>
>>
>>> Also, I think it is worth discussing what is happening for backup.
>>>
>>> We now have a few scripts running on eu0 that can back things up
>>> easily enough. Look at /etc/cron.daily/daily_backup_snapshot_eu1.  You
>>> can adjust the hostname, target directory and begin backing up another
>>> host.
>
> Now munin is back, we can see the effect of the backups on eu0 (about 6am):
> http://munin.okfn.org/okfn.org/eu0.okfn.org-cpu.html
>
> Now you can see there is a shed load of iowait going on there when the
> backup happens.  But interestingly there is a lot of iowait all the
> time - or more than I have seen with other hosts.

My guess part of this is that this is quite a busy machine so that the
backup interacts badly with the fact that apache is still trying to
serve requests -- but this is just guessing.

> Is this host running off tape or something?  Anyway I really don't
> think it is an appropriate backup host!
> Is it possible there is some disk misconfiguration going on here?

One possibility is that openshakespeare.org does a lot of reading off
disk (we could fix this by doing more caching). However w/o a better
fix we'll be stumbling in the dark.

Rufus



More information about the okfn-help mailing list