Out of disk space after only a week?

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
clauretano
Posts: 2
Joined: Tue Jan 19, 2010 1:12 pm

Out of disk space after only a week?

Post by clauretano »

I figured the machine was 8GB because that was sufficient, but boy was I wrong.

***** Nagios XI Alert ***** Notification Type: PROBLEM Service: Root Partition Host: localhost Address: 127.0.0.1 State: WARNING Info: DISK WARNING - free space: / 1355 MB (20% inode=93%)

Since it's a VM on a vSphere cluster and since the XI appliance uses LVM, it shouldn't be too much trouble to extend it, or so I thought.

Here's the steps I went through, including where I think I went wrong:
1. Shut down the appliance
2. Edit the config in the vSphere Client. I expanded the disk from 8GB to 64GB
3. Boot up the appliance, partition the new free space at the end of the disk as a primary partition type Linux LVM (8e)
4. Expand VolGroup00 to fill the new partition (/dev/sda3 in my case)
5. Expand the logical volume VolGroup00-Logvol00 to fill the empty space in the volume group
6. Expand the ext3 partition to fill the new space in the logical volume** this is where I messed it up.

I issued the command "resize2fs /dev/mapper/VolGroup00-Logvol00", and this is the output I got:

Code: Select all

[root@nagiosxi ~]# resize2fs /dev/mapper/VolGroup00-LogVol00
resize2fs 1.39 (29-May-2006)
Filesystem at /dev/mapper/VolGroup00-LogVol00 is mounted on /; on-line resizing required
Performing an on-line resize of /dev/mapper/VolGroup00-LogVol00 to 16424960 (4k) blocks.
It has been quite a while, which is why I'm worried that it has failed. Monitoring the CPU, Memory, and Disk Usage in vSphere I see that it did peak for a few minutes but it's basically flatlining now. There has been zero disk usage for the last 30 minutes. I did try searching the wiki for info on this topic before I proceeded but the wiki seems to be pretty much empty. I knew I should have just rolled my own Nagios box, but mgmt likes to see support contracts, "enterprise", ajaxy interfaces and pretty pictures, all of which seem to be covered by Nagios XI.
User avatar
admin
Site Admin
Posts: 256
Joined: Mon Oct 12, 2009 8:21 am

Re: Out of disk space after only a week?

Post by admin »

Strange that you ran out of space that quickly. How many hosts/services are you monitoring? Are there a lot of passive service checks?

In order to see where the space might be getting used, try running the 'du -hs' command on four directories like so:

Code: Select all

cd /usr/local/nagios
du -hs *
cd /usr/local/nagiosxi
du -hs *
cd /var/lib/mysql
du -hs *
cd /var/lib/pgsql
du -hs *
If there are large numbers somewhere in the output, you can start digging deeper in the offending directory to try and track the source down. It could be database size, performance graphs, or Nagios event logs that are eating up some space.

BTW, thanks for posting the notes on how you expanded the disk. I'm sure others will find the information you posted most useful if they need to upsize their drives in the future.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Ethan Galstad
President
User avatar
rseiwert
Posts: 196
Joined: Wed Jun 22, 2011 10:33 pm
Location: Somewhere between Here and Now

Re: Out of disk space after only a week?

Post by rseiwert »

I know this is a year later but I have the same problem. The thing that everyone seemed to miss about the original post is that the disk is 20% full but the inode list is 97% full. Expanding the disk will have no effect. Expanding the file system with mkfs should but the issue is lots of temp files not being cleaned up.

To fix this problem:
Remove unnecessary (old, temporary, core, or log) files from the filesystem.
Determine whether the filesystem contains a large number of small files.

The initial allocation of inodes assumes a ratio of about four data blocks per inode. If the filesystem contains mostly files that are smaller than four blocks, it runs out of inodes.
Grumpy Olde IT Guy
User avatar
rseiwert
Posts: 196
Joined: Wed Jun 22, 2011 10:33 pm
Location: Somewhere between Here and Now

Re: Out of disk space after only a week?

Post by rseiwert »

BTW, top of inode count for /usr/local/nagios

181171 ./var/spool/perfdata
1254 ./share/images/logos
168 ./var/archives
124 ./libexec
106 ./share/docs/images
95 ./share/perfdata

Is there supposed to be a log rotator or cleanup process for these?
Grumpy Olde IT Guy
User avatar
rseiwert
Posts: 196
Joined: Wed Jun 22, 2011 10:33 pm
Location: Somewhere between Here and Now

Re: Out of disk space after only a week?

Post by rseiwert »

Also can these be deleted? If so how to delete 181,000 files from a directory. Surely rm -f * will not work as the expansion would be to long.
Grumpy Olde IT Guy
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Out of disk space after only a week?

Post by mguthrie »

If there are 181171 then you might have a permissions issue somewhere. Those files are supposed to be cleaned up automatically within by PNP (the performance grapher). What are the permissions on that directory? The files are supposed to be "reaped" every 15 seconds, the results processed and dump to the rrd files, and then the files removed. How are your performance graphs? ; )

Code: Select all

service npcd status
service npcd restart
You can delete those files, it's hard to say whether or not that performance data has been processed to the rrd files or not for performance graphs. If we fix whatever issue is preventing them from being deleted, they will get processed ok, but your CPU load is going to be pretty high until they're all completed.
User avatar
rseiwert
Posts: 196
Joined: Wed Jun 22, 2011 10:33 pm
Location: Somewhere between Here and Now

Re: Out of disk space after only a week?

Post by rseiwert »

This is a problem that's been going on for awhile. I
Grumpy Olde IT Guy
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Out of disk space after only a week?

Post by mguthrie »

I'm thinking there was supposed to be more to that message ; )