PostgreSQL: performance descrease due to index bloatper

performancepostgresql

I'm running a PgSQL 8.1 on a CentOS 4.4 system (not upgradable unfortunately).
There's a Java app running on top of the PgSQL daemon and we got to reindex the database every 2 months or so. Also important: the database isn't growing.

It looks like the bloat is now coming faster than before and this tends to increase.

My config is available here, autovacuum daemon is enabled and running quite often:
pastebin.com/RytNj7dK

You can also find the output of this query wiki.postgresql.org/wiki/Show_database_bloat

3 hours after running reindex: http://pastebin.com/raw.php?i=75fybKyd
72 hours after running reindex: http://pastebin.com/raw.php?i=89VKd7PC

Does anyone have any idea what should I tweak to get rid of that growing bloat?

Thanks for your help

PS: due to antispam prevention system, I had to remove the first 2 http:// prefixes for my two first links.

Best Answer

Unfortunately if you have an active database (one with lots of inserts/updates/deletes) you will experience index bloat -- It's just a fact of database life. The best you can do is hope to slow the bloat down to the point where your reindex intervals are reasonable.

The best advice I can give you in that regard is to upgrade to a newer version of Postgres (8.3 or later): This is when Postgres introduced Heap-Only Tuples support.
Right now on your (8.1) system ANY update to a row is the equivalent of a delete/insert as far as the index is concerned, hence the index bloat. 8.3 and later don't touch the index unless they have to ("If the row still fits on the page it's in").

After upgrading to a version of Postgres that with HOT support you may still experience index bloat if your UPDATEs are touching indexed columns, or if your UPDATEs substantially increase the size of a row so that it has to be moved to a new page, but these situations should be relatively infrequent provided your indexing strategy is sane and your rows are relatively static in size, so the index bloat issue should be less of a problem.


Some additional general strategies for dealing with index bloat:

  1. Primary Key Indexes
    You'r pretty much out of luck here - you need to REINDEX and take the table lock.
  2. Other Indexes
    • Option 1: DROP and re-CREATE Non-Critical Indexes
      This has the advantage of not locking the table, but the disadvantage of taking away the index while it's being rebuilt.
    • Option 2: Index-Shuffle for Critical Indexes
      Rather than the Drop/Create procedure above, first create a new index, then when that is done drop the old one and rename the new one to take its place.
      This has the advantage of not locking the table, and the advantage of leaving the original index working (though bloated). The primary disadvantage is that you've got to rename an index to keep naming conventions sane -- an extra manual step.