Your database may be silently headed for trouble. PostgreSQL, like all modern MVCC-based relational databases, is subject to what’s called “bloat.” The consequences range from slowdowns and wasted space to transaction id wraparound – aka the “night of the living dead” when deleted rows come back to life.
Bloat is the disk space claimed by unneeded data rows and indices. Quinn Weaver
from Postgres Experts
explains the causes and consequences of bloat as well as its prevention and steps for emergency intervention.
Why do Postgres databases get bloat?
- Couldn’t you just not do that vacuum thing? It causes so many problems!
Vacuuming is necessary for MVCC systems like Postgres
- Multi-version concurrency control
- When you delete a row it doesn’t really go away, instead it becomes invisible to future transactions
- Transactions started before the deletion continue to see the row
- Likewise an update is just a delete followed by insert
The transaction visibility of every row is stored in the hidden (but accessible)
xmax` columns on every table
- Example of MVCC
What’s good about MVCC?
- It’s efficient
- In particular it avoids the locking found in earlier relational databases
- Works with a large number of concurrent transactions
- It’s now in use by all relational databases
What’s bad about MVCC?
- Rows persist even when they are no longer visible to any current or future transaction
- That is what we call bloat
- Indices also get bloat
- Leaf nodes point at dead rows, and inner nodes point at those superfluous leaf nodes, etc etc
- You can’t actually remove an index pointer until its entire page stops pointing at rows
- Hence indices tend to stay bloated more than tables do
What are the consequences of bloat?
- It fills up pages (you have to read more pages into memory from disk)
- Slows queries down
- Have to keep more things in memory, some of which is junk
- Bloat can in fact push the working set to be greater than ram, and you spill to disk and life sucks
- Bloated indices take more steps to traverse
- They can get really big
What can you do about bloat?
- Something needs to garbage collect dead rows
- That something is the auto-vacuum daemon
- It’s been around for about eleven years
- Usually its default settings do a pretty good job
- But if you have an especially high-traffic site it might not keep up
How do you know when bloat is happening?
- You don’t want to prematurely optimize
- Scripts at pgexperts/pgx_scripts
- Example output from the table an index bloat scripts
- Even large absolute values of bloat can be OK as long as the percentage is stable
- Speed of bloat growth is important
- Run bloat queries on a cron job and review it weekly to review the direction of change
- If you’re measuring increasing bloat then auto-vacuum is not keeping up
- You can tune auto-vacuum parameters to stay on top of bloat but it won’t save you if you’re too far gone already
If things are bad, try pg_repack
- It copies love rows to a new table, then does a quick switch in the system catalogs to make this new table take the place of the old
- It’s way better than the built-in alternative, VACUUM FULL, because repack does not require access-exclusive locks
does require an access exclusive lock during the quick switch though
- If it cannot get the lock it starts canceling queries or killing back-ends
- Repack also takes a lot of space (a duplicate table’s worth)
Another approach is flexible freeze
by PG Experts. It’s a vacuum, but it’s an extra aggressive vacuum
- It’s designed to be run as a cron job during lulls in traffic to start to catch up on bloat debt
- You can force flexible freeze to cancel its work but it will have to start from scratch next time you run it
Dealing with bloated indices
Flexible freeze does not fix indices like
- One fix is to build an index that is identical to the bloated one and then drop the bloated on (you can do it all in a single transaction)
will take a lock on your table, but you can also run
create index concurrently
which locks only ddl
- However you’ll want to check that the index built correctly
In psql run
and look at the indices for any marked “invalid.”
- You can programmatically check for invalid indices as well
- The other limitation of the index swap method is that you can’t use it for primary key indices
- Flexible freeze does not fix indices like
Preventing future bloat
- Auto-vacuum and its settings
is number of concurrent vacuum processes you can have running
- Certainly safe to raise to 5 or 6 on most databases, higher for really big databases
- Each worker gets maintenance_work_mem amount of memory, and generally a sixteenth of available ram is appropriate
- The default is 16mb which can be seriously too low!
- But if you double the number of workers to six then use an eighth of your ram instead
- How often is auto-vacuum run?
Starts for every n rows written where n =
for table] +
For huge tables the default value of
= 0.1 will lead to serious bloat situations
- For huge tables set the scale factor to 0.02 (you can set it per-table) and crank up the threshold a little
- There are analogous settings for the auto-analyze daemon and its settings are quite cheap to raise
- But not need to raise them unless you’re noticing bad query planning
Transaction ID (XID) overflow
are monotonically increasing
- The underlying number is a 32-bit int
- If it overflows then Postgres cannot tell the past from the future and deleted rows would come back from the dead!
- Total corruption
- It’s so bad that Postgres will shut down to prevent it from happening and will not start up until you manually address the problem
- The moral of the story is don’t turn auto-vacuum off
searches for rows so old that no transaction can see them and replaces their
with a magic value
- This buys you time
You can measure how close you are to XID wraparound using some scripts
- Vacuum freeze parameter overview
See Josh Berkus’ article
for more info