Tales From A Lazy Fat DBA

Fan of Oracle DB & Performance, PostgreSQL, Cassandra & much more … \,,/

  • Likes

    • 278,721
  • Archives

  • ΰ₯

  • Categories

  • Subscribe

  • Prashant Dixit is the FatDBA

  • Follow Tales From A Lazy Fat DBA on WordPress.com
  • My Twitter Feeds

  • Disclaimer!

    FatDBA or Oracle ‘Ant’ is an independent web-blog/site.The experiences, Test cases, views, and opinions expressed in this website are my own and does not reflect the views or opinions of my employer.

    This site is independent of and does not represent Oracle Corporation in any way. Oracle does not officially sponsor, approve, or endorse this site or its content.
    Product and company names mentioned in this website may be the trademarks of their respective owners.

How to simulate the deadly, infamous, misunderstood & complex ‘Transaction Wraparound Problem’ in PostgreSQL …

Posted by FatDBA on July 20, 2021

Hi Guys,

Last week one of my good friend asked me an interesting question – ‘How to replicate the transaction wraparound problem’ in PostgreSQL ?

He has a demo/POC that he needs to do for one of his customer who is interested to see how it looks like in reality and how ‘deadly’ is the problem and how to fix this. He scanned almost everything on the Google, but unfortunately found nothing available or pre-cooked …

So, this might be the only post available on the internet explaining how to exactly simulate the transaction wraparound in the PostgreSQL Database. This one was really interesting and I have never tried this before as I thought I will never see transaction ID wraparound error πŸ™‚ . So, before I jump in to the demo, we need to take care of few things. Use a system with good resources, I have performed this demo on an EDB PostgreSQL 11.11.20 database with 8 CPUs, 31 GB RAM, and it took me around 9 hours to exhaust all transactionIDs and reach the limit of 2,000,000,000.

Would like to thank Nikolay Samokhvalov for his help! πŸ™‚

Before I start, let’s do this one more time – What is a transaction ID and why it wrapsaround ?
The PostgreSQL transaction control mechanism assigns a transaction ID to every row that is modified in the database; these IDs control the visibility of that row to other concurrent transactions. Txid wraparound is a problem due to MVCC. MVCC relies on being able to take the txids of two transactions and determine which of the transactions came first.
A PostgreSQL database can have two billion ‘in-flight’ unvacuumed transactions before PostgreSQL takes dramatic action to avoid data loss. If the number of unvacuumed transactions reaches (2^31 – 10,000,000), the log starts warning that vacuuming is needed. If the number of unvacuumed transactions reaches 2,000,000,000, PostgreSQL sets the database to read-only mode and requires an offline, single-user, standalone vacuum.

Note: You have to carefully monitor your database pg_log, archiving locations as this activity will generate huge WALs. At the same time, keep an eye on the size of your log directory, specially if you are capturing all statements.

We will open multiple parallel sessions to simulate the workload and monitor the progress.

SESSION 1:

In the first session, we will first connect with the database and set idle_in_transaction_session_timeout to value 0 and start a BEGIN block and execute select txid_current(); this is to block autovacuum, as simply turning it off wouldn’t help because emegency mode kick in even with `autovacuum = off.

The idle_in_transaction_session_timeout parameter terminate any session with an open transaction that has been idle for longer than the specified duration in milliseconds. This allows any locks held by that session to be released and the connection slot to be reused; it also allows tuples visible only to this transaction to be vacuumed. So, we will have to set it to value 0 that will disable this feature.

-bash-4.2$ psql -p 5444
psql.bin (11.11.20)
Type "help" for help.

enterprisedb=# SET idle_in_transaction_session_timeout = 0;
SET
enterprisedb=#  BEGIN;
BEGIN
enterprisedb=# select txid_current();
 txid_current
--------------
         1266

enterprisedb=#
enterprisedb=#

SESSION 2:

Now, I will create a file with current transaction ID written in to it and will call the pgbench. I am doing something like below to call pgbench, below are the explanations on flags used and why I used them ..

pgbench -U enterprisedb enterprisedb -f wl.sql -c8 -j8 -T100000 -rn -P60

-f: to read from a file
-c8: 8 clients
-j8: 8 threads/jobs (adjust it as per your CPU count)
-T100000: seconds to run the test for this many seconds, rather than a fixed number of transactions per client
-rn: where r is to report the average per-statement latency (execution time from the perspective of the client) of each command after the benchmark finishes. Whereas n is to perform no vacuuming before running the test. This option is necessary if you are running a custom test scenario that does not include the standard tables pgbench_accounts, pgbench_branches, pgbench_history, and pgbench_tellers.
-P60: is to show progress report every sec seconds

-bash-4.2$ 
-bash-4.2$ echo 'select txid_current();' > wl.sql
-bash-4.2$ 
-bash-4.2$
-bash-4.2$ /usr/edb/as11/bin/pgbench -U enterprisedb enterprisedb -f wl.sql -c8 -j8 -T100000 -rn -P60
progress: 60.0 s, 51097.6 tps, lat 0.157 ms stddev 0.262
progress: 120.0 s, 51171.1 tps, lat 0.156 ms stddev 0.252
progress: 180.0 s, 51622.8 tps, lat 0.155 ms stddev 0.250
progress: 240.0 s, 51328.9 tps, lat 0.156 ms stddev 0.260
progress: 300.0 s, 50976.6 tps, lat 0.157 ms stddev 0.256
progress: 360.0 s, 51130.3 tps, lat 0.156 ms stddev 0.265
progress: 420.0 s, 50792.6 tps, lat 0.158 ms stddev 0.254
progress: 480.0 s, 51207.2 tps, lat 0.156 ms stddev 0.250
progress: 540.0 s, 51162.6 tps, lat 0.156 ms stddev 0.251
progress: 600.0 s, 51246.0 tps, lat 0.156 ms stddev 0.256
progress: 660.0 s, 51047.3 tps, lat 0.157 ms stddev 0.253
progress: 720.0 s, 51197.3 tps, lat 0.156 ms stddev 0.252
progress: 780.0 s, 51119.2 tps, lat 0.156 ms stddev 0.281
progress: 840.0 s, 50938.8 tps, lat 0.157 ms stddev 0.252
progress: 900.0 s, 51252.8 tps, lat 0.156 ms stddev 0.251
progress: 960.0 s, 51054.0 tps, lat 0.157 ms stddev 0.253
progress: 1020.0 s, 51159.5 tps, lat 0.156 ms stddev 0.254
progress: 1080.0 s, 51116.0 tps, lat 0.157 ms stddev 0.253
progress: 1140.0 s, 50947.6 tps, lat 0.157 ms stddev 0.251
progress: 1200.0 s, 51289.7 tps, lat 0.156 ms stddev 0.254
progress: 1260.0 s, 50338.7 tps, lat 0.159 ms stddev 0.254
progress: 1320.0 s, 50121.1 tps, lat 0.160 ms stddev 0.260
progress: 1380.0 s, 49936.6 tps, lat 0.160 ms stddev 0.291
progress: 1440.0 s, 50451.1 tps, lat 0.159 ms stddev 0.251
progress: 1500.0 s, 51107.4 tps, lat 0.157 ms stddev 0.248
progress: 1560.0 s, 50907.8 tps, lat 0.157 ms stddev 0.321
progress: 1620.0 s, 51033.7 tps, lat 0.157 ms stddev 0.251
progress: 1680.0 s, 51013.6 tps, lat 0.157 ms stddev 0.254
progress: 1740.0 s, 51119.4 tps, lat 0.156 ms stddev 0.252
progress: 1800.0 s, 51411.9 tps, lat 0.156 ms stddev 0.249
progress: 1860.0 s, 51099.1 tps, lat 0.157 ms stddev 0.249
progress: 1920.0 s, 51527.5 tps, lat 0.155 ms stddev 0.256
progress: 1980.0 s, 51204.3 tps, lat 0.156 ms stddev 0.267
progress: 2040.0 s, 51363.8 tps, lat 0.156 ms stddev 0.249
progress: 2100.0 s, 51563.1 tps, lat 0.155 ms stddev 0.250
progress: 2160.0 s, 51329.3 tps, lat 0.156 ms stddev 0.246
progress: 2220.0 s, 51073.4 tps, lat 0.157 ms stddev 0.276
progress: 2280.0 s, 51402.0 tps, lat 0.156 ms stddev 0.248
progress: 2340.0 s, 51364.4 tps, lat 0.156 ms stddev 0.253
progress: 2400.0 s, 50890.0 tps, lat 0.157 ms stddev 0.251
progress: 2460.0 s, 50239.6 tps, lat 0.159 ms stddev 0.296
progress: 2520.0 s, 50684.8 tps, lat 0.158 ms stddev 0.283
progress: 2580.0 s, 49826.2 tps, lat 0.161 ms stddev 0.271
progress: 2640.0 s, 51076.7 tps, lat 0.157 ms stddev 0.250
progress: 2700.0 s, 51349.6 tps, lat 0.156 ms stddev 0.252
progress: 2760.0 s, 51076.6 tps, lat 0.157 ms stddev 0.253
progress: 2820.0 s, 50854.5 tps, lat 0.157 ms stddev 0.256
progress: 2880.0 s, 49864.8 tps, lat 0.160 ms stddev 0.260
progress: 2940.0 s, 50450.1 tps, lat 0.159 ms stddev 0.264
progress: 3000.0 s, 50334.7 tps, lat 0.159 ms stddev 0.264
progress: 3060.0 s, 50514.3 tps, lat 0.158 ms stddev 0.299
progress: 3120.0 s, 50508.7 tps, lat 0.158 ms stddev 0.284
progress: 3180.0 s, 50691.3 tps, lat 0.158 ms stddev 0.288
progress: 3240.0 s, 50601.6 tps, lat 0.158 ms stddev 0.256
progress: 3300.0 s, 50280.2 tps, lat 0.159 ms stddev 0.264
progress: 3360.0 s, 50457.6 tps, lat 0.159 ms stddev 0.265
progress: 3420.0 s, 50178.3 tps, lat 0.159 ms stddev 0.390
progress: 3480.0 s, 50353.3 tps, lat 0.159 ms stddev 0.264
progress: 3540.0 s, 50133.0 tps, lat 0.160 ms stddev 0.259
progress: 3600.0 s, 50217.6 tps, lat 0.159 ms stddev 0.282
progress: 3660.0 s, 50147.8 tps, lat 0.160 ms stddev 0.279
progress: 3720.0 s, 50578.0 tps, lat 0.158 ms stddev 0.261
progress: 3780.0 s, 50523.9 tps, lat 0.158 ms stddev 0.261
progress: 3840.0 s, 50349.7 tps, lat 0.159 ms stddev 0.269
progress: 3900.0 s, 50437.5 tps, lat 0.159 ms stddev 0.287
progress: 3960.0 s, 49569.1 tps, lat 0.161 ms stddev 0.311
progress: 4020.0 s, 50337.3 tps, lat 0.159 ms stddev 0.286
progress: 4080.0 s, 50065.2 tps, lat 0.160 ms stddev 0.260
progress: 4140.0 s, 49791.4 tps, lat 0.161 ms stddev 0.264
progress: 4200.0 s, 49681.3 tps, lat 0.161 ms stddev 0.326
progress: 4260.0 s, 48034.2 tps, lat 0.167 ms stddev 0.340
progress: 4320.0 s, 50460.9 tps, lat 0.159 ms stddev 0.290
progress: 4380.0 s, 49731.0 tps, lat 0.161 ms stddev 0.300
progress: 4440.0 s, 50441.1 tps, lat 0.159 ms stddev 0.266
progress: 4500.0 s, 50259.1 tps, lat 0.159 ms stddev 0.267
progress: 4560.0 s, 50197.0 tps, lat 0.159 ms stddev 0.259
progress: 4620.0 s, 49832.1 tps, lat 0.161 ms stddev 0.271
progress: 4680.0 s, 50160.3 tps, lat 0.159 ms stddev 0.268
progress: 4740.0 s, 50578.5 tps, lat 0.158 ms stddev 0.262
progress: 4800.0 s, 50260.5 tps, lat 0.159 ms stddev 0.270
progress: 4860.0 s, 50545.1 tps, lat 0.158 ms stddev 0.256
progress: 4920.0 s, 49961.8 tps, lat 0.160 ms stddev 0.285
progress: 4980.0 s, 50450.7 tps, lat 0.159 ms stddev 0.267
progress: 5040.0 s, 50008.7 tps, lat 0.160 ms stddev 0.257
progress: 5100.0 s, 49723.2 tps, lat 0.161 ms stddev 0.428
progress: 5160.0 s, 50211.1 tps, lat 0.159 ms stddev 0.378
progress: 5220.0 s, 49791.0 tps, lat 0.161 ms stddev 0.262
progress: 5280.0 s, 50091.5 tps, lat 0.160 ms stddev 0.265
progress: 5340.0 s, 50280.3 tps, lat 0.159 ms stddev 0.266
progress: 5400.0 s, 49957.1 tps, lat 0.160 ms stddev 0.402
progress: 5460.0 s, 50101.7 tps, lat 0.160 ms stddev 0.293
progress: 5520.0 s, 50387.3 tps, lat 0.159 ms stddev 0.261
progress: 5580.0 s, 50173.2 tps, lat 0.159 ms stddev 0.278
progress: 5640.0 s, 50225.1 tps, lat 0.159 ms stddev 0.266
progress: 5700.0 s, 50186.7 tps, lat 0.159 ms stddev 0.264
progress: 5760.0 s, 49120.1 tps, lat 0.163 ms stddev 0.264
progress: 5820.0 s, 49445.3 tps, lat 0.162 ms stddev 0.262
progress: 5880.0 s, 49547.2 tps, lat 0.161 ms stddev 0.260
progress: 5940.0 s, 49730.2 tps, lat 0.161 ms stddev 0.267
progress: 6000.0 s, 50543.3 tps, lat 0.158 ms stddev 0.258
progress: 6060.0 s, 50569.1 tps, lat 0.158 ms stddev 0.258
progress: 6120.0 s, 50560.4 tps, lat 0.158 ms stddev 0.250
progress: 6180.0 s, 50572.1 tps, lat 0.158 ms stddev 0.255
progress: 6240.0 s, 50551.6 tps, lat 0.158 ms stddev 0.253
progress: 6300.0 s, 50993.8 tps, lat 0.157 ms stddev 0.254
progress: 6360.0 s, 51114.6 tps, lat 0.157 ms stddev 0.257
progress: 6420.0 s, 50762.6 tps, lat 0.158 ms stddev 0.261
progress: 6480.0 s, 50814.2 tps, lat 0.157 ms stddev 0.250
progress: 6540.0 s, 50735.7 tps, lat 0.158 ms stddev 0.257
progress: 6600.0 s, 50521.9 tps, lat 0.158 ms stddev 0.263
progress: 6660.0 s, 50829.8 tps, lat 0.157 ms stddev 0.252
progress: 6720.0 s, 50933.7 tps, lat 0.157 ms stddev 0.255
progress: 6780.0 s, 51152.9 tps, lat 0.156 ms stddev 0.252
progress: 6840.0 s, 50457.1 tps, lat 0.159 ms stddev 0.255
progress: 6900.0 s, 50542.5 tps, lat 0.158 ms stddev 0.251
progress: 6960.0 s, 50802.5 tps, lat 0.157 ms stddev 0.282
progress: 7020.0 s, 51066.1 tps, lat 0.157 ms stddev 0.270
progress: 7080.0 s, 50729.0 tps, lat 0.158 ms stddev 0.258
progress: 7140.0 s, 51242.7 tps, lat 0.156 ms stddev 0.255
progress: 7200.0 s, 50714.2 tps, lat 0.158 ms stddev 0.253
progress: 7260.0 s, 50704.7 tps, lat 0.158 ms stddev 0.323
progress: 7320.0 s, 50951.4 tps, lat 0.157 ms stddev 0.250
progress: 7380.0 s, 50604.8 tps, lat 0.158 ms stddev 0.340
progress: 7440.0 s, 51217.0 tps, lat 0.156 ms stddev 0.251
progress: 7500.0 s, 50832.4 tps, lat 0.157 ms stddev 0.251
progress: 7560.0 s, 50815.7 tps, lat 0.157 ms stddev 0.256
progress: 7620.0 s, 49936.3 tps, lat 0.160 ms stddev 0.463
progress: 7680.0 s, 50556.4 tps, lat 0.158 ms stddev 0.251
progress: 7740.0 s, 50699.8 tps, lat 0.158 ms stddev 0.255
progress: 7800.0 s, 50474.0 tps, lat 0.158 ms stddev 0.256
progress: 7860.0 s, 50401.9 tps, lat 0.159 ms stddev 0.321
progress: 7920.0 s, 51385.5 tps, lat 0.156 ms stddev 0.251
progress: 7980.0 s, 50671.9 tps, lat 0.158 ms stddev 0.248
progress: 8040.0 s, 50631.3 tps, lat 0.158 ms stddev 0.252
progress: 8100.0 s, 50741.3 tps, lat 0.158 ms stddev 0.255
progress: 8160.0 s, 50465.5 tps, lat 0.159 ms stddev 0.257
progress: 8220.0 s, 51384.9 tps, lat 0.156 ms stddev 0.212
progress: 8280.0 s, 51676.8 tps, lat 0.155 ms stddev 0.233
progress: 8340.0 s, 52049.3 tps, lat 0.154 ms stddev 0.179
progress: 8400.0 s, 52394.6 tps, lat 0.153 ms stddev 0.178
progress: 8460.0 s, 51872.3 tps, lat 0.154 ms stddev 0.192
progress: 8520.0 s, 51896.9 tps, lat 0.154 ms stddev 0.178
progress: 8580.0 s, 52187.5 tps, lat 0.153 ms stddev 0.178
progress: 8640.0 s, 51773.8 tps, lat 0.155 ms stddev 0.185
progress: 8700.0 s, 51862.7 tps, lat 0.154 ms stddev 0.184
progress: 8760.0 s, 51906.7 tps, lat 0.154 ms stddev 0.189
progress: 8820.0 s, 52038.6 tps, lat 0.154 ms stddev 0.189
progress: 8880.0 s, 52370.9 tps, lat 0.153 ms stddev 0.180
progress: 8940.0 s, 52155.2 tps, lat 0.153 ms stddev 0.185
progress: 9000.0 s, 52162.2 tps, lat 0.153 ms stddev 0.219
progress: 9060.0 s, 51900.7 tps, lat 0.154 ms stddev 0.189
progress: 9120.0 s, 51794.5 tps, lat 0.154 ms stddev 0.178
progress: 9180.0 s, 51885.1 tps, lat 0.154 ms stddev 0.180
progress: 9240.0 s, 52007.4 tps, lat 0.154 ms stddev 0.177
progress: 9300.0 s, 52168.3 tps, lat 0.153 ms stddev 0.189
progress: 9360.0 s, 51801.3 tps, lat 0.154 ms stddev 0.182
progress: 9420.0 s, 51899.8 tps, lat 0.154 ms stddev 0.173
progress: 9480.0 s, 52215.3 tps, lat 0.153 ms stddev 0.177
progress: 9540.0 s, 52129.9 tps, lat 0.153 ms stddev 0.227
progress: 9600.0 s, 51929.9 tps, lat 0.154 ms stddev 0.176
progress: 9660.0 s, 51776.6 tps, lat 0.155 ms stddev 0.181
progress: 9720.0 s, 51542.8 tps, lat 0.155 ms stddev 0.268
progress: 9780.0 s, 51352.5 tps, lat 0.156 ms stddev 0.310
progress: 9840.0 s, 51661.0 tps, lat 0.155 ms stddev 0.177
progress: 9900.0 s, 51219.8 tps, lat 0.156 ms stddev 0.260
progress: 9960.0 s, 52235.8 tps, lat 0.153 ms stddev 0.184
progress: 10020.0 s, 51904.5 tps, lat 0.154 ms stddev 0.186
progress: 10080.0 s, 52066.8 tps, lat 0.154 ms stddev 0.183
progress: 10140.0 s, 52212.5 tps, lat 0.153 ms stddev 0.209
progress: 10200.0 s, 52402.8 tps, lat 0.153 ms stddev 0.186
progress: 10260.0 s, 52194.8 tps, lat 0.153 ms stddev 0.178
progress: 10320.0 s, 52117.3 tps, lat 0.153 ms stddev 0.179
progress: 10380.0 s, 52060.5 tps, lat 0.154 ms stddev 0.181
progress: 10440.0 s, 52043.1 tps, lat 0.154 ms stddev 0.183
progress: 10500.0 s, 52358.5 tps, lat 0.153 ms stddev 0.178
progress: 10560.0 s, 51361.2 tps, lat 0.156 ms stddev 0.179
progress: 10620.0 s, 51655.7 tps, lat 0.155 ms stddev 0.184
progress: 10680.0 s, 51501.1 tps, lat 0.155 ms stddev 0.176
progress: 10740.0 s, 51315.4 tps, lat 0.156 ms stddev 0.182
progress: 10800.0 s, 52344.5 tps, lat 0.153 ms stddev 0.181
progress: 10860.0 s, 51687.0 tps, lat 0.155 ms stddev 0.182
progress: 10920.0 s, 51150.2 tps, lat 0.156 ms stddev 0.179
progress: 10980.0 s, 51341.9 tps, lat 0.156 ms stddev 0.181
progress: 11040.0 s, 51074.0 tps, lat 0.157 ms stddev 0.184
progress: 11100.0 s, 51645.4 tps, lat 0.155 ms stddev 0.277
progress: 11160.0 s, 51984.6 tps, lat 0.154 ms stddev 0.182
progress: 11220.0 s, 51993.4 tps, lat 0.154 ms stddev 0.181
progress: 11280.0 s, 51894.9 tps, lat 0.154 ms stddev 0.183
progress: 11340.0 s, 52007.2 tps, lat 0.154 ms stddev 0.186
progress: 11400.0 s, 52255.9 tps, lat 0.153 ms stddev 0.188
progress: 11460.0 s, 52221.2 tps, lat 0.153 ms stddev 0.183
progress: 11520.0 s, 51990.5 tps, lat 0.154 ms stddev 0.181
progress: 11580.0 s, 52138.3 tps, lat 0.153 ms stddev 0.178
progress: 11640.0 s, 51258.2 tps, lat 0.156 ms stddev 0.178
progress: 11700.0 s, 52268.0 tps, lat 0.153 ms stddev 0.179
progress: 11760.0 s, 51838.4 tps, lat 0.154 ms stddev 0.180
progress: 11820.0 s, 52079.2 tps, lat 0.154 ms stddev 0.185
progress: 11880.0 s, 52531.4 tps, lat 0.152 ms stddev 0.175
progress: 11940.0 s, 51711.7 tps, lat 0.155 ms stddev 0.260
progress: 12000.0 s, 52020.7 tps, lat 0.154 ms stddev 0.180
progress: 12060.0 s, 51990.1 tps, lat 0.154 ms stddev 0.241
progress: 12120.0 s, 51367.6 tps, lat 0.156 ms stddev 0.193
progress: 12180.0 s, 51316.0 tps, lat 0.156 ms stddev 0.187
progress: 12240.0 s, 50819.0 tps, lat 0.157 ms stddev 0.190
progress: 12300.0 s, 51614.6 tps, lat 0.155 ms stddev 0.182
progress: 12360.0 s, 51500.6 tps, lat 0.155 ms stddev 0.187
progress: 12420.0 s, 51715.8 tps, lat 0.155 ms stddev 0.259
progress: 12480.0 s, 51805.0 tps, lat 0.154 ms stddev 0.183
progress: 12540.0 s, 51846.2 tps, lat 0.154 ms stddev 0.183
progress: 12600.0 s, 51829.6 tps, lat 0.154 ms stddev 0.179
progress: 12660.0 s, 51411.1 tps, lat 0.156 ms stddev 0.221
progress: 12720.0 s, 51728.5 tps, lat 0.155 ms stddev 0.187
progress: 12780.0 s, 52170.5 tps, lat 0.153 ms stddev 0.181
progress: 12840.0 s, 51985.8 tps, lat 0.154 ms stddev 0.182
progress: 12900.0 s, 51723.3 tps, lat 0.155 ms stddev 0.181
progress: 12960.0 s, 52324.2 tps, lat 0.153 ms stddev 0.184
progress: 13020.0 s, 51795.5 tps, lat 0.154 ms stddev 0.189
progress: 13080.0 s, 51547.3 tps, lat 0.155 ms stddev 0.373
progress: 13140.0 s, 51592.4 tps, lat 0.155 ms stddev 0.231
progress: 13200.0 s, 52004.8 tps, lat 0.154 ms stddev 0.223
progress: 13260.0 s, 52118.3 tps, lat 0.153 ms stddev 0.184
progress: 13320.0 s, 51781.1 tps, lat 0.154 ms stddev 0.183
progress: 13380.0 s, 50621.8 tps, lat 0.158 ms stddev 0.196
progress: 13440.0 s, 50671.8 tps, lat 0.158 ms stddev 0.321
progress: 13500.0 s, 51389.4 tps, lat 0.156 ms stddev 0.181
progress: 13560.0 s, 51653.2 tps, lat 0.155 ms stddev 0.186
progress: 13620.0 s, 51491.7 tps, lat 0.155 ms stddev 0.181
progress: 13680.0 s, 51639.4 tps, lat 0.155 ms stddev 0.183
progress: 13740.0 s, 51385.3 tps, lat 0.156 ms stddev 0.188
progress: 13800.0 s, 51640.2 tps, lat 0.155 ms stddev 0.177
progress: 13860.0 s, 51820.7 tps, lat 0.154 ms stddev 0.176
progress: 13920.0 s, 52041.9 tps, lat 0.154 ms stddev 0.179
progress: 13980.0 s, 52001.0 tps, lat 0.154 ms stddev 0.178
progress: 14040.0 s, 52189.3 tps, lat 0.153 ms stddev 0.176
progress: 14100.0 s, 51897.9 tps, lat 0.154 ms stddev 0.179
progress: 14160.0 s, 52066.4 tps, lat 0.154 ms stddev 0.180
progress: 14220.0 s, 52257.9 tps, lat 0.153 ms stddev 0.178
progress: 14280.0 s, 51850.1 tps, lat 0.154 ms stddev 0.210
......
.........
...........

SESSION 3:

Now in this third session we will monitor the database to see what’s going on. You can do something like below to check where we are going.

Below SQL will show you the ‘datfrozenxid‘ which is a value for all transaction IDs before this one have been replaced with a permanent (“frozen”) transaction ID in this database. This is used to track whether the database needs to be vacuumed in order to prevent transaction ID wraparound. And to see ‘autovacuum_freeze_max_age‘ that is age at which to autovacuum a table to prevent transaction ID wraparound.

Do a watch (\watch 10) on the command to see the progress and the play live πŸ™‚

SELECT datname
    , age(datfrozenxid)
    , current_setting('autovacuum_freeze_max_age') 
FROM pg_database 
ORDER BY 2 DESC;


   datname    |   age   | current_setting
--------------+---------+-----------------
 postgres     | 3058531 | 200000000
 edb          | 3058531 | 200000000
 template1    | 3058531 | 200000000
 template0    | 3058531 | 200000000
 immortal     | 3058531 | 200000000
 darkthrone   | 3058531 | 200000000
 immortal1    | 3058531 | 200000000
 mayhem       | 3058531 | 200000000
 marduk       | 3058531 | 200000000
 burzum       | 3058531 | 200000000
 gorgoroth    | 3058531 | 200000000
 repmgr       | 3058531 | 200000000
 enterprisedb | 3058531 | 200000000
(13 rows)

I have used below query to check where we are going, current XIDs, how close we are to our target of wraparound ..

enterprisedb=# WITH max_age AS ( 
    SELECT 2000000000 as max_old_xid
        , setting AS autovacuum_freeze_max_age 
        FROM pg_catalog.pg_settings 
        WHERE name = 'autovacuum_freeze_max_age' )
, per_database_stats AS ( 
    SELECT datname
        , m.max_old_xid::int
        , m.autovacuum_freeze_max_age::int
        , age(d.datfrozenxid) AS oldest_current_xid 
    FROM pg_catalog.pg_database d 
    JOIN max_age m ON (true) 
    WHERE d.datallowconn ) 
SELECT max(oldest_current_xid) AS oldest_current_xid
    , max(ROUND(100*(oldest_current_xid/max_old_xid::float))) AS percent_towards_wraparound
    , max(ROUND(100*(oldest_current_xid/autovacuum_freeze_max_age::float))) AS percent_towards_emergency_autovac 
FROM per_database_stats


\watch 10

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          289007513 |                         14 |                               145
(1 row)



 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          290077770 |                         15 |                               145
(1 row)


 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          292467983 |                         15 |                               146
(1 row)


                     Mon 19 Jul 2021 08:38:01 PM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          342714029 |                         17 |                               171
(1 row)

                     Mon 19 Jul 2021 08:38:11 PM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          343223530 |                         17 |                               172
(1 row)


                     Mon 19 Jul 2021 09:17:45 PM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          406185619 |                         20 |                               203
(1 row)


 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          421779698 |                         21 |                               211
(1 row)

                     Mon 19 Jul 2021 09:28:10 PM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          438446236 |                         22 |                               219
(1 row)



                     Mon 19 Jul 2021 11:03:03 PM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          733479765 |                         37 |                               367
(1 row)



 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          763852636 |                         38 |                               382
(1 row)


                     Mon 19 Jul 2021 11:33:03 PM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          826485794 |                         41 |                               413
(1 row)


                     Tue 20 Jul 2021 12:18:07 AM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          966692697 |                         48 |                               483
(1 row)


 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
         1107823196 |                         55 |                               554
 

                     Tue 20 Jul 2021 01:09:18 AM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
         1125393734 |                         56 |                               563
(1 row)


                     Tue 20 Jul 2021 01:43:11 AM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
         1230984982 |                         62 |                               615
(1 row)





-- LETS SEE WHAT'S HAPPENING ON THE DATABASE 
enterprisedb=# \x
Expanded display is on.
enterprisedb=#
enterprisedb=# select * from pg_stat_activity;


-[ RECORD 4 ]----+---------------------------------
datid            | 16384
datname          | enterprisedb
pid              | 24056
usesysid         | 10
usename          | enterprisedb
application_name | psql.bin
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 20-JUL-21 09:45:44.833016 +05:30
xact_start       | 20-JUL-21 09:45:57.749892 +05:30
query_start      | 20-JUL-21 09:46:01.042977 +05:30
state_change     | 20-JUL-21 09:46:01.047161 +05:30
wait_event_type  | Client
wait_event       | ClientRead
state            | idle in transaction
backend_xid      | 1195
backend_xmin     |
query            | select txid_current();
backend_type     | client backend
-[ RECORD 5 ]----+---------------------------------
datid            | 16384
datname          | enterprisedb
pid              | 25884
usesysid         | 10
usename          | enterprisedb
application_name | psql.bin
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 20-JUL-21 09:47:36.766729 +05:30
xact_start       | 20-JUL-21 09:52:18.941776 +05:30
query_start      | 20-JUL-21 09:52:18.941776 +05:30
state_change     | 20-JUL-21 09:52:18.941778 +05:30
wait_event_type  |
wait_event       |
state            | active
backend_xid      |
backend_xmin     | 1195
query            | select * from pg_stat_activity;
backend_type     | client backend
-[ RECORD 6 ]----+---------------------------------
datid            | 16384
datname          | enterprisedb
pid              | 27150
usesysid         | 10
usename          | enterprisedb
application_name | pgbench
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 20-JUL-21 09:49:56.70284 +05:30
xact_start       | 20-JUL-21 09:52:18.941429 +05:30
query_start      | 20-JUL-21 09:52:18.941429 +05:30
state_change     | 20-JUL-21 09:52:18.94143 +05:30
wait_event_type  | IPC
wait_event       | ClogGroupUpdate
state            | active
backend_xid      | 4526420
backend_xmin     |
query            | select txid_current();
backend_type     | client backend
-[ RECORD 7 ]----+---------------------------------
datid            | 16384
datname          | enterprisedb
pid              | 27151
usesysid         | 10
usename          | enterprisedb
application_name | pgbench
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 20-JUL-21 09:49:56.704062 +05:30
xact_start       |
query_start      | 20-JUL-21 09:52:18.943293 +05:30
state_change     | 20-JUL-21 09:52:18.943411 +05:30
wait_event_type  | Client
wait_event       | ClientRead
state            | idle
backend_xid      |
backend_xmin     |
query            | select txid_current();
backend_type     | client backend
-[ RECORD 8 ]----+---------------------------------
datid            | 16384
datname          | enterprisedb
pid              | 27153
usesysid         | 10
usename          | enterprisedb
application_name | pgbench
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 20-JUL-21 09:49:56.7067 +05:30
xact_start       |
query_start      | 20-JUL-21 09:52:18.943218 +05:30
state_change     | 20-JUL-21 09:52:18.943334 +05:30
wait_event_type  |
wait_event       |
state            | idle
backend_xid      |
backend_xmin     |
query            | select txid_current();
backend_type     | client backend
-[ RECORD 9 ]----+---------------------------------
datid            | 16384
datname          | enterprisedb
pid              | 27154
usesysid         | 10
usename          | enterprisedb
application_name | pgbench
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 20-JUL-21 09:49:56.707948 +05:30
xact_start       | 20-JUL-21 09:52:18.941116 +05:30
query_start      | 20-JUL-21 09:52:18.941116 +05:30
state_change     | 20-JUL-21 09:52:18.941117 +05:30
wait_event_type  | IPC
wait_event       | ClogGroupUpdate
state            | active
backend_xid      | 4526414
backend_xmin     |
query            | select txid_current();
backend_type     | client backend
-[ RECORD 10 ]---+---------------------------------
datid            | 16384
datname          | enterprisedb
pid              | 27152
usesysid         | 10
usename          | enterprisedb
application_name | pgbench
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 20-JUL-21 09:49:56.710155 +05:30
xact_start       | 20-JUL-21 09:52:18.943403 +05:30
query_start      | 20-JUL-21 09:52:18.943403 +05:30
state_change     | 20-JUL-21 09:52:18.943405 +05:30
wait_event_type  |
wait_event       |
state            | active
backend_xid      |
backend_xmin     |
query            | select txid_current();
backend_type     | client backend
-[ RECORD 11 ]---+---------------------------------
datid            | 16384
datname          | enterprisedb
pid              | 27155
usesysid         | 10
usename          | enterprisedb
application_name | pgbench
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 20-JUL-21 09:49:56.710785 +05:30
xact_start       | 20-JUL-21 09:52:18.941493 +05:30
query_start      | 20-JUL-21 09:52:18.941493 +05:30
state_change     | 20-JUL-21 09:52:18.941494 +05:30
wait_event_type  | IPC
wait_event       | ClogGroupUpdate
state            | active
backend_xid      | 4526421
backend_xmin     |
query            | select txid_current();
backend_type     | client backend
-[ RECORD 12 ]---+---------------------------------
datid            | 16384
datname          | enterprisedb
pid              | 27156
usesysid         | 10
usename          | enterprisedb
application_name | pgbench
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 20-JUL-21 09:49:56.711762 +05:30
xact_start       |
query_start      | 20-JUL-21 09:52:18.942582 +05:30
state_change     | 20-JUL-21 09:52:18.942704 +05:30
wait_event_type  | Client
wait_event       | ClientRead
state            | idle
backend_xid      |
backend_xmin     |
query            | select txid_current();
backend_type     | client backend
-[ RECORD 13 ]---+---------------------------------
datid            | 16384
datname          | enterprisedb
pid              | 27157
usesysid         | 10
usename          | enterprisedb
application_name | pgbench
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 20-JUL-21 09:49:56.711944 +05:30
xact_start       | 20-JUL-21 09:52:18.943345 +05:30
query_start      | 20-JUL-21 09:52:18.943345 +05:30
state_change     | 20-JUL-21 09:52:18.943346 +05:30
wait_event_type  | Client
wait_event       | ClientRead
state            | active
backend_xid      | 4526447
backend_xmin     | 1195
query            | select txid_current();
backend_type     | client backend
-[ RECORD 14 ]---+---------------------------------



SESSION 4:

Open another session to check log files, if they are reporting any hints, warnings or errors
After ~1 hour of test run, I have started seeing below warnings in the log files.

2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST WARNING:  oldest xmin is far in the past
2021-07-19 20:15:26 IST HINT:  Close open transactions soon to avoid wraparound problems.
        You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST WARNING:  oldest xmin is far in the past
2021-07-19 20:15:26 IST HINT:  Close open transactions soon to avoid wraparound problems.
        You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();
2021-07-19 20:15:26 IST LOG:  statement: select txid_current();

After ~ 3 hours of test run, the warnings becoming more prominent.

-bash-4.2$ grep -Rinw /var/lib/edb/as11/data/log/ -e 'wraparound'
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:17711:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:20705:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:21160:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:21650:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:22179:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:22694:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:23229:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:23776:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:24316:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:24840:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:25391:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:25927:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:27533:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:28135:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:32986:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:33581:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:17711:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:20705:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:21160:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:21650:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:22179:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:22694:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:23229:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:23776:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:24316:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:24840:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:25391:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:25927:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:27533:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:28135:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:32986:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.
/var/lib/edb/as11/data/log/edb-2021-07-19_195300.log:33581:2021-07-19 19:53:01 IST HINT:  Close open transactions soon to avoid wraparound problems.

Now, let’s monitor system stats too to see overall load, processes their commands etc.

top - 00:19:39 up 159 days,  7:11,  6 users,  load average: 10.31, 9.97, 9.95
Tasks: 399 total,   9 running, 390 sleeping,   0 stopped,   0 zombie
%Cpu(s): 60.0 us, 38.4 sy,  0.0 ni,  1.6 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 32780168 total, 14568928 free,  1259976 used, 16951264 buff/cache
KiB Swap: 16777212 total, 16685044 free,    92168 used. 23083820 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
21831 enterpr+  20   0  591764   5228   2152 S 16.8  0.0 417:57.53 /usr/edb/as11/bin/pgbench -U enterprisedb enterprisedb -f wl.sql -c8 -j8 -T100000 -rn -P+
21845 enterpr+  20   0 8667644  24604  22256 R  9.7  0.1 209:53.07 postgres: enterprisedb enterprisedb [local] SELECT
21840 enterpr+  20   0 8667644  24584  22244 R  9.4  0.1 210:36.23 postgres: enterprisedb enterprisedb [local] idle
21844 enterpr+  20   0 8667644  24876  22528 R  9.4  0.1 209:38.54 postgres: enterprisedb enterprisedb [local] idle
21841 enterpr+  20   0 8667644  24836  22496 R  8.8  0.1 209:15.88 postgres: enterprisedb enterprisedb [local] SELECT
21847 enterpr+  20   0 8667644  24600  22252 S  8.8  0.1 206:16.87 postgres: enterprisedb enterprisedb [local] idle
21846 enterpr+  20   0 8667644  24604  22256 S  8.5  0.1 210:41.59 postgres: enterprisedb enterprisedb [local] idle
20421 enterpr+  20   0  259688   2124    632 R  7.1  0.0 146:27.66 postgres: logger
21843 enterpr+  20   0 8667644  24588  22248 R  6.6  0.1 210:05.38 postgres: enterprisedb enterprisedb [local] idle
21842 enterpr+  20   0 8667644  24600  22260 R  6.6  0.1 209:48.66 postgres: enterprisedb enterprisedb [local] idle
20426 enterpr+  20   0 8666684  18736  17208 R  4.0  0.1 102:43.53 postgres: walwriter
 6858 polkitd   20   0  620236  18044   4428 S  0.9  0.1  15940:23 /usr/lib/polkit-1/polkitd --no-debug

SESSION 2:

To my surprise, after ~8 hours of test run, the session screen where I have pgbench running, started throwing some WARNINGs to perform vacuum within specified number of transactions to avoid database shutdown. It kept on throwing those direct warnings for almost an hour before it stopped processing pgbench requests and started aborting workers with a message ‘ERROR: database is not accepting commands to avoid wraparound data loss in database “postgres”.

In the end it will present you the overall pgbench stats where it fired 2146483646 statements in total.

WARNING:  database "postgres" must be vacuumed within 5092920 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  database "postgres" must be vacuumed within 5092920 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  database "postgres" must be vacuumed within 5092917 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
....
......
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  database "postgres" must be vacuumed within 3178805 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  database "postgres" must be vacuumed within 3178808 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  database "postgres" must be vacuumed within 3178804 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  database "postgres" must be vacuumed within 3178803 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  database "postgres" must be vacuumed within 3178801 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  database "postgres" must be vacuumed within 3178799 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
......
........
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  database "postgres" must be vacuumed within 1000002 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
WARNING:  database "postgres" must be vacuumed within 1000002 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
client 0 aborted in command 0 (SQL) of script 0; ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"
HINT:  Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

client 1 aborted in command 0 (SQL) of script 0; ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"
HINT:  Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

client 7 aborted in command 0 (SQL) of script 0; ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"
HINT:  Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

client 6 aborted in command 0 (SQL) of script 0; ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"
HINT:  Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

client 2 aborted in command 0 (SQL) of script 0; ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"
HINT:  Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

client 4 aborted in command 0 (SQL) of script 0; ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"
HINT:  Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

WARNING:  database "postgres" must be vacuumed within 1000005 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
client 3 aborted in command 0 (SQL) of script 0; ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"
HINT:  Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

client 5 aborted in command 0 (SQL) of script 0; ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"
HINT:  Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

transaction type: wl.sql
scaling factor: 1
query mode: simple
number of clients: 8
number of threads: 8
duration: 100000 s
number of transactions actually processed: 2146483646
latency average = 0.159 ms
latency stddev = 1.102 ms
tps = 50437.366776 (including connections establishing)
tps = 50437.386645 (excluding connections establishing)
statement latencies in milliseconds:
         0.171  select txid_current();
You have new mail in /var/spool/mail/enterprisedb
-bash-4.2$

SESSION 3:

Lets check our screen 3 to see where we are in terms on oldest current XID and percent towards wraparound, and I see XID numbers are not increasing and we have reached 107% towards wraparound (strange, why it goes beyond 100%).

enterprisedb=# WITH max_age AS ( 
    SELECT 2000000000 as max_old_xid
        , setting AS autovacuum_freeze_max_age 
        FROM pg_catalog.pg_settings 
        WHERE name = 'autovacuum_freeze_max_age' )
, per_database_stats AS ( 
    SELECT datname
        , m.max_old_xid::int
        , m.autovacuum_freeze_max_age::int
        , age(d.datfrozenxid) AS oldest_current_xid 
    FROM pg_catalog.pg_database d 
    JOIN max_age m ON (true) 
    WHERE d.datallowconn ) 
SELECT max(oldest_current_xid) AS oldest_current_xid
    , max(ROUND(100*(oldest_current_xid/max_old_xid::float))) AS percent_towards_wraparound
    , max(ROUND(100*(oldest_current_xid/autovacuum_freeze_max_age::float))) AS percent_towards_emergency_autovac 
FROM per_database_stats;

(1 row)
                     Tue 20 Jul 2021 06:57:51 AM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
         2146483647 |                        107 |                              1073
(1 row)


                     Tue 20 Jul 2021 06:58:01 AM IST (every 10s)

 oldest_current_xid | percent_towards_wraparound | percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
         2146483647 |                        107 |                              1073
(1 row)




enterprisedb=# SELECT datname
, age(datfrozenxid)
, current_setting('autovacuum_freeze_max_age')
FROM pg_database
 ORDER BY 2 DESC;

   datname    |    age     | current_setting
--------------+------------+-----------------
 postgres     | 2146483647 | 200000000
 edb          | 2146483647 | 200000000
 template1    | 2146483647 | 200000000
 template0    | 2146483647 | 200000000
 immortal     | 2146483647 | 200000000
 darkthrone   | 2146483647 | 200000000
 immortal1    | 2146483647 | 200000000
 mayhem       | 2146483647 | 200000000
 murduk       | 2146483647 | 200000000
 burzum       | 2146483647 | 200000000
 gorgoroth    | 2146483647 | 200000000
 repmgr       | 2146483647 | 200000000
 enterprisedb | 2146483647 | 200000000
(13 rows)

Let’s connect with the database and see what happens when we try to do both SELECT & UPDATE statements …

enterprisedb=# \dt+
                            List of relations
 Schema |       Name       | Type  |    Owner     |  Size   | Description
--------+------------------+-------+--------------+---------+-------------
 public | pgbench_accounts | table | enterprisedb | 13 MB   |
 public | pgbench_branches | table | enterprisedb | 40 kB   |
 public | pgbench_history  | table | enterprisedb | 40 kB   |
 public | pgbench_tellers  | table | enterprisedb | 40 kB   |
 public | scale_data       | table | enterprisedb | 6720 MB |
(5 rows)

enterprisedb=# select * from pgbench_history;
 tid | bid |  aid  | delta |           mtime           | filler
-----+-----+-------+-------+---------------------------+--------
   2 |   1 | 64444 | -1096 | 19-JUL-21 18:41:15.655738 |
   3 |   1 | 18460 | -4688 | 19-JUL-21 18:41:15.668699 |
   2 |   1 | 92762 | -4669 | 19-JUL-21 18:41:15.670741 |
   2 |   1 | 55081 |  1817 | 19-JUL-21 18:41:15.672719 |
   5 |   1 | 47743 |   803 | 19-JUL-21 18:41:15.674666 |
   5 |   1 | 23899 | -3910 | 19-JUL-21 18:41:15.67664  |
  10 |   1 | 33384 |  4005 | 19-JUL-21 18:41:15.678631 |
   6 |   1 | 88383 | -4257 | 19-JUL-21 18:41:15.680629 |
   8 |   1 | 87562 |   200 | 19-JUL-21 18:41:15.682612 |
   5 |   1 |  3957 | -2407 | 19-JUL-21 18:41:15.684595 |
(10 rows)

enterprisedb=#
enterprisedb=# update pgbench_history set aid=65555 where aid=64444;
ERROR:  database is not accepting commands to avoid wraparound data loss in database "postgres"
HINT:  Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
enterprisedb=#
enterprisedb=#
enterprisedb=#

Alright, so with transaction wraparound is already there, we only be able to run SELECT queries, but DMLS will fail with a message printed to stop the database and vacuum it in single-user mode.

So, what we have learned

  • Database will keep on running and accepting READ ONLY (SELECT) requests, error will be thrown in case if DMLs tried.
  • 2,000,000,000 seems a huge number, but in case of busy transactional database, it’s just a matter of few hours.
  • AUTOVACUUMING is good thing, at least for this scenario, let’s not discuss about its bugs and other problems.
  • Add VACUUMING a routine activity else you will soon find yourself in a big mess.

The easiest way to get the highest TXID age value back down is to force a vacuum on the entire database cluster. And the best way to do this cluster-wide vacuum is the vacuumdb binary utility that comes with PostgreSQL. I will cover it in my next post as this post is all about simulating wraparounds.

Hope It Helped!
Prashant Dixit

2 Responses to “How to simulate the deadly, infamous, misunderstood & complex ‘Transaction Wraparound Problem’ in PostgreSQL …”

  1. Oraclouddba-Nataraj said

    Excellent post.

  2. […] « How to simulate the deadly, infamous, misunderstood & complex ‘Transaction Wraparound Prob… […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

 
%d bloggers like this: