Tales From A Lazy Fat DBA

Its all about Databases & their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Archive for September, 2018

PostgreSQL: SELECT on a big table leading to session KILL.

Posted by FatDBA on September 26, 2018

Hi Guys,

Would like to discuss one strange situation what I’ve faced while doing a general SELECT operation on one of the table which is around 1 GB in size and contains 20000000 rows of test data. After spending around less than a minute the session gets killed and kicks me out back to the Linux terminal.

And yes, the OS is RHEL 7 (64 Bit) running on VMWare.

So, this what happened …


postgres=# select * from test;
Killed
-bash-4.2$ 


As always to troubleshoot any issue i started with the instance/db logs to see what’s going on, but it speaks not much and left me confused with no reason of this session kill everytime during this SELECT operation.



 LOG:  could not send data to client: Broken pipe
 STATEMENT:  SELECT * from test;
 FATAL:  connection to client lost



Table details are here below.



postgres=#
postgres=# \dt+ test
                    List of relations
 Schema | Name | Type  |  Owner   |  Size  | Description
--------+------+-------+----------+--------+-------------
 public | test | table | postgres | 996 MB |
(1 row)


postgres=# select count(*) from test;
  count
----------
 20000000
(1 row)

Time: 2045.816 ms



I fired the same query once again to see what’s happening there on the OS, pasted below are top results for the same time when this SELECT was in run. You can use pg_top too, that’s more PostgreSQL specific.



last pid: 15382;  load avg:  4.40,  2.70,  1.65;       up 0+10:27:52                                                                                           
0 processes:
CPU states: 94.0% user,  0.0% nice,  6.0% system,  0.0% idle,  0.0% iowait
Memory: 1836M used, 15M free, 139M cached
DB activity:   0 tps,  0 rollbs/s,   0 buffer r/s,  0 hit%,      0 row r/s,    0 row w/s
DB I/O:    51 reads/s, 25940 KB/s,     0 writes/s,     0 KB/s
DB disk: 0.0 GB total, 0.0 GB free (100% used)
Swap: 1925M used, 123M free, 8916K cached


   PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 15367 postgres  20   0  358572   3660   3376 D 29.8  0.2   0:33.31 postgres: postgres postgres [local] SELECT
  1721 mongod    20   0  976044   3192      0 S 19.0  0.2   4:13.26 /usr/bin/mongod -f /etc/mongod.conf
 15382 postgres  20   0 2012488 1.584g    228 S  2.4 87.6   0:02.62 psql      >>>>>>>>>>>>>>>>>>>>>>>>> Culprit



If we take a look at top results its evident that one of the postgreSQL session with PID 15382 is consuming a lot what is there under ‘RES’ column and that’s the non-swapped physical memory a task has used. It’s using around 1.5 GB of physical RAM and the CPU usage is little high as well.
This makes sense as we only have around 1.5 GB of RAM allocated to this VM machine, so every time i fires this SELECT it maxes out on memory usage, but still i would like to dig-in deep with all my WHY, WHERE, WHAT questions. And the best place to go next is system messages or logs.

Below are the system logs for the same time.



Sep 26 11:33:04 fatdba kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
Sep 26 11:33:04 fatdba kernel: ffff880027f41f60 00000000a0ebe647 ffff88006c9c7ad0 ffffffff81686e13

Sep 26 11:33:07 fatdba kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Sep 26 11:33:08 fatdba kernel: [14918]    26 14918    29067        1      13      312             0 bash
Sep 26 11:33:08 fatdba kernel: [15382]    26 15382   521305   419602     920    27070             0 psql   >>>>>>>>>>>>>>>>>> >>>>>>>>>>>> CULPRIT PROCESS
Sep 26 11:33:08 fatdba kernel: [15384]    26 15384    89643      440      64      405             0 postmaster
Sep 26 11:33:08 fatdba kernel: [15385]    26 15385    19311      259      43        0             0 pg_top
Sep 26 11:33:08 fatdba kernel: [15411]     0 15411    28811       46      11       23             0 ksmtuned
Sep 26 11:33:08 fatdba kernel: [15412]     0 15412    28811       61      11       23             0 ksmtuned
Sep 26 11:33:08 fatdba kernel: [15413]     0 15413    37148      172      29        0             0 pgrep
Sep 26 11:33:08 fatdba kernel: [15416]    26 15416    89489      303      59      215             0 postmaster
Sep 26 11:33:08 fatdba kernel: [15417]    26 15417    89511      239      57      224             0 postmaster

Sep 26 11:33:08 fatdba kernel: Out of memory: Kill process 15382 (psql) score 448 or sacrifice child
Sep 26 11:33:08 fatdba kernel: Killed process 15382 (psql) total-vm:2085220kB, anon-rss:1678260kB, file-rss:148kB, shmem-rss:0kB



Sometimes i gets shocked to see the terms OS/Softwares/Apps used while logging internal activities — “Sacrifise Child”, “Node Amnesia”, “Shoot The Other Node in Head”, “Node Suicides” … 🙂

Okay so coming back to the point, so its clear now that the VM/OS has killed this process with ID 15382 (psql) as its is consuming almost all of the system memory resources. Take a look at few of the columns total_vm, rss, swapents .. All too high and pointing towards the real reason of killing this individual session.

We increased the physical memory on this box and ran the same query again with success!

Hope It Helps
Prashant Dixit

Advertisement

Posted in Advanced | Tagged: | Leave a Comment »

Fragmentation Or Bloating in PostgreSQL – How to identify and fix it using DB Vacuuming

Posted by FatDBA on September 4, 2018

Hey Folks,

Back with another post on PostgreSQL. This time related with table fragmentation (Bloating in PG) on how to identify it and fix it using Vacuuming.

Okay, so we have this table of size 995 MBs with close to 20000000 rows and the DB (postgres default db) size is of 2855 MBs.




postgres=#
postgres=# \dt+ large_test
                         List of relations
 Schema |    Name    | Type  |  Owner   |    Size    | Description
--------+------------+-------+----------+------------+-------------
 public | large_test | table | postgres | 995 MB     |



postgres=# \l+ postgres
                                                                List of databases
   Name   |  Owner   | Encoding |  Collate   |   Ctype    | Access privileges |  Size   | Tablespace |                Description
----------+----------+----------+------------+------------+-------------------+---------+------------+--------------------------------------------
 postgres | postgres | UTF8     | en_US.utf8 | en_US.utf8 |                   | 2855 MB | pg_default | default administrative connection database
(1 row)


Now lets do some DMLs to create the scenario.



postgres=# delete from large_test where num3 >=90;
DELETE 7324451

postgres=#
postgres=# INSERT INTO large_test (num1, num2, num3)
postgres-#              SELECT round(random()*10), random(), random()*142
postgres-#                FROM generate_series(1, 2000) s(i);
INSERT 0 2000

postgres=# delete from large_test where num3 >=90;
DELETE 729

postgres=# INSERT INTO large_test (num1, num2, num3)
postgres-#              SELECT round(random()*10), random(), random()*142
postgres-#                FROM generate_series(1, 9000) s(i);
INSERT 0 9000

postgres=#
postgres=# delete from large_test where num1 < 8;
DELETE 9514961

postgres=# delete from large_test where num1 < 10;
DELETE 2536447


Okay now with all those DMLs i am sure we will have enough of fragmentation in the database, lets check.



postgres=#
postgres=# \dt+ large_test
                       List of relations
 Schema |    Name    | Type  |  Owner   |  Size  | Description
--------+------------+-------+----------+--------+-------------
 public | large_test | table | postgres | 996 MB  |
(1 row)


Okay, the size of the table is almost the same what it was before all the DMLs. Lets see if there is any fragmentation in the database, for this i have used below custimized statement, you can also use pg_class table to get basic details on fragmentation.



SELECT
  current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
  ROUND((CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages::FLOAT/otta END)::NUMERIC,1) AS tbloat,
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes,
  iname, /*ituples::bigint, ipages::bigint, iotta,*/
  ROUND((CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages::FLOAT/iotta END)::NUMERIC,1) AS ibloat,
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
FROM (
  SELECT
    schemaname, tablename, cc.reltuples, cc.relpages, bs,
    CEIL((cc.reltuples*((datahdr+ma-
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::FLOAT)) AS otta,
    COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::FLOAT)),0) AS iotta -- very rough approximation, assumes all cols
  FROM (
    SELECT
      ma,bs,schemaname,tablename,
      (datawidth+(hdr+ma-(CASE WHEN hdr%ma=0 THEN ma ELSE hdr%ma END)))::NUMERIC AS datahdr,
      (maxfracsum*(nullhdr+ma-(CASE WHEN nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
    FROM (
      SELECT
        schemaname, tablename, hdr, ma, bs,
        SUM((1-null_frac)*avg_width) AS datawidth,
        MAX(null_frac) AS maxfracsum,
        hdr+(
          SELECT 1+COUNT(*)/8
          FROM pg_stats s2
          WHERE null_frac0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
        ) AS nullhdr
      FROM pg_stats s, (
        SELECT
          (SELECT current_setting('block_size')::NUMERIC) AS bs,
          CASE WHEN SUBSTRING(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
          CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
        FROM (SELECT version() AS v) AS foo
      ) AS constants
      GROUP BY 1,2,3,4,5
    ) AS foo
  ) AS rs
  JOIN pg_class cc ON cc.relname = rs.tablename
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname  'information_schema'
  LEFT JOIN pg_index i ON indrelid = cc.oid
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
) AS sml
ORDER BY wastedbytes DESC;


Okay so below results which we have captured clearly shows that there is a fragmentation (Look at wastedbytes column)




 current_database | schemaname |    tablename     | tbloat | wastedbytes |               iname               | ibloat | wastedibytes
------------------+------------+------------------+--------+-------------+-----------------------------------+--------+--------------
 postgres         | public     | large_test       |   31.6 |  1011122176 | idxlargetest                      |   19.6 |    426352640
 postgres         | public     | large_test       |   31.6 |  1011122176 | idxlargetest1                     |   19.6 |    426704896


Now when the fragmentation is clear, lets do the shrink or space reclaim using VACUUM. I will first try with ‘BASIC VACUUM’ and ANALYZE the table at the same time to make optimizer statistics up-to-date.

To remove dead tuples, vacuum processing provides two modes, i.e. Concurrent VACUUM and Full VACUUM. Concurrent VACUUM, often simply called VACUUM, removes dead tuples for each page of the table file, and other transactions can read the table while this process is running. In contrast, Full VACUUM removes dead tuples and defragments live tuples the whole file, and other transactions cannot access tables while Full VACUUM is running.



postgres=#
postgres=# VACUUM (VERBOSE, ANALYZE) large_test; 
INFO:  vacuuming "public.large_test"
INFO:  index "idxlargetest" now contains 634412 row versions in 54840 pages
DETAIL:  0 index row versions were removed.
19811 index pages have been deleted, 13985 are currently reusable.
CPU 0.49s/0.03u sec elapsed 2.93 sec.
INFO:  index "idxlargetest1" now contains 634412 row versions in 54883 pages
DETAIL:  0 index row versions were removed.
52137 index pages have been deleted, 0 are currently reusable.
CPU 0.23s/0.03u sec elapsed 0.26 sec.
INFO:  "large_test": found 0 removable, 7 nonremovable row versions in 1 out of 127459 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 87 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.72s/0.07u sec elapsed 3.21 sec.
INFO:  analyzing "public.large_test"
INFO:  "large_test": scanned 30000 of 127459 pages, containing 149103 live rows and 0 dead rows; 30000 rows in sample, 633484 estimated total rows
VACUUM
postgres=#


As discussed and expected we see no change in wasted space and fragmentation still exists, see below result which matches the same what we have collected before the BASIC VACUUM.



 current_database | schemaname |    tablename     | tbloat | wastedbytes |               iname               | ibloat | wastedibytes
------------------+------------+------------------+--------+-------------+-----------------------------------+--------+--------------
 postgres         | public     | large_test       |   31.6 |  1011122176 | idxlargetest                      |   19.6 |    426352640
 postgres         | public     | large_test       |   31.6 |  1011122176 | idxlargetest1                     |   19.6 |    426704896


Okay, so time to test the FULL VACUUM which reclaims more space and does the real action of freeing up the space than plain or basic VACUUM but the only issue with it it locks the database table.



postgres=# VACUUM (FULL, VERBOSE, ANALYZE) large_test;
INFO:  vacuuming "public.large_test"
INFO:  "large_test": found 0 removable, 634412 nonremovable row versions in 127459 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 1.48s/0.50u sec elapsed 12.28 sec.
INFO:  analyzing "public.large_test"
INFO:  "large_test": scanned 4041 of 4041 pages, containing 634412 live rows and 0 dead rows; 30000 rows in sample, 634412 estimated total rows
VACUUM


Now lets see if there is any change in fragmentation levels.



postgres=# \l+ postgres
                                                                List of databases
   Name   |  Owner   | Encoding |  Collate   |   Ctype    | Access privileges |  Size   | Tablespace |                Description
----------+----------+----------+------------+------------+-------------------+---------+------------+--------------------------------------------
 postgres | postgres | UTF8     | en_US.utf8 | en_US.utf8 |                   | 1062 MB | pg_default | default administrative connection database
(1 row)

postgres=# \dt+ large_test
                      List of relations
 Schema |    Name    | Type  |  Owner   | Size  | Description
--------+------------+-------+----------+-------+-------------
 public | large_test | table | postgres | 32 MB |
(1 row)

 current_database | schemaname |    tablename     | tbloat | wastedbytes |               iname               | ibloat | wastedibytes
------------------+------------+------------------+--------+-------------+-----------------------------------+--------+--------------
 postgres         | public     | large_test       |    1.0 |       32768 | idxlargetest                      |    0.6 |            0
 postgres         | public     | large_test       |    1.0 |       32768 | idxlargetest1                     |    0.6 |            0


Yup, this time it worked after FULL VACUUMing of the database and now there isn’t any wasted or fragmented space exists in the table.


Hope It Helps!
Prashant Dixit

Posted in Advanced | Tagged: | 3 Comments »

Upgrade PostgreSQL from 9.4 to 9.6 on RHEL 7

Posted by FatDBA on September 3, 2018

Hi Mates,

Every other day while doing this POC (Data migrations between Oracle and PostgreSQL), i am facing regular challenges (But i like them ;)) , this time related with the application and database support. Project team asked us to upgrade the database to at-least 9.6 (Though we have 10 in the marked as well but i guess not yet ready for Prod’s).

Its good to update you that you can run various versions of PostgreSQL at the same time but as per the ask we have this test DB running on version 9.4 and would be upgraded to 9.6. So, before i start with the steps, lets quickly look and collect details about all objects till now created on the database as this will be used to TVT testings later on.

Okay, so we are running on PostgreSQL 9.4.19 have created few sample/test databases with tables, indexes, tablespaces created.

Environment:
Server: RHEL 7 64 Bit
Upgrade: From 9.4.19 to 9.6.10

Pre Checks:


-bash-4.2$ uname -ar
Linux fatdba.localdomain 3.10.0-514.21.1.el7.x86_64 #1 SMP Thu May 25 16:26:01 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux
-bash-4.2$

postgres=# select version();
                                                    version
----------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.4.19 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)


elephant=# select datname from pg_database;
  datname
-----------
 template1
 template0
 postgres
 elephant
 dixitdb
(5 rows)

elephant=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# \dt+
                         List of relations
 Schema |    Name    | Type  |  Owner   |    Size    | Description
--------+------------+-------+----------+------------+-------------
 public | alee       | table | postgres | 128 kB     |
 public | events     | table | postgres | 8192 bytes |
 public | large_test | table | postgres | 995 MB     |
(3 rows)

postgres=#

Step 1: Download the required package from PostgreSQL official website (Link: https://www.postgresql.org/download/linux/redhat/), here you will find the latest repository RPM for your OS and the latest PG versions.


[root@fatdba ~]#
[root@fatdba ~]# yum install https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-redhat96-9.6-3.noarch.rpm
Loaded plugins: langpacks, ulninfo
pgdg-redhat96-9.6-3.noarch.rpm                                                                                                                                        | 4.7 kB  00:00:00
Examining /var/tmp/yum-root-fEvD8A/pgdg-redhat96-9.6-3.noarch.rpm: pgdg-redhat96-9.6-3.noarch
Marking /var/tmp/yum-root-fEvD8A/pgdg-redhat96-9.6-3.noarch.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package pgdg-redhat96.noarch 0:9.6-3 will be installed
--> Finished Dependency Resolution
cassandra/signature                                                                                                                                                   |  833 B  00:00:00
cassandra/signature                                                                                                                                                   | 2.9 kB  00:00:00 !!!
mongodb-org-3.4/7Server                                                                                                                                               | 2.5 kB  00:00:00
ol7_UEKR4/x86_64                                                                                                                                                      | 1.2 kB  00:00:00
ol7_addons/x86_64                                                                                                                                                     | 1.2 kB  00:00:00
ol7_latest/x86_64                                                                                                                                                     | 1.4 kB  00:00:00
pgdg94/7Server/x86_64                                                                                                                                                 | 4.1 kB  00:00:00

Dependencies Resolved

=============================================================================================================================================================================================
 Package                                      Arch                                  Version                                 Repository                                                  Size
=============================================================================================================================================================================================
Installing:
 pgdg-redhat96                                noarch                                9.6-3                                   /pgdg-redhat96-9.6-3.noarch                                2.7 k

Transaction Summary
=============================================================================================================================================================================================
Install  1 Package

Total size: 2.7 k
Installed size: 2.7 k
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : pgdg-redhat96-9.6-3.noarch                                                                                                                                                1/1
  Verifying  : pgdg-redhat96-9.6-3.noarch                                                                                                                                                1/1

Installed:
  pgdg-redhat96.noarch 0:9.6-3

Complete!
[root@fatdba ~]#



[root@fatdba ~]#
[root@fatdba ~]# yum install postgresql96
Loaded plugins: langpacks, ulninfo
pgdg96                                                                                                                                                                | 4.1 kB  00:00:00
(1/2): pgdg96/7Server/x86_64/group_gz                                                                                                                                 |  249 B  00:00:01
(2/2): pgdg96/7Server/x86_64/primary_db                                                                                                                               | 197 kB  00:00:01
Resolving Dependencies
--> Running transaction check
---> Package postgresql96.x86_64 0:9.6.10-1PGDG.rhel7 will be installed
--> Processing Dependency: postgresql96-libs(x86-64) = 9.6.10-1PGDG.rhel7 for package: postgresql96-9.6.10-1PGDG.rhel7.x86_64
--> Running transaction check
---> Package postgresql96-libs.x86_64 0:9.6.10-1PGDG.rhel7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================================================
 Package                                           Arch                                   Version                                               Repository                              Size
=============================================================================================================================================================================================
Installing:
 postgresql96                                      x86_64                                 9.6.10-1PGDG.rhel7                                    pgdg96                                 1.4 M
Installing for dependencies:
 postgresql96-libs                                 x86_64                                 9.6.10-1PGDG.rhel7                                    pgdg96                                 318 k

Transaction Summary
=============================================================================================================================================================================================
Install  1 Package (+1 Dependent package)

Total download size: 1.7 M
Installed size: 8.7 M
Is this ok [y/d/N]: y
Downloading packages:
(1/2): postgresql96-libs-9.6.10-1PGDG.rhel7.x86_64.rpm                                                                                                                | 318 kB  00:00:07
(2/2): postgresql96-9.6.10-1PGDG.rhel7.x86_64.rpm                                                                                                                     | 1.4 MB  00:00:10
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                        169 kB/s | 1.7 MB  00:00:10
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : postgresql96-libs-9.6.10-1PGDG.rhel7.x86_64                                                                                                                               1/2
  Installing : postgresql96-9.6.10-1PGDG.rhel7.x86_64                                                                                                                                    2/2
  Verifying  : postgresql96-libs-9.6.10-1PGDG.rhel7.x86_64                                                                                                                               1/2
  Verifying  : postgresql96-9.6.10-1PGDG.rhel7.x86_64                                                                                                                                    2/2

Installed:
  postgresql96.x86_64 0:9.6.10-1PGDG.rhel7

Dependency Installed:
  postgresql96-libs.x86_64 0:9.6.10-1PGDG.rhel7

Complete!
[root@fatdba ~]#



[root@fatdba ~]# yum install postgresql96-server
Loaded plugins: langpacks, ulninfo
Resolving Dependencies
--> Running transaction check
---> Package postgresql96-server.x86_64 0:9.6.10-1PGDG.rhel7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================================================
 Package                                             Arch                                   Version                                             Repository                              Size
=============================================================================================================================================================================================
Installing:
 postgresql96-server                                 x86_64                                 9.6.10-1PGDG.rhel7                                  pgdg96                                 4.5 M

Transaction Summary
=============================================================================================================================================================================================
Install  1 Package

Total download size: 4.5 M
Installed size: 19 M
Is this ok [y/d/N]: y
Downloading packages:
postgresql96-server-9.6.10-1PGDG.rhel7.x86_64.rpm                                                                                                                     | 4.5 MB  00:00:11
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : postgresql96-server-9.6.10-1PGDG.rhel7.x86_64                                                                                                                             1/1
  Verifying  : postgresql96-server-9.6.10-1PGDG.rhel7.x86_64                                                                                                                             1/1

Installed:
  postgresql96-server.x86_64 0:9.6.10-1PGDG.rhel7

Complete!
[root@fatdba ~]#


Step 2: Create the new PostgreSQL directory. This you can do this by calling the initdb (Initialize DB) with setup shell present under new installed directory for 9.6


[root@fatdba ~]#
[root@fatdba ~]# /usr/pgsql-9.6/bin/postgresql96-setup initdb
Initializing database ... OK

[root@fatdba ~]#
[root@fatdba ~]#
[root@fatdba ~]#
[root@fatdba ~]# cd /var/lib/pgsql
[root@fatdba pgsql]# ls -lthr
total 0
drwxrwxrwx 4 postgres postgres 48 Sep  3 04:39 9.4
drwx------ 3 postgres postgres 29 Sep  3 04:44 tbs
drwx------ 3 postgres postgres 29 Sep  3 20:31 tbs1
drwx------ 4 postgres postgres 48 Sep  3 20:40 9.6
[root@fatdba pgsql]# su - postgres
Last login: Mon Sep  3 20:24:34 IST 2018 on pts/3
-bash-4.2$

Step 3: Before you go with real upgrade steps, its always best and advised to see if its all set and ready for the upgrade. This is one of the most important per-requsites to check the abilities.
Below at the end of the check run it says ‘Cluster is Compatible’ and this is good enough to say that the system is all set for the upgrade.


-bash-4.2$ /usr/pgsql-9.6/bin/pg_upgrade --old-bindir=/usr/pgsql-9.4/bin/ --new-bindir=/usr/pgsql-9.6/bin/ --old-datadir=/var/lib/pgsql/9.4/data/ --new-datadir=/var/lib/pgsql/9.6/data/ --check

*failure*
Consult the last few lines of "pg_upgrade_server.log" for
the probable cause of the failure.
Performing Consistency Checks on Old Live Server
------------------------------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* system OID user data types                ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for roles starting with 'pg_'                      ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok

*Clusters are compatible*
-bash-4.2$


Step 4: Till this time the older version is still running, so before we do the original upgrade steps that needs to be stopped.


-bash-4.2$ ps -ef|grep post
root       1378      1  0 20:04 ?        00:00:00 /usr/libexec/postfix/master -w
postfix    1380   1378  0 20:04 ?        00:00:00 qmgr -l -t unix -u
postgres   3381      1  0 20:10 ?        00:00:00 /usr/pgsql-9.4/bin/postgres -D /var/lib/pgsql/9.4/data
postgres   3382   3381  0 20:10 ?        00:00:00 postgres: logger process
postgres   3384   3381  0 20:10 ?        00:00:01 postgres: checkpointer process
postgres   3385   3381  0 20:10 ?        00:00:00 postgres: writer process
postgres   3386   3381  0 20:10 ?        00:00:01 postgres: wal writer process
postgres   3387   3381  0 20:10 ?        00:00:00 postgres: autovacuum launcher process
postgres   3388   3381  0 20:10 ?        00:00:00 postgres: stats collector process


-bash-4.2$ su - root
Password:
Last login: Mon Sep  3 20:42:59 IST 2018 from 192.168.40.1 on pts/4
[root@fatdba ~]#
[root@fatdba ~]#
[root@fatdba ~]# service postgresql-9.4 stop
Redirecting to /bin/systemctl stop  postgresql-9.4.service
[root@fatdba ~]#
[root@fatdba ~]# service postgresql-9.4 status
Redirecting to /bin/systemctl status  postgresql-9.4.service
● postgresql-9.4.service - PostgreSQL 9.4 database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql-9.4.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Mon 2018-09-03 20:44:59 IST; 5s ago
     Docs: https://www.postgresql.org/docs/9.4/static/
  Process: 4924 ExecStop=/usr/pgsql-9.4/bin/pg_ctl stop -D ${PGDATA} -s -m fast (code=exited, status=0/SUCCESS)
 Main PID: 3381 (code=exited, status=0/SUCCESS)

Sep 03 04:40:02 fatdba.localdomain systemd[1]: Starting PostgreSQL 9.4 database server...
Sep 03 04:40:02 fatdba.localdomain pg_ctl[3377]: LOG:  redirecting log output to logging collector process
Sep 03 04:40:02 fatdba.localdomain pg_ctl[3377]: HINT:  Future log output will appear in directory "pg_log".
Sep 03 04:40:03 fatdba.localdomain systemd[1]: Started PostgreSQL 9.4 database server.
Sep 03 20:44:58 fatdba.localdomain systemd[1]: Stopping PostgreSQL 9.4 database server...
Sep 03 20:44:59 fatdba.localdomain systemd[1]: Stopped PostgreSQL 9.4 database server.
[root@fatdba ~]#

[root@fatdba ~]# ps -ef|grep post
root       1378      1  0 20:04 ?        00:00:00 /usr/libexec/postfix/master -w
postfix    1380   1378  0 20:04 ?        00:00:00 qmgr -l -t unix -u
postfix    4113   1378  0 20:24 ?        00:00:00 pickup -l -t unix -u
postgres   4835   4708  0 20:43 pts/4    00:00:00 tail -200f pg_upgrade_server.log
root       4938   4864  0 20:45 pts/3    00:00:00 grep --color=auto post


Step 5: Let’s upgrade the database now. The upgrade took around ~ 5 minutes for a 5 GB database.


-bash-4.2$ /usr/pgsql-9.6/bin/pg_upgrade --old-bindir=/usr/pgsql-9.4/bin/ --new-bindir=/usr/pgsql-9.6/bin/ --old-datadir=/var/lib/pgsql/9.4/data/ --new-datadir=/var/lib/pgsql/9.6/data/
Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* system OID user data types                ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for roles starting with 'pg_'                      ok
Creating dump of global objects                             ok
Creating dump of database schemas
                                                            ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Analyzing all rows in the new cluster                       ok
Freezing all rows on the new cluster                        ok
Deleting files from new pg_clog                             ok
Copying old pg_clog to new server                           ok
Setting next transaction ID and epoch for new cluster       ok
Deleting files from new pg_multixact/offsets                ok
Copying old pg_multixact/offsets to new server              ok
Deleting files from new pg_multixact/members                ok
Copying old pg_multixact/members to new server              ok
Setting next multixact ID and offset for new cluster        ok
Resetting WAL archives                                      ok
Setting frozenxid and minmxid counters in new cluster       ok
Restoring global objects in the new cluster                 ok
Restoring database schemas in the new cluster
                                                            ok
Copying user relation files
                                                            ok
Setting next OID for new cluster                            ok
Sync data directory to disk                                 ok
Creating script to analyze new cluster                      ok
Creating script to delete old cluster                       ok

Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
    ./analyze_new_cluster.sh

Running this script will delete the old cluster's data files:
    ./delete_old_cluster.sh
-bash-4.2$
-bash-4.2$

Its always good to put a TAIL on server upgrade logs.


-bash-4.2$ ls -ltrh
total 12K
drwxrwxrwx 4 postgres postgres   48 Sep  3 04:39 9.4
drwx------ 3 postgres postgres   29 Sep  3 04:44 tbs
drwx------ 3 postgres postgres   29 Sep  3 20:31 tbs1
drwx------ 4 postgres postgres   48 Sep  3 20:40 9.6
-rw------- 1 postgres postgres  179 Sep  3 20:43 pg_upgrade_utility.log
-rw------- 1 postgres postgres 1.1K Sep  3 20:43 pg_upgrade_internal.log
-rw------- 1 postgres postgres 1.6K Sep  3 20:43 pg_upgrade_server.log


-----------------------------------------------------------------
  pg_upgrade run on Mon Sep  3 20:46:48 2018
-----------------------------------------------------------------

command: "/usr/pgsql-9.4/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/9.4/data/" -o "-p 50432 -b  -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/pgsql'" start >> "pg_upgrade_server.log" 2>&1
waiting for server to start....LOG:  redirecting log output to logging collector process
HINT:  Future log output will appear in directory "pg_log".
 done
server started


command: "/usr/pgsql-9.4/bin/pg_ctl" -w -D "/var/lib/pgsql/9.4/data/" -o "" -m smart stop >> "pg_upgrade_server.log" 2>&1
waiting for server to shut down.... done
server stopped


command: "/usr/pgsql-9.6/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/9.6/data/" -o "-p 50432 -b -c synchronous_commit=off -c fsync=off -c full_page_writes=off  -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/pgsql'" start >> "pg_upgrade_server.log" 2>&1
waiting for server to start.... LOG:  redirecting log output to logging collector process
 HINT:  Future log output will appear in directory "pg_log".
 done
server started


command: "/usr/pgsql-9.6/bin/pg_ctl" -w -D "/var/lib/pgsql/9.6/data/" -o "" -m smart stop >> "pg_upgrade_server.log" 2>&1
waiting for server to shut down.... done
server stopped


command: "/usr/pgsql-9.6/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/9.6/data/" -o "-p 50432 -b -c synchronous_commit=off -c fsync=off -c full_page_writes=off  -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/pgsql'" start >> "pg_upgrade_server.log" 2>&1
waiting for server to start.... LOG:  redirecting log output to logging collector process
 HINT:  Future log output will appear in directory "pg_log".
 done
server started


command: "/usr/pgsql-9.6/bin/pg_ctl" -w -D "/var/lib/pgsql/9.6/data/" -o "" -m smart stop >> "pg_upgrade_server.log" 2>&1
waiting for server to shut down.... done
server stopped


command: "/usr/pgsql-9.6/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/9.6/data/" -o "-p 50432 -b -c synchronous_commit=off -c fsync=off -c full_page_writes=off  -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/pgsql'" start >> "pg_upgrade_server.log" 2>&1
waiting for server to start.... LOG:  redirecting log output to logging collector process
 HINT:  Future log output will appear in directory "pg_log".
 done
server started


command: "/usr/pgsql-9.6/bin/pg_ctl" -w -D "/var/lib/pgsql/9.6/data/" -o "" -m smart stop >> "pg_upgrade_server.log" 2>&1
waiting for server to shut down.... done
server stopped

Step 6: Once the upgrade is done, we have to start & enable the new postgreSQL service.
Okay so below we can see the postmaster and its associated background processes started from new version (9.6) and are accessing the right data directory.

Note: Before you start the new service, always make sure you have all required values set in new parameter or configuration files, moved from old database to the new. For example listen_addresses and max_connections of postgresql.conf or any specific settings related with authorization defied on pg_hba.conf files.


[root@fatdba ~]# service postgresql-9.6 start
Redirecting to /bin/systemctl start  postgresql-9.6.service
[root@fatdba ~]#
[root@fatdba ~]# ps -ef|grep post
root       1378      1  0 20:04 ?        00:00:00 /usr/libexec/postfix/master -w
postfix    1380   1378  0 20:04 ?        00:00:00 qmgr -l -t unix -u
postgres   5298      1  0 20:48 ?        00:00:00 /usr/pgsql-9.6/bin/postmaster -D /var/lib/pgsql/9.6/data/
postgres   5301   5298  0 20:48 ?        00:00:00 postgres: logger process
postgres   5303   5298  0 20:48 ?        00:00:00 postgres: checkpointer process
postgres   5304   5298  0 20:48 ?        00:00:00 postgres: writer process
postgres   5305   5298  0 20:48 ?        00:00:00 postgres: wal writer process
postgres   5306   5298  0 20:48 ?        00:00:00 postgres: autovacuum launcher process
postgres   5307   5298  0 20:48 ?        00:00:00 postgres: stats collector process
root       5309   5228  0 20:48 pts/3    00:00:00 grep --color=auto post
[root@fatdba ~]#

[root@fatdba ~]#
[root@fatdba ~]# systemctl enable postgresql-9.6
Created symlink from /etc/systemd/system/multi-user.target.wants/postgresql-9.6.service to /usr/lib/systemd/system/postgresql-9.6.service.
[root@fatdba ~]#

Postchecks:
The upgrade script creates two new scripts under parent directory, analyze_new_cluster.sh and delete_old_cluster.sh, its good to run analyze_new_cluster.sh which performs the vaccuming or collects stats for objects and is a must in big production setups.


-bash-4.2$ ls -ltrh
total 8.0K
drwxrwxrwx 4 postgres postgres  48 Sep  3 04:39 9.4
drwx------ 4 postgres postgres  48 Sep  3 20:40 9.6
drwx------ 4 postgres postgres  52 Sep  3 20:46 tbs1
drwx------ 4 postgres postgres  52 Sep  3 20:46 tbs
-rwx------ 1 postgres postgres 135 Sep  3 20:47 delete_old_cluster.sh
-rwx------ 1 postgres postgres 755 Sep  3 20:47 analyze_new_cluster.sh

-bash-4.2$ ./analyze_new_cluster.sh
This script will generate minimal optimizer statistics rapidly
so your system is usable, and then gather statistics twice more
with increasing accuracy.  When it is done, your system will
have the default level of optimizer statistics.

If you have used ALTER TABLE to modify the statistics target for
any tables, you might want to remove them and restore them after
running this script because they will delay fast statistics generation.

If you would like default statistics as quickly as possible, cancel
this script and run:
    "/usr/pgsql-9.6/bin/vacuumdb" --all --analyze-only

vacuumdb: processing database "dixitdb": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "elephant": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "postgres": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "template1": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "dixitdb": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "elephant": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "postgres": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "template1": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "dixitdb": Generating default (full) optimizer statistics
vacuumdb: processing database "elephant": Generating default (full) optimizer statistics
vacuumdb: processing database "postgres": Generating default (full) optimizer statistics
vacuumdb: processing database "template1": Generating default (full) optimizer statistics

Done
-bash-4.2$

Post Verification or Sanity Tests.


-bash-4.2$
-bash-4.2$ psql -d elephant -U postgres
psql (9.6.10)
Type "help" for help.

elephant=#
elephant=# select version();
                                                  version
-----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)

elephant=#
elephant=# \d+
                     List of relations
 Schema |  Name  | Type  |  Owner   |  Size   | Description
--------+--------+-------+----------+---------+-------------
 public | serbia | table | postgres | 5120 kB |
(1 row)

elephant=# \dt+
                     List of relations
 Schema |  Name  | Type  |  Owner   |  Size   | Description
--------+--------+-------+----------+---------+-------------
 public | serbia | table | postgres | 5120 kB |
(1 row)

elephant=#
elephant=#
elephant=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=#
postgres=#
postgres=# \dt+
                         List of relations
 Schema |    Name    | Type  |  Owner   |    Size    | Description
--------+------------+-------+----------+------------+-------------
 public | alee       | table | postgres | 128 kB     |
 public | events     | table | postgres | 8192 bytes |
 public | large_test | table | postgres | 995 MB     |
(3 rows)

postgres=#
postgres=#
postgres=# \q

Okay we are all good and no corruption, loss is noticied and we can consider this upgrade as SUCCESSFULL.
Now if required we can go and delete the old database (Postgres calls it a Cluster).


-bash-4.2$
-bash-4.2$ ./delete_old_cluster.sh

Hope It Helps
Prashant Dixit

Posted in Advanced | Tagged: | 1 Comment »

Postgres CREATE TABLESPACE failed with Permission denied

Posted by FatDBA on September 2, 2018

Hi Everyone,

This week i was busy preparing one Postgres database for migration purposes and have faced few basic problems while doing some of the rudimentary or elementary operations. Yesterday itself i was trying to create one Tablespace and was getting ‘Permission Denied’ for the defined path/location.
I tried to create this tablespace on different locations but failed every-time with same error related with permissions and interestingly permissions are okay (Directory owned by POSTGRES with RWX permissions). This left me confused about where exactly is the problem.

This is what i was trying and getting the error.

elephant=# CREATE TABLESPACE aleph location '/var/lib/pgsql/tbs';
ERROR:  could not set permissions on directory "/var/lib/pgsql/tbs": Permission denied
elephant=# CREATE TABLESPACE aleph location '/var/lib/pgsql';
ERROR:  could not set permissions on directory "/var/lib/pgsql": Permission denied
elephant=# CREATE TABLESPACE aleph location '/var';
ERROR:  could not set permissions on directory "/var": Permission denied

Then all of the sudden i remember something similar i have encountered in the past while working on Oracle Databases where SELinux Policy Prevents SQLPlus From Connecting to Oracle Database. So, i decided to turn off the SELinux status, and to effectively do it i ran setenforce 0 (Or you can use setenforce Permissive )
* The above commands will switch off SELinux enforcement temporarily until the machine is rebooted. If you would like to make it permanently, edit /etc/sysconfig/selinux, enter:
# vi /etc/sysconfig/selinux

And set / update it as follows:
SELINUX=disabled

[root@fatdba ~]# setenforce 0
[root@fatdba ~]# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /selinux
Current mode:                   permissive
Mode from config file:          enforcing
Policy version:                 26
Policy from config file:        targeted
[root@fatdba ~]#
[root@fatdba ~]#

Now after changing the setting of SELinux i once again tried the same step and BOOM, it worked!

elephant=# CREATE TABLESPACE aleph location '/var/lib/pgsql/tbs';
CREATE TABLESPACE
elephant=#
elephant=#

Hope It Helps
Prashant Dixit

Posted in Basics, troubleshooting | Tagged: | Leave a Comment »

 
%d bloggers like this: