Tales From A Lazy Fat DBA

Its all about Databases & their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

pg_cron : Probably the best way to schedule jobs within PostgreSQL database.

Posted by FatDBA on July 30, 2021

Hi Guys,

Many of databases like Oracle (DBMS_SCHEDULER), MySQL (event_scheduler), Micrsoft SQL Server (SQL Server Agent) have their own build-in job/command scheduling mechanisms, PostgreSQL still does’t have any native or in-build scheduling methods like those databases. There are few popular extensions like pg_cron, pgAgent etc. which can be used to get the scheduling functionality in your PostgreSQL database.

This post will be all about installing, configuring and doing some tests using pg_cron extension and see what all it has to offer. pg_cron which is a simple cron-based job scheduler for PostgreSQL ( >= 9.5) that runs inside the database as an extension. It uses the same grammatical arrangement as regular cron, but it allows you to schedule PostgreSQL commands directly. You can schedule your SQL Statements, stored procedures and tasks like VACUUM, VACUUM ANALYZE

Though there are few limitations with the approach

  • You can’t use pg_cron jobs to schedule a job in seconds interval. For example, you can’t schedule a cron job to run every 10 seconds. For such scenarios, it’s better to write a shell script that uses the sleep command. Still a good option to try!
  • pg_cron can run multiple jobs in parallel, but it runs at most one instance of a job at a time. If a second run is supposed to start before the first one finishes, then the second run is queued and started as soon as the first run completes.

but still a best bet …

Okay, let’s start with the installation. I am using YUM to install the package. I am doing this test on Open Source PostgreSQL 12.6.7

[root@canttowin ~]# yum install pg_cron_12.x86_64
Loaded plugins: langpacks, ulninfo

Resolving Dependencies
--> Running transaction check
---> Package pg_cron_12.x86_64 0:1.3.1-1.rhel7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================
 Package                               Arch                              Version                                     Repository                         Size
=============================================================================================================================================================
Installing:
 pg_cron_12                            x86_64                            1.3.1-1.rhel7                               pgdg12                             90 k

Transaction Summary
=============================================================================================================================================================
Install  1 Package

Total download size: 90 k
Installed size: 178 k
Is this ok [y/d/N]: y
Downloading packages:
pg_cron_12-1.3.1-1.rhel7.x86_64.rpm                                                                                                   |  90 kB  00:00:02
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : pg_cron_12-1.3.1-1.rhel7.x86_64                                                                                                           1/1
  Verifying  : pg_cron_12-1.3.1-1.rhel7.x86_64                                                                                                           1/1

Installed:
  pg_cron_12.x86_64 0:1.3.1-1.rhel7

Complete!
You have new mail in /var/spool/mail/root
[root@canttowin ~]#

Next, I will be modifying the parameter ‘shared_preload_libraries‘ within postgresql.conf parameter file which is to specifies one or more shared libraries to be preloaded at server start. Will do a database reboot after that to make changes persistent.

[postgres@canttowin data]$
[postgres@canttowin data]$ more postgresql.conf |grep shared
#shared_preload_libraries = ''  # (change requires restart)


[postgres@canttowin data]$
[postgres@canttowin data]$ more postgresql.conf |grep shared
shared_preload_libraries = 'pg_cron'    # (change requires restart)


[postgres@canttowin bin]$
[postgres@canttowin bin]$ ./pg_ctl -D /var/lib/pgsql/12/data/ stop
waiting for server to shut down.... done
server stopped
[postgres@canttowin bin]$ ./pg_ctl -D /var/lib/pgsql/12/data/ start
waiting for server to start....2021-07-24 11:14:00.948 EDT [14074] LOG:  starting PostgreSQL 12.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
2021-07-24 11:14:00.948 EDT [14074] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2021-07-24 11:14:00.948 EDT [14074] LOG:  listening on IPv6 address "::", port 5432
2021-07-24 11:14:00.966 EDT [14074] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-07-24 11:14:00.982 EDT [14074] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-07-24 11:14:01.020 EDT [14074] LOG:  redirecting log output to logging collector process
2021-07-24 11:14:01.020 EDT [14074] HINT:  Future log output will appear in directory "log".
 done
server started
[postgres@canttowin bin]$

Next, if you want to use any specific database to create the extension, you have to an additional entry to configuration file with cron.database_name= ‘name of the database’

Now, you will see one more background process ‘pg_cron launcher‘ comes in to existence and can be seen running on the system.

Alright, all set. Now will create the extension on the database and will do some tests.

[postgres@canttowin bin]$
[postgres@canttowin bin]$ psql -p 5432
psql (12.6.7)
Type "help" for help.

postgres=# \conninfo
You are connected to database "postgres" as user "postgres" via socket in "/tmp" at port "5432".
postgres=#
postgres=#
postgres=#
postgres=# CREATE EXTENSION pg_cron;
CREATE EXTENSION
postgres=#
postgres=#
postgres=# GRANT USAGE ON SCHEMA cron TO postgres;
GRANT
postgres=#



-- How to check job lists.
postgres=# SELECT * FROM cron.job;
 jobid | schedule | command | nodename | nodeport | database | username | active | jobname
-------+----------+---------+----------+----------+----------+----------+--------+---------
(0 rows)

-- To schedule a VACUUM ANALYZE on each day at 11:00 AM
postgres=# SELECT cron.schedule('0 11 * * *', 'VACUUM ANALYZE');
 schedule
----------
        1
(1 row)

postgres=# SELECT * FROM cron.job;
 jobid |  schedule  |    command     | nodename  | nodeport | database | username | active | jobname
-------+------------+----------------+-----------+----------+----------+----------+--------+---------
     1 | 0 11 * * * | VACUUM ANALYZE | localhost |     5432 | postgres | postgres | t      |
(1 row)

postgres=#

postgres=#
postgres=#
postgres=# table cron.job;

 jobid |  schedule  |    command     | nodename  | nodeport | database | username | active | jobname
-------+------------+----------------+-----------+----------+----------+----------+--------+---------
     1 | 0 11 * * * | VACUUM ANALYZE | localhost |     5432 | postgres | postgres | t      |
(1 row)


Lets do some more tests, will now create a table with some test records and will try to delete them using pg_cron.

postgres=#
postgres=# CREATE TABLE article (
postgres(#     article_id bigserial primary key,
postgres(#     article_name varchar(20) NOT NULL,
postgres(#     article_desc text NOT NULL,
postgres(#     date_added timestamp default NULL
postgres(# );
CREATE TABLE
postgres=#


postgres=# insert into article (article_id, article_name, article_desc, date_added) values (1010,'vinyl','Beatles 1980 Vinyl',current_timestamp);
postgres=# insert into article (article_id, article_name, article_desc, date_added) values (1011,'Tape','Deftones 2015 Vinyl',current_timestamp);
postgres=# insert into article (article_id, article_name, article_desc, date_added) values (1019,'Tape','Deftones 2015 Vinyl',current_timestamp);



postgres=#
postgres=# select * from article;
 article_id | article_name |    article_desc     |         date_added
------------+--------------+---------------------+----------------------------
       1010 | vinyl        | Beatles 1980 Vinyl  | 2021-07-24 12:03:30.514886
       1019 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:03:40.870081
       1011 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:03:52.046054
(3 rows)

postgres=#


postgres=# \! date
Sat Jul 24 12:20:40 EDT 2021
postgres=#



-- To delete entries those are older than 1 minute
postgres=# SELECT cron.schedule('23 12 * * *', $$DELETE FROM article WHERE date_added<now()- interval '1 Mins'$$);
 schedule
----------
       2
(1 row)

postgres=#
postgres=# SELECT * FROM cron.job;
 jobid |  schedule   |                            command                            | nodename  | nodeport | database | username | active | jobname
-------+-------------+---------------------------------------------------------------+-----------+----------+----------+----------+--------+---------
     1 | */5 * * * * | VACUUM                                                        | localhost |     5432 | postgres | postgres | t      |
     2 | 23 12 * * * | DELETE FROM article WHERE date_added<now()- interval '1 Mins' | localhost |     5432 | postgres | postgres | t      |
(2 rows)


-- This is what it is before pg_cron executing the schedule.
postgres=#
postgres=#
postgres=# select * from article;
 article_id | article_name |    article_desc     |         date_added
------------+--------------+---------------------+----------------------------
       1010 | vinyl        | Beatles 1980 Vinyl  | 2021-07-24 12:16:00.507391
       1011 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:16:05.762869
       1019 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:16:09.960464
(3 rows)


-- Will do a watch on table records to see if pg_cron comes and delete records on specified time.
postgres=#
postgres=# \watch 5
                  Sat 24 Jul 2021 12:22:43 PM EDT (every 5s)

 article_id | article_name |    article_desc     |         date_added
------------+--------------+---------------------+----------------------------
       1010 | vinyl        | Beatles 1980 Vinyl  | 2021-07-24 12:16:00.507391
       1011 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:16:05.762869
       1019 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:16:09.960464
(3 rows)

                  Sat 24 Jul 2021 12:22:48 PM EDT (every 5s)

 article_id | article_name |    article_desc     |         date_added
------------+--------------+---------------------+----------------------------
       1010 | vinyl        | Beatles 1980 Vinyl  | 2021-07-24 12:16:00.507391
       1011 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:16:05.762869
       1019 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:16:09.960464
(3 rows)

                  Sat 24 Jul 2021 12:22:53 PM EDT (every 5s)

 article_id | article_name |    article_desc     |         date_added
------------+--------------+---------------------+----------------------------
       1010 | vinyl        | Beatles 1980 Vinyl  | 2021-07-24 12:16:00.507391
       1011 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:16:05.762869
       1019 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:16:09.960464
(3 rows)

                  Sat 24 Jul 2021 12:22:58 PM EDT (every 5s)

 article_id | article_name |    article_desc     |         date_added
------------+--------------+---------------------+----------------------------
       1010 | vinyl        | Beatles 1980 Vinyl  | 2021-07-24 12:16:00.507391
       1011 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:16:05.762869
       1019 | Tape         | Deftones 2015 Vinyl | 2021-07-24 12:16:09.960464
(3 rows)

                  Sat 24 Jul 2021 12:23:03 PM EDT (every 5s)

 article_id | article_name |    article_desc     |         date_added
------------+--------------+---------------------+----------------------------

Yes, it did pretty smoothly on specified time! Let’s do some more fun.

postgres=#
postgres=# CREATE OR REPLACE PROCEDURE display_message (INOUT msg TEXT)
postgres-# AS $$ BEGIN
postgres$# RAISE NOTICE 'Procedure Parameter: %', msg ;
postgres$# END ;
postgres$# $$
postgres-# LANGUAGE plpgsql ;
CREATE PROCEDURE
postgres=#
postgres=# call display_message('This is my test case');
NOTICE:  Procedure Parameter: This is my test case
         msg
----------------------
 This is my test case
(1 row)



postgres=# SELECT cron.schedule('display_message', '* * * * *', 'CALL display_message()');
 schedule
----------
       14
(1 row)

postgres=# SELECT * FROM cron.job;
 jobid |  schedule   |                            command                            | nodename  | nodeport | database | username | active |     jobname
-------+-------------+---------------------------------------------------------------+-----------+----------+----------+----------+--------+-----------------
     9 | */5 * * * * | VACUUM                                                        | localhost |     5432 | postgres | postgres | t      |
    12 | 20 12 * * * | DELETE FROM article WHERE date_added<now()- interval '1 Mins' | localhost |     5432 | postgres | postgres | t      |
    13 | 0 3 * * *   | VACUUM article                                                | localhost |     5432 | postgres | postgres | t      |
    14 | * * * * *   | CALL display_message()                                        | localhost |     5432 | postgres | postgres | t      | display_message
(4 rows)

Now, how to delete a single schedule or all/multiple.

postgres=#  SELECT cron.unschedule(2);
 unschedule
------------
 t
(1 row)


postgres=#
postgres=# SELECT cron.unschedule(jobid) FROM cron.job;
 unschedule
------------
 t
 t

(2 rows)

postgres=#
postgres=# SELECT * FROM cron.job;
 jobid | schedule | command | nodename | nodeport | database | username | active | jobname
-------+----------+---------+----------+----------+----------+----------+--------+---------
(0 rows)

postgres=#

You can check database log files (postgresql-<day of the week>.log) to see if all okay with the jobs. If any of the scheduled job failed to execute, it will log more details to the log file, example, see below screenshot where one of the job with ID 14 trying to start but failed due to ‘connection refused‘ error. So, here we have got a hint to check pg_hba file and accordingly take the action to mitigate the problem.

After we applied the change …

Hope It Helped!
Prashant Dixit

Advertisement

2 Responses to “pg_cron : Probably the best way to schedule jobs within PostgreSQL database.”

  1. RekGRpth said

    Consider https://github.com/RekGRpth/pg_task

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

 
%d bloggers like this: