PostgreSQL window functions crash course

Guide to Using Window Functions in PostgreSQL

Introduction

According to PostgreSQL documentation

Window functions provide the ability to perform calculations across sets of rows that are related to the current query row.

Window functions in PostgreSQL are a powerful tool that can significantly enhance your data analysis capabilities. They can perform calculations across a set of related rows, in a way traditional SQL queries can't. In this blog post, I'll walk you through the concept of window functions, their syntax, and several examples to better understand their practical use cases.

The primary goal of this post is that the reader after reading the post and trying all the examples will understand how to use PostgreSQL window functions.

Please also use the accompanying GitHub repository.

To follow the crash course I highly encourage you to try the examples.

Clone the github repository

First, clone the repository

got clone https://github.com/gosom/postgres-window-functions.git

Then enter the folder

cd postgres-window-functions

A docker-compose.yaml file is provided that will spin a PostgreSQL docker container that includes the example tables and the data.

To spin the container: docker-compose up -d

You can connect to the database from the command line using:

make psql

Alternatively, use your favorite PostgreSQL client using the below connection details.

Hostname: localhost
Port: 5432
Username: postgres
Password: postgres
Database: postgres

The Dataset

We'll use a hypothetical dataset from a library system, specifically a read_log table, which tracks the reading habits of various individuals:

CREATE TABLE read_log (
    id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    reader_name TEXT,
    book_title TEXT,
    pages INT,
    read_date DATE
);
  • id: is an auto-increment integer which is unique per row

  • reader_name: is the name of the reader

  • book_title: is the title of the book

  • read_date: is the date that the reader finished the book

Now, let's populate our read_log table with sample data:

INSERT INTO read_log (reader_name, book_title, pages, read_date) VALUES 
('Giorgos', 'The Iliad', 711, '2023-01-01'),
('Giorgos', 'The Odyssey', 800, '2023-02-15'),
('Giorgos', 'The Republic', 400, '2023-03-30'),
('Emmanouela', 'Antigone', 150, '2023-01-20'),
('Emmanouela', 'Oedipus Rex', 100, '2023-02-26'),
('Emmanouela', 'The Symposium', 200, '2023-04-10'),
('Eleni', 'The Histories', 900, '2023-02-01'),
('Eleni', 'Works and Days', 150, '2023-03-15'),
('Eleni', 'Prometheus Bound', 80, '2023-04-25'),
('Eleni', 'Metamorphoses', 480, '2023-02-15'),
('Konstantina', 'The Iliad', 711, '2023-01-22'),
('Konstantina', 'The Odyssey', 800, '2023-03-18'),
('Konstantina', 'The Symposium', 200, '2023-04-30'),
('Andreas', 'Antigone', 150, '2023-01-30'),
('Andreas', 'Oedipus Rex', 100, '2023-02-20'),
('Andreas', 'The Republic', 400, '2023-04-18'),
('Andreas', 'Metamorphoses', 480, '2023-05-08');

Understanding Window Functions

A window function performs a calculation across a set of table rows that are related to the current row. It's like an advanced version of an aggregation function (like SUM(), COUNT(), etc.), but instead of collapsing all the rows into a single output row, it maintains the separate rows.

Syntax

window_function (expression) OVER (
    [PARTITION BY partition_expression]
    [ORDER BY sort_expression]
    [frame_clause]
)

Let's break this down:

  • window function (expression) : this part is similar to regular SQL function calls. The expression is the subject of the calculation. For example, if we are using the SUM() window function, the expression would be the column we want to sum.

  • OVER(...): A window function will always include an OVER clause. It defines the 'window' or set of rows the function operates on.

  • PARTITION BY partition_expression: This is used to break the data into smaller partitions. The window function is applied within each of these partitions. If it's not specified, the window function treats all rows of the result set as a single partition.

  • ORDER BY sort_expression: This clause determines the order in which rows are processed by the window function. Rows are ordered according to this expression before the function is applied.

  • frame_clause: This clause further refines the window within a partition that the window function operates on.
    This is a more complex part, that I will cover later. For now, focus understanding the basic syntax and capabilities of window functions.

Let's see the syntax in action:

Let's suppose that we want to get all the rows from the read_log table but we additionally want in each row the tp, which is the total number of pages for all readers.

Without window functions:

One way to do it:

WITH cte AS (
    SELECT 
        sum(pages) as tp 
    FROM 
        read_log
)
SELECT 
    read_log.*, cte.tp 
FROM 
    read_log, cte;

Result set:

  id | reader_name |    book_title    | pages | read_date  |  tp  
----+-------------+------------------+-------+------------+------
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 | 6812
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 6812
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 6812
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 | 6812
  5 | Emmanouela  | Oedipus Rex      |   100 | 2023-02-26 | 6812
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 | 6812
  7 | Eleni       | The Histories    |   900 | 2023-02-01 | 6812
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 6812
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 6812
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 6812
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 | 6812
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 6812
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 6812
 14 | Andreas     | Antigone         |   150 | 2023-01-30 | 6812
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 | 6812
 16 | Andreas     | The Republic     |   400 | 2023-04-18 | 6812
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 6812

Using Window Function

SELECT 
    *, sum(tp) OVER() 
FROM 
    read_log;

Notice, here that we are using an empty OVER clause, we want our window function to operate in all the rows.

Result set:

 id | reader_name |    book_title    | pages | read_date  | tp  
----+-------------+------------------+-------+------------+------
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 | 6812
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 6812
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 6812
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 | 6812
  5 | Emmaounela  | Oedipus Rex      |   100 | 2023-02-26 | 6812
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 | 6812
  7 | Eleni       | The Histories    |   900 | 2023-02-01 | 6812
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 6812
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 6812
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 6812
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 | 6812
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 6812
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 6812
 14 | Andreas     | Antigone         |   150 | 2023-01-30 | 6812
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 | 6812
 16 | Andreas     | The Republic     |   400 | 2023-04-18 | 6812
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 6812

Our requirement has changed and now we need the tp to be the total pages by each reader_name

Without Window functions

WITH cte AS (
    SELECT 
    reader_name, sum(pages) AS tp 
    FROM 
        read_log 
    GROUP BY 
        reader_name
)
SELECT 
    read_log.*, cte.tp 
FROM 
    read_log, cte 
WHERE 
    read_log.reader_name = cte.reader_name;

Result Set:

 id | reader_name |    book_title    | pages | read_date  |  tp  
----+-------------+------------------+-------+------------+------
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 | 1911
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 1911
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 1911
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 |  450
  5 | Emmanouela  | Oedipus Rex      |   100 | 2023-02-26 |  450
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 |  450
  7 | Eleni       | The Histories    |   900 | 2023-02-01 | 1610
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 1610
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 1610
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 1610
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 | 1711
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 1711
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 1711
 14 | Andreas     | Antigone         |   150 | 2023-01-30 | 1130
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 | 1130
 16 | Andreas     | The Republic     |   400 | 2023-04-18 | 1130
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 1130

With window functions

SELECT
    *, sum(pages) OVER(PARTITION BY reader_name) AS tp
FROM
    read_log;

Notice, here that we are using PARTITION BY in the OVER clause, since we want to calculate the SUM of pages per reader_name.

Result Set:

 id | reader_name |    book_title    | pages | read_date  |  tp  
----+-------------+------------------+-------+------------+------
 14 | Andreas     | Antigone         |   150 | 2023-01-30 | 1130
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 1130
 16 | Andreas     | The Republic     |   400 | 2023-04-18 | 1130
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 | 1130
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 1610
  7 | Eleni       | The Histories    |   900 | 2023-02-01 | 1610
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 1610
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 1610
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 |  450
  5 | Emmanouela  | Oedipus Rex      |   100 | 2023-02-26 |  450
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 |  450
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 | 1911
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 1911
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 1911
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 1711
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 | 1711
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 1711

Note: Notice that the results are in different order. This is not an issue since we didn't have any order requirements.

Let's now change our requirement and we want the running total of pages read by each reader, ordered by the date when they finished each book.
So, for example for reader
Andreas we need an output like:

14 | Andreas | Antigone | 150 | 2023-01-30 | 150 
15 | Andreas | Oedipus Rex | 100 | 2023-02-20 | 250 
16 | Andreas | The Republic | 400 | 2023-04-18 | 650 
17 | Andreas | Metamorphoses | 480 | 2023-05-08 | 1130

Without window functions this is not so trivial:

WITH A AS (                                      
    SELECT                                                                   
        read_date,
        reader_name,
        SUM(pages) AS tp
    FROM(
        SELECT
            t1.read_date,
            t1.reader_name,
            t2.pages
        FROM 
            read_log t1
        JOIN 
            read_log t2
        ON 
            t1.reader_name = t2.reader_name
        AND t1.read_date >= t2.read_date
    ) t3
    GROUP BY
        read_date,
        reader_name
)
SELECT 
    read_log.*,
    A.tp
FROM 
    read_log
JOIN A ON 
    read_log.read_date = A.read_date
    AND read_log.reader_name = A.reader_name
;

Result Set

 id | reader_name |    book_title    | pages | read_date  |  tp  
----+-------------+------------------+-------+------------+------
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 |  711
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 1511
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 1911
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 |  150
  5 | Emmanouela  | Oedipus Rex      |   100 | 2023-02-26 |  250
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 |  450
  7 | Eleni       | The Histories    |   900 | 2023-02-01 |  900
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 1530
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 1610
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 1380
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 |  711
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 1511
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 1711
 14 | Andreas     | Antigone         |   150 | 2023-01-30 |  150
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 |  250
 16 | Andreas     | The Republic     |   400 | 2023-04-18 |  650
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 1130

Now using Window Function

SELECT 
    *, sum(pages) OVER (PARTITION BY reader_name ORDER BY read_date) AS tp
FROM
    read_log
;

Result Set

 id | reader_name |    book_title    | pages | read_date  |  tp  
----+-------------+------------------+-------+------------+------
 14 | Andreas     | Antigone         |   150 | 2023-01-30 |  150
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 |  250
 16 | Andreas     | The Republic     |   400 | 2023-04-18 |  650
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 1130
  7 | Eleni       | The Histories    |   900 | 2023-02-01 |  900
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 1380
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 1530
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 1610
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 |  150
  5 | Emmanouela  | Oedipus Rex      |   100 | 2023-02-26 |  250
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 |  450
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 |  711
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 1511
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 1911
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 |  711
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 1511
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 1711

Please pay attention how the window function behaves differently comparing to the behavior without the ORDER BY in the previous example.
Rows are ordered according to this expression before the function is applied.

Now we want to know the moving average of pages each user reads.

Example output for reader_name Giorgos

Without window functions

WITH A AS (                                      
    SELECT                                                                   
        read_date,
        reader_name,
        AVG(pages) AS moving_avg
    FROM(
        SELECT
            t1.read_date,
            t1.reader_name,
            t2.pages
        FROM 
            read_log t1
        JOIN 
            read_log t2
        ON 
            t1.reader_name = t2.reader_name
        AND t1.read_date >= t2.read_date
    ) t3
    GROUP BY
        read_date,
        reader_name
)
SELECT 
    read_log.reader_name,
    read_log.read_date,
    A.moving_avg
FROM 
    read_log
JOIN A ON 
    read_log.read_date = A.read_date
    AND read_log.reader_name = A.reader_name
;

Result set:

 reader_name | read_date  |      moving_avg      
-------------+------------+----------------------
 Giorgos     | 2023-01-01 | 711.0000000000000000
 Giorgos     | 2023-02-15 | 755.5000000000000000
 Giorgos     | 2023-03-30 | 637.0000000000000000
 Emmanouela  | 2023-01-20 | 150.0000000000000000
 Emmanouela  | 2023-02-26 | 125.0000000000000000
 Emmanouela  | 2023-04-10 | 150.0000000000000000
 Eleni       | 2023-02-01 | 900.0000000000000000
 Eleni       | 2023-03-15 | 510.0000000000000000
 Eleni       | 2023-04-25 | 402.5000000000000000
 Eleni       | 2023-02-15 | 690.0000000000000000
 Konstantina | 2023-01-22 | 711.0000000000000000
 Konstantina | 2023-03-18 | 755.5000000000000000
 Konstantina | 2023-04-30 | 570.3333333333333333
 Andreas     | 2023-01-30 | 150.0000000000000000
 Andreas     | 2023-02-20 | 125.0000000000000000
 Andreas     | 2023-04-18 | 216.6666666666666667
 Andreas     | 2023-05-08 | 282.5000000000000000

Let's find the average pages read by each reader, considering only the current book and one book prior, so for reader Giorgos the first value should be 711, the second (711+800)/2 = 755, the third (800+400)/2=600

Without window functions

I cannot think of an obvious way to do that. I am pretty sure that this is possible although.

Using window functions

If you see the window functions syntax there is the frame_clause. It can be very helpful when you need more control over the rows that are being considered by the window function. We are going to utilize that here.

SELECT 
    reader_name, 
    read_date,
    AVG(pages) OVER(PARTITION BY reader_name ORDER BY read_date ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) as moving_avg
FROM 
    read_log
;

Notice now that we use a frame_clause .

ROWS BETWEEN 1 PRECEDING AND CURRENT ROW

which further refines the 'window' that the window functions operate on.
In our case, it controls the number of rows included in the calculation.

A frame_clause can be RANGE, ROWS, GROUPS . Here we used ROWS . I am not going to explain further the frame_clause.
You can search online or read the documentation for further examples.

Result Set

 reader_name | read_date  |      moving_avg      
-------------+------------+----------------------
 Andreas     | 2023-01-30 | 150.0000000000000000
 Andreas     | 2023-02-20 | 125.0000000000000000
 Andreas     | 2023-04-18 | 250.0000000000000000
 Andreas     | 2023-05-08 | 440.0000000000000000
 Eleni       | 2023-02-01 | 900.0000000000000000
 Eleni       | 2023-02-15 | 690.0000000000000000
 Eleni       | 2023-03-15 | 315.0000000000000000
 Eleni       | 2023-04-25 | 115.0000000000000000
 Emmanouela  | 2023-01-20 | 150.0000000000000000
 Emmanouela  | 2023-02-26 | 125.0000000000000000
 Emmanouela  | 2023-04-10 | 150.0000000000000000
 Giorgos     | 2023-01-01 | 711.0000000000000000
 Giorgos     | 2023-02-15 | 755.5000000000000000
 Giorgos     | 2023-03-30 | 600.0000000000000000
 Konstantina | 2023-01-22 | 711.0000000000000000
 Konstantina | 2023-03-18 | 755.5000000000000000
 Konstantina | 2023-04-30 | 500.0000000000000000

Window Functions in Action

Now that we understand the syntax and concept of window functions, let's take a look at some examples using different window functions on our read_log table.

Ranking Functions

Ranking functions are a category of functions that assign a unique rank to each row within the window partition. This is is useful when you want to find top-N or bottom-N rows.

RANK()

Rank() function assigns a unique rank to each distinct row.

We want to rank the readers by the total number of pages they have read:

SELECT 
    reader_name, 
    SUM(pages) as total_pages, 
    RANK() OVER(ORDER BY SUM(pages) DESC) as rank
FROM 
    read_log
GROUP BY 
reader_name
;

Result Set:

 reader_name | total_pages | rank 
-------------+-------------+------
 Giorgos     |        1911 |    1
 Konstantina |        1711 |    2
 Eleni       |        1610 |    3
 Andreas     |        1130 |    4
 Emmanouela  |         450 |    5

Now I would like you to insert this into our database

INSERT INTO read_log
(reader_name, book_title, pages, read_date) 
VALUES
('Konstantina', 'The Orestia', 200, '2023-05-15');

We would like readers Giorgos and Konstatina to have read the same amount of pages.

Run the query again:

SELECT 
    reader_name, 
    SUM(pages) as total_pages, 
    RANK() OVER(ORDER BY SUM(pages) DESC) as rank
FROM 
    read_log
GROUP BY 
reader_name
;

Notice the result now:

 reader_name | total_pages | rank 
-------------+-------------+------
 Giorgos     |        1911 |    1
 Konstantina |        1911 |    1
 Eleni       |        1610 |    3
 Andreas     |        1130 |    4
 Emmanouela  |         450 |    5

Giorgos and Konstantina have the same rank but there is gap in rank column (it goes from 1 to 3).

DENSE_RANK()

DENSE_RANK() works similarly to RANK() , but it doesn't leave gaps between groups of duplicate values.

SELECT 
    reader_name, 
    SUM(pages) as total_pages, 
    DENSE_RANK() OVER(ORDER BY SUM(pages) DESC) as rank
FROM 
    read_log
GROUP BY 
reader_name
;

Result:

 reader_name | total_pages | rank 
-------------+-------------+------
 Giorgos     |        1911 |    1
 Konstantina |        1911 |    1
 Eleni       |        1610 |    2
 Andreas     |        1130 |    3
 Emmanouela  |         450 |    4

Notice that now there is no gap and reader Eleni has rank 2.

ROW_NUMBER()

ROW_NUMBER() assigns a unique number to each row in the results. This function is useful when you need a unique value for each row belonging to the partition.

SELECT 
    reader_name, 
    book_title, 
    read_date,
    ROW_NUMBER() OVER(PARTITION BY reader_name ORDER BY read_date) as read_order
FROM read_log;

This query will give a unique row number to each book read by the same reader, ordered by the date they finished each book.

Aggregate Functions

I have already shown you a couple of examples of aggregate functions SUM() and AVG() .

SUM()

SUM() can be useful to calculate running totals.

AVG()

AVG() can be useful to calculate moving averages.

You can use any builtin aggregate function as a window function.

General Purpose Window Functions

FIRST VALUE() and LAST_VALUE()

FIRST_VALUE() and LAST_VALUE() are window functions that return the first and last value of an ordered set of values. Let's see which book each reader started and ended with:

SELECT 
    DISTINCT ON (reader_name) reader_name, 
    FIRST_VALUE(book_title) OVER(PARTITION BY reader_name ORDER BY read_date) as first_book,
    LAST_VALUE(book_title) OVER(PARTITION BY reader_name 
                                ORDER BY read_date 
                                ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as last_book

Note from the docs:

Note that first_value, last_value, and nth_value consider only the rows within the “window frame”, which by default contains the rows from the start of the partition through the last peer of the current row. This is likely to give unhelpful results for last_value and sometimes also nth_value. You can redefine the frame by adding a suitable frame specification (RANGE, ROWS or GROUPS) to the OVER clause. See Section 4.2.8 for more information about frame specifications.

...

To obtain aggregation over the whole partition, omit ORDER BY or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

The ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING clause ensures that the window frame includes all rows in the partition.

Based on the docs we can also write the query like:

SELECT 
    DISTINCT ON (reader_name) reader_name, 
    FIRST_VALUE(book_title) OVER(PARTITION BY reader_name ORDER BY read_date) as first_book,
    LAST_VALUE(book_title) OVER(PARTITION BY reader_name )
FROM read_log;

Result:

 reader_name |  first_book   |    last_book     
-------------+---------------+------------------
 Andreas     | Antigone      | Metamorphoses
 Eleni       | The Histories | Prometheus Bound
 Emmanouela  | Antigone      | The Symposium
 Giorgos     | The Iliad     | The Republic
 Konstantina | The Iliad     | The Orestia

LAG() and LEAD()

LAG() and LEAD() functions allow you to fetch the value of a previous row or following row within your data partition.

For example, you can use LAG() to get the book each reader read before the current one:

SELECT 
    reader_name, 
    book_title, 
    LAG(book_title) OVER(PARTITION BY reader_name ORDER BY read_date) as previous_book
FROM 
read_log;

Result:

 reader_name |    book_title    | previous_book  
-------------+------------------+----------------
 Andreas     | Antigone         | 
 Andreas     | Oedipus Rex      | Antigone
 Andreas     | The Republic     | Oedipus Rex
 Andreas     | Metamorphoses    | The Republic
 Eleni       | The Histories    | 
 Eleni       | Metamorphoses    | The Histories
 Eleni       | Works and Days   | Metamorphoses
 Eleni       | Prometheus Bound | Works and Days
 Emmanouela  | Antigone         | 
 Emmanouela  | Oedipus Rex      | Antigone
 Emmanouela  | The Symposium    | Oedipus Rex
 Giorgos     | The Iliad        | 
 Giorgos     | The Odyssey      | The Iliad
 Giorgos     | The Republic     | The Odyssey
 Konstantina | The Iliad        | 
 Konstantina | The Odyssey      | The Iliad
 Konstantina | The Symposium    | The Odyssey
 Konstantina | The Orestia      | The Symposium

You can find a list of all the builtin window functions in the docs

Conclusion

In this blog post, we covered the basics of PostgreSQL window functions. I showed you the basic syntax of window functions and how you can use them to rank data, keep running totals, find averages and check data around each row

In order to master window functions the key is practice. Get the test dataset or create your dataset, look up the documentation and try them with different examples.

References