Using sliding windows

So far the window we have used inside our query has been static. However, for calculations such as a moving average, this is not enough. A moving average needs a sliding window that moves along as data is processed.

Here is an example of how a moving average can be achieved:

test=# SELECT country, year, production,  
        min(production) 
OVER (PARTITION BY country

ORDER BY year ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) FROM t_oil WHERE year BETWEEN 1978 AND 1983 AND country IN ('Iran', 'Oman'); country | year | production | min ---------+-------+------------+------ Iran | 1978 | 5302 | 3218 Iran | 1979 | 3218 | 1479 Iran | 1980 | 1479 | 1321 Iran | 1981 | 1321 | 1321 Iran | 1982 | 2397 | 1321 Iran | 1983 | 2454 | 2397 Oman | 1978 | 314 | 295 Oman | 1979 | 295 | 285 Oman | 1980 | 285 | 285 Oman | 1981 | 330 | 285 Oman | 1982 | 338 | 330 Oman | 1983 | 391 | 338 (12 rows)

The most important thing is that a moving window should be used with an ORDER BY clause. Otherwise, there will be major problems. PostgreSQL would actually accept the query but the result would be totally wrong. Remember, feeding data to a sliding window without ordering it first will simply lead to random data.

ROWS BETWEEN 1 PRECEDING and 1 FOLLOWING defines the window. In this example, up to three rows will be in use: the current row, the one before, and the one after the current row. To illustrate how the sliding window works, check out the following example:

test=# SELECT *, array_agg(id)
OVER (ORDER BY id ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) FROM generate_series(1, 5) AS id;
id | array_agg ----+----------- 1 | {1,2} 2 | {1,2,3} 3 | {2,3,4} 4 | {3,4,5} 5 | {4,5} (5 rows)

The array_agg function will turn a list of values into a PostgreSQL array. It will help to explain how the sliding window operates.

Actually, this trivial query has some very important aspects. What you see is that the first array contains only two values. There is no entry before 1 and therefore the array is not full. PostgreSQL does not add null entries because they would be ignored by aggregates anyway. The same happens at the end of the data.

However, sliding windows offer more. There are a couple of keywords that can be used to specify the sliding window:

test=# SELECT *, 
array_agg(id) OVER (ORDER BY id ROWS BETWEEN
UNBOUNDED PRECEDING AND 0 FOLLOWING)
FROM generate_series(1, 5) AS id;

id | array_agg ----+------------- 1 | {1} 2 | {1,2} 3 | {1,2,3} 4 | {1,2,3,4} 5 | {1,2,3,4,5} (5 rows)

The UNBOUNDED PRECEDING keywords mean that everything before the current line will be in the window. The counterpart to UNBOUNDED PRECEDING is UNBOUNDED FOLLOWING. Let us look at the following example:

test=# SELECT *, 
array_agg(id) OVER (ORDER BY id ROWS BETWEEN 2 FOLLOWING AND UNBOUNDED FOLLOWING)
FROM generate_series(1, 5) AS id;

id | array_agg ----+----------- 1 | {3,4,5} 2 | {4,5} 3 | {5} 4 | 5 | (5 rows)

As you can see, it is also possible to use a window that is in the future. PostreSQL is very flexible here.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset