To demonstrate how the optimizer works, I have compiled an example. It is something that has been used by me over the years for PostgreSQL training. Let's assume that there are three tables, as follows:
CREATE TABLE a (aid int, ...); -- 100 million rows CREATE TABLE b (bid int, ...); -- 200 million rows CREATE TABLE c (cid int, ...); -- 300 million rows
Let's further assume that those tables contain millions, or maybe hundreds of millions, of rows. In addition to that, there are indexes:
CREATE INDEX idx_a ON a (aid); CREATE INDEX idx_b ON b (bid); CREATE INDEX idx_c ON c (cid);
CREATE VIEW v AS SELECT *
FROM a, b
WHERE aid = bid;
Finally, there is a view joining the first two tables together.
Let's suppose now that the end user wants to run the following query. What will the optimizer do with this query? What choices does the planner have?
SELECT *
FROM v, c
WHERE v.aid = c.cid
AND cid = 4;
Before looking at the real optimization process, we will focus on some of the options that the planner has.