Chapter 22

The Quantile Function

“There is flattery in friendship.”

– William Shakespeare

Table of Contents Chapter 22 – The Quantile Function

The Quantile Function and Syntax

A Quantile Example

A Quantile Example using DESC Mode

QUALIFY to find Products in the top Partitions

QUALIFY to find Products in the top Partitions Sorted DESC

QUALIFY to find Products in the top Partitions Sorted ASC

QUALIFY to find Products in top Partitions with Tiebreaker

Using Tertiles (Partitions of Four)

How Quantile Works

The Quantile Function and Syntax

The syntax for the Quantile function

QUANTILE (<partitions>, <column-name> ,<sort-key> [DESC | ASC])
[QUALIFY QUANTILE (<column-name>) {< | > | = | <= | >=} <number-of-rows>]

A Quantile is used to divide rows into a number of categories or grouping of roughly the same number of rows in each group. The percentile is the QUANTILE most commonly used in business. This means that the request is based on a value of 100 for the number of partitions. It is also possible to have quartiles (based on 4), tertiles (based on 3) and deciles (based on 10).

By default, both the QUANTILE column and the QUANTILE value itself will be output in ascending sequence. As in some cases, the ORDER BY clause may be used to reorder the output for display. Here the order of the output does not change the meaning of the output, unlike a summation where the values are being added together and all need to appear in the proper sequence.

A Quantile Example

This example determines the percentile for every row in the Sales table based on the daily sales amount and sorts it into sequence by the value being categorized, which here is daily sales.

SELECT Product_ID, Sale_Date, Daily_Sales
,QUANTILE (100, Daily_Sales ) AS “Quantile”
FROM Sales_Table
WHERE Product_ID < 3000
AND Sale_Date > 1000930 ;

images

Notice that the amount of 32800.50 in the first two rows has the same percentile value. They are the same value and will therefore be put into the same partition.

A Quantile Example using DESC Mode

SELECT Product_ID ,Sale_Date ,Daily_Sales
               ,QUANTILE (100, Daily_Sales , Sale_Date DESC ) AS “Quantile”
FROM Sales_Table WHERE Product_ID < 3000 AND Sale_Date >= 1000930 ;

images

Notice the first two rows. This is because the Sale date DESC, impacts the first two rows. Why? Since these rows have the same value, it uses the Sale_Date column as a tiebreaker for the sequencing and makes them different from each other. Hence, they are assigned to different values in different partitions.

QUALIFY to find Products in the top Partitions

This example uses a QUALIFY to show only products that sell in the top 60 Percentile

SELECT Product_ID, Sale_Date, Daily_Sales
       ,QUANTILE (100, Daily_Sales, Sale_Date ) as "Percentile"
FROM Sales_Table
QUALIFY "Percentile" >= 60 ;

images

Like the aggregate functions, OLAP functions must read all required rows before performing their operation. Therefore, the WHERE clause cannot be used. Where the aggregates use HAVING, the OLAP functions uses QUALIFY. The QUALIFY evaluates the result to determine which ones to return.

QUALIFY to find Products in the top Partitions Sorted DESC

SELECT Product_ID, Sale_Date, Daily_Sales
      ,QUANTILE (100, Daily_Sales, Sale_Date ) as "Percentile"
FROM Sales_Table
QUALIFY "Percentile" >= 70
ORDER BY "percentile" DESC ;

images

The ORDER BY changes the sequence of the rows being listed, not the meaning of the percentile. The above functions both determined that the highest number in the column is the highest percentile. The data value sequence ascends as the percentile ascends or descends as the percentile descends. When the sort in the QUANTILE function is changed to ASC the data value sequence changes to ascend as the percentile descends. The sequence of the percentile does not change, but the data value sequence is changed to ascend (ASC) instead of the default, which is to descend (DESC).

QUALIFY to find Products in the top Partitions Sorted ASC

SELECT Product_ID ,Sale_Date, Daily_Sales
       ,QUANTILE (100, Daily_Sales ASC, Sale_Date) as "Percentile"
FROM Sales_Table
QUALIFY "Percentile" >=70 ;

images

The example SELECT above uses the ASC to cause the data values to go contradictory to the percentile.

QUALIFY to find Products in top Partitions with Tiebreaker

SELECT Product_ID, Sale_Date, Daily_Sales
      ,QUANTILE (100, Daily_Sales ASC, Sale_Date ASC) as "Percentile"
FROM Sales_Table
QUALIFY "Percentile" >= 70 ;

images

The next SELECT modifies the above query to incorporate the sale date as a tiebreaker and reverse the ordering for the two rows with sales of $32,800.50.

Using Tertiles (Partitions of Four)

SELECT Product_ID, Sale_Date, Daily_Sales
      ,QUANTILE (4, Daily_Sales , Sale_Date ) AS "Quartiles"
FROM Sales_Table WHERE Product_ID in (1000, 2000) ;

images

Instead of 100 the example above uses a quartile (QUANTILE based on 4 partitions).

How Quantile Works

SELECT Product_ID, Sale_Date, Daily_Sales
      ,QUANTILE (4, Daily_Sales , Sale_Date ) AS "Quartiles"
FROM Sales_Table WHERE Product_ID = 1000;

images

Assigning a different value to the <partitions> indicator of the QUANTILE function changes the number of partitions established. Each Quantile partition is assigned a number starting at 0 increasing to a value that is one less than the partition number specified. So, with a Quantile of 4 the partitions are 0 through 3 and for 10, the partitions are assigned 0 through 9. Then, all the rows are distributed as evenly as possible into each partition from highest to lowest values. Normally, extra rows with the lowest value begin back in the lowest numbered partitions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset