Chapter 5 – Partition Primary Index (PPI) Tables Subquery

CREATE TABLE Emp_Intl
(Emp_No         INTEGER,
  Dept_No         SMALLINT,
  First_Name     VARCHAR(12),
  Last_Name      CHAR(20),
  Salary             DECIMAL(10,2))
UNIQUE Primary Index( Emp_No)

images

A Unique Primary Index(UPI) spreads the rows of a table evenly across the AMPs.

Primary Index in the WHERE Clause – Single-AMP Retrieve

images

Use the Primary Index column in your SQL WHERE clause and only 1-AMP retrieves.

Review of A Non-Unique Primary Index (NUPI)

CREATE TABLE Emp_Intl
(Emp_No         INTEGER,
  Dept_No         SMALLINT,
  First_Name     VARCHAR(12),
  Last_Name     CHAR(20),
  Salary             DECIMAL(10,2))
  Primary Index( Dept_No)

images

A Non-Unique Primary Index(NUPI) will have duplicates grouped together on the same AMP so data will always be skewed (uneven). The above skew is reasonable and ok.

Primary Index in the WHERE Clause – Single-AMP Retrieve

images

Use the Primary Index column in your SQL WHERE clause and only 1-AMP retrieves.

Review of a Multi-Column Primary Index

images

A table can have only one Primary Index, but you can combine up to 64 columns together max to form one Multi-Column Primary Index.

Primary Index in the WHERE Clause – Single-AMP Retrieve

images

Use the Primary Index column in your SQL WHERE clause and only 1-AMP retrieves.

Creating a PPI Table with Simple Partitioning

images

A PPI Table is a table with a Partition on it. A PPI table has the AMPs sort (order ) the rows on the table by the Partition. This allows for people to avoid Full Table Scans. This is an example of Simple PPI. Each AMP will sort the rows they own by Dept_No.

A Visual Display of Simple Partitioning

images

An SQL Example that explains Simple Partitioning

SQL Query

SELECT *
FROM Employee_Table_PPI
WHERE Dept_No = 300 ;

PE's Explain Plan

We do an All-AMPs
retrieve by way of a
Single Partition.

The table below was Partitioned by Dept_No using Simple Partitioning. Each AMP merely reads from partition 3 which holds all Dept_No 300 employees.

images

Creating a PPI Table with RANGE_N Partitioning per Day

images

The above syntax represents a RANGE_N partition.

A Visual of Range_N Partitioning Per Day

images

There would be 365 partitions in total for an entire year of Partitioning per Day. Each AMP holds the orders assigned to them in daily partitions on the day of the order date.

An SQL Example that explains Range_N Partitioning per Day

SQL Query

SELECT *
FROM Order_Table_PPI
WHERE Order_Date BETWEEN
‘2012-01-01’ and ‘2012-01-02’ ;

PE's Explain Plan

We do an All-AMPs
retrieve by way of
2 Partitions.

images

Each AMP holds the orders assigned to them in daily partitions on the day of the order .

Creating a PPI Table with RANGE_N Partitioning per Week

images

You can order everything by day. You can also order everything by week, my month, and by year. You have many options to choose from.

A Visual of Range_N Partitioning Per Week

images

There would be 52 partitions in total for an entire year of Partitioning per Week. Each AMP holds the orders assigned to them in weekly partitions on the week of the order.

An SQL Example that explains Range_N Partitioning per Day

SQL Query

SELECT *
FROM Order_Table_PPI
WHERE Order_Date BETWEEN
‘2012-01-01’ and ‘2012-01-07’ ;

PE's Explain Plan

We do an All-AMPs
retrieve by way of
a single Partition.

images

Each AMP holds the orders assigned to them in daily partitions on the day of the order .

Creating a PPI Table with RANGE_N Partitioning per Month

images

You can order everything by day. You can also order everything by week, by month, and by year. You have many options to choose from.

A Visual of One Year of Data with Range_N Per Month

images

Each AMP sorts their rows by Month (of Order_Date).

An SQL Example explaining Range_N Partitioning per Month

SQL Query

SELECT *
FROM Order_Table
WHERE Order_Date BETWEEN
‘2012-01-01’ and ‘2012-03-31’ ;

PE's Explain Plan

We do an All-AMPs
retrieve by way of
3 Partition3.

images

The above query wants to see orders in the first quarter so each AMP reads 3 partitions.

Creating a PPI Table with CASE_N

images

The above syntax represents the CASE_ N partitioning. If an Order_Total is < 1000 it will go into Partition 1. If an Order_Total is between 1000 and 4999.99 it will go in Partition 2. The NO Case partition is for anything falling through and the UNKNOWN partition is for NULL values in the Total.

A Visual of Case_N Partitioning

PARTITION BY CASE_N
     (Order_Total < 1000,
     Order_Total < 5000,
     Order_Total < 10000,
     Order_Total < 20000,
     NO Case, UNKNOWN);

images

Three are six partitions for this table. Orders are placed in partitions based on cost.

An SQL Example that explains Range_N Partitioning per Day

SQL Query

SELECT *
FROM Order_Table_CPPI
WHERE Order_Total > 20000.00

PE's Explain Plan

We do an All-AMPs
retrieve by way of
a Single Partition.

images

All-AMPs retrieve but each only reads partition 5 which is a mere sliver of the disk.

How many partitions do you see?

CREATE TABLE Order_Table_PPI
(Order_Number   INTEGER Not Null
,Customer_Number INTEGER
,Order_Date     DATE
,Order_Total    Decimal (10,2)
) PRIMARY INDEX(Order_Number)
  PARTITION BY RANGE_N (Order_Date
   BETWEEN date '2010-01-01'
     AND date '2010-12-31'
       EACH INTERVAL '1' DAY);

How many partitions will Order_Table_PPI have at the year's end? It will have 365, unless it is leap year and then it will have 366.

Number of PPI Partitions Allowed

CREATE TABLE Order_Table_PPI
(Order_Number     INTEGER Not Null
,Customer_Number     INTEGER
,Order_Date     DATE
,Order_Total     Decimal (10,2)
) PRIMARY INDEX(Order_Number)
   PARTITION BY RANGE_N (Order_Date
     BETWEEN date '2010-01-01'
       AND date '2010-12-31'
         EACH INTERVAL '1' DAY);

Each AMP can have 65,535 partitions per table, but after Teradata V14 each AMP can have 9.223 Quintillion partitions per table.

Teradata Tables can have 65,535 partitions max, but after V14 it is 9.223 Quintillion.

How many partitions do you see?

CREATE TABLE Order_Table_CPPI
(Order_Number   INTEGER Not Null
,Customer_Number INTEGER
,Order_Date      DATE
,Order_Total     Decimal (10,2))
PRIMARY INDEX(Order_Number)
    PARTITION BY CASE_N
     (Order_Total < 1000,
     Order_Total < 5000,
     Order_Total < 10000,
     Order_Total < 20000,
     NO Case, UNKNOWN);

How many partitions do you see in the above example?

There are six partitions in the above example. Partition 1 is for Order_Totals < 1000 and Partition 2 is for Order_Totals < 5000, etc. The NO Case partition is for anything falling through the CASE statement, like an ELSE so it is for any Order_Total 20,000 or greater. The UNKNOWN is for Order_Totals that are Null.

NO CASE and UNKNOWN Partitions Together

CREATE TABLE Order_Table_CPPI
(Order_Number     INTEGER Not Null
,Customer_Number  INTEGER
,Order_Date       DATE
,Order_Total      Decimal (10,2)
)
PRIMARY INDEX(Order_Number)
    PARTITION BY CASE_N
     (Order_Total < 1000,
     Order_Total < 5000,
     Order_Total < 10000,
     Order_Total < 20000,
     NO Case OR UNKNOWN);

How many partitions do you see in the above example?

There are five partitions in the above example. Partition 1 is for Order-Totals < 1000 and Partition 2 is for Order_Totals < 5000, etc. The NO Case partition and the UNKNOWN partition are combined. So now any Order_Total that is 20,000 or greater or any Order_Total that is NULL will be in the 5th partition together.

A Visual of Case_N Partitioning

PARTITION BY CASE_N
   (Order_Total < 1000,
   Order_Total < 5000,
   Order_Total < 10000,
   Order_Total < 20000,
NO CASE OR UNKNOWN);

images

Three are five partitions for this table because we combined the NO CASE and UNKNOWN partitions together. Orders are placed in partitions based on cost.

Combining Older Data and Newer Data in PPI

images

The above example shows Older Data Partitioned by Month and Newer Data by Day.

A Visual for Combining Older Data and Newer Data in PPI

images

This example combines 5-years of monthly partitions with 1-year of daily partitions.

The SQL on Combining Older Data and Newer Data in PPI

SELECT * FROM Order_Table
WHERE Order_Date BETWEEN‘2010-01-01’ and ‘2010-01-07’ ;

We do an
All-AMPs
retrieve by way
of 7 Partitions.

images

Multi-Level Partitioning Combining Range_N and Case_N

images

This type of partitioning is called a MULTI-LEVEL PPI. It is a Partition within a Partition. The top partition is the only one you can ALTER so keep that in mind. With Multi-Level partitioning you can combine up to 15 Case_N or Range_N partitions within partitions. NO Simple Partitioning can be used in Multi-Level Partitioning.

A Visual of Multi-Level Partitioning

images

The SQL on a Multi-Level Partitioned Primary Index

SELECT * FROM Order_Table_MLPPI
WHERE Order_Date = ‘2011-01-01’ and Order_Total > 20000 ;

We do an
All-AMPs
retrieve by way
of a Single
Partition.

images

NON-Unique Primary Indexes (NUPI) in PPI

images

Why were we forced to make the Primary Index of Order_Number a Non-Unique Primary Index (NUPI)? This is because the PRIMARY INDEX (Order_Number ) was not part of the Partition. That is why most PPI tables are defined as NUPI.

PPI Table with a Unique Primary Index (UPI)

CREATE Set TABLE Order_Table_CPPI
(Order_Number    INTEGER Not Null
,Customer_Number INTEGER
,Order_Date      DATE
,Order_Total     Decimal (10,2)
) UNIQUE PRIMARY INDEX(Order_Number, Order_Total)
   PARTITION BY CASE_N
   (Order_Total < 1000,
    Order_Total < 5000,
    Order_Total < 10000,
    Order_Total < 20000,
    NO Case OR UNKNOWN);

We were able to make the Primary Index of (Order_Number, Order_Total) a Unique Primary Index (UPI) because Order_Total was part of the Primary Index and part of the Partition.

Tricks for Non-Unique Primary Indexes (NUPI)

images

CREATE UNIQUE INDEX(Order_Number)
on Order_Table_Tricks ;

USI

CREATE INDEX(Order_Number)
on Order_Table_Tricks ;

NUSI

You can create a secondary index (USI or NUSI) on the Primary Index to either ensure uniqueness or for faster retrieval. Use either one depending on your needs.

Character Based PPI for RANGE_N

CREATE TABLE Employee_Char
(Employee_No     INTEGER
,Dept_No             INTEGER
,First_Name         CHAR(20)
,Last_Name         VARCHAR(20)
,Salary                 Decimal(10,2)
) PRIMARY INDEX(Employee_No)
  PARTITION BY RANGE_N
   (Last_Name Between
    (‘A’, ‘B’,‘C’, ‘D’, ‘E’, ‘F’, ‘G’, ‘H’, ‘I’,
     ‘J’, ‘K’,‘L’, ‘M’, ‘N’, ‘O’, ‘P’, ‘Q’, ‘R’,
     ‘S’, ‘T’,‘U’, ‘V’, ‘W’, ‘X’, ‘Y’, ‘Z’
      AND ‘ZZ’, UNKNOWN)) ;

SELECT *
FROM Employee_Table_Char
WHERE Last_Name LIKE ‘C%’ ;

Character Based PPI (Teradata Version 13.10 and beyond) paves the way for Range Queries that are used in conjunction with the LIKE command.

A Visual for Character Based PPI for RANGE_N

images

Each letter in the alphabet gets their own partition to easily track Last_Name.

The SQL on Character Based PPI for RANGE_N

SELECT * FROM Employee_Char
WHERE Last_Name LIKE ‘DAV%’

We do an All-AMPs retrieve by way of a Single Partition.

images

Only one partition is read on each AMP and that is the partition for the names that start with the letter ‘D’. The query above finds all last names that start with the letters DAV.

Character Based PPI for CASE_N

CREATE TABLE Product_Table_Char
(Product_ID                  INTEGER
,Product_Name             (CHAR(20)) NOT CASESPECIFIC
,Product_Cost               Decimal(10,2)
,Product_Description      VARCHAR(100)
) PRIMARY INDEX(Product_ID)
    PARTITION BY CASE_N
    (Product_Name < ‘Apples’
    Product_Name < ‘Bananas’
    Product_Name < ‘Cantaloupe’
    Product_Name < ‘Grapes’
    Product_Name < ‘Lettuce’
    Product_Name < ‘Mangos’
    Product_Name >= ‘Mangos’ and <= ‘Tomatoes’) ;

SELECT * FROM Product_Table_Char
WHERE Product_Name BETWEEN ‘Apples’ and ‘Grapes’ ;

Character Based PPI (Teradata Version 13.10 and beyond) can use CASE_N.

Dates and Character Based Multi-Level PPI

CREATE TABLE Claims_Table_PPI
(Claim_ID         INTEGER NOT NULL
,Claim_Date      DATE NOT NULL
,State_ID          CHAR(2) NOT NULL
,CL_Details        VARCHAR(1000) NOT NULL
) PRIMARY INDEX(Claim_ID)
   PARTITION BY (RANGE_N
   (Claim_Date BETWEEN DATE ‘2010-01-01’
     AND DATE ‘2012-12-31’
     EACH INTERVAL ‘1’ DAY)
   ,RANGE_N (State_ID Between
     (‘AL’, ‘AK’,‘AZ’, ‘AR’, ‘CA’, ‘CO’, ‘CT’, ‘DE’, ‘FL’,
      ‘GA, ’HI’,‘ID’, ‘IL’, ‘IN’, ‘IA’, ‘KS’, ‘KY’, ‘LA’,
      ‘ME’, ‘MD’,‘MA’, ‘MI’, ‘MN’, ‘MS’, ‘MT’, ‘NE’,
      ‘NV, ‘NH’,‘NJ’, ‘NM’, ‘NY’, ‘NC’, ‘ND’, ‘OH’, ‘OK’,
      ‘OR’, ‘PA’,‘RI’, ‘SC’, ‘SD’, ‘TN’, ‘TX’, ‘UT’, ‘VT’,
      ‘VA’, ‘WA’, ‘WV’, ‘WI’, ‘WY’));

This Multi-Level PPI table uses both Range_N on both Dates and Character based PPI.

TIMESTAMP Partitioning

CREATE TABLE Order_Table_TS
(Order_Number            INTEGER NOT NULL
,Customer_Number      INTEGER NOT NULL
,Order_Date_Time        TIMESTAMP(0) WITH TIME ZONE
,Order_Total                 Decimal(10,2)
) PRIMARY INDEX(Order_Number)
   PARTITION BY (RANGE_N
   (CAST (Order_Date_Time as DATE AT 0)
     BETWEEN DATE ‘2011-01-01’ AND DATE ‘2011-12-31’
      EACH INTERVAL ‘1’ Month ) ;

SELECT * FROM Order_Table_TS
WHERE Order_Date_Time BETWEEN
TIMESTAMP ‘2011-01-13 00:00:00-08:00’ AND
TIMESTAMP ‘2011-01-13 23:59:59-08:00’ ;

Find all Orders Place on the day January 13, 2011

TIMESTAMP partitioning is available with Teradata V13.10. Notice the DATE at 0 in the CREATE statement. This means the Timestamp is Deterministic, which means the system will know your Time Zone and retrieve the rows from your point of time.

Using CURRENT_DATE to define a PPI

images

Both CREATE statements above demand 5 years of partitioned data from 2009-2013.

ALTER to CURRENT_DATE the next year

images

In Teradata V13.10 the system allowed for CURRENT_DATE to be utilized in the PPI CREATE statement. This allowed for easy altering of the data and deleting of partitions. The ALTER Table command can alter to reflect 5 years, now going from 1010-2014.

ALTER to CURRENT_DATE with Save

images

When 2013 comes along you can easily ALTER the table using TO CURRENT.

ALTER TABLE Order_Table_PPI3 TO CURRENT WITH INSERT INTO Order_Table_History ;

Notice we are ALTERING the table to CURRENT, but instead of deleting the data we are moving the old data into a History file called Order_Table_History.

Altering a PPI Table to Add or Drop Partitions

Creating a PPI Table

CREATE TABLE Order_TablePPI
(Order_Number           INTEGER Not Null
,Customer_Number     INTEGER
,Order_Date                DATE
,Order_Total                Decimal (10,2))
PRIMARY INDEX(Order_Number)

PARTITION BY RANGE_N
(Order_Date BETWEEN
       date '2010-01-01'
   AND date '2010-12-31'
  EACH INTERVAL ‘1’ DAY);

ALTERING a PPI Table

ALTER TABLE Order_TablePPI
  MODIFY PRIMARY INDEX
   DROP RANGE BETWEEN
     DATE ‘2010-01-01’
   AND DATE ‘2010-12-31’
    EACH INTERVAL ‘1’ DAY
   ADD RANGE BETWEEN
     DATE ‘2011-01-01’
   AND DATE ‘2011-12-31’
    EACH INTERVAL ‘1’ DAY
     WITH DELETE ;

What will happen to the 2010 data after the ALTER?

Deleting a Partition

Creating a PPI Table

CREATE TABLE Order_TablePPI
(Order_Number           INTEGER Not Null
,Customer_Number     INTEGER
,Order_Date                DATE
,Order_Total                Decimal (10,2))
PRIMARY INDEX(Order_Number)

PARTITION BY RANGE_N
  (Order_Date BETWEEN
          date '2010-01-01'
      AND date '2010-12-31'
  EACH INTERVAL ‘1’ DAY);

ALTERING a PPI Table

  ALTER TABLE Order_TablePPI
   MODIFY PRIMARY INDEX
    DROP RANGE BETWEEN
       DATE ‘2010-01-01’
   AND DATE ‘2010-12-31’
    EACH INTERVAL ‘1’ DAY
   ADD RANGE BETWEEN
       DATE ‘2011-01-01’
   AND DATE ‘2011-12-31’
     EACH INTERVAL ‘1’ DAY
      WITH DELETE ;

The 2010 data after the ALTER will be deleted from the table.

Deleting a Partition and Saving its contents

Creating a PPI Table

CREATE TABLE Order_TablePPI
(Order_Number     INTEGER Not Null
,Customer_Number  INTEGER
,Order_Date       DATE
,Order_Total      Decimal (10,2))
PRIMARY INDEX(Order_Number)

PARTITION BY RANGE_N
  (Order_Date BETWEEN
          date '2010-01-01'
      AND date '2010-12-31'
  EACH INTERVAL ‘1’ DAY);

ALTERING a PPI Table
  ALTER TABLE Order_TablePPI
   MODIFY PRIMARY INDEX
    DROP RANGE BETWEEN
     DATE ‘2010-01-01’
    AND DATE ‘2010-12-31’
     EACH INTERVAL ‘1’ DAY
   ADD RANGE BETWEEN
     DATE ‘2011-01-01’
   AND DATE ‘2011-12-31’
    EACH INTERVAL ‘1’ DAY
     WITH INSERT INTO
      Order_Table_History ;

The 2010 data after the ALTER will be deleted from the table, but the deleted data is stored in Order_Table_History.

Using the PARTITION Keyword in your SQL

SELECT PARTITION as “Partition Number”
,Count(*) as “Rows in Partition”
FROM Order_Table_PPI
GROUP BY 1
ORDER BY 1 ;

Partition Number	Rows in Partition
1	1000
2	1200
3	1100
4	1022
5	1121
6	1132
7	1300
8	1500
9	1610

You can find out how many partitions you have and the number of rows from the above query. You won't see information about empty partitions.

SQL for RANGE_N

SELECT RANGE_N (
      Calendar_Date BETWEEN DATE ‘2010-01-01’ AND DATE ‘2010-04-30’
              EACH INTERVAL ‘1’ MONTH) AS “Partition”
      ,MIN(Calendar_Date) as “Begin Date”
      ,MAX(Calendar_Date) as “End Date”
FROM Sys_Calendar.Calendar
WHERE Calendar_Date BETWEEN DATE ‘2010-01-01’ and DATE ‘2010-04-30’
GROUP BY 1
ORDER BY 1 ;

Partition	Begin Date	End Date
1	2010-01-01	2010-01-31
2	2010-02-01	2010-02-28
3	2010-03-01	2010-03-31
4	2010-04-01	2010-04-30

You can find out how many partitions you have and the number of rows from the above query. You won't see information about empty partitions.

SQL for CASE_N

SELECT CASE_N (
        Order_Total < 1000
        ,Order_Total < 5000
        ,Order_Total < 10000
        ,Order_Total < 20000
        ,NO Case
        ,UNKNOWN) as “Partition”
        ,COUNT(*) as “Counter”
FROM Order_Table_PPI
GROUP BY 1
ORDER BY 1 ;

Partition	Counter
1	100
2	125
3	140
4	110
5	3

You can find out how many partitions you have and the number of rows from the above query. You won't see information about empty partitions.

Watch the Video on Partitioning

images

Tera-Tom Trivia

Tom Coffing graduated with a Bachelor's degree in Speech Communications. Tom spent 10 years in Toastmasters competing in hundreds of speech contests. Tom won speech championships at the city, area, state, and national level.

Click on the link below or place it in your browser and watch the video on partitioning.

http://www.coffingdw.com/TbasicsV12/partitioning.wmv

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5 – Partition Primary Index (PPI) Tables Subquery

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 5 – Partition Primary Index (PPI) Tables Subquery