User-defined data types

PostgreSQL provides two methods for implementing user-defined data types through the following commands:

  • CREATE DOMAIN: The CREATE DOMAIN command allows developers to create a user-defined data type with constraints. This helps to make the source code more modular.
  • CREATE TYPE: The CREATE TYPE command is often used to create a composite type, which is useful in procedural languages, and is used as the return data type. Also, one can use the create type to create the ENUM type, which is useful to decrease the number of joins, specifically for lookup tables.

Often, developers tend not to use user-defined data types and use flat tables instead due to a lack of support on the driver side, such as JDBC and ODBC. Nonetheless, in JDBC, the composite data types can be retried as Java objects and parsed manually.

Domain objects, as with other database objects, should have a unique name within the schema scope. The first use case of domains is to use them for common patterns. For example, a text type that does not allow null values and does not contain spaces is a common pattern. In the web car portal, the first_name and the last_name columns in the account table are not null. They should also not contain spaces, and are defined as follows:

first_name TEXT NOT NULL,
last_name TEXT NOT NULL,
CHECK(first_name !~ 's' AND last_name !~ 's'),

One can replace the text data type and the constraints by creating a domain and using it to define the first_name and the last_name data type, as follows:

CREATE DOMAIN text_without_space_and_null AS TEXT NOT NULL CHECK (value!~ 's');

In order to test the text_without_space_and_null domain, let's use it in a table definition, and execute several INSERT statements, as follows:

CREATE TABLE test_domain (
test_att text_without_space_and_null
);

Let's finally insert several values into the table test_domain:

postgres=# INSERT INTO test_domain values ('hello');
INSERT 0 1
postgres=# INSERT INTO test_domain values ('hello world');
ERROR: value for domain text_without_space_and_null violates check constraint "text_without_space_and_null_check"
postgres=# INSERT INTO test_domain values (null);
ERROR: domain text_without_space_and_null does not allow null values

Another good use case for creating domains is to create distinct identifiers across several tables, since some people tend to use numbers instead of names to retrieve information. One can do that by creating a sequence and wrapping it with a domain:

CREATE SEQUENCE global_id_seq;
CREATE DOMAIN global_serial INT DEFAULT NEXTVAL('global_id_seq') NOT NULL;

Finally, one can alter the domain using the ALTER DOMAIN command. If a new constraint is added to a domain, it will cause all the attributes using that domain to be validated against the new constraint. One can control this by suppressing the constraint validation on old values and then cleaning up the tables individually. For example, let's assume we would like to have a constraint on the text length of the text_without_space_and_null domain, this can be done as follows:

ALTER DOMAIN text_without_space_and_null ADD CONSTRAINT text_without_space_and_null_length_chk check (length(value)<=15);

The preceding SQL statement will fail due to data violation if an attribute is using this domain and the attribute value length is more than 15 characters. So, to force the newly created data to adhere to the domain constraints and to leave the old data without validation, one can still create it, as follows:

ALTER DOMAIN text_without_space_and_null ADD CONSTRAINT text_without_ space_and_null_length_chk check (length(value)<=15) NOT VALID;

After data clean up, one can also validate the constraint for old data by invoking the ALTER DOMAIN ... VALIDATE CONSTRAINT statement. Finally, the dD+ psql meta-command can be used to describe the domain.

Composite data types are very useful for creating functions, especially when the return type is a row of several values. For example, let's assume that we would like to have a function that returns the seller_id, seller_name, number of advertisements, and the total rank for a certain customer account. The first step is to create a TYPE, as follows:

CREATE TYPE seller_information AS (seller_id INT, seller_name TEXT,number_of_advertisements BIGINT, total_rank float);

Then, we can use the newly created data TYPE as the return type of the function, as follows:

CREATE OR REPLACE FUNCTION seller_information (account_id INT ) RETURNS seller_information AS
$$
SELECT seller_account.seller_account_id, first_name || last_name as seller_name, count(*), sum(rank)::float/count(*)
FROM account INNER JOIN
seller_account ON account.account_id = seller_account.account_id LEFT JOIN
advertisement ON advertisement.seller_account_id = seller_account.seller_account_id LEFT JOIN
advertisement_rating ON advertisement.advertisement_id = advertisement_rating.advertisement_id
WHERE account.account_id = $1
GROUP BY seller_account.seller_account_id, first_name, last_name
$$
LANGUAGE SQL;

CREATE TYPE could also be used to define enums; an enum type is a special data type that enables an attribute to be assigned one of the predefined constants. The usage of the enum data types reduces the number of joins needed to create some queries; thus, it makes SQL code more compact and easier to understand. In the advertisement_rating table, we have a column with the rank name, which is defined as follows:

-- This is a part of advertisement_rating table def.
rank INT NOT NULL,
CHECK (rank IN (1,2,3,4,5)),

In the preceding example, the given code is not semantically clear. For example, some people might consider 1 to be the highest rank, while others might consider 5 to be the highest rank. To solve this, one could use the lookup table, as follows:

CREATE TABLE rank (
rank_id SERIAL PRIMARY KEY,
rank_name TEXT NOT NULL
);
INSERT INTO rank VALUES (1, 'poor') , (2, 'fair'), (3, 'good') , (4,'very good') ,( 5, 'excellent');

In the preceding approach, the user can explicitly see the rank table entries. Moreover, the rank table entries can be changed to reflect new business needs, such as to make the ranking from 1 to 10. Additionally, in this approach, changing the rank table entries will not lock the advertisement_rating table, since the ALTER TABLE command will not be needed to change the check constraint CHECK (rank IN (1, 2, 3, 4, 5)). The disadvantage of this approach lies in retrieving the information of a certain table that is linked to several lookup tables, since the tables need to be joined together. In our example, we need to join advertisement_rating and the rank table to get the semantic of rank_id . The more lookup tables, the more lengthy queries are. 

Another approach to modeling the rank is to use the enum data types, as follows:

CREATE TYPE rank AS ENUM ('poor', 'fair', 'good', 'very good','excellent');

The psql dT meta command is used to describe the enum data type. One could also use the function enum_range, as follows:

postgres=# SELECT enum_range(null::rank);
enum_range
----------------------------------------
{poor,fair,good,"very good",excellent}

The enum data type order is determined by the order of the values in the enum at the time of its creation.

postgres=# SELECT unnest(enum_range(null::rank)) order by 1 desc;
unnest
-----------
excellent
very good
good
fair
poor
(5 rows)

enum PostgreSQL data types are type safe, and different enum data types cannot be compared with each other. Moreover, enum data types can be altered, and new values can be added. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset