Because any Rosetta item whose value is a set can be used to create a type, creating new types is the same process as creating new sets. Rosetta provides three basic mechanisms for forming sets and types: (i) listing the elements explicitly, (ii) filtering or composing existing sets, or (iii) defining functions for constructing the elements of the set. Listing the elements of a set, referred to as extension, is the simplest mechanism for defining new types. The new type is formed by simply using the set former to list the elements of the type. Filtering existing sets, referred to as comprehension, involves using the sel
operation or one of its derivatives to extract a set of items from an existing set. Finally, defining functions to construct type elements, referred to as constructive specification, involves defining functions that generate all elements of a type.
Types are specified using the expression language to define sets making Rosetta’s type system dependent. Specifically, the same language used to define expressions that use declared items is used to define their types. Most programming languages use a distinct language subset for defining types that does not include anything requiring evaluation. Although this restriction makes type checking far simpler, Rosetta uses a dependent type system due to its expressive power.
An excellent example of Rosetta’s type system’s power is using comprehension to define item properties. Defining a set by comprehension is defining a property that each member of the new set must have. Using Rosetta’s set comprehension capabilities, we can define sets that assert a property over all members of a type. This is a simple and powerful way of defining new properties for items that is difficult or impossible to achieve without dependent types.
Because Rosetta types are essentially set values, they are first-class in the language. Types can be passed as actual parameters to facets and functions, and type items can be variable or constant. They can be created, compared, observed, and transformed like any other Rosetta value using the same expression language.
When a new Rosetta type is defined, it must either be a subtype of an existing type, or a new base type that has no supertype (Table 8.1). The standard notation for Rosetta declarations applies equally to the definition of new types. Recall that the notation for a item definition is:
names ::T [[ is e ]];
Table 8.1. Forms for defining types classified by interpretability and subtype
| Constant Type | Variable Type |
---|---|---|
Base Type |
|
|
Subtype |
|
|
where names is a comma-separated list of type names, T is the set of values the new type’s value is drawn from, and e is a Rosetta expression whose value determines the values associated with the new type. Using this notation, the Rosetta type tri
can be defined using extension as a subtype of the integer
type:
tri ::set (integer) is {–1,0,1};
In this declaration, the label is tri
, the type set
(integer), and the value {–1,0,1}
. The set type former creates the powerset of its argument. Thus, tri
must be a subset of the integer
set. The is
clause defines the value of tri
as the set {–1,0,1}
, an element of the powerset of integers and a legal value for this type. Thus, we have defined a new item, tri
, that can be used as a type to define both variables and constants:
high ::tri is 1;
x ::tri;
To distinguish types from sets, Rosetta provides the keyword subtype
to indicate (i) that a new type is being defined and (ii) the supertype of the new subtype. The subtype
synonym is semantically the same as set
, but indicates the intent of a declaration. The previous definition of tri
would appear as follows in a definition:
tri ::subtype (integer) is {–1,0,1};
The most common mechanism for defining new types is using the subtype
qualifier to to specify the type expression. The notation:
x::subtype(integer) is sel (x::integer | x>=0);
declares that x
is a set of items from integer
. It is semantically equivalent to the previous notation. Types defined in this manner are referred to as interpreted subtypes because their supertype is known and the value of the new type is also known. In general, new types are defined using the following notation:
name ::subtype(T) [[ is e ]];
where name is the name of the new type and T is the associated supertype. The optional expression, e, defines the value of the new type. It is mandatory that the range of e is a subset of T in the same manner as any other item declaration. In declarations of this sort, name is considered subtype of T while T is the supertype of name. Treated as sets, the relationship name = <T must hold for all values of name.
Like traditional item declarations, the is
clause is optional, allowing the definition of a subtype whose specific value is not known. For example:
time::subtype(integer);
defines a new item called time
that is a subtype of integer
, whose specific value is not known. The difference between this definition and previous definitions is that the value of time
is not known. It is known that the values associated with time
are found in integer
, but those values are otherwise unconstrained. This supports incomplete definitions of types. Specifically, some properties of the type are known, but its specific contents are not. Types declared in this fashion are called uninterpreted subtypes because their supertypes are known, but their values are not.
At times it is useful to define a new type that is not a subtype of any existing type. Rosetta does not allow type definitions that do not refer to some kind of supertype in their declaration. To achieve the desired result, the type top
is used as the supertype. The definition:
tri ::subtype(top) is {–1,0,1};
defines a new type called tri
whose values are –1, 0, and 1 and whose supertype is top. In this definition, an item of type tri
will not inherit operations from number
or element
types.
Given the alternative definition:
tri ::subtype(integer) is {–1,0,1};
operations over integer
values are inherited by the new type. The previous definition breaks the inheritance chain forcing the user to define operations over the new type. Types defined in this manner are referred to as interpreted types, as their values are known but their supertypes are known to only be top
.
In both cases existing values are used to populate a new type. When the literal –1 is seen in a specification, some mechanism must be used to determine its associated type. The most common approach is to simply use a type assertion such as –1::tri
or –-1::integer
to specify the desired type. In some cases, the appropriate type can be inferred. However, it cannot be inferred in the general case.
At times it is desirable to define new types, where both value and supertype are unknown. This definition style defers all properties of the new type to definition using terms within the facet. Such types are defined using top
as the supertype, but omitting the is
clause and the explicitly specified value. The definition:
tri ::subtype(top);
defines a type whose supertype and value are not known. Such types are referred to as uninterpreted types or sorts in definitions. They are particularly useful at high levels of abstraction where some, but not all, properties are known.
The use of subtype(top)
is quite common in Rosetta specifications and always indicates the definition of a new type. For readability, the keyword type
is introduced as a synonym for this construct. The definitions:
tri ::type is {–1,0,1}; tri ::type;
are identical to the previous definitions, but are easier to read and interpret.
Types need not be defined starting only from elemental values, but can be defined from any type. For example, consider the definition of a type whose values are subsets of a set T
:
s::subtype(set(T));
The notation set
(T)
defines the type containing all possible subsets created from elements of T
. The notation x::set
(T)
therefore states that x
is a single subset of T
. Using the subtype
form defines s
to be an arbitrary collection of subsets of T
. The subtype
form simply says that s
may include many sets. This is fundamentally different than the declaration:
s::set(set(T));
In this case, s
is a single set of subsets from T
, rather than a set of subsets from T
. The notation set (T)
again defines a type that contains all subsets of T
. In this case, the outer set former evaluates to all possible subsets of all possible subsets from T
. These types can be further restricted, as in:
set4 ::subtype (set(T)) is sel (x::subtype (set(T)) | forall (t::x | #t=4));
where set4
is the set of subsets of T that contain exactly four elements. The notation z::set4
declares z
to be a singleton element of type set4
, or simply a subset of T
containing four elements.
One reminder about the distinction between the notations x::T
and x:: subtype
(T)
as declarations: The first says that the value of x
is a single element of type T
; the second says that the value of x
is a set of values selected from T
. If the first definition is used as a type, then only single element types are allowed.
The second explicitly allows sets. In contrast, if these statements are used as terms in a specification, they are identical.
There are two mechanisms for defining new types by extension. Using set formers, values for a type can be listed explicitly. However, those values must already exist in the Rosetta value space. Using enumerations, sets of values are defined where values must be fresh and cannot appear in Rosetta prior to the declaration.
The simplest mechanism for defining a type is to simply list its elements or combine existing items to form a set. In Rosetta, the items in a set can be listed explicitly by using the set former. For example, the definition of type bit
has the following form:
bit ::subtype(natural) is {0,1};
Because {0,1}
is an element of subtype
(natural)
, this represents a legal type definition. Further, the new bit
type inherits operations from the natural numbers. Other set formation operations, such as intersection and difference, can be used to form types in the same manner as for the set former. They are used far less often, but still remain useful. For example, the type containing only positive integers, posint
, can be defined as:
posint ::subtype (natural) is sel (x::natural | x>0);
using the selection function to filter out all values greater than 0.
As noted earlier, there is some danger in defining new types whose values are shared with other existing types. When processing specifications, Rosetta tools must determine the type of –1, 0, and 1. If the new bit
type is defined without reference to integer
, the only type that can be inferred is top
. To assist tools, type ascription can be used to indicate the type explicitly:
1::tri;
identifies 1 as the value from tri
, while:
1::integer;
identifies 1 as the value from integer
. With two types associated with 1 and no common supertype other than top
, this annotation is required in virtually all situations. Using an enumeration to define new types like this will avoid this problem.
Enumerations are types whose values can be specified by explicitly listing them. Unlike definitions using set enumeration, the values of an enumerated type cannot be defined previously and are subtypes of element
only. Defining an enumerated type is achieved by declaring the type and defining a collection of values associated with it using the enumeration
former:
colors:: type is enumeration[red,yellow,blue];
The notation yellow
defines a new item whose lexical representation is yellow
and whose value is also yellow
. Thus, yellow
becomes a new value in the current value space. Enumerations in Rosetta differ from enumerations in traditional programming languages, as the new type definition does not imply an ordering on its elements. Ordering must be introduced separately if needed. Any variable whose type is an enumerated type is an element
and thus a scalar.
Because enumeration values are fresh and cannot exist in the Rosetta type space prior to the enumeration declaration, the problem with ambiguous types for values is avoided. Ascription can be used in the same manner as described previously, but it need not be, as each enumeration value is associated with exactly one type. It is possible to define subtypes consisting of enumeration values. This reintroduces the need for ascription.
The semantics for enumerated types is defined by elaborating the type declaration to a constructed type. Described later in this chapter, constructed types define a type by specifying a collection of constructors for all elements of that type. An enumeration is simply a shorthand for defining constructed types with constant constructors.
Example 8.1. Defining and Using a Three-Valued Logic Using Enumerations
Using enumerated types, we can define a three-valued logic that adds an unknown value to the normal logical high and low values. The declaration of this type using enumerated types is:
tri ::type is enumeration [high,low,unknown];
We can now define standard operations over this type using high
, low
, and unknown
:
tri_and(x,y::tri )::tri is case x of {high} -> y | {low} -> low | {unknown} -> if y=low then low else unknown end if end case; tri_or (x,y::tri)::tri is case x of {high} -> high | {low} -> y | {unknown} -> if y=high then high else unknown end if end case;
tri_and
and try_or
define the logic of conjunction and disjunction for this three-valued logic. Other functions can be defined similarly. Using these definitions, we can define facet models for and
and or
gates, respectively:
facet tri_and_gate(x,y::input tri ; z::output tri)::state_based is begin update: z' = tri_and(x,y) end facet tri_and_gate; facet tri_or_gate(x,y::input tri; z::output tri)::state_based is begin update: z' = tri_or(x,y); end facet tri_and_gate;
Defining types by comprehension involves filtering or transforming existing types to create new subtypes. Using the sel
function or filter
functions defines new types by filtering values from old types. Using the ran
or image
functions defines new types by transforming old types.
An excellent example of defining types by comprehension is the definition of the built-in type natural
. Ideally, natural
should include the integer
values that are greater than or equal to zero. Using this fact, the standard definition for natural numbers is:
natural :: subtype(integer) is sel (x::integer | x >= 0);
In this type definition, the comprehension operator sel chooses all elements of the type integer
that satisfy the relationship x >= 0
. Because natural
is a subtype of integer
, all operations defined on integer
are also defined on natural
.
Similarly, the sel
operator can be used to form a type called byte
consisting of 8-bit bitvectors. Here the selection operator will be used again, with the length of the bitvector being checked:
byte :: subtype (bitvector) is sel (b::bitvector | #b=8);
In this definition, the selection operator chooses elements from the bitvector
type that are of length 8.
While the selection operator chooses elements from a type, the range operation transforms each element of a type. An example of using ran
to define a new type is a potential definition of the even integers. Instead of using the comprehension operator, here the ran
operation is used as an image function:
even :: subtype(integer) is ran(x::integer | 2*x);
Here the even numbers are generated by multiplying each integer
value by 2.
As demonstrated here, the selection and range operations are excellent choices for defining new types by comprehension. Other higher-order functions, such as forall
and exists
, are far less useful, as they evaluate to single boolean
values.
As noted earlier, any Rosetta set can be used as the value of a type. This has numerous advantages, not the least of which is clarity. However, it quite easy to misuse this capability to develop specifications that are exceptionally difficult to analyze. Using operations such as set union, intersection, and difference to form new types can introduce specification complexities that make analysis virtually impossible. We have already seen the need to use type annotations to disambiguate literals shared between types. New types formed from arbitrary sets must be used carefully, with full understanding of their purpose and how they impact analysis.
The constructed type definition syntax provides a mechanism for defining constructor, observer, and recognizer functions in a single definitional notation. Constructor functions create values associated with the new type. These values behave as any other values and are treated as normal forms that are not evaluated. Together, the collection of constructor functions can create every value associated with the type. The observer functions define properties associated with values of the new type. Observer functions specify and calculate properties for constructed type values. Finally, recognizer functions are special predicates that indicate the constructor used to create a value. Summarizing: (i) constructor functions create new values, (ii) observer functions define properties, and (iii) recognizer functions indicate how the value was created.
As an example of constructed type definition, consider the following definition for a binary tree of integers:
intTree :: type is data nil::empty | node(L::intTree,v:integer,R::intTree )::nonempty; end data;
The item intTree
is defined to be a new, fresh type using the type
indicator. This declaration simply says that the new type’s supertype is subtype(top)
. The is
clause is used to give the type a value and the data
keyword is used to indicate that the value of this type will be given a value.
The constructed type definition provides two constructors for intTree
: (i) the constructor function nil
and (ii) the constructor function node
. The nil
function creates an empty tree and is recognized by the boolean function empty
. The recognizer function is defined so that if its argument is the nil
constructor function, it will return true, otherwise it returns false. The node
constructor creates a nonempty tree from a value and a left and right subtree. Note that the type is recursive in that intTree
is referenced in the definition of node. The nonempty
recognizer is true whenever its argument was created by a call to the node
function. In addition, the observer functions L
, v
, and R
observe parameters of the node constructor.
A tree with one node whose value is 0 is defined using the node
constructor:
node(nil,0,nil);
The recognizers empty
and nonempty
indicate the constructor used to generate a tree. Specifically, empty
is true
if its argument is nil
and nonempty
is true
if its argument is an instantiation of the node
constructor. The recognizers nonempty
and empty
evaluate to:
nonempty(node(nil,0,nil)) == true empty(node(nil,0,nil)) == false
A balanced tree with 0 as the root, and 1 and 2 as the left and right nodes, respectively, is constructed as follows:
node(node(nil,1,nil),0,node(nil,2,nil ));
Here, the node
constructor is used to create nonempty left and right subtrees.
Parameter names are used to generate observer functions that return actual parameters from constructor functions. These functions return the actual parameter instantiation of their associated formal parameter. For example, when evaluated on the node constructor, the observers L
, R
, and v
evaluate to:
L(node(nil,0,nil )) == nil R(node(nil,0,nil )) == nil v(node(nil,0,nil )) == 0
In this sense, the constructed value behaves much like a record— so much so that Rosetta uses constructed types in lieu of traditional record structures.
The primary purposes for constructors, observers, and recognizers should be clear from these examples. Constructors create values, observers make observations over values, and recognizers partition the value set. In if
and other conditional statements, the recognizers are used to guard the use of observers. For example:
if nonempty(t) then L(t) else nil end if
determines if t
is empty and accesses the left subtree if it is not. If t
were empty, then L
would be undefined.
Constructed type definitions have the form:
T :: type is data c(parameters)::r [[ | c(parameters) ::r ]]* end data;
where T names the new type and each subsequent constructor definition specifies a constructor name, a collection of typed parameters, and a recognizer name. The format of each constructor declaration is identical to a function definition. The parameters identify the names and types of observers, and the recognizer names a predicate that is true when its argument is constructed with the associated constructor.
Like any other function, constructor functions can be curried. If this is the case, then the results of applying observer functions associated with uninstantiated parameters are not defined. However, constructor functions are not evaluated in the traditional fashion. When a constructor function is fully instantiated, it becomes a value in the Rosetta value space. Thus, it cannot be further evaluated.
The ranges of constructor functions are disjoint, implying that each element of the constructed type is created by exactly one constructor. This implies that each value associated with a constructed type can have only one form. Furthermore, each constructor name must be unique. This allows Rosetta tools to automatically determine what type a constructed value is associated with.
Example 8.2. Defining Records using constructed types
In Rosetta, no special syntax for defining records is defined, as record structures follow directly from constructed types. A record type is a constructed type with a single constructor function that associates values with parameters used as field names. A typical record type will be defined with the following constructive technique:
record::type is data recordFormer(f0::T0 | f1::T1 |... fn::Tn)::recordp; end data;
where recordFormer
is the single constructor, f0 through fn are the names of the various fields, and T0 through Tn are the types associated with those fields. The recognizer recordp
is also defined, but is largely unused, as there is only one constructor. Any constructed type definition providing only a single constructor is referred to as a record. To define a specific record type that represents Cartesian coordinates, the following notation is used:
cartesian::type is data cartFormer(x::real, y::real, z::real)::cartp; end data;
To define an item of this type, the standard Rosetta declaration syntax is used:
c ::cartesian;
Values can be associated with record items using the canonical is
form:
origin ::cartesian is cartFormer(0,0,0);
Accessing individual fields of the record is achieved by applying one of the observer functions associated with a field name. To access field y
in the record c
, the following notation is used:
y(c)
Defining a coordinate in Cartesian space using this definition is achieved by:
cartFormer(1,0,0);
Accessing elements of the constructed coordinate is achieved using observer functions:
x(cartFormer(1,0,0))==1; y(cartFormer(1,0,0))==0; z(cartFormer(1,0,0))==0;
The following notation creates a Cartesian coordinate whose x
and y
values are known, but whose z
value is not specified:
cartFormer(1,0);
cartFormer(1,0)
produces a function of one parameter that returns the completed Cartesian value. Should the function z(cartFormer(1,0))
be specified, it should not pass type checking because cartFormer(1,0)
is a function, not a Cartesian value.
Example 8.3. Semantics of Enumerations
Similar to records, it is quite easy to define enumerations using constructed types. The enumeration declaration:
colors ::type is [red,yellow,blue]
is semantically equivalent to:
colors ::type is data red()::redp | yellow()::yellowp | blue()::bluep; end data;
The constructed type defines three nullary constructors that are now values in the value space. Further, three recognizers are provided that specify what constructor produces a value. In this case, the recognizers are not particularly useful because the equals relationship performs the same function. Specifically:
redp(x) == x=red()
In a sense, an enumeration is the opposite of a record. A record is defined using one constructor with parameters to indicate record elements. An enumeration is defined using many constructors with no parameters to define different elements.
Using functions to define types allows the introduction of parameters in type definitions. Because types are first-class items in Rosetta, any function returning a set can be used to define a Rosetta parameterized type. Consider the following function definition:
word(n::natural)::subtype(bitvector) is sel (b::bitvector | #b = n);
The function signature defines a mapping from natural
to a subtype of the built-in bitvector type. The subtype is defined by the sel
operation to be those bitvectors whose lengths are equal to the parameter n
. Thus, word
evaluates to the set of bitvectors of length equal to its parameter. We can now use word
as a type definition construct. The notation:
reg::word(8);
defines reg
to be a bitvector
of length 8. The notation:
bv8::subtype(bitvector) is word(8);
defines bv8
to be the type containing all bitvectors of length 8.
When using functions as type definition tools it is important to understand that they, themselves, are not types. They generate types, but they are not usable as types prior to evaluation. The word
type is an excellent example. Altough word(8)
is a type, word
is simply a function making the declaration w::word
meaningless.
Example 8.4. Defining Generic Trees
The intTree
example demonstrates a capability for defining a tree containing integer
values that can easily be modified to define a tree containing other types. Simply replace the integer
type with the item type to be contained in the tree. Using functions and constructed types together, a general tree type definition function can be developed that allows definition of trees containing arbitrary types.
The modified tree definition has the following form:
tree(T::type) ::type is data nil ::empty | node(L::tree(T),v:T,R::tree(T ))::nonempty; end data;
This definition uses the standard function definition format, specifying a function signature and an expression. In this case, tree is defined to take a single, arbitrary type and return another type. The value of the returned type is obtained by evaluating the function’s associated expression. In this case, the data
keyword indicates a constructed type definition. If the tree definition is instantiated with a type, the result is a tree data
type that contains the instantiated type. For example:
intTree::type is tree(integer);
is semantically equal to the intTree
provided earlier in this section. If the parameter T
is replaced by integer
in the expression, the same type definition results. The advantage here is that the new tree function can easily and quickly be used to define new tree types with new contents. The tree
function is not a type, but simply a function whose result value is a type value. Unfortunately, it is not possible to define a new item t::tree
. The tree
function must be fully instantiated.