Collections

Cassandra also supports collections in its data model to store a small amount of data. Collections are a complex type that can provide tremendous flexibility. Three collections are supported: Set, List, and Map. The type of data stored in each of these collections requires to be defined, for example, a set of timestamp is defined as set<timestamp>, a list of text is defined as list<text>, a map containing a text key and a text value is defined as map<text, text>, and so on. Also, only native data types can be used in collections.

Cassandra reads a collection in its entirety and the collection is not paged internally. The maximum number of items of a collection is 64K and the maximum size of an item is 64K.

To better demonstrate the CQL support on these collections, let us create a table in the packt keyspace with columns of each collection and insert some data into it, as shown in the following screenshot:

Collections

Experiment on collections

Note

How to update or delete a collection?

CQL also supports updation and deletion of elements in a collection. You can refer to the relevant information in DataStax's documentation at http://www.datastax.com/documentation/cql/3.1/cql/cql_using/use_collections_c.html.

As in the case of native data types, let us walk through each collection below.

Set

CQL uses sets to keep a collection of unique elements. The benefit of a set is that Cassandra automatically keeps track of the uniqueness of the elements and we, as application developers, do not need to bother on it.

CQL uses curly braces ({}) to represent a set of values separated by commas. An empty set is simply {}. In the previous example, although we inserted the set as {'Lemon', 'Orange', 'Apple'}, the input order was not preserved. Why?

The reason is in the mechanism of how Cassandra stores the set. Internally, Cassandra stores each element of the set as a single column whose column name is the original column name suffixed by a colon and the element value. As shown previously, the ASCII values of 'Apple', 'Lemon', and 'Orange' are 0x4170706c65, 0x4c656d6f6e, and 0x4f72616e6765, respectively. So they are stored in three columns with column names, setfield:4170706c65, setfield:4c656d6f6e, and setfield:4f72616e6765. By the built-in order column-name-nature of Cassandra, the elements of a set are sorted automatically.

List

A list is ordered by the natural order of the type selected. Hence it is suitable when uniqueness is not required and maintaining order is required.

CQL uses square brackets ([]) to represent a list of values separated by commas. An empty list is []. In contrast to a set, the input order of a list is preserved by Cassandra. Cassandra also stores each element of the list as a column. But this time, the columns have the same name composed of the original column name (listfield in our example), a colon, and a UUID generated at the time of update. The element value of the list is stored in the value of the column.

Map

A map in Cassandra is a dictionary-like data structure with keys and values. It is useful when you want to store table-like data within a single Cassandra row.

CQL also uses curly braces ({}) to represent a map of keys and values separated by commas. Each key-value pair is separated by a colon. An empty map is simply represented as {}. Conceivably, each key/value pair is stored in a column whose column name is composed of the original map column name followed by a colon and the key of that pair. The value of the pair is stored in the value of the column. Similar to a set, the map sorts its items automatically. As a result, a map can be imagined as a hybrid of a set and a list.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset