Cassandra collection types include the Map, List, and Set types. If collection types are discovered, the driver normalizes the native data into a set of parent-child tables. Primitive types are normalized in a parent table, while each collection type is normalized in a child table that has a foreign key relationship to the parent table. Take for example the following Cassandra table:

CREATE TABLE employee (
   empid int PRIMARY KEY,
   phone map<varchar, varint>,
   client list<varchar>,
   review set<date>);

The following employee table is a tabular representation of the native Cassandra table with data included. In this example, four distinct relational tables are created. A parent table is created based on the empid column, and a child table is created for each of the three collection types (Map, List, and Set).

Table 1. employee (native)
empid

(primary key)

phone client review
int map<varchar, varint> list<varchar> set<date>
103 home: 2855551122

mobile: 2855552347

office: 2855555566

spouse: 2855556782

Li

Kumar

Jones

2013-12-07

2015-01-22

2016-01-10

105 home: 2855555678

mobile: 2855553335

office: 2855555462

Yanev

Bishop

Bogdanov

2015-01-22

2016-01-12

The Parent Table

The parent table is comprised of the primitive integer type column EMPID and takes its name from the native table. A SQL statement would identify the column as EMPLOYEE.EMPID.

Table 2. EMPLOYEE (relational parent)
EMPID

(primary key)

int
103
105

A SQL insert on the EMPLOYEE parent table would take the form:

INSERT INTO EMPLOYEE (EMPID) VALUES (107)

The Map Child Table

The Map collection is normalized into a three column child table called EMPLOYEE_PHONE. The name of the table is formulated by concatenating the name of the native table and the name of the Map column. A foreign key relationship to the parent table is maintained via the EMPLOYEE_EMPID column, and the Map's key value pairs are resolved into separate KEYCOL and VALUECOL columns. In a SQL statement, these columns would be identified as the EMPLOYEE_PHONE.EMPLOYEE_EMPID, EMPLOYEE_PHONE.KEYCOL, and EMPLOYEE_PHONE.VALUECOL, respectively.

Table 3. EMPLOYEE_PHONE (relational child of the map column)
EMPLOYEE_EMPID

(foreign key)

KEYCOL VALUECOL
int varchar varint
103 home 2855551122
103 mobile 2855552347
103 office 2855555566
103 spouse 2855556782
105 home 2855555678
105 mobile 2855553335
105 office 2855555462

A SQL insert on the EMPLOYEE_PHONE child table would take the form1:

INSERT INTO EMPLOYEE_PHONE (EMPLOYEE_EMPID, KEYCOL, VALUECOL)
   VALUES (107, 'mobile', 2855552391)

The List Child Table

The List collection is normalized into a three column child table called EMPLOYEE_CLIENT. The name of the table is formulated by concatenating the name of the native table and the name of the List column. A foreign key relationship to the parent table is maintained via the EMPLOYEE_EMPID column; the order of the elements in the List is maintained via the CURRENT_LIST_INDEX column; and the elements themselves are contained in the CLIENT column. SQL statements would identify these columns as EMPLOYEE_CLIENT.EMPLOYEE_EMPID, EMPLOYEE_CLIENT.CURRENT_LIST_INDEX, and EMPLOYEE_CLIENT.CLIENT, respectively.

Table 4. EMPLOYEE_CLIENT (relational child of the list column)
EMPLOYEE_EMPID

(foreign key)

CURRENT_LIST_INDEX CLIENT
int int varchar
103 0 Li
103 1 Kumar
103 2 Jones
105 0 Yanev
105 1 Bishop
105 2 Bogdanov

A SQL insert on the EMPLOYEE_CLIENT child table would take the form2:

INSERT INTO EMPLOYEE_CLIENT (EMPLOYEE_EMPID, CLIENT) VALUES (107, 'Nelson')

The Set Child Table

The Set collection is normalized into a two column child table called EMPLOYEE_REVIEW. The name of the table is formulated by concatenating the name of the native table and the name of the Set column. A foreign key relationship to the parent table is maintained via the EMPLOYEE_EMPID column, while the elements of the Set are given in natural order in the REVIEW column. In this child table, SQL statements would identify these columns as EMPLOYEE_REVIEW.EMPLOYEE_EMPID and EMPLOYEE_REVIEW.REVIEW

Table 5. EMPLOYEE_REVIEW (relational child of the set column)
EMPLOYEE_EMPID

(foreign key)

REVIEW
int date
103 2013-12-07
103 2015-01-22
103 2016-01-10
105 2015-01-22
105 2016-01-12

A SQL insert on the EMPLOYEE_CLIENT child table would take the form3:

INSERT INTO EMPLOYEE_REVIEW (EMPLOYEE_EMPID, REVIEW) VALUES (107, '2015-01-20')

Update Support

Update is supported for primitive types, non-nested Tuple types, and non-nested user-defined types. Update is also supported for value columns (VALUECOL) in non-nested Map types. The driver does not support updates on List types, Set types, or key columns (KEYCOL) in Map types because the values in each are part of the primary key of their respective child tables and primary key columns cannot be updated. If an Update is attempted when not allowed, the driver issues the following error message:

[DataDirect][Cassandra ODBC Driver][Cassandra]syntax error or access rule violation: UPDATE not permitted for column: column_name

1 The driver supports an insert on a child table prior to an insert on a parent table, circumventing referential integrity constraints associated with traditional RDBMS. To maintain integrity between parent and child tables, it is recommended that an insert first be performed on the parent table for each foreign key value added to the child. If such an insert is not first performed, the driver automatically inserts a row into the parent table that contains only the primary key values and NULL values for all non-primary key columns.
2 The driver supports an insert on a child table prior to an insert on a parent table, circumventing referential integrity constraints associated with traditional RDBMS. To maintain integrity between parent and child tables, it is recommended that an insert first be performed on the parent table for each foreign key value added to the child. If such an insert is not first performed, the driver automatically inserts a row into the parent table that contains only the primary key values and NULL values for all non-primary key columns.
3 The driver supports an insert on a child table prior to an insert on a parent table, circumventing referential integrity constraints associated with traditional RDBMS. To maintain integrity between parent and child tables, it is recommended that an insert first be performed on the parent table for each foreign key value added to the child. If such an insert is not first performed, the driver automatically inserts a row into the parent table that contains only the primary key values and NULL values for all non-primary key columns.