Normalizing Native Data
- Last Updated: October 27, 2015
- 2 minute read
- DataDirect Connectors
- JDBC
- MongoDB 6.0
- Documentation
For example, the collection residents
contains the array vehicles and subdocuments in the address document (or object). The collection's JSON structure can
be rendered as follows:
{"_id": "ajx363",
"name": "Sydney Smith",
"address": {"street": "101 Main Street", "city": "Raleigh", "state": "NC"},
"county": "Wake",
"vehicles: ["car", "boat", "bicycle"]
}
{"_id": "tzn525",
"name": "Cora Welch",
"address": {"street": "191 First Street", "city": "Chapel Hill", "state": "NC"},
"county": "Orange",
"vehicles": ["scooter", "truck", "bicycle"]
}
The collection has been decompounded into separate but related tables. The
normalization of the residents collection yields one parent
table and two child tables. In the parent table, the _id
field is mapped as a primary key column, and fields with other simple types are mapped as
relational columns. The parent table adopts the name RESIDENTS and takes the following form:
| _ID (PK) | NAME | COUNTY |
|---|---|---|
| ajx363 | Sydney Smith | Wake |
| tzn525 | Cora Welch | Orange |
The information in the vehicles array maps
to a RESIDENTS_VEHICLES child table. In this table, a column
which refers back to the parent table is mapped as a concatenation of the parent table and the
_id field: RESIDENTS_ID.
(This column establishes a foreign key relationship to the parent table based on the _id field.) The vehicles array is
mapped to the column VEHICLES. Each value in the VEHICLES column corresponds to an element in the vehicles array. In addition, the driver generates a third column to
establish a primary key for the child table: VEHICLES_GENERATED_ID. (This automatically generated column enables the driver to
normalize nested arrays to any depth. The format of the unique keys are internal to the
driver.) This child table would take the following form:
| RESIDENTS_ID (FK) | VEHICLES | VEHICLES_GENERATED_ID (PK) |
|---|---|---|
| ajx363 | car | unique key |
| ajx363 | boat | unique key |
| ajx363 | bicycle | unique key |
| tzn525 | bicycle | unique key |
| tzn525 | scooter | unique key |
| tzn525 | truck | unique key |
The information in the address object maps
to a RESIDENTS_ADDRESS child table. In this table, a column
which refers back to the parent table is mapped as a concatenation of the parent name and the
_id field: RESIDENTS_ID.
(This column establishes a foreign key relationship to the parent table and functions as a
primary key within the child table.) Each subdocument found in the address object is mapped as a column. This table would take the following
form:
| RESIDENTS_ID (PK & FK) | STREET | CITY | STATE |
|---|---|---|---|
| ajx363 | 101 Main Street | Raleigh | NC |
| tzn525 | 191 First Street | Chapel Hill | NC |