Normalizing native data
- Last Updated: October 27, 2020
- 2 minute read
- DataDirect Connectors
- ODBC
- MongoDB 8.0
- Documentation
For example, the collection residents contains the array vehicles and subdocuments in the address document (or object). The collection's JSON structure
can be rendered as follows:
{"_id": "ajx363",
"name": "Sydney Smith",
"address": {"street": "101 Main Street", "city": "Raleigh", "state": "NC"},
"county": "Wake",
"vehicles: ["car", "boat", "bicycle"]
}
{"_id": "tzn525",
"name": "Cora Welch",
"address": {"street": "191 First Street", "city": "Chapel Hill", "state": "NC"},
"county": "Orange",
"vehicles": ["scooter", "truck", "bicycle"]
}
The collection has been decompounded into
separate but related tables. The normalization of the residents collection yields one parent table
and two child tables. In the parent table, the _id field is mapped as a primary key column, and fields with
other simple types are mapped as relational columns. The parent table
adopts the name RESIDENTS and takes the
following form:
| _ID (PK) | NAME | COUNTY |
|---|---|---|
| ajx363 | Sydney Smith | Wake |
| tzn525 | Cora Welch | Orange |
The information in the vehicles array maps to a RESIDENTS_VEHICLES child table. In this table, a column which
refers back to the parent table is mapped as a concatenation of the parent
table and the _id field: RESIDENTS_ID. (This column establishes a
foreign key relationship to the parent table based on the _id field.) The vehicles array is mapped to the column VEHICLES. Each value in the VEHICLES column corresponds to an element in the vehicles array. In addition, the driver
generates a third column to establish a primary key for the child table:
VEHICLES_GENERATED_ID. (This
automatically generated column enables the driver to normalize nested
arrays to any depth. The format of the unique keys are internal to the
driver.) This child table would take the following form:
| RESIDENTS_ID (FK) | VEHICLES | VEHICLES_GENERATED_ID (PK) |
|---|---|---|
| ajx363 | car | unique key |
| ajx363 | boat | unique key |
| ajx363 | bicycle | unique key |
| tzn525 | bicycle | unique key |
| tzn525 | scooter | unique key |
| tzn525 | truck | unique key |
The information in the address object maps to a RESIDENTS_ADDRESS child table. In this table, a column which
refers back to the parent table is mapped as a concatenation of the parent
name and the _id field: RESIDENTS_ID. (This column establishes a
foreign key relationship to the parent table and functions as a primary
key within the child table.) Each subdocument found in the address object is mapped as a column. This
table would take the following form:
| RESIDENTS_ID (PK & FK) | STREET | CITY | STATE |
|---|---|---|---|
| ajx363 | 101 Main Street | Raleigh | NC |
| tzn525 | 191 First Street | Chapel Hill | NC |