Third normal form

The third rule of normalization is that you must remove columns that can be derived from existing columns. A table is in the third normal form when it contains only independent columns, that is, columns not derived from other columns.

The following table shows an Order table with a Total After Tax column that is calculated from adding a 10% tax to the Order Amount column.

Table 1. Order table with derived column
Order Number (Primary key) Order Date Order Amount Total After Tax Cust Num (Foreign key)
M31 3/19/05 $400.87 $441.74 101
M98 8/13/05 $3,000.90 $3,300.99 101
M129 2/9/05 $919.45 $1011.39 101
M56 5/14/04 $1,000.50 $1,100.55 102
M37 12/25/04 $299.89 $329.87 103
M140 3/15/04 $299.89 $329.87 103
M41 4/2/04 $2,300.56 $2,530.61 104

To reduce this table to the third normal form, eliminate the Total After Tax column because it is a dependent column that changes when the Order Amount or tax changes. For your report, you can create an algorithm to obtain the amount for Total After Tax. You need only keep the source value because you can always derive dependent values. Similarly, if you have an Employee table, you do not need to include an Age column if you already have a Date of Birth column, because you can always calculate the age from the date of birth.

A table that is in the third normal form gives you these advantages:

  • It uses disk space more efficiently because no unnecessary data is stored
  • It contains only the necessary columns because superfluous columns are removed

Although a database normalized to the third normal form is desirable because it provides a high level of consistency, it might impact performance when you implement the database. If this occurs, consider denormalizing these tables.