MIS 515: Data Management
Spring Semester 2004
Prof
O.Petkova
Normalization is a method for organizing data elements
in a database into tables.
Normalization
Avoids
Normalization
is a three stage process – After the first stage, the data is said to be in
first normal form, after the second, it is in second normal form, after the third,
it is in third normal form
1.
Begin with a list of all of the fields that must appear in the
database. Think of this as one big
table.
2.
Do not include computed fields
3.
One place to begin getting this information is from a printed document
used by the system.
4.
Additional attributes besides those for the entities described on
the document can be added to the database.
See
Sales Order from below:

Fields
in the original data table will be as follows:
ORDERS
SalesOrderNo, Date, CustomerNo, CustomerName, CustomerAdd, ClerkNo, ClerkName, ItemNo,
Description, Qty, UnitPrice
Think
of this as the baseline – one large table
Some definitions:
A
table is in 2NF if it is in 1NF and the partial dependencies are eliminated.
Third
A
table is in 3NF if it in 2NF and the transitive dependencies are eliminated.
1NF: (Primary key is determined and
multiple values are eliminated); obviously the combination of SalesOrderNo and
ItemNo is
the primary key.
ORDERS
SalesOrderNo, Date, CustomerNo, CustomerName, CustomerAdd, ClerkNo, ClerkName, ItemNo,
Description, Qty, UnitPrice
From 1NF to 2NF:
From
the table ORDERS we have to identify the partial dependencies:
There
is a partial dependency (attribute depending on another attribute,
that is a part of the primary key) between SalesOrderNo
and Date, CustomerNo, CustomerName,
CustomerAdd, ClerkNo, ClerkName; so we can eliminate it in a separate table:
SALES
SALES
SalesOrderNo, Date, CustomerNo, CustomerName, CustomerAdd, ClerkNo, ClerkName (2NF)
There
is another partial dependency between IntemNo and Desription, UnitPrice; we can
eliminate it into a separate table: ITEMS
ITEMS
ItemNo, Description, UnitPrice (2NF)
What
is left from the original ORDERS table is:
ORDERS
SalesOrderNo, ItemNo, QTY (2NF)
We can claim that
the three tables SALES, ITEMS and ORDERS are in 2NF, because they are in 1NF
and they don’t contain partial dependencies.
From 2NF to 3NF:
We
explore the three 2NF tables above for transitive dependencies (attribute
depending on another attribute that is not a part of the primary key). Obviously,
tables ITEMS and ORDERS have no problems, but in table SALES we can identify
two transitive dependencies:
CustomerName and Customer Address (CustomerAdd)
depend on CusomerNo
and
ClerkName depends on ClerkNo.
We
shall eliminate these dependences by removing them from SALES and creating two
additional tables:
CUSTOMERS
CustomerNo, CustomerName, CustomerAdd (3NF)
and
CLERKS
ClerkNo, ClerkName (3NF)
The
table SALES will be transformed as follows:
SALES
SalesOrderNo, Date, CustomerNo, ClerkNo (3NF)
Please pay attention that we didn’t remove all the information
about customers and clerks from the SALES table, but kept the two tables
primary keys as foreign keys in SALES.
ITEMS and ORDERS
are also in 3NF, because they don’t have transitive dependencies.
To conclude: We
transformed SALES into five 3NF tables:
SALES
CLERKS
CUSTOMERS
ITEMS
ORDERS.