In this tutorial, l’m going to discuss the topic of normalization. normalization is the technique of minimizing redundancy and anomaly in the data. It leads to creation of more tables. There a many normal forms but l’m going to discuss only the first 3 normal forms because they are the most important. To begin with, let’s look at the topic of functional dependency because normalization is built around this concept.

Functional dependency is when one attribute depends on another attribute.  For example, letter grade depends on mark the student gets in a course. So we say, the mark determines the grade or the grade depends on the mark.

There are three types of functional dependencies; Partial, Total and Transitive Functional Dependencies.

Partial functional dependency is when an attribute depend on only some of the components of the attribute it depends on.Let’s see an example, let’s say we have a table for course and it has a  composite primary key consisting of a courseid,rm#. then there are other attributes in the course table such as coursename,time, room name.Now if we look carefully, the course name depends on the course id so if our primary key for the table is courseid and rm#, that means the course name depends on only one component of the primary key. This is called a partial functional dependency. By the way, the functional dependency is in terms of the primary key of the table. The primary key can be composite or just a single component key. If it’s a single component primary key, then every other component will totally depend on it but if it’s a composite primary key, some columns can be partially functional dependent and some can be totally dependent on the key. Total functional dependency simply means the attribute is totally dependent on all the components of the primary key. Transitive Functional dependency results when one attribute depends on another attribute and another attribute depends on the attribute that depends on the other attribute. That’s a long one there. Lets use  this demonstration to clarify things

A->B->C

So let’s say, B depends on A and C depends on B, then we say C is transitive functional dependent on A.

Ok, so knowing all these dependencies, we can go into normalization now and you’ll understand why we need to understand the functional dependencies first.

The first form of normalization is called First Normal form and it’s shorten as 1NF.

In 1NF, we have to make sure there is single value in each cell, rows are not duplicated, primary key present in the table and  All attributes are functionally dependent on the Primary Key. Easy right?  Here is a table in 1NF.

Students(sid, cid, mark, studentname, rm#)—-first normal because primary key present and all columns are functionally dependent on the primary key.

The second form of normalization is called Second Normal form and it’s shorten as 2NF.

In 2NF, we have to make sure the table is already in 1NF. Then we remove all Partial functional dependencies existing in the table by creating new tables.  From the above student table, check this out

sid , cid–>name —partial dependency because name depends only on sid. by the way sid is student id and cid is course id. You don’t need a course id to know a student name. so the name depends on only one component of the composite primary key of sid,cid. To push this table into 2NF, we have to break the partial functional dependency by creating a new table to take care of the partial dependencies. Let us also assume knowing the course id will allow us to tell the room number where that course will be taught, that means rm# depends only on the cid component of the composite primary key. Other columns that have no problem will be left as it is. That means in 2NF, we have these tables below.

Students(sid, cid, mark)

Course_Room(cid, rm#), Primary key here can be cid

Student_name(sid,name), Primary key here can be sid

The new tables have no partial functional dependencies. So the tables are in 2NF.

The third Normal form, shortened as 3NF is where the table already exists in the 2NF and you remove the transitive functional dependencies that exist between attributes. Assume again, we have a table student which looks like this

student(sid,cid,name,rm)

Let’s say in this case, rm is the room or dormitory of the student. Now, if we can find the name of the student using the sid, then name  depends on sid, also we can find the room of the student by looking up their name. This means rm is transitive functionally dependent on sid. So we breaking  up this relationship and creating a new table  as follows,

(sid, name)

(sid, cid, marks)

(name, rm)

There are other normal forms but l’m only ending this tutorial here because the first three are the most important. You can normalise as far as you want but in some situations, you have to denormalise for efficient working of the database. So normalisation has advantages and disadvantages.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Name *