Data Normalization University Management System

Data Normalization is the process widely used and implemented in industry . While this concepts has it’s use it comes with a price. However , Designing a DBMS is fundamental and of utmost importance to the success of any software. You can think of DBMS as human brain and this is true because DBMS design is what controls everything in software .  From storage of information to its retrieval , DBMS do everything for a software that our brain do. Hence , it’s very important for our DBMS to be able to clear paths for retrieval of information and its storage. In any case , it should avoid redundant data and anomalies. Check the links before going through Data Normalization With Example .

Data Normalization

Process of removing redundancy , data duplication and elimination of anomalies and preventing loss of information.

If you are still unable to understand terms used above . Read these for better understanding and also there will be other terms which we have already discussed separately. Following are few articles which are pre-requisites to this article. If you are a beginner you can go through them or else you can proceed

  1. Role Of Keys In DBMS
  2. Database Terminologies
  3. Data Modelling
  4. Anomalies 
  5. Redundancy  

Normalization was also proposed by Dr. Codd and his 12 rules are the fundamentals to every DBMS design for decades now.  Now that the context is built , we can proceed with the Normalization with example.

We will discuss four types of Normal Forms

  1. First Normal Form 1NF
  2. Second Normal Form 2NF
  3. Third Normal Form 3 NF

FIRST NORMAL FORM 1NF

Table is considered to be in First Normal Form 1NF only if it contains no repeating groups . Taking the example from ERD case study of University management system . See below

 

Student_IDNameAgeZip CodeCityDept_IDDepartment Name
2Faiz210000,48758Karachi,Lahore4Botany
3Nouman259887, 6556Faislabad,New York5English
4Jerry21266 ,5555Dubai, Abu Dhabi6Physical Education

Two Columns , Zip Code and City they both have repeating groups in them. Can you imagine a way to identify which city has which zip code of two? Hence, we need to remove them and convert them in a way that each record can be recognised independently.

First Normal Form will be like this

Student_IDNameAgeZip CodeCityDept_IDDepartment Name
2Faiz210000Karachi4Botany
2Faiz2148758Lahore4Botany
3Nouman259887Faislabad5English
3Nouman256556New York5English
4Jerry21266Dubai6Physical Education
4Jerry215555Abu Dhabi6Physical Education

No Repeating Groups in table for it to be in 1NF. Above table completes the condition

SECOND NORMAL FORM 2NF

Table is 2NF only and only if it already has 1NF and then it must be absolute that there should not be partial dependency of any column on primary key. This part can be confusing , so let’s try to spend some more time here.

Student_IDNameAgeZip CodeCityDept_IDDepartment Name
2Faiz210000Karachi4Botany
2Faiz2148758Lahore4Botany
3Nouman259887Faislabad5English
3Nouman256556New York5English
4Jerry21266Dubai6Physical Education
4Jerry215555Abu Dhabi6Physical Education

See the table above , it is in 1NF but let’s find other candidate keys in the table of Studnet. You have Student_ID which is the primary key and we have another key which is department’s foreign key Department_ID. We can use both these keys to find details of student . right? Let’s try to find dependency of every column on our composite key.

Student_ID,Department_ID -> Name , Zip Code & Department Name they both need these two keys to be able to identify them as to which student they belong to.

Student_ID,Department_ID -> City & Age they both do not depend completely on our composite primary key rather their dependency is partial.

Remove Partial dependency from table for it to be in 2NF

See below

Student_IDNameZipDept_IDDepartment Name
2Faiz00004Botany
2Faiz487584Botany
3Nouman98875English
3Nouman65565English
4Jerry2666Physical Education
4Jerry55556Physical Education

Student_IDZip CodeCity
20000Karachi
248758Lahore
39887Faislabad
36556New York
4266Dubai
45555Abu Dhabi

Student_IDNameAge
2Faiz21
3Nouman25
4Jerry21

We now have two more tables Student Age & Zip Code.

THIRD NORMAL FORM 3NF

Can you guess the first condition for a table to be in 3NF? Yes , it has to be in 2NF first. Easy Peasy! Who thought Normalization could be this much fun?

Other condition is that for a table to be in 3NF  every column should only be dependant on primary key and no other column  this is called transitive dependency . You can say we don’t want to have candidate keys in 3NF. From above tables , Student table has transitive dependency for Department Name. This column depends on Student ID and Department ID. hence , we will create a new table with Department name and remove the department name column.

Student Table at 3NF will be

Student_IDNameZipDept_ID
2Faiz00004
2Faiz487584
3Nouman98875
3Nouman65565
4Jerry2666
4Jerry55556

All other tables already satisfying all conditions. No transitive dependency columns in a table for it to be 3NF.

So we have converted this long table

Student_IDNameAgeZip CodeCityDept_IDDepartment Name
2Faiz210000,48758Karachi,Lahore4Botany
3Nouman259887, 6556Faislabad,New York5English
4Jerry21266 ,5555Dubai, Abu Dhabi6Physical Education

  Into these well-designed tables

Student_IDZip CodeCity
20000Karachi
248758Lahore
39887Faislabad
36556New York
4266Dubai
45555Abu Dhabi

Student_IDNameAge
2Faiz21
3Nouman25
4Jerry21

Student_IDNameZipDept_ID
2Faiz00004
2Faiz487584
3Nouman98875
3Nouman65565
4Jerry2666
4Jerry55556

 

How to Normalize Data ?

Follow these steps For Normalization :

  1. No Repeating Groups will give us 1NF
  2. Remove Partial Dependency and 1 NF will give us 2NF
  3. No Transitive Dependency and 2NF will give us 3NF

Now that we are done with data normalization , We would like to share an insight . If you are a professional you already know this but if you are a student know this . In Industrial database people hardly go to 3NF and let alone to Boyce and Codd’s Normal form or fourth and fifth normal form.

You can see the reason as to why , Normalization creates too many tables and query costs increase dramatically which is something we don’t want while retrieving data. It reduces redundancy ,duplication but increases tables and dependency between them. Hence , people tend to avoid them sometimes.

In above example we have Age table with only column , that is unnecessary in practical terms . It will just add one more join to simple information. Anyways  , you should know these facts as well to understand completely what data Normalization is .

We hope this was helpful. You can follow us on facebook and subscribe us by e-mail to get the latest articles. If you likes this please , share with your fellows and if there are mistakes . Please , feel free to correct them in comments. Afterall , we are all but Students at the end of the day!

Your One Dollar Will keep us moving.

Happy Learning!

1 thought on “Data Normalization University Management System

Leave a Reply