Data Normalization is the process widely used and implemented in industry . While this concepts has it’s use it comes with a price. However , Designing a DBMS is fundamental and of utmost importance to the success of any software. You can think of DBMS as human brain and this is true because DBMS design is what controls everything in software . From storage of information to its retrieval , DBMS do everything for a software that our brain do. Hence , it’s very important for our DBMS to be able to clear paths for retrieval of information and its storage. In any case , it should avoid redundant data and anomalies. Check the links before going through Data Normalization With Example .
Process of removing redundancy , data duplication and elimination of anomalies and preventing loss of information.
If you are still unable to understand terms used above . Read these for better understanding and also there will be other terms which we have already discussed separately. Following are few articles which are pre-requisites to this article. If you are a beginner you can go through them or else you can proceed
Normalization Concept in DBMS was introduced to solve the data anomalies that were causing halts and serious data problems to the developers.
Also Read : Data Anomalies in DBMS
Normalization was also proposed by Dr. Codd and his 12 rules are the fundamentals to every DBMS design for decades now. Now that the context is built , we can proceed with the Normalization with example.
We will discuss four types of Normal Forms
- First Normal Form 1NF
- Second Normal Form 2NF
- Third Normal Form 3 NF
FIRST NORMAL FORM 1NF
Table is considered to be in First Normal Form 1NF only if it contains no repeating groups . Taking the example from ERD case study of University management system . See below
Two Columns , Zip Code and City they both have repeating groups in them. Can you imagine a way to identify which city has which zip code of two? Hence, we need to remove them and convert them in a way that each record can be recognised independently.
First Normal Form will be like this
No Repeating Groups in table for it to be in 1NF. Above table completes the condition
SECOND NORMAL FORM 2NF
Table is 2NF only and only if it already has 1NF and then it must be absolute, that there should not be partial dependency of any column on primary key. This part can be confusing , so let’s try to spend some more time here.
See the table above , It is in 1NF but let’s find other candidate keys in the table of Student. You have Student_ID which is the primary key and we have another key which is department’s foreign key Department_ID.
We can use both these keys to find details of student . Right? Let’s try to find dependency of every column on our composite key.
Student_ID,Department_ID -> Name , Zip Code & Department Name they both need these two keys to be able to identify them as to which student they belong to.
Student_ID,Department_ID -> City & Age, they both do not depend completely on our composite primary key rather their dependency is partial. City Depends on Zip Code and Age is relevant to Student_ID
Remove Partial dependency from table for it to be in 2NF
See below two tables Department Name Student Table:
THIRD NORMAL FORM 3NF
Can you guess the first condition for a table to be in 3NF? Yes , it has to be in 2NF first. Easy Peasy! Who thought Normalization could be this much fun?
Other condition is that for a table to be in 3NF every column should only be dependant on primary key and no other column this is called transitive dependency .
You can say we don’t want to have candidate keys in 3NF. From above tables , Student table has transitive dependency for Department Name. This column depends on Student ID and Department ID. Hence , we will create a new table with Department name and remove the department name column.
Student Table at 3NF will be
All other tables already satisfying all conditions. No transitive dependency columns in a table for it to be 3NF.
So we have converted this long table
Into these well-designed tables
Can you answer what is 3nf in dbms?
Answer is pretty straight forward .
Table with no transitive Dependency, no partial Dependency & does not have repeating groups is 3nf table
Data Normalization Rules
To sum it up and answer the question of How to normalize tables? Follow below normalization rules:
- No Repeating Groups in data will give us 1NF
- Remove Partial Dependency and 1 NF will give us 2NF
- No Transitive Dependency and 2NF will give us 3NF
Now that we are done with data normalization , We would like to share an insight . If you are a professional you already know this but if you are a student know this :
In Industrial database people hardly go to 3NF and let alone to Boyce and Codd’s Normal form or fourth and fifth normal form.
You can see the reason as to why , Normalization creates too many tables and query costs increase dramatically which is something we don’t want while retrieving data. It reduces redundancy ,duplication but increases tables and dependency between them. Hence , people tend to avoid them sometimes.
In above example, we have Age table with only column , that is unnecessary in practical terms . It will just add one more join to simple information. Anyways , you should know these facts as well to understand completely what data Normalization is .
We hope this was helpful. You can follow us on facebook and subscribe us by e-mail to get the latest articles. If you likes this please , share with your fellows and if there are mistakes . Please , feel free to correct them in comments. Afterall , we are all but Students at the end of the day!