Database normalization is defined as the rules that help in building relational database. In the database building process, the redundant data is eliminated and the related data is stored in the table. Basically, this process is used to organize data so that its performance is optimum. The process of normalization involves breaking a single table into two or more tables and defining the relationships between them. The purpose of isolating data in more than one table is to add, delete, and modify fields in one table and then propagate the data through the rest of the database.
Normalization guidelines are divided into five normal forms:
The first normal form describes the basic rules to normalize a database. It follows the rules listed below:
Consider the following Student table:
In Student table, the Course field is a multi-valued attribute and has no single value for any field. Now, consider this Student table:
The Course1, Course2, Course3 fields represent repeating groups and to brought it to First Normal Form we will store the data in following ways.
In the first two designs of the data, the students enrolled for a certain course is difficult. Say I want to do the following:
Tell me all of the students enrolled in course 4100. In the first design, you'll have to pull all of the course data and parse it somehow. And in the second design, you'll have to check 3 different fields for course 4100. In the final design, a simple Select StudentID from Student where Course=100.
Second Normal Form
The second normal form is defined to remove duplicate data from a table. These are as follows:
Normalization guidelines are divided into five normal forms:
- 1st Normal Form
- 2nd Normal Form
- 3rd Normal Form
- 4th Normal Form
- 5th Normal Form
The first normal form describes the basic rules to normalize a database. It follows the rules listed below:
- Remove duplicate columns from a table.
- Segregate tables with a specific group of data.
Consider the following Student table:
StudentID | Course |
12345 | 4100,4200,4300 |
54321 | 2100,2200,2300 |
In Student table, the Course field is a multi-valued attribute and has no single value for any field. Now, consider this Student table:
StudentID | Course1 | Course2 | Course3 |
12345 | 4100 | 4200 | 4300 |
54321 | 2100 | 2200 | 2300 |
The Course1, Course2, Course3 fields represent repeating groups and to brought it to First Normal Form we will store the data in following ways.
StudentID | Course |
12345 | 4100 |
12345 | 4200 |
12345 | 4300 |
54321 | 2100 |
54321 | 2200 |
54321 | 2300 |
In the first two designs of the data, the students enrolled for a certain course is difficult. Say I want to do the following:
Tell me all of the students enrolled in course 4100. In the first design, you'll have to pull all of the course data and parse it somehow. And in the second design, you'll have to check 3 different fields for course 4100. In the final design, a simple Select StudentID from Student where Course=100.
Second Normal Form
The second normal form is defined to remove duplicate data from a table. These are as follows:
- It includes all rules of the first normal form.
- Remove the data from the table that applies to multiple rows and then arrange it in the other table by creating relations.
Third Normal Form
The third normal form is defined to remove the columns that do not depend upon constraints. It has the following rules:
- It includes all rules of the second normal form.
- It removes those columns that do not depend upon the Primary keys.
No comments:
Post a Comment