Normalization :In the field of relational database design, normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies that could lead to a loss of integrity.
Or
Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating two factors: redundancy and inconsistent dependency.
Following rules are applied when we normalize relations.
1.Eliminating repeating groups
2.Eliminating repeating data
3.Eliminating columns not depending on key.
4.Isolate independent multiple relationship
5.Isolate semantically related multiple relationships
Types
1N (first normalization):- A table is said to be in 1N if there is no repeating groups (fields or tuple for all records) in individual tables. For example
Here actually there is one group named class which is repeated and which should not be in table of two dimensions so let’s make this table in 1N and that is like,
Now this table is in 1N.
2N (second normalization):- A table is said to be in 2N if it is in 1N and no records are functionally dependant other than primary keys. Let’s take an example,
In above table, the field class is not functionally dependant on primary key(student #) so it is not in 2N. To bring this in 2N let’s break this into two tables given below.
Students
Registration
Now we can see the above tables are in 2N completely.
3N (3rd normalization):- A table is said to be in 3N if it is in 2N and if there is no field which does not depend on key. For example,
Students
In the above example, Adv-Room (the advisor's office number) is functionally dependent on the Advisor attribute. The solution is to move that attribute from the Students table to the Faculty table, as shown below: students
Faculty
Now we can see the tables (broken ones) are in 3N.
another example can be like,
1 NF:
A TABLE IS SAID TO BE IN 1n IF there is not repeating groups or information in a table..Here, repeating group means a set of columns that stores similar information that repeats in the same table.
Let’s consider following SQL commands.
Create table contacts
(
Contact Id Integer not null,
L_name varchar(20) not null,
F_name varchar(20)
Contact_date1 date,
Contact_desc1 varchar(50),
Contact_date2 date,
Contact_desc2 varchar(50),
);
We can see here in this table that there is a repeating group of date and description.
Now to convert into 1 N, we have to make a new table and shift that repeating group into new table.
Like,
Create table contacts
(
Contact_ID integer not null,
L_name varchar(20) not null,
F_name varchar(20)
);
Create table conversation
(
Contact_id integer not null,
Contact_date date,
Contact_desc varchar(50)
);
Now we have eliminated the repeating groups and established relationship between them by using primary key and hence table is in 1N.
2N:-
A table is said to be in 2 N if it is in 1N and there is no redundant data in table i.e. if a value of column is dependent on one column but not another.
For example:
Create table employee
(
Emp_no integer not null,
L-name varchar(20) not null,
F_name varchar(20),
Dept_code integer,
Description varchar(50),
);
This table contains redundant data i.e. the value of column(field) “description” depends on dept_code but does not depend on primary key “emp_no”.So let’s make a new table and shift them into that.
Create table employee
(
Emp_no integer not null,
L_name varchar(20) not null,
F_name varchar(20),
Dept_code integer,
);
Create table department
(
Dept_code integer not null,
Description varchar(50) not null,
);
We got two tables named employee and department with fields. Both the tables are related by primary key called dept_code. Now there is no redundancy and table is in 2N.
3N:
A table is said to be in 3N if it is in 2N and there is no column which is/are not dependent on key.
To be more clear, let’s see an example of table given below.
Create table contacts
(
Contact_ID integer not null,
L_name varchar(20) not null,
F_name varchar(20),
Company_name varchar(20),
Company_location varchar(50),
);
Clearly we can see here that company location differs on an individual basis and does not depend on the key value and should be removed to another table. It can be shown like,
Create table contacts
(
Contact_id integer not null,
L_name varchar(20) not null,
F_name varchar(20),
Copany_id integer,
);
Create table company
(
Company_id integer not null,
Company_name varchar(20),
Company_location varchar(50),
);