SQL Interview Questions
SQL
BackendWeb DevelopmentData ScienceQuestion 24
What is normalization? Explain different normal forms.
Answer:
Normalization is a database design technique used to minimize data redundancy and ensure data integrity by organizing tables and their relationships. The process involves dividing a database into two or more tables and defining relationships between them to eliminate anomalies during data operations such as insertions, deletions, and updates.
Objectives of Normalization:
- Eliminate Redundant Data: Ensure that data is stored only once.
- Ensure Data Integrity: Maintain consistency and accuracy of data across the database.
- Improve Database Efficiency: Optimize database performance by reducing the amount of duplicated data.
Different Normal Forms:
Normalization involves applying a series of rules called normal forms (NF) to the database. Each normal form has specific requirements that a table must satisfy.
1. First Normal Form (1NF)
A table is in the first normal form if:
- All columns contain atomic (indivisible) values.
- Each column contains values of a single type.
- Each column has a unique name.
- The order in which data is stored does not matter.
Example:
- A table containing a list of customers with columns for
CustomerID
,CustomerName
,PhoneNumbers
is not in 1NF ifPhoneNumbers
contains multiple values. It must be split into individual rows or another table to achieve 1NF.
CustomerID | CustomerName | PhoneNumber
1 | John Doe | 123-456-7890
1 | John Doe | 234-567-8901
2. Second Normal Form (2NF)
A table is in the second normal form if:
- It is in 1NF.
- All non-key attributes are fully functionally dependent on the primary key.
Example:
- A table containing
OrderID
,CustomerID
,CustomerName
is not in 2NF becauseCustomerName
depends only onCustomerID
and not onOrderID
. This requires creating a separate table for customers.
Orders Table:
OrderID | CustomerID | OrderDate
1 | 1 | 2023-01-01
2 | 1 | 2023-02-01
Customers Table:
CustomerID | CustomerName
1 | John Doe
3. Third Normal Form (3NF)
A table is in the third normal form if:
- It is in 2NF.
- All non-key attributes are not only fully functionally dependent on the primary key but also non-transitively dependent on the primary key (i.e., there are no transitive dependencies).
Example:
- A table containing
OrderID
,CustomerID
,CustomerName
,CustomerAddress
is not in 3NF becauseCustomerAddress
depends onCustomerID
, not directly onOrderID
. This requires creating a separate table for customer details.
Orders Table:
OrderID | CustomerID | OrderDate
1 | 1 | 2023-01-01
Customers Table:
CustomerID | CustomerName | CustomerAddress
1 | John Doe | 123 Main St
4. Boyce-Codd Normal Form (BCNF)
A table is in BCNF if:
- It is in 3NF.
- For every functional dependency (X → Y), X should be a super key.
Example:
- A table with
CourseID
,InstructorID
,InstructorName
where each instructor teaches only one course is not in BCNF becauseInstructorName
depends onInstructorID
butInstructorID
is not a super key.
Instructors Table:
InstructorID | InstructorName
1 | Dr. Smith
Courses Table:
CourseID | InstructorID | CourseName
1 | 1 | Math
5. Fourth Normal Form (4NF)
A table is in the fourth normal form if:
- It is in BCNF.
- It has no multi-valued dependencies.
Example:
- A table with
StudentID
,CourseID
,Hobby
where students can have multiple hobbies and enroll in multiple courses is not in 4NF because there are multiple values forHobby
andCourseID
.
Students Table:
StudentID | Hobby
1 | Chess
1 | Soccer
Courses Table:
StudentID | CourseID
1 | Math
1 | Science
6. Fifth Normal Form (5NF)
A table is in the fifth normal form if:
- It is in 4NF.
- It cannot be decomposed into smaller tables without losing information (no join dependency).
Example:
- This is more theoretical and rarely encountered in practical scenarios. It deals with reconstructing a table from smaller tables without any loss of data.
Summary
Normalization is a crucial aspect of database design aimed at reducing redundancy and ensuring data integrity. Each normal form builds upon the previous one, adding more stringent requirements to eliminate potential anomalies and inefficiencies. Understanding and applying these normal forms can lead to a more robust, efficient, and maintainable database structure.