SQL Interview Questions
SQL
BackendWeb DevelopmentData ScienceQuestion 36
What is the difference between RANK()
and DENSE_RANK()
?
Answer:
RANK()
and DENSE_RANK()
are both window functions in SQL that assign a rank to each row within a partition of a result set. However, they handle ties (i.e., rows with the same values in the ordering column) differently.
RANK()
The RANK()
function assigns a unique rank to each distinct value in the order specified. If there are ties, it assigns the same rank to all tied rows but skips the subsequent ranks.
Example
Given a table scores
:
student_id | score |
---|---|
1 | 95 |
2 | 85 |
3 | 95 |
4 | 75 |
5 | 85 |
Using RANK()
:
SELECT student_id, score,
RANK() OVER (ORDER BY score DESC) AS rank
FROM scores;
Result:
student_id | score | rank |
---|---|---|
1 | 95 | 1 |
3 | 95 | 1 |
2 | 85 | 3 |
5 | 85 | 3 |
4 | 75 | 5 |
Explanation:
- Students with scores 95 are ranked 1.
- The next rank after the tie is 3 (skipping 2).
- Students with scores 85 are ranked 3.
- The next rank is 5 (skipping 4).
DENSE_RANK()
The DENSE_RANK()
function also assigns a rank to each distinct value in the order specified. If there are ties, it assigns the same rank to all tied rows, but it does not skip any ranks.
Example
Using DENSE_RANK()
on the same table:
SELECT student_id, score,
DENSE_RANK() OVER (ORDER BY score DESC) AS dense_rank
FROM scores;
Result:
student_id | score | dense_rank |
---|---|---|
1 | 95 | 1 |
3 | 95 | 1 |
2 | 85 | 2 |
5 | 85 | 2 |
4 | 75 | 3 |
Explanation:
- Students with scores 95 are ranked 1.
- The next rank is 2 (no skipping).
- Students with scores 85 are ranked 2.
- The next rank is 3.
Summary of Differences
Aspect | RANK() | DENSE_RANK() |
---|---|---|
Tie Handling | Assigns the same rank to tied rows and skips subsequent ranks. | Assigns the same rank to tied rows and does not skip ranks. |
Resulting Ranks | May have gaps if there are ties. | Consecutive ranks, no gaps. |
Use Case | Useful when you need to understand the relative position with gaps. | Useful when you need consecutive ranking without gaps. |
Conclusion
The choice between RANK()
and DENSE_RANK()
depends on how you want to handle ties and gaps in the ranking sequence. Use RANK()
when you want to have gaps in the ranking sequence after ties, and use DENSE_RANK()
when you want consecutive ranks without gaps. Understanding the differences helps in selecting the appropriate function based on the specific requirements of your data analysis.