How Grouping Works in SQL

One of our subscribers reached us and asked How Grouping works in SQL ?  So today , this is what we are going to do . We will discuss in detail about grouping in SQL and how it actually works behind the scenes. Also , this is our first post in Questions category and we will be doing more as there are bundle of requests in the queue. So if you want to contribute and like us to write something for you , please do not hesitate.

How Grouping works in SQL?

Grouping summarises the whole data set into smaller chunks of data where each row represents a group . Group is when we combine a group of things, elements under some rule. While working with SQL reports , a lot of times we will come across scenarios where we want to summarise the data. For this purpose we use MS SQL Group By Clause

Also see : MS SQL Group By Clause

because this clause summarise the data into groups . Time to dig in now.

Prepare the data

Examples and queries that we will see here will run on our University Management System database . If you want to use these SQL queries for practice , create the same database , you need to just copy paste the script from here .

Case Study : Grade Reports from University Management System

We are asked to get the top 5 students with top grades . One best query writing practice is to try and visualise the end result and then build the query around it. To do this , you need first to see the underlying data. If you run the below SQL query , you will have a good look at the data we have.

SELECT P.FirstName,C.Title,Grade
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID

How Grouping Works in SQL

 

Remember our objective. We need first five students with top marks out of these forty rows. Give a brief and a long look at the table. You are seeing three columns here. Student names , course names and respective grades. Objective is to find the top 5 students .

Let’s break up the table and visualise what is we want. We want two things

  1. Student Names
  2. Student Grades

Now it is clear that we want top achievers which are actually the students who have obtained highest marks. How can you find that? By adding marks in all courses of each student.

Aggregate functions in SQL will help us to add the grades . See the Grades columns , its a column of numbers.

How Grouping Works in SQL examples

Can you guess of which aggregate functions in SQL we will use? SQL Count () or SQL Sum() ?

See : Difference Between SUM() & Count ()

It has to be SQL Group By SUM  which will do the job for us. Now , we have already determined that when Grouping clause is in effect each row represents
a group . So to achieve the objective remove Course column from the query

Read : MS SQL Group By Clause 


SELECT P.FirstName,GradeFROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID

How Grouping Works in SQL example 2

See the table again. Let’s find total marks of each student. Run this query


SELECT P.FirstName,SUM(GRADE) AS TotalStudentsinCourse
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID


It will give us out this

How Grouping Works in SQL example 2

We have applied the aggregate function but haven’t told our SQL Server on what basis it should be grouped. We want to group the grades against each student that is why we are writing FirstName in query . To apply SQL aggregate function of SUM on grades column you need to put First Name in Group By Clause. When you do that it will create groups for each student like this


SELECT P.FirstName,SUM(GRADE) AS TotalStudentsinCourse
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID
GROUP BY P.FirstName

How Grouping Works in SQL example 4

All students are appearing only once because all same elements are grouped together . Each student was taking up more than one course so SQL Server first combines the elements into group. In our case , unique student names.  Now it will apply SQL Sum for each group. Combine both steps and SQL Server handles bot these steps like this

How Grouping Works in SQL example 5

This is the final output you will see

How Grouping Works in SQL example 6

See how SQL Server reduces the table into groups and then by the use of aggregate functions in SQL it gives us results for those groups. We will use TOP clause with Order By Clause to find the results

RESULT QUERY


SELECT TOP (5) P.FirstName,SUM(GRADE) AS TotalStudentsinCourse
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID
GROUP BY P.FirstName
ORDER BY SUM(GRADE) DESC

How Grouping Works in SQL example 7

If you use TOP clause always use Order By clause with it to get correct data. Because , results can be random , so if you apply Order By then the order will be consistent. Topic for another day!

ALL YOU NEED TO KNOW ABOUT GROUPING

  1. Grouping is applied on all columns as combination in the same order they appear in query.
  2. Each row represents a unique row.
  3. Filtering on grouping is done by using HAVING Clause
  4. Use ORDER BY clause to arrange the data set in a order.

if you can find more points about grouping , do comment and we will add this in our list.

I hope now you understand about How Grouping Works in SQL because this has been a brief articleNow if you like this article , do share with your friends and colleagues and recommend us . Oh! and if the sponsored content is bothering you , apologies for that too because it’s the only source of revenue for this blog.

See you with next post very soon!

 

Author: Awais

Professional Data Analyst and Business Intelligence Developer with experience of delivering industrial projects for Supply Chain and Insurance Industry . Sharing all my experience and insights in databases and data warehousing and open to learn from fellows ! Happy Reading and Learning!

Leave a Reply