SQL Index Tutorial | SQL Optimisation Part 1

SQL index is a distinct structure in the database that is built using the CREATE INDEX statement. Mostly , DBAs love & hate SQL indexes at the same time due to the tricky nature they posses. They can either speed up your databases or slow it down , depending on how correctly you have implemented them. People familiar with programming languages understand the behaviour of pointers very well. An index is the pointer in database.

WHAT IS SQL INDEX

As said , SQL indexes are a distinct data structure such as B+tree or balanced tree where reference to the original data is stored with the keys.  It requires its own disk space. Creating an index does not change the table data; it just creates a new data structure that refers to the table. You can also put it as SQL Index are special lookup tables that the database search engine can use to speed up data retrieval.

SQL INDEX CREATE Command

Syntax to create SQL Index is as follows :

 CREATE INDEX index_name ON table_name;

You define index on columns and you can do it two ways

*Single-Column Indexes

Declare index on one column only :

 CREATE INDEX index_name ON table_name (column1) ;

*Composite Indexes ( Multiple Columns)

Declare index on more than one column

 CREATE INDEX index_name ON table_name (column1) ;

Implicit Indexes

Implicit indexes are indexes that server creates automatically  upon creation of the object . Indexes are automatically created for primary key constraints and unique constraints.

SQL INDEX DROP Command

You can Drop index with the below code

 DROP INDEX index_name;

4 Scenarios WHEN SQL Index is Not An Option

  1. When tables are too small
  2. Tables have frequent, large batch updates or insert operations.
  3. When columns contain a high number of NULL values.
  4. Manipulation frequency of Column(s) is high.

SQL indexes are very important when it comes to the optimisation of database and enhancing query performance. The selection criteria for index and the columns which should be selected for index , needs a special consideration. This is the reason we have decided to go deep with this topic and write separate posts :

  1. How Does an SQL Index work.
  2. Clustered & Non Clustered Indexes.
  3. Fragmentation of Indexes

We will see how an SQL index sort data and how it helps to optimise the performance of the database. How data is organised in data structures and what will happen upon each read /write operation.

At the end we will discuss types if indexes with practical examples because then you will be able to understand why DBAs love & hate SQL .

Stay Tuned for next articles!

Since you are here. Support BI Talks BI in efforts to meet the Maintenance costs so that we can keep serving you up with the latest articles.

Advertisements

Table vs View – All You Need To Know

Table vs View is a topic of debate among DBAs and beginners because the similarities they both share . We will discuss various aspects of both tables and views . Such as:

  1. Difference between Table vs View.
  2. Table vs View Performance.
  3. Advantages of Views in SQL

In SQL Beginner’s Guide we have developed basic understanding about views. This article will discuss advance aspects of the topic. In case you need some background reading

Read : What is View in SQL ?

Difference between Table vs View

A table is where you store your data. The table actually occupies space on disk and is created by defining columns , data types & Constraints . They are the entities and responsible to store data . From our sample university database . Below is the Course Table

Table Vs Views
Table

This table is created by a following DDL statement :

CREATE TABLE [dbo].[Course](
[CourseID] [int] NOT NULL,
[Title] [nvarchar](100) NOT NULL,
[Credits] [int] NOT NULL,
[DepartmentID] [int] NOT NULL,
)

View  is a created by DML statement (SELECT Statement) which doesn’t occupy space on disk . View is literally a query . Nothing more nothing less . What it does is it executes the query defined in it’s structure and display the result which looks a lot like table.

Run this query in our Sample database


CREATE VIEW CourseView AS 
SELECT [CourseID]
,[Title]
,[Credits]
,[DepartmentID]
FROM [School].[dbo].[Course]

SELECT *
FROM [School].[dbo].[CourseView] 

This query will produce the same result

SQL Subquery
CourseView

Difference is in the behaviour these two behave and the purpose they serve.

We create views to avoid writing same piece of code multiple times on multiple places . For instance : You want to see the departments name against the Department ID with  courses from above table . The query will be as follows :

SELECT [CourseID]
,[Title]
,[Credits]
,D.[DepartmentID] , D.Name
FROM [School].[dbo].[Course] C

LEFT JOIN [School].[dbo].Department D
ON C.DepartmentID=D.DepartmentID

When to use Views?

Now you will write this query once , execute it and then what will you do if you need this one more time? will you write this piece of code again? What if your information is scattered over 10 different tables  & you need to execute this in 20 different places? This Scenario is common in Enterprise Applications. You may say , this is not a problem because we can save the files and call them every time. Problem is that when a query  executes , it goes through various stages in Database engine .

Two main stages are

  1. Query Parsing
  2. Query Plan

If you run this query 20 times. Your database engine will parse the query & make the query plan 20 times ( I know smart engines may cache frequent queries -for the sake of argument , let’s think they won’t) . Therefore , your query will acquire & release locks approx. 20 times . Therefore ,This will create bottlenecks for other operations as well.

Instead of all this just create a VIEW . This way for the same query , database engine will create query plan and parse the query only once. Then whenever you call view it will simply execute the query to return results.

CREATE VIEW CourseView AS
SELECT [CourseID]
,[Title]
,[Credits]
,D.[DepartmentID] , D.Name
FROM [School].[dbo].[Course] C

LEFT JOIN [School].[dbo].Department D
ON C.DepartmentID=D.DepartmentID

Next time if we have to execute it , we will simply call CourseView to display result.

Views can be updated , deleted & inserted though with a lot of restrictions. The base table for these operations should be same . More complex the view more difficult it is to do DDL operations.

CAN we have insert or update statement in view?

No we cannot do this . For such scenarios , use Stored Procedures 

CAN VIEWS HAVE INDEXES?

Yes , index views are called materialised view. Very useful for reporting purposes & highly functional .
We will soon write on materialised views as they are related to query optimisation section . The series is in the queue .

Table vs View Performance

This question is frequently asked in interviews & comes up frequently in our daily work. Answer to the question is as dynamic as it can be. This is one of those questions that doesn’t have one right answer because –It Depends

If the table and the view have the same structure and data in it, a view is slower than a normal query, because the view has to be generated from the table every time you query it. This is a step which is not necessary in a normal query. Therefore, view has normally only a piece of data from a table, which in turn makes  view faster than a normal query.

To further make it easy for you remember this

  1. It depends on the query- How well written it is.
  2. Do not use like operator in SQL Views because It will take millions of years if you’re data is big
  3. It depends on how many Joins are use to fetch data. Normally , if there are more tables . DBAs usually use multiple views to optimise speed. For example if the information is coming from 15 different tables. DBA may use 8 tables in one View lets call it View A and then later use the View A to form View B with remaining 7 tables.
  4. It depends on how well indexed your base table is. If the base table has poor index . View can’t do much.
  5. Advantages of Views in SQL
  6. Simplify the complex queries .For instance , reporting team wants to access the data. Usually , well managed databases can be large & complex . Sometimes , DBAs create views and make the data available hiding the complex underlying structure for other users. Though this practice is not recommended but still in the wake of meeting deadlines . Developers can be rough on naming columns and tables conventions.
  7. Computed Columns /Derived Columns can be created in Views. Great way to display reports. Write queries on multiple tables and create bulky reports with simple Views
  8. A database view provides extra security layer. Security is a vital part of any relational database management system. The database view offers additional protection for a database management system. The database view allows you to create the read-only view to expose read-only data to specific users. Users can only retrieve data in read-only view but cannot update it because he wont’t have access to data.
  9. Limits the access of data to certain users. For instance , your client may not want developers to access finance data or sensitive information.Because , this may create data breach. Hence , they need to have limited access.  This can be easily manage with Views.

Table vs View is one of the favourite topics among DBAs. Hopefully , you have had enough content here to understand the difference. In case , you want to add anything . Feel free to comment below!

Lock Modes in SQL Server Part 2

Lock Modes in SQL Server ensures the concurrency and data integrity for multiple users. This is essential because we want all users to be able to use the database without any delays. Every thing to know about Levels of Locking in SQL Server read below link

Read More : Locking in SQL Server Part 1 

Lock Modes

Following are the lock modes which that can engage with the resources :

  • Exclusive (X)
  • Shared (S)
  • Update (U)

Exclusive lock (X)

This lock type, when imposed, will ensure that a page or row will be reserved exclusively for the transaction that imposed the exclusive lock, as long as the transaction holds the lock.

The exclusive lock will be imposed by the transaction when it wants to modify the page or row data, which is in the case of DML statements DELETE, INSERT and UPDATE. An exclusive lock can be imposed to a page or row only if there is no other shared or exclusive lock imposed already on the target. This practically means that only one exclusive lock can be imposed to a page or row, and once imposed no other lock can be imposed on locked resources

Shared lock (S) 

This lock type, when imposed, will reserve a page or row to be available only for reading, which means that any other transaction will be prevented to modify the locked record as long as the lock is active. However, a shared lock can be imposed by several transactions at the same time over the same page or row and in that way several transactions can share the ability for data reading since the reading process itself will not affect anyhow the actual page or row data. In addition, a shared lock will allow write operations, but no DDL changes will be allowed

Update lock (U)

This lock is similar to an exclusive lock but is designed to be more flexible in a way. An update lock can be imposed on a record that already has a shared lock. In such a case, the update lock will impose another shared lock on the target row. Once the transaction that holds the update lock is ready to change the data, the update lock (U) will be transformed to an exclusive lock (X). It is important to understand that update lock is asymmetrical in regards of shared locks. While the update lock can be imposed on a record that has the shared lock, the shared lock cannot be imposed on the record that already has the update lock

Intent locks (I)

This lock is a means used by a transaction to inform another transaction about its intention to acquire a lock. The purpose of such lock is to ensure data modification to executed properly by preventing another transaction to acquire a lock on the next in hierarchy object. In practice, when a transaction wants to acquire a lock on the row, it will acquire an intent lock on a table, which is a higher hierarchy object. By acquiring the intent lock, the transaction will not allow other transactions to acquire the exclusive lock on that table (otherwise, exclusive lock imposed by some other transaction would cancel the row lock).

This is an important lock type from the performance aspect as the SQL Server database engine will inspect intent locks only at the table level to check if it is possible for transaction to acquire a lock in a safe manner in that table, and therefore intent lock eliminates need to inspect each row/page lock in a table to make sure that transaction can acquire lock on entire table

This is all the basics about locking in SQL Server . See you with next article.

 

Locking in SQL Server

Locking in SQL Server is mechanism to control and manage concurrency maintaining data integrity and validation. This allows multiple users to access  and use database concurrently.

Consider the scenario and let me paint the complete picture here. When we talk about concurrency in DBMS , we are talking about DBMS allowing multiple users conducting multiple transactions and operations all at the same time. The operation can be read , update , delete or create. All of this , should execute on correct data. If Transaction T1 updates a row transaction T2 should be able to read the updated data. All transactions should satisfy their ACID properties.

Must-Read : SQL Server Concurrency & Transaction Management

100 transactions are trying to use the same table. Some of these transactions will only read data others will change the data. There has to be some way or mechanism in place , which should decide whether to show committed data or uncommitted data to other transactions. Maybe , we need to make sure that unless a transaction is not complete , no other transaction can read the data. Depending on situation , may be some other approach can be used. However , To do this SQL Server offers isolation levels.

Also-Read  : SQL Server Isolation Levels

Now that all the pieces are in place and if you are familiar with the topics from above . Can you tell about transaction life cycle?

Transaction Life Cycle

Locking in SQL Server

Transaction has ACID properties and its execution is very simple. If a transaction is successful and all the operations in it are complete. Transaction is successful and it gets commit. If not, all operations which were previously completed in that transaction will face a rollback.

Lets take two transactions here .

Transaction T1


SELECT * FROM Students

Transaction T2


delete FROM Students

SQL Server take these two transactions to execute them concurrently  checks the isolation level . Default Isolation level for SQL Server is Read committed isolation level , so naturally T2 won’t be able to delete data unless T1 is complete. SQL Server does it by using LOCKS. 

Locking in SQL Server Explained

Mechanism to control and manage concurrency maintaining data integrity and validation. This allows multiple users to access  and use database concurrently.

Default Isolation level of SQL Server will lock the data for Transaction T1 and will only allow T2 to access the same data once T1 is complete. This way it will manage the concurrency and data integrity.

Now we have tables , diskpages in database . Tables have rows. A transaction may engage all database , or simply a row or may be just a diskpages in database. SQL Server decides upon seeing the transaction what to lock and what not. Because if a transaction is reading only a row of one table , it won’t make sense to block all the tables . Similarly , if one table is involved in the transaction other tables should be available to execute any operations. This way concurrency is achieved and at the same time data validation and integrity also remains intact.

Lock Granularity

Lock granularity tells us about the level of lock use. We have following levels

  1. Database Level Locks
  2. Table Level Locks
  3. Page Level Locks
  4. Row-Level Locks
  5. Field-Level Locks

Database Level Lock

  1. Good for batch processing but unsuitable for online multi-user DBMSs
  2. T1 and T2 can not access the same database concurrently even if they use different tables

Locking in SQL Server

See how two transaction behaves when this lock is in action? Whole database is locked until T1 is complete.

Table Level Locks

  1. T1 and T2 can access the same database concurrently as long as they use different tables
  2. Can cause bottlenecks when many transactions are trying to access the same table (even if the transactions want to access different parts of the table and would not interfere with each other)
  3. Not suitable for multi-user DBMSs

Locking in SQL Server

Here T1 requires row 5 to be updated , so the lock and after that T2 gets the lock .

Page-Level Lock

  1. An entire disk page is locked (a table can span several pages and each page can contain several rows of one or more tables)
  2. Most frequently used multi-user DBMS locking method

Locking in SQL Server

Row-Level Lock

  1. Concurrent transactions can access different rows of the same table even if the rows are located on the same page
  2. Improves data availability but with high overhead (each row has a lock that must be read and written to)

Locking in SQL Server

Field-Level Lock

  1. Allows concurrent transactions to access the same row as long as they require the use of different fields with that row
  2. Most flexible lock buy requires an extremely high level of overhead

These diagrams illustrates how concurrency and locks works together on the basis of isolation levels to make sure that data is always consistent . Each transaction exhibits ACID properties.

 

SQL Server Isolation Levels

SQL Server Isolation Levels helps in concurrency & transaction management by preventing concurrency problems . In an environment where we have multiple users modifying and accessing the same information , it is essential that they all can work at the same time without any delays. They all should be able to read correct data and modify data at will.

Also see : SQL Server Concurrency & Transaction Management

 

Commit & Rollback

In transaction’s ACID Properties we talked about , how a transaction has to change the state of the database and complete it’s task and leave the system in a consistent state . What would happen , if system fails while a transaction is still under process? How will transaction leave the system in consistent state ?

For example , One transaction has these two tasks :

  1. Delete Rows from one table
  2. Update rows from another table.

You can assign as many tasks as possible to a transaction in SQL Server. How to do that is Topic for another post another day. So we have two tasks in a transaction. Now during it’s execution if task 1 is completed and while it was updating the database system fails. It is unable to update the data , so what now?

COMMIT

Commit in SQL Server is when a change is made permanent into database.

ROLLBACK

Rollback undo all the changes made into database upto a certain point.

Coming back to our question now.  When a transaction fails to complete , it will rollback the changes made by task one of the transaction. This way it manages its ACID properties . It will leave the database in a reliable and consistent state by undoing task 1.

If everything goes well and both tasks are completed by transaction. These changes will then be committed in database and then they can’t be reverted by  rollback.

Transaction Dependencies Problems For Concurrency

A transaction should always and always follow ACID Properties  , but sometimes a transaction may end up exhibit other behaviours . These behaviours are the concurrency problems . SQL Server use Isolation levels to provide control over them. Let’s take a look at the few of them

  1. Dirty Reads
  2. Lost Updates
  3. Repeatable Reads
  4. Phantom Reads

A little explanation would be nice before we dive into Isolation levels and then we will see which isolation levels manage these problems.

Dirty Reads

When a transaction tries to read a data which is being modified by some other concurrent transaction which is not committed yet. There is a  risk, that this other transaction may never be committed,  leaving the original transaction with wrong data.

Consider an example of our University Management System , for example university management decides to check how many students belong to a Lahore city. They will run a SQL query to get the data. At the very same time a student A puts a request to change his city from Karachi to Lahore because he has moved .  Now think the result university management will get , should it contain that student A who has requested his city to change? Answer is Yes.

However , there is healthy chance Student A is not a part of the result university management is going to get from the SQL Query. This is a dirty read.

Lost Updates :

Occur when two transaction read the same data and both manipulate the data, changing its value and then both try to update the original data to the new value. The second process might overwrite the first update completely.

Repeatable reads:

Repeatable reads condition occur when a transaction tries to read a data multiple times and and between the two reads, another transaction modified that data. Therefore when the original transaction tries to read that data second time, it find different value for that data. In other words, the original transaction reads two different values for the same data.

Phantom reads:

When a transaction needs to execute a same query twice(or multiple times) and it gets different set of rows from the result of its first execution of the query. This happens if some another transaction adds or deletes rows between the two executions of the query and these change in rows becomes the the part of final result.

SQL Server Isolation Levels

SQL Server has six isolation levels .

  1. Read Uncommitted
  2. Read Committed
  3. Repeatable Read
  4. Serializable
  5. Snapshot

All these isolation levels helps in preventing one or more above problems . Remember Higher the level of isolation level you set the lesser will be the availability of data for other users. Trade off is the word you can use to describe this . To gain some you have to lose some. lovely!

Read Uncommitted

SQL Tips :-  When learning SQL Server or any other DBMS and database terms always focus on the term’s meaning. It will be self-explanatory.

This Isolation level allows other transactions to read data which is not committed yet. This means if Transaction A is writing the data and Transaction B comes in , it will be able to read the data changed by Transaction A.  Regardless of the fact that it may receive Rollback command and all the changes made by Transaction A will be reverted.

This can cause dirty reads, lost updates & phantom reads. This is Isolation level with least restrictions and more concurrency .

Read Committed Isolation Level

Read committed isolation level doesn’t allow Transaction B to read data changed by Transaction A until it is committed. This reduces concurrency but prevents dirty . Phantom reads and repeatable reads are still a possibility.

SQL Server Read Committed is the default isolation level in it’s settings. However , you can change that. This is second least isolation level.

Repeatable Read Isolation Level

This SQL Server Isolation level behaves like read committed and also it will prevent any other transaction to modify the data which another transaction is reading until it is completely free. This eliminates dirty reads and repeatable reads  . Phantom reads are still a possibility.

Serializable Isolation Level

SQL Server Serializable Isolation level is the strongest isolation level .  If Transaction A is changing data on table Students and Transaction B comes to read the data from Transaction A  . SQL Server will not allow transaction B to read the data until Transaction is not complete. Similarly , If Transaction A is reading the data from table Students . Transaction B come to change the data in table Student. SQL Server will make sure Transaction A complete itself before allowing Transaction A to do anything. It will completely remove phantom reads!

Snapshot Isolation Level

Highest level of isolation level and least concurrency .  A transaction recognise only data which is committed before the start of the transaction. Any modification of the data after the transaction has started ,  is not visible to any statements of the currently executing transaction.  Row versions are created for each transaction and stored in temp db which handles these transactions. This  isolation level eliminates dirty reads, lost updates, repeatable reads and Phantom reads problem.

 

In the next article on Locks , we will see how SQL Server is managing all these Isolation levels and how it keeps data clean and allow multiple transaction to work at the same time.

Share your insights and feedback in comments if there is more to add here. Do share this with your friends on Social Media and oh you can follow us on twitter , facebook as well.

See you with our next article!
Cheers!
Happy Learning!

 

SQL Server Concurrency & Transaction Management

SQL Server Concurrency & Transaction Management  is a process which allows multiple users to access data and perform multiple operations maintaining data integrity and smooth execution all at the same time. Microsoft SQL Server manages concurrency control by using locks to help protect data. The locks control how multiple users can access and change shared data at the same time without conflicting with each other.

This a concept which needs to be in developer’s mind all the time while writing the scripts to avoid any problems that may occur due to execution. Such as data loss , dirty reads and deadlocks which can eventually through out the whole system into darkness.

Hello subscribers and readers , I hope you guys are doing great and apologies for being absent for a long time. We all need a break from our routine from time-to-time. Don’t We?  Hopefully , you will see more posts in month of May as there are few new series on SQL Server that are in queue . In meantime , if you wan’t to ask any question or would like to have a post on a topic . Feel free to Contact us and we will publish it in our Questions Category.

SQL Server Concurrency

SQL Server Concurrency manages number of operations simultaneously  without letting them interfering with each other. Concurrency comes into play when a process that is changing data prevents other processes from reading that data . Similarly ,  when a process that is reading data prevents other processes from changing that data concurrency comes in to ensure safe execution of transactions .Concurrency is also affected when multiple processes are attempting to change the same data at the same time .

SQL Server Concurrency & Transaction Management Case Study

Think of the SQL Server Concurrency as a system which is responsible to make sure that operations such as read , write and update  always operate on correct data. An enterprise application manages multiple operations from hundreds of users all at the same time . Now if someone , adds a new record in the application. SQL Server tries to add the details in sql database . At the very same time it is possible that a manager is trying to access the reports. SQL Server then access data to fill out the reports.

Now you have two operations , one is changing the data and the other is reading the data. Can you see the problem here? The read operation and write operation both could be trying to access same data . Now can you guess which one will be able to access the information first? Read or Write?

SQL Server concurrency and Transaction management takes care of taking such decisions without slowing down the process. It makes sure that both these operations are performing correctly and that there is no data loss during writing and correct data is present in reports.

To understand the SQL Server Concurrency first we need to understand what is a Transaction Management .

Transaction Management

Transaction is a  logical unit of work that must be either entirely complete or not. Successful transaction changes the database from one consistent state to another

Using case study example , a user who wants to read the data when ask the system to do so. SQL Server treats the task as a single transaction. Similarly for the write operations our SQL Server will treat as another transaction. A transaction has ACID properties.

ACID Properties

Atomicity:  Requires that all operations (SQL requests) of a transaction be completed; if not, then the transaction is aborted. A transaction is treated as a single, indivisible, logical unit of work

Consistency: ensures that a transaction won’t allow the system to arrive at an incorrect logical state – the data must always be logically correct.  Consistency is honored even in the event of a system failure.

Isolation: Data used during execution of a transaction cannot be used by second transaction until first one is completed

Durability:Indicates permanence of database’s consistent state.When a transaction is complete, the database reaches a consistent state. That state can not be lost even if the system fails. If a system failure occurs while a transaction is in progress, the transaction is completely undone, leaving no partial effects on data.

These four principles govern how SQL Server or any other DBMS handles a transaction.

How Does SQL Server Concurrency Work?

Take a long breath and let everything settle in . We read about concurrency and why is it so important for SQL Server to manage multiple operations at a time and then we read about transactions and its properties.  Now we know that every task or single instruction (transaction) has to satisfy all four conditions in order to be successful . Otherwise transaction will be incomplete.

From our Case Study , imagine hundreds of write /update transactions coming in . At the same time , few users are viewing the reports to make important decisions. Different users accessing same data for different purposes. SQL Server Concurrency & Transaction management will make sure that all the tasks are properly aligned and that each transaction is completed before the other transaction.

Some transactions may change the state of the data and some may not . So it is very crucial for the SQL Server to maintain its ACID property even if the system fails. SQL Server Concurrency & Transaction management is achieved by the use of locks .  To understand the locks we need to understand about concurrency and transactions . DBAs can choose how these locks work by using isolation levels. The two topics that we will cover in Part 2 of this article.

I hope this proves useful. You can follow us on twitter , facebook or subscribe us to get the second part of this article and many more such articles in your mailbox. Feel free to comment below , if anything requires further clarification or if i have make a mistake. I still feel a bit rusty after my vacations.

See you with our next article!
Cheers!
Happy Learning!

SQL Subquery | SQL Beginner Guide

SQL Subquery is another tool SQL Server has which can be used to fetch data from other tables to be used in main query according to the needs.  Inner query and nested query are other names of SQL Subquery.

SQL Subquery can be used with the SELECT, INSERT, UPDATE, and DELETE statements along with the operators like =, <, >, >=, <=, IN, BETWEEN, etc.

The idea is that a subquery returns data for one column which can be further utilised in outer query.

Too much to read , let’s just start with examples and we will clear the points as we go. Sounds fair?

Prepare the data

Examples and queries that we will see here will run on our University Management System database . If you want to use these SQL queries for practice , create the same database , you need to just copy paste the script from here .

SQL Subquery Examples

Take a look at the tables and data we have and then see the query :

We want to see the courses who belong to a department with budget more than 200000.00!


SELECT [CourseID]
,[Title]
,[Credits]
,[DepartmentID]
FROM [School].[dbo].[Course]
WHERE [DepartmentID] IN
(SELECT [DepartmentID]
FROM [Department]
WHERE [Budget] > 200000.00)

Result is

 

This query will return all departments with budget more than 200000.00!To understand this clearly ,  let’s discuss these terms and see how they exist in an SQL query

  1. Inner query
  2. Outer Query

From the above SQL queries you can find these three

SQL Subquery/Inner Query/Nested Query


(SELECT [DepartmentID]
FROM [Department]
WHERE [Budget] > 200000.00)

 

Outer Query


SELECT [CourseID]
,[Title]
,[Credits]
,[DepartmentID]
FROM [School].[dbo].[Course]
WHERE [DepartmentID] IN

 

If you read carefully the complete query you will realise that we have basically filtered the data from other table and used the result rows in WHERE clause. Here is how SQL Server handles this

Our SQL Subquery produced this

These two rows then were matched one-by-one with outer query

One could argue that the same can be achieved by SQL Joins , you are right . SQL Joins could give us much more flexibility than SQL Subquery but it depends highly on the situation as well as requirements .

See : SQL Joins Explained

When to Use SQL Subquery versus a Standard Join?

There cannot be any universal rule to answer this question . We can adopt some best sql practices to make sure that we don’t end up worrying about tuning the same basic SQL queries a year later. A stage where you have to spend a good amount of time to go through the whole thing.

More often than not main concern of DBAs and data developers is performance of SQL query. So use of SQL Subquery or SQL Join can be decided if we keep performance as our deciding factor.

Independent Subquery

If a subquery can be executed without outer query entirely it’s own then it’s independent subquery. Such queries are evaluated once and the result is then used by outer query.

Dependent Subquery

If a subquery cannot be executed without outer query and you get a syntax error because it refers to some tables outside of the subquery then its a dependent subquery. Such queries can be very slow . Reason is that they have to get the values from outer query as well. This with example is discussed at the bottom in correlated query exmaple . So if a outer table has 10,000 rows the inner query will be executed 10,000 times.

To answer our question many DBAs avoid dependant subquery and change it to JOIN .

Hopefully , after this you can decide which tool to pick and what to do with such a scenario.

Types of SQL Subquery

Following are few types of SQL subquery with exmaples

Single row subquery : Returns zero or one row.


SELECT [DepartmentID]
,[Name]
,[Budget]
,[StartDate]
,[Administrator]
FROM [School].[dbo].[Department]
WHERE [Administrator] = (SELECT PERSONID FROM PERSON WHERE PERSONID=2)

This produces only one row.

Multiple row subquery : Returns one or more rows.


SELECT [CourseID]
,[Title]
,[Credits]
,[DepartmentID]
FROM [School].[dbo].[Course]
WHERE [DepartmentID] IN
(SELECT [DepartmentID]
FROM [Department]
WHERE [Budget] > 200000.00)

Correlated subquery

Reference one or more columns in the outer SQL statement. The subquery is known as a correlated subquery because it is related to the outer SQL statement.

From our previous articles How Grouping Works in SQL .  We use the example where we concluded the total marks of each student. It looked like this SQL Subquery

Can you find me the students who has above average grades? Use this query


SELECT FirstName, TotalGrades
FROM (

SELECT P.FirstName,SUM(GRADE) AS TotalGrades
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID

GROUP BY P.FirstName) A
WHERE A.TotalGrades > (
SELECT AVG(GRADE) AS TotalGrades
FROM [School].[dbo].[StudentGrade] SG

)

Run this query and it will return students with above average marks.Also this is dependant query

 

SQL Subquery Correlated example

How to Use SQL Subquery with SELECT Statement

SELECT [CourseID] ,[Title] ,[Credits] ,[DepartmentID]
FROM [School].[dbo].[Course]
WHERE [DepartmentID] IN
(SELECT [DepartmentID]
FROM [Department]
WHERE [Budget]> 200000.00)

 

How to Use SQL Subquery with the INSERT Statement


INSERT INTO [dbo].[Department]
([DepartmentID]
,[Name]
,[Budget]
,[StartDate]
,[Administrator])

VALUES  (5,'Computer', 800000, '2007-09-01 00:00:00.000',(SELECT PERSONID FROM PERSON WHERE FIRSTNAME='Awais'))

 

Use SQL Subquery with the UPDATE Statement


UPDATE [dbo].[Department]
SET [Name] = 'Computer Science'
WHERE [Administrator] =(SELECT PERSONID FROM PERSON WHERE LastName='Harui')

 

Use SQL Subquery with the DELETE Statement


DELETE [dbo].[Department]

WHERE [Administrator] =(SELECT PERSONID FROM PERSON WHERE LastName='Harui')

 

SQL Subquery Rules :

  • SQL Subquery must be enclosed within parentheses.
  • Can have only one column in the SELECT clause, unless multiple columns are in the main query for the subquery to compare its selected columns.
  • An ORDER BY command cannot be used in a subquery, although the main query can use an ORDER BY. The GROUP BY command can be used to perform the same function as the ORDER BY in a subquery.
  • SQL Subquery that return more than one row can only be used with multiple value operators such as the IN operator.
  • A subquery cannot be immediately enclosed in a set function.
  • The BETWEEN operator cannot be used with a subquery. However, the BETWEEN operator can be used within the subquery.

This is pretty much about subqueries and if you find it useful . Share it with your firends and colleagues and let us know about how effective it is.

We are moving towards advanced topics now and also taking up the guests posts as well. So feel free to contact and have a good time learning.

Happy Learning!
Cheers

How Grouping Works in SQL

One of our subscribers reached us and asked How Grouping works in SQL ?  So today , this is what we are going to do . We will discuss in detail about grouping in SQL and how it actually works behind the scenes. Also , this is our first post in Questions category and we will be doing more as there are bundle of requests in the queue. So if you want to contribute and like us to write something for you , please do not hesitate.

How Grouping works in SQL?

Grouping summarises the whole data set into smaller chunks of data where each row represents a group . Group is when we combine a group of things, elements under some rule. While working with SQL reports , a lot of times we will come across scenarios where we want to summarise the data. For this purpose we use MS SQL Group By Clause

Also see : MS SQL Group By Clause

because this clause summarise the data into groups . Time to dig in now.

Prepare the data

Examples and queries that we will see here will run on our University Management System database . If you want to use these SQL queries for practice , create the same database , you need to just copy paste the script from here .

Case Study : Grade Reports from University Management System

We are asked to get the top 5 students with top grades . One best query writing practice is to try and visualise the end result and then build the query around it. To do this , you need first to see the underlying data. If you run the below SQL query , you will have a good look at the data we have.

SELECT P.FirstName,C.Title,Grade
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID

How Grouping Works in SQL

 

Remember our objective. We need first five students with top marks out of these forty rows. Give a brief and a long look at the table. You are seeing three columns here. Student names , course names and respective grades. Objective is to find the top 5 students .

Let’s break up the table and visualise what is we want. We want two things

  1. Student Names
  2. Student Grades

Now it is clear that we want top achievers which are actually the students who have obtained highest marks. How can you find that? By adding marks in all courses of each student.

Aggregate functions in SQL will help us to add the grades . See the Grades columns , its a column of numbers.

How Grouping Works in SQL examples

Can you guess of which aggregate functions in SQL we will use? SQL Count () or SQL Sum() ?

See : Difference Between SUM() & Count ()

It has to be SQL Group By SUM  which will do the job for us. Now , we have already determined that when Grouping clause is in effect each row represents
a group . So to achieve the objective remove Course column from the query

Read : MS SQL Group By Clause 


SELECT P.FirstName,GradeFROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID

How Grouping Works in SQL example 2

See the table again. Let’s find total marks of each student. Run this query


SELECT P.FirstName,SUM(GRADE) AS TotalStudentsinCourse
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID


It will give us out this

How Grouping Works in SQL example 2

We have applied the aggregate function but haven’t told our SQL Server on what basis it should be grouped. We want to group the grades against each student that is why we are writing FirstName in query . To apply SQL aggregate function of SUM on grades column you need to put First Name in Group By Clause. When you do that it will create groups for each student like this


SELECT P.FirstName,SUM(GRADE) AS TotalStudentsinCourse
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID
GROUP BY P.FirstName

How Grouping Works in SQL example 4

All students are appearing only once because all same elements are grouped together . Each student was taking up more than one course so SQL Server first combines the elements into group. In our case , unique student names.  Now it will apply SQL Sum for each group. Combine both steps and SQL Server handles bot these steps like this

How Grouping Works in SQL example 5

This is the final output you will see

How Grouping Works in SQL example 6

See how SQL Server reduces the table into groups and then by the use of aggregate functions in SQL it gives us results for those groups. We will use TOP clause with Order By Clause to find the results

RESULT QUERY


SELECT TOP (5) P.FirstName,SUM(GRADE) AS TotalStudentsinCourse
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID
GROUP BY P.FirstName
ORDER BY SUM(GRADE) DESC

How Grouping Works in SQL example 7

If you use TOP clause always use Order By clause with it to get correct data. Because , results can be random , so if you apply Order By then the order will be consistent. Topic for another day!

ALL YOU NEED TO KNOW ABOUT GROUPING

  1. Grouping is applied on all columns as combination in the same order they appear in query.
  2. Each row represents a unique row.
  3. Filtering on grouping is done by using HAVING Clause
  4. Use ORDER BY clause to arrange the data set in a order.

if you can find more points about grouping , do comment and we will add this in our list.

I hope now you understand about How Grouping Works in SQL because this has been a brief articleNow if you like this article , do share with your friends and colleagues and recommend us . Oh! and if the sponsored content is bothering you , apologies for that too because it’s the only source of revenue for this blog.

See you with next post very soon!

 

MS SQL Query Execution Order Examples | SQL Beginner Guide

MS SQL Query Execution Order or SQL order of operations determines the sequence in which SQL clauses are executed by database engine . This helps in writing the error free and optimised queries .This SQL Query Processing is applied from a basic query to a thousands lines of queries , our database engine uses this execution order . See below the MS SQL Query Execution Order

  1. FROM
  2. ON
  3. OUTER
  4. WHERE
  5. GROUP BY
  6. HAVING
  7. SELECT
  8. DISTINCT
  9. ORDER BY
  10. TOP

Also see : SQL SELECT Query & Statements

Recommended : MS SQL Group By Clause

Also see : MS SQL Having Clause

One of the biggest differences & advantages of SQL Server is the way it processes SQL Code. Where other programming languages have sequential order i.e. from top to bottom . MS SQL Query Execution Order is unique and systematic phases. Each phase is responsible for the successful execution of next phase . Each phase creates a virtual tables and feed it to the next phase and at the end all the pieces of puzzle are put together to display a result.

Let’s see some examples & we will apply the MS SQL Query Execution Order on the examples to see how it all goes behind the scene .

MS SQL Query Execution Order Examples


SELECT CourseID ,[StudentID] 

FROM [School].[dbo].[StudentGrade] 

WHERE [StudentID] in (7,8,9) 

First of all “FROM [School].[dbo].[StudentGrade]” clause will apply to the query

FROM Clause BI Talks BI

This table has 40 rows , we are only showing 10 here but SQL Server will obviously takes the complete table. So , SQL Server now has 40 rows in it’s virtual table. Next phase applicable is WHERE clause.

SQL Server will now execute this line WHERE [StudentID] in (7,8,9) . Result will be as follows

WHERE Clause BI Talks BI

Out of those 40 rows Where clause have filtered students with StudentID 7,8,9. At the end SELECT will select the columns which will be displayed.

So , you see how SQL Server handles the query. Can you write the order for the following query?

 
SELECT SUM([Grade]) As TotalGrades , StudentID 
FROM [StudentGrade] 
GROUP BY StudentID 
HAVING SUM(Grade); 

3 Things To Know About MS SQL Query Execution Order

  1. Aliases for the column cannot be used in WHERE clause. Because WHERE is executed first & SQL Server has no way of knowing about your alias.
  2. For INNER JOIN ON is like a WHERE clause. So if you put your filter criteria in ON clause it will behave as WHERE. For details read this from Pinal Dave. Huge fan of his blog for years.
  3. Virtual tables formed in each phases are logical only.

 

This is pretty much it about MS SQL Query Execution , if you like this article share with your friends 0n Social Media & recommend them to read us.

We are moving to advance tutorials on SQL in future, so if you have some certain topics in mind . I’ll advise you to comment them below and we will surely add them in the series as well.

See you next time

May the Force be with you!

 

MS SQL Having Clause | SQL Beginner Guide

Filtering the Groups in MS SQL ( Transact SQL-TSQL)  requires MS SQL Having clause . We discussed in detail about sets & groups in our post on Group by clause.   In order to be able to practice the examples that we will discuss , i recommend you to use our sample database. It’s a script you just have to copy paste in your MS SQL Server Management Studio & everything will be ready.

Also see : How to Install MS SQL Server Management Studio 

MS SQL Having Clause

MS SQL Having clause is used to apply a filter on groups that are formed as a result of group by clause.

SYNTAX


SELECT aggregatefunction(columname), columname2

FROM tablename

GROUP BY columnname2

HAVING condition

These four are mandatory steps to follow for MS SQL Having clause to work. You can use ORDER BY & WHERE as well if needed be.

Before we jump to the examples , I want to clear one more confusion here which is one of the biggest mistake people make while working with MS SQL Group by and MS SQL Having group clauses.

MS SQL Having VS Where

While working with groups you may often come across scenarios where you have to decide between where & having clause. Understanding what they do will help you to write a sound query .

Where is used when we want to apply filtering on a data set or simply when we want to use them on a simple data table. For example , I want to see how many courses are studied by any particular student or all students. Let those Students have Student ID (7,8,9) .See below


SELECT CourseID
,[StudentID]

FROM [School].[dbo].[StudentGrade]
WHERE [StudentID] in (7,8,9)

This is the data without grouping . Data looks like this now

Where example

Now time to apply grouping. My requirement was that I needed to see how many courses are studied by any particular student or all students.  You can see that StudentID 7 & 8 has two rows while 9 has only one.

Here you can count them but imagine you have a data set of thousands of rows. Counting them won’t be an option. So Let’s apply aggregate function to make our life easier.


SELECT COUNT([CourseID]) CourseID
,[StudentID]

FROM [School].[dbo].[StudentGrade]
WHERE [StudentID] in (7,8,9)
GROUP BY [StudentID]

This gives us

Grouping SQL

Much easier to count now right?

So this completes my requirement of how many courses are studied by any particular student or all students! Now comes the interesting part ! Are you ready?

My next requirement is to find the students who are studying 2 courses . Let’s see again the original data

Where example

Can you use a where clause? try & you will find the error saying An aggregate may not appear in the WHERE clause….. because you cannot use aggregated functions with WHERE & you certainly cannot apply a WHERE clause on GROUPED data.

Write this query now


SELECT COUNT([CourseID]) CourseCount
,[StudentID]
FROM [School].[dbo].[StudentGrade]
WHERE [StudentID] in (7,8,9)
GROUP BY [StudentID]
HAVING COUNT([CourseID])=2

 

Her’s how it works . See CourseCount was a column which was grouped so applying a Where clause on it wasn’t possible so we used HAVING Clause.

Where is working on StudentID because that is not a grouped column .

So the difference between where & having is simple . Where is applied use for simple set & Having is used with grouping.

MS SQL Group By & Having Clause In SQL

COUNT()

SELECT COUNT (Title) AS TotalCourses , CourseID
FROM [Course]
GROUP BY CourseID
HAVING COUNT(CourseID)=1 ;

SQL GROUP BY SUM()

SELECT SUM([Grade]) As TotalGrades , StudentID
FROM [StudentGrade]
GROUP BY StudentID
HAVING SUM(Grade);

GROUP BY COUNT() MULTIPLE COLUMNS

SELECT C.Title,COUNT([StudentID]) AS TotalStudentsinCourse ,DepartmentID
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
GROUP BY C.Title ,DepartmentID
HAVING COUNT([StudentID]);

So in last three of our articles we have learned what is aggregated data , about SQL Server functions on aggregation and how to apply filtering groups in SQL.

Also see

  1. SQL Server Aggregate Functions 
  2. Groups in SQL 

I have received a request from one of our readers about how grouping works in actual. Very interesting article is on its way! See you next time with our first Question & Answers post.

MS SQL Group By Clause | SQL Beginner Guide

MS SQL Group By Clause in SQL is used with aggregate functions to group data based on columns. Before , we jump to details of the Group By clause let’s try to understand about grouping.

SQL is based on set-theory and whenever you’re composing a query you treat a table as a whole & not row-by-row. Similarly , when we use aggregate functions we have to group the sets . This way you will have aggregated view of the set.

What is the Difference Between a Group & Set?

What is a SET in SQL

A set is collection of elements . In our sample database , we have a set of courses with the students who have taken them formed up as a table Studentgrade. See below

What do you see? Collection of different courses right?

But as a whole they’re all courses! Wonderful you’re learning already!

What is a Group in SQL

Group is also a set but it is enclosed under certain conditions. What those conditions might be is entirely up to us. We humans , learn by examples.

Let’s apply a condition on our StudentGrade table to see how many students have taken up a particular course. This we’ll form a Group!

SQL Group By

 

This is just a snapshot of 40 rows of data. Run this code


SELECT C.Title,COUNT([StudentID]) AS TotalStudentsinCourse

FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID

GROUP BY C.Title

This will return SQL Grouping Example

This is what we did here. We have applied a condition on a set of courses & student with their grades & formed a group of Number of students who have taken up a course.

Another way to explain it is this – > Group is based on Course Name & they are grouped together on a condition which is total number of students in a course

So , the difference between two is very simple now .

Set is collection of elements & group is a set based on any condition.

MS SQL Group By Clause

Aggregate functions (SUM () , COUNT(),AVG(),MIN(),MAX() ) requires data to be grouped by Group By Clause based on any one or more than one columns.

MS SQL Group By Clause appears at the end of the SELECT query .

SYNTAX

SELECT aggregatefunctions(columnname1) , columnname2
FROM tablename
GROUP BY columns2

Let’s use the example we have

SELECT C.Title,COUNT([StudentID]) AS TotalStudentsinCourse
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
GROUP BY C.Title

This is the result
SQL Grouping Example

5 Things to Know About MS SQL Group By Clause

  1. In aggregate functions if we use only one column on which aggregation is applied , then it’s not necessary to have a GROUP BY clause . But if you want to introduce another column then it becomes necessary
  2. When you apply Group by clause on a column , all the values in columns will appear only once. Since , they are now in groups. See above example of Courses
  3. All the columns that you will introduce in SELECT query , also will appear in GROUP BY clause .
  4. Grouping will be for all the columns in the group by clause .
  5. Group data should be filter by HAVING clause

We will write about HAVING Clause in next article.

Time to apply some aggregate functions and apply all the theory we have learned today!

GROUP BY EXAMPLES

SQL GROUP BY

COUNT()

SELECT COUNT (Title) AS TotalCourses , CourseID
FROM [Course]
GROUP BY CourseID;

 

SQL GROUP BY SUM()

SELECT SUM([Grade]) As TotalGrades , StudentID
FROM [StudentGrade]
GROUP BY StudentID

GROUP BY COUNT() MULTIPLE COLUMNS

SELECT C.Title,COUNT([StudentID]) AS TotalStudentsinCourse ,DepartmentID
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
GROUP BY C.Title ,DepartmentID

Aggregate functions & Group by functions are responsible when we want to summarise the data for our reports. Both of these concepts are comprehensively use by the ETL developers & query writers to do the reporting. So if you’re looking to be in any of these areas you have to have a great understanding of SETs , Groups & then , understanding of aggregate functions with Group by clause should be a piece of cake.
I hope you have enjoyed this article. Do share on your Social media & subscribe us for the latest articles.

See you with HAVING Clause article next time

Cheers!

SQL Server Aggregate Functions | SQL Beginner Guide

SQL Server offers several built-in functions. SQL Aggregate functions are among the ones most widely used in reporting & data joining scenarios. Following are few SQL Aggregate functions we will discuss today

  1. COUNT ()                  returns the total number of values in a given column
  2. SUM ()                       returns the sum of the numeric values in a given column
  3. AVG  ()                       returns the average value of a given column
  4. MIN()                         returns the smallest value in a given column
  5. MAX ()                       returns the largest value in a given column

What is Aggregate

Anytime when I start learning about anything , my best practice is to learn meaning of the word used. So , let’s start with aggregate  and it’s meaning.

If you search google it says

A whole formed by combining several separate elements

This couldn’t be more precise than this. In SQL Aggregate is when we combine a column to give out the single value result. For instance , you may be interested in average grades of a student. You may want to see the total number of courses taken up by a Student. Maximum marks obtained , minimum marks obtained & many such scenarios can come up where you want to see aggregated form rather than a long 100 rows of data. For such purposes SQL Server has Aggregate Functions.

SQL Server Aggregate Functions

If you’re looking to work in database , data warehouse or Business Intelligence you have to have a very strong grip on this topic. Reason is simple , databases are built so that they can be used later to retrieve data for reports. People who make decisions only see the aggregated data. Then if necessary , they go for detailed data.

I will use our sample database for the demonstration of these functions. I recommend you to copy paste the code in your SQL Server Management Studio and just execute the code. You will have a sample database to practice it on.

The SQL COUNT(), AVG() and SUM() Functions

SQL COUNT( ) Function

It is use for counting the number of rows for any columns. It only counts them regardless of what type of data they have. Let’s use Course table for this function. In case you don’t have the sample database. Look at the table below.

sql table

 

SQL COUNT Syntax

SELECT COUNT    (columnname)
FROM  tablename

COUNT() Examples

Just put column name and you will have the count of rows for that columns. Using our sample database let’s see total courses we have .


SELECT COUNT (Title) AS TotalCourses

FROM [Course];

Count function in sql

Our column Title has 10 rows according to our query which is correct.

 

What if you want to see how many courses have 4 credit hours . Simply introduce Where clause

SELECT COUNT (Title) AS CoursesWithCreditH
FROM [Course]
WHERE Credits =4;
Result :
Count where

You can find total number of rows in a table by using this function as well.


SELECT COUNT (*) As NumberOfRows
FROM [Course]

Result will be 10.

SQL Server Average Function AVG()

This function returns the average value of a numeric column. Simply returns the average

SYNTAX

SELECT AVG (columnname)

FROM tablename

AVG() Examples

For this we will use StudentGrade table. See it’s data belowsql server average function

If you want to see the average grades of students for all subjects. Run this query


SELECT AVG([Grade]) As AverageGrades
FROM [StudentGrade]

Can you introduce a WHERE clause ? Comment below if you can!

SUM() Function In SQL

This SQL aggregate function sums up all the values in the column & returns a single value

SYNTAX

SELECT SUM (columnname)

FROM tablename

SUM() Example

Write a query that returns total grades obtained by the whole class.


SELECT SUM([Grade]) As TotalGrades
FROM [StudentGrade]

sql sum

Introduce a StudentID in where clause so that we can see the total marks by a student


SELECT SUM([Grade]) As TotalGrades
FROM [StudentGrade]
WHERE StudentID=2

sql sum count

Difference Between SUM() & Count ()

Database developers often mix Sum with Count. These two functions have very clear names & their purpose is also exactly as written. The mistake is committed in it’s execution. Sometimes , we want the total number of rows in SQL table. You should use count function for that & when you want to sum the values in it then Use Sum Function.

See below


SELECT COUNT(*) As TotalRows
FROM [StudentGrade]

This will give you total number of rows which is 40, reason is simple because now you are counting the number of rows in table.

Now execute this query


SELECT COUNT(Grade) As TotalRows
FROM [StudentGrade]

Result will be 36 rows. You know why ? Because our Column Grade has 4 NULLS in it.

Now consider this . What if you want to see the total marks obtained by the StudentID 2. Will you use Count or SUM?

Best way is to run queries on it.  When we run this query


SELECT COUNT(Grade) As TotalRows
FROM [StudentGrade]
WHERE StudentID=2

It returns

Does this answer my question. I wanted to see the total marks not the total rows.

Let’s Run SUM query


SELECT SUM(Grade) As TotalGrades
FROM [StudentGrade]
WHERE StudentID=2

It Returns

sql sum vs count

This is actual answer.

So keep in mind the difference between Total Number of Rows is Count & Total of a column in SUM

 

MIN () & MAX () FUNCTION

Min() Stands for minimum & Max () Stands for Maximum. They are used to find minimum/maximum values in a column

SYNTAX MIN()

SELECT MIN(columnname)

FROM tablename

SYNTAX MAX()

SELECT MAX(columnname)

FROM tablename

MIN() Example


SELECT MIN(Grade) As TotalGrades
FROM [StudentGrade]

This returns minimum marks obtained by a student

MAX() Example


SELECT MAX(Grade) As TotalGrades
FROM [StudentGrade]

SQL max()
This returns maximum marks obtained.

SQL Aggregate functions are super useful in summary reports , in ETL , in data auditing & plenty more. You may have noticed that in all Examples I all only used one column. That is not co-incidence, If you introduce a second column in SELECT query you have to use GROUP BY clause & HAVING Clause for slicing the data. Group By & Having is our next article. We will learn how to use SQL aggregate functions to find useful data & a report.

Hope you have enjoyed this article & please do comments if you think there is a better explanation needed. See you next time

Happy Learning!
Cheers!