SQL Server Isolation Levels

SQL Server Isolation Levels helps in concurrency & transaction management by preventing concurrency problems . In an environment where we have multiple users modifying and accessing the same information , it is essential that they all can work at the same time without any delays. They all should be able to read correct data and modify data at will.

Also see : SQL Server Concurrency & Transaction Management

 

Commit & Rollback

In transaction’s ACID Properties we talked about , how a transaction has to change the state of the database and complete it’s task and leave the system in a consistent state . What would happen , if system fails while a transaction is still under process? How will transaction leave the system in consistent state ?

For example , One transaction has these two tasks :

  1. Delete Rows from one table
  2. Update rows from another table.

You can assign as many tasks as possible to a transaction in SQL Server. How to do that is Topic for another post another day. So we have two tasks in a transaction. Now during it’s execution if task 1 is completed and while it was updating the database system fails. It is unable to update the data , so what now?

COMMIT

Commit in SQL Server is when a change is made permanent into database.

ROLLBACK

Rollback undo all the changes made into database upto a certain point.

Coming back to our question now.  When a transaction fails to complete , it will rollback the changes made by task one of the transaction. This way it manages its ACID properties . It will leave the database in a reliable and consistent state by undoing task 1.

If everything goes well and both tasks are completed by transaction. These changes will then be committed in database and then they can’t be reverted by  rollback.

Transaction Dependencies Problems For Concurrency

A transaction should always and always follow ACID Properties  , but sometimes a transaction may end up exhibit other behaviours . These behaviours are the concurrency problems . SQL Server use Isolation levels to provide control over them. Let’s take a look at the few of them

  1. Dirty Reads
  2. Lost Updates
  3. Repeatable Reads
  4. Phantom Reads

A little explanation would be nice before we dive into Isolation levels and then we will see which isolation levels manage these problems.

Dirty Reads

When a transaction tries to read a data which is being modified by some other concurrent transaction which is not committed yet. There is a  risk, that this other transaction may never be committed,  leaving the original transaction with wrong data.

Consider an example of our University Management System , for example university management decides to check how many students belong to a Lahore city. They will run a SQL query to get the data. At the very same time a student A puts a request to change his city from Karachi to Lahore because he has moved .  Now think the result university management will get , should it contain that student A who has requested his city to change? Answer is Yes.

However , there is healthy chance Student A is not a part of the result university management is going to get from the SQL Query. This is a dirty read.

Lost Updates :

Occur when two transaction read the same data and both manipulate the data, changing its value and then both try to update the original data to the new value. The second process might overwrite the first update completely.

Repeatable reads:

Repeatable reads condition occur when a transaction tries to read a data multiple times and and between the two reads, another transaction modified that data. Therefore when the original transaction tries to read that data second time, it find different value for that data. In other words, the original transaction reads two different values for the same data.

Phantom reads:

When a transaction needs to execute a same query twice(or multiple times) and it gets different set of rows from the result of its first execution of the query. This happens if some another transaction adds or deletes rows between the two executions of the query and these change in rows becomes the the part of final result.

SQL Server Isolation Levels

SQL Server has six isolation levels .

  1. Read Uncommitted
  2. Read Committed
  3. Repeatable Read
  4. Serializable
  5. Snapshot

All these isolation levels helps in preventing one or more above problems . Remember Higher the level of isolation level you set the lesser will be the availability of data for other users. Trade off is the word you can use to describe this . To gain some you have to lose some. lovely!

Read Uncommitted

SQL Tips :-  When learning SQL Server or any other DBMS and database terms always focus on the term’s meaning. It will be self-explanatory.

This Isolation level allows other transactions to read data which is not committed yet. This means if Transaction A is writing the data and Transaction B comes in , it will be able to read the data changed by Transaction A.  Regardless of the fact that it may receive Rollback command and all the changes made by Transaction A will be reverted.

This can cause dirty reads, lost updates & phantom reads. This is Isolation level with least restrictions and more concurrency .

Read Committed Isolation Level

Read committed isolation level doesn’t allow Transaction B to read data changed by Transaction A until it is committed. This reduces concurrency but prevents dirty . Phantom reads and repeatable reads are still a possibility.

SQL Server Read Committed is the default isolation level in it’s settings. However , you can change that. This is second least isolation level.

Repeatable Read Isolation Level

This SQL Server Isolation level behaves like read committed and also it will prevent any other transaction to modify the data which another transaction is reading until it is completely free. This eliminates dirty reads and repeatable reads  . Phantom reads are still a possibility.

Serializable Isolation Level

SQL Server Serializable Isolation level is the strongest isolation level .  If Transaction A is changing data on table Students and Transaction B comes to read the data from Transaction A  . SQL Server will not allow transaction B to read the data until Transaction is not complete. Similarly , If Transaction A is reading the data from table Students . Transaction B come to change the data in table Student. SQL Server will make sure Transaction A complete itself before allowing Transaction A to do anything. It will completely remove phantom reads!

Snapshot Isolation Level

Highest level of isolation level and least concurrency .  A transaction recognise only data which is committed before the start of the transaction. Any modification of the data after the transaction has started ,  is not visible to any statements of the currently executing transaction.  Row versions are created for each transaction and stored in temp db which handles these transactions. This  isolation level eliminates dirty reads, lost updates, repeatable reads and Phantom reads problem.

 

In the next article on Locks , we will see how SQL Server is managing all these Isolation levels and how it keeps data clean and allow multiple transaction to work at the same time.

Share your insights and feedback in comments if there is more to add here. Do share this with your friends on Social Media and oh you can follow us on twitter , facebook as well.

See you with our next article!
Cheers!
Happy Learning!

 

Advertisements

SQL Server Concurrency & Transaction Management

SQL Server Concurrency & Transaction Management  is a process which allows multiple users to access data and perform multiple operations maintaining data integrity and smooth execution all at the same time. Microsoft SQL Server manages concurrency control by using locks to help protect data. The locks control how multiple users can access and change shared data at the same time without conflicting with each other.

This a concept which needs to be in developer’s mind all the time while writing the scripts to avoid any problems that may occur due to execution. Such as data loss , dirty reads and deadlocks which can eventually through out the whole system into darkness.

Hello subscribers and readers , I hope you guys are doing great and apologies for being absent for a long time. We all need a break from our routine from time-to-time. Don’t We?  Hopefully , you will see more posts in month of May as there are few new series on SQL Server that are in queue . In meantime , if you wan’t to ask any question or would like to have a post on a topic . Feel free to Contact us and we will publish it in our Questions Category.

SQL Server Concurrency

SQL Server Concurrency manages number of operations simultaneously  without letting them interfering with each other. Concurrency comes into play when a process that is changing data prevents other processes from reading that data . Similarly ,  when a process that is reading data prevents other processes from changing that data concurrency comes in to ensure safe execution of transactions .Concurrency is also affected when multiple processes are attempting to change the same data at the same time .

SQL Server Concurrency & Transaction Management Case Study

Think of the SQL Server Concurrency as a system which is responsible to make sure that operations such as read , write and update  always operate on correct data. An enterprise application manages multiple operations from hundreds of users all at the same time . Now if someone , adds a new record in the application. SQL Server tries to add the details in sql database . At the very same time it is possible that a manager is trying to access the reports. SQL Server then access data to fill out the reports.

Now you have two operations , one is changing the data and the other is reading the data. Can you see the problem here? The read operation and write operation both could be trying to access same data . Now can you guess which one will be able to access the information first? Read or Write?

SQL Server concurrency and Transaction management takes care of taking such decisions without slowing down the process. It makes sure that both these operations are performing correctly and that there is no data loss during writing and correct data is present in reports.

To understand the SQL Server Concurrency first we need to understand what is a Transaction Management .

Transaction Management

Transaction is a  logical unit of work that must be either entirely complete or not. Successful transaction changes the database from one consistent state to another

Using case study example , a user who wants to read the data when ask the system to do so. SQL Server treats the task as a single transaction. Similarly for the write operations our SQL Server will treat as another transaction. A transaction has ACID properties.

ACID Properties

Atomicity:  Requires that all operations (SQL requests) of a transaction be completed; if not, then the transaction is aborted. A transaction is treated as a single, indivisible, logical unit of work

Consistency: ensures that a transaction won’t allow the system to arrive at an incorrect logical state – the data must always be logically correct.  Consistency is honored even in the event of a system failure.

Isolation: Data used during execution of a transaction cannot be used by second transaction until first one is completed

Durability:Indicates permanence of database’s consistent state.When a transaction is complete, the database reaches a consistent state. That state can not be lost even if the system fails. If a system failure occurs while a transaction is in progress, the transaction is completely undone, leaving no partial effects on data.

These four principles govern how SQL Server or any other DBMS handles a transaction.

How Does SQL Server Concurrency Work?

Take a long breath and let everything settle in . We read about concurrency and why is it so important for SQL Server to manage multiple operations at a time and then we read about transactions and its properties.  Now we know that every task or single instruction (transaction) has to satisfy all four conditions in order to be successful . Otherwise transaction will be incomplete.

From our Case Study , imagine hundreds of write /update transactions coming in . At the same time , few users are viewing the reports to make important decisions. Different users accessing same data for different purposes. SQL Server Concurrency & Transaction management will make sure that all the tasks are properly aligned and that each transaction is completed before the other transaction.

Some transactions may change the state of the data and some may not . So it is very crucial for the SQL Server to maintain its ACID property even if the system fails. SQL Server Concurrency & Transaction management is achieved by the use of locks .  To understand the locks we need to understand about concurrency and transactions . DBAs can choose how these locks work by using isolation levels. The two topics that we will cover in Part 2 of this article.

I hope this proves useful. You can follow us on twitter , facebook or subscribe us to get the second part of this article and many more such articles in your mailbox. Feel free to comment below , if anything requires further clarification or if i have make a mistake. I still feel a bit rusty after my vacations.

See you with our next article!
Cheers!
Happy Learning!

SQL Subquery | SQL Beginner Guide

SQL Subquery is another tool SQL Server has which can be used to fetch data from other tables to be used in main query according to the needs.  Inner query and nested query are other names of SQL Subquery.

SQL Subquery can be used with the SELECT, INSERT, UPDATE, and DELETE statements along with the operators like =, <, >, >=, <=, IN, BETWEEN, etc.

The idea is that a subquery returns data for one column which can be further utilised in outer query.

Too much to read , let’s just start with examples and we will clear the points as we go. Sounds fair?

Prepare the data

Examples and queries that we will see here will run on our University Management System database . If you want to use these SQL queries for practice , create the same database , you need to just copy paste the script from here .

SQL Subquery Examples

Take a look at the tables and data we have and then see the query :

We want to see the courses who belong to a department with budget more than 200000.00!


SELECT [CourseID]
,[Title]
,[Credits]
,[DepartmentID]
FROM [School].[dbo].[Course]
WHERE [DepartmentID] IN
(SELECT [DepartmentID]
FROM [Department]
WHERE [Budget] > 200000.00)

Result is

 

This query will return all departments with budget more than 200000.00!To understand this clearly ,  let’s discuss these terms and see how they exist in an SQL query

  1. Inner query
  2. Outer Query

From the above SQL queries you can find these three

SQL Subquery/Inner Query/Nested Query


(SELECT [DepartmentID]
FROM [Department]
WHERE [Budget] > 200000.00)

 

Outer Query


SELECT [CourseID]
,[Title]
,[Credits]
,[DepartmentID]
FROM [School].[dbo].[Course]
WHERE [DepartmentID] IN

 

If you read carefully the complete query you will realise that we have basically filtered the data from other table and used the result rows in WHERE clause. Here is how SQL Server handles this

Our SQL Subquery produced this

These two rows then were matched one-by-one with outer query

One could argue that the same can be achieved by SQL Joins , you are right . SQL Joins could give us much more flexibility than SQL Subquery but it depends highly on the situation as well as requirements .

See : SQL Joins Explained

When to Use SQL Subquery versus a Standard Join?

There cannot be any universal rule to answer this question . We can adopt some best sql practices to make sure that we don’t end up worrying about tuning the same basic SQL queries a year later. A stage where you have to spend a good amount of time to go through the whole thing.

More often than not main concern of DBAs and data developers is performance of SQL query. So use of SQL Subquery or SQL Join can be decided if we keep performance as our deciding factor.

Independent Subquery

If a subquery can be executed without outer query entirely it’s own then it’s independent subquery. Such queries are evaluated once and the result is then used by outer query.

Dependent Subquery

If a subquery cannot be executed without outer query and you get a syntax error because it refers to some tables outside of the subquery then its a dependent subquery. Such queries can be very slow . Reason is that they have to get the values from outer query as well. This with example is discussed at the bottom in correlated query exmaple . So if a outer table has 10,000 rows the inner query will be executed 10,000 times.

To answer our question many DBAs avoid dependant subquery and change it to JOIN .

Hopefully , after this you can decide which tool to pick and what to do with such a scenario.

Types of SQL Subquery

Following are few types of SQL subquery with exmaples

Single row subquery : Returns zero or one row.


SELECT [DepartmentID]
,[Name]
,[Budget]
,[StartDate]
,[Administrator]
FROM [School].[dbo].[Department]
WHERE [Administrator] = (SELECT PERSONID FROM PERSON WHERE PERSONID=2)

This produces only one row.

Multiple row subquery : Returns one or more rows.


SELECT [CourseID]
,[Title]
,[Credits]
,[DepartmentID]
FROM [School].[dbo].[Course]
WHERE [DepartmentID] IN
(SELECT [DepartmentID]
FROM [Department]
WHERE [Budget] > 200000.00)

Correlated subquery

Reference one or more columns in the outer SQL statement. The subquery is known as a correlated subquery because it is related to the outer SQL statement.

From our previous articles How Grouping Works in SQL .  We use the example where we concluded the total marks of each student. It looked like this SQL Subquery

Can you find me the students who has above average grades? Use this query


SELECT FirstName, TotalGrades
FROM (

SELECT P.FirstName,SUM(GRADE) AS TotalGrades
FROM [School].[dbo].[StudentGrade] SG
LEFT JOIN [School].[dbo].Course C
ON C.CourseID=SG.CourseID
LEFT JOIN [dbo].Person P
ON P.PersonID=SG.StudentID

GROUP BY P.FirstName) A
WHERE A.TotalGrades > (
SELECT AVG(GRADE) AS TotalGrades
FROM [School].[dbo].[StudentGrade] SG

)

Run this query and it will return students with above average marks.Also this is dependant query

 

SQL Subquery Correlated example

How to Use SQL Subquery with SELECT Statement

SELECT [CourseID] ,[Title] ,[Credits] ,[DepartmentID]
FROM [School].[dbo].[Course]
WHERE [DepartmentID] IN
(SELECT [DepartmentID]
FROM [Department]
WHERE [Budget]> 200000.00)

 

How to Use SQL Subquery with the INSERT Statement


INSERT INTO [dbo].[Department]
([DepartmentID]
,[Name]
,[Budget]
,[StartDate]
,[Administrator])

VALUES  (5,'Computer', 800000, '2007-09-01 00:00:00.000',(SELECT PERSONID FROM PERSON WHERE FIRSTNAME='Awais'))

 

Use SQL Subquery with the UPDATE Statement


UPDATE [dbo].[Department]
SET [Name] = 'Computer Science'
WHERE [Administrator] =(SELECT PERSONID FROM PERSON WHERE LastName='Harui')

 

Use SQL Subquery with the DELETE Statement


DELETE [dbo].[Department]

WHERE [Administrator] =(SELECT PERSONID FROM PERSON WHERE LastName='Harui')

 

SQL Subquery Rules :

  • SQL Subquery must be enclosed within parentheses.
  • Can have only one column in the SELECT clause, unless multiple columns are in the main query for the subquery to compare its selected columns.
  • An ORDER BY command cannot be used in a subquery, although the main query can use an ORDER BY. The GROUP BY command can be used to perform the same function as the ORDER BY in a subquery.
  • SQL Subquery that return more than one row can only be used with multiple value operators such as the IN operator.
  • A subquery cannot be immediately enclosed in a set function.
  • The BETWEEN operator cannot be used with a subquery. However, the BETWEEN operator can be used within the subquery.

This is pretty much about subqueries and if you find it useful . Share it with your firends and colleagues and let us know about how effective it is.

We are moving towards advanced topics now and also taking up the guests posts as well. So feel free to contact and have a good time learning.

Happy Learning!
Cheers

Features Of Database Management System?

We have all heard so much about Database management systems that sometimes we confuse the two things. One is RDBMS which is  Dr. Codd’s data model which has 12 rules & the other is the software which is used to manage the RDBMS. Since , we humans love acronyms & like to shorten down everything these two terms are often used together . We have removed R from RDBMS  and started to use DBMS . Trust me it does confuse even the experienced people sometimes. Because S in both terms have different meaning. For RDBMS it is System and for DBMS it is Software. With this out-of-the-way let’s discuss features of database management system ( the software)

 

FEATURES OF DATABASE MANAGEMENT SYSTEMS

We will discuss few very important ones to familiarise ourselves with the basics of database management system.

Data Dictionary Management

Actual dictionary has meanings of words in it. Similarly , a DBMS has a dictionary where it stores information about all tables , columns and relationship between them & the change in them.  This saves you time to track & manage all of this by yourselves

Data Storage Management

Data is to be stored on disk of our computers right? Now , imagine you have to create the right structures or program the arrays , data list and create data structures to handle the data and the pressure of it’s growing size? It will be too much to handle . So a DBMS do this job for us . We just follow our designs & data structures and data storage is handled by our DBMS.

Data Transformation and Presentation

Modern databases have users from all parts of the world. You use facebook  & you see that it gives you language options or time & date options according to your region . Now this is so spotless that most of miss the genius work of programmers who do it. DBMS helps to achive this task by providing options to transform data in multiple formats & then present it to the appropriate users.

Date format in England is dd/mm/yyyy & in United States it is mm/dd/yyyy . DBMS should be able to manage this change & present it to both American & English user their time formats. Howz that for a requirement 😉

Security Management

DBMS mustn’t allow breach of user’s access .  DBMS should have appropriate security rules that can restrict access to tables , schema & other database objects. Why is this so important?
My client wouldn’t like their employees to access HR information. Such as salaries of other employees &  their contract details. Such data should only be accessible for HR department. Makes sense? 

Backup and Recovery Management

Sometimes , names are enough to show what does it mean. DBMS should be able to take backups and provide methods to recover the lost data . Sometimes , maybe some data developer accidentally removes your production table . Accidents happen all the time? DBMS should have logs & other mechanism than can support data recovery!

Resigning employees or a sleepy developers tend to make mistakes more! DBMS should be able to  help us out!

Data Integrity Management

Wallmart , Amazon have thousands of transactions every second if not minutes. Their orders coming in , inventory coming up on the screen . Now it’s the job of DBMS to make sure that data integrity is not affected in any case. Remember , about primary keys? If DBMS allows duplication of primary key . Can you guess what will happen?

Let me tell you. You could log in in your account and end up seeing my account. Why? because DBMS failed to manage your user id against your account. Imagine the damage. Someone’s credit card shopping is billed to you.

So DBMS should be able to maintain data integrity , data concurrency & data consistency!

Database Access Languages and Application Programming Interfaces

You can access database by a query language. Standard Query Language is one example and mostly used around the world.

If you have more features to add & comment them below. We will add it to this article. If you have enjoyed reading this. Please , share this with your friends & colleagues. Subscribe Us to receive latest articles in your inbox.

Write to us if you want to add any topic & we will surely do so.

Happy Learning!

Cheers!

 

Data Normalization University Management System

Data Normalization is the process widely used and implemented in industry . While this concepts has it’s use it comes with a price. However , Designing a DBMS is fundamental and of utmost importance to the success of any software. You can think of DBMS as human brain and this is true because DBMS design is what controls everything in software .  From storage of information to its retrieval , DBMS do everything for a software that our brain do. Hence , it’s very important for our DBMS to be able to clear paths for retrieval of information and its storage. In any case , it should avoid redundant data and anomalies. Check the links before going through Data Normalization With Example .

Data Normalization

Process of removing redundancy , data duplication and elimination of anomalies and preventing loss of information.

If you are still unable to understand terms used above . Read these for better understanding and also there will be other terms which we have already discussed separately. Following are few articles which are pre-requisites to this article. If you are a beginner you can go through them or else you can proceed

  1. Role Of Keys In DBMS
  2. Database Terminologies
  3. Data Modelling
  4. Redundancy  

Normalization Concept in DBMS was introduced to solve the data anomalies that were causing halts and serious data problems to the developers.

Also Read : Data Anomalies in DBMS

 

Normalization was also proposed by Dr. Codd and his 12 rules are the fundamentals to every DBMS design for decades now.  Now that the context is built , we can proceed with the Normalization with example.

We will discuss four types of Normal Forms

  1. First Normal Form 1NF
  2. Second Normal Form 2NF
  3. Third Normal Form 3 NF

FIRST NORMAL FORM 1NF

Table is considered to be in First Normal Form 1NF only if it contains no repeating groups . Taking the example from ERD case study of University management system . See below

 

Student_IDNameAgeZip CodeCityDept_IDDepartment Name
2Faiz210000,48758Karachi,Lahore4Botany
3Nouman259887, 6556Faislabad,New York5English
4Jerry21266 ,5555Dubai, Abu Dhabi6Physical Education

Two Columns , Zip Code and City they both have repeating groups in them. Can you imagine a way to identify which city has which zip code of two? Hence, we need to remove them and convert them in a way that each record can be recognised independently.

First Normal Form will be like this

Student_IDNameAgeZip CodeCityDept_IDDepartment Name
2Faiz210000Karachi4Botany
2Faiz2148758Lahore4Botany
3Nouman259887Faislabad5English
3Nouman256556New York5English
4Jerry21266Dubai6Physical Education
4Jerry215555Abu Dhabi6Physical Education

No Repeating Groups in table for it to be in 1NF. Above table completes the condition

SECOND NORMAL FORM 2NF

Table is 2NF only and only if it already has 1NF and then it must be absolute that there should not be partial dependency of any column on primary key. This part can be confusing , so let’s try to spend some more time here.

Student_IDNameAgeZip CodeCityDept_IDDepartment Name
2Faiz210000Karachi4Botany
2Faiz2148758Lahore4Botany
3Nouman259887Faislabad5English
3Nouman256556New York5English
4Jerry21266Dubai6Physical Education
4Jerry215555Abu Dhabi6Physical Education

See the table above , it is in 1NF but let’s find other candidate keys in the table of Studnet. You have Student_ID which is the primary key and we have another key which is department’s foreign key Department_ID. We can use both these keys to find details of student . right? Let’s try to find dependency of every column on our composite key.

Student_ID,Department_ID -> Name , Zip Code & Department Name they both need these two keys to be able to identify them as to which student they belong to.

Student_ID,Department_ID -> City & Age they both do not depend completely on our composite primary key rather their dependency is partial.

Remove Partial dependency from table for it to be in 2NF

See below

Student_IDNameZipDept_IDDepartment Name
2Faiz00004Botany
2Faiz487584Botany
3Nouman98875English
3Nouman65565English
4Jerry2666Physical Education
4Jerry55556Physical Education

Student_IDZip CodeCity
20000Karachi
248758Lahore
39887Faislabad
36556New York
4266Dubai
45555Abu Dhabi

Student_IDNameAge
2Faiz21
3Nouman25
4Jerry21

We now have two more tables Student Age & Zip Code.

THIRD NORMAL FORM 3NF

Can you guess the first condition for a table to be in 3NF? Yes , it has to be in 2NF first. Easy Peasy! Who thought Normalization could be this much fun?

Other condition is that for a table to be in 3NF  every column should only be dependant on primary key and no other column  this is called transitive dependency . You can say we don’t want to have candidate keys in 3NF. From above tables , Student table has transitive dependency for Department Name. This column depends on Student ID and Department ID. hence , we will create a new table with Department name and remove the department name column.

Student Table at 3NF will be

Student_IDNameZipDept_ID
2Faiz00004
2Faiz487584
3Nouman98875
3Nouman65565
4Jerry2666
4Jerry55556

All other tables already satisfying all conditions. No transitive dependency columns in a table for it to be 3NF.

So we have converted this long table

Student_IDNameAgeZip CodeCityDept_IDDepartment Name
2Faiz210000,48758Karachi,Lahore4Botany
3Nouman259887, 6556Faislabad,New York5English
4Jerry21266 ,5555Dubai, Abu Dhabi6Physical Education

  Into these well-designed tables

Student_IDZip CodeCity
20000Karachi
248758Lahore
39887Faislabad
36556New York
4266Dubai
45555Abu Dhabi

Student_IDNameAge
2Faiz21
3Nouman25
4Jerry21

Student_IDNameZipDept_ID
2Faiz00004
2Faiz487584
3Nouman98875
3Nouman65565
4Jerry2666
4Jerry55556

Can you answer what is 3nf in dbms?

Answer is pretty straight forward . Table with no transitive Dependency, no partial Dependency & does not have repeating groups is 3nf table

data normalization rules

To sum it up and answer the question of How to normalize tables? Follow below normalization rules:

  • No Repeating Groups in data will give us 1NF
  • Remove Partial Dependency and 1 NF will give us 2NF
  • No Transitive Dependency and 2NF will give us 3NF

Now that we are done with data normalization , We would like to share an insight . If you are a professional you already know this but if you are a student know this . In Industrial database people hardly go to 3NF and let alone to Boyce and Codd’s Normal form or fourth and fifth normal form.

You can see the reason as to why , Normalization creates too many tables and query costs increase dramatically which is something we don’t want while retrieving data. It reduces redundancy ,duplication but increases tables and dependency between them. Hence , people tend to avoid them sometimes.

In above example we have Age table with only column , that is unnecessary in practical terms . It will just add one more join to simple information. Anyways  , you should know these facts as well to understand completely what data Normalization is .

We hope this was helpful. You can follow us on facebook and subscribe us by e-mail to get the latest articles. If you likes this please , share with your fellows and if there are mistakes . Please , feel free to correct them in comments. Afterall , we are all but Students at the end of the day!

Your One Dollar Will keep us moving.

Happy Learning!

Entity Relation Diagram| University Management System Part 2

In Part 1 for ERD Case Study University Management System  we discussed how can you crack any case study and convert it into DBMS .  We did that step by step and had a wonderful feedback .  In another article on best practices for ERD , we discussed the pitfalls that we need to avoid.  Time to implement them on our ERD .

This is our Current ERD .

ERD University Management System

PROBLEMS
  1. Entity Research Project and Faculty has Many-to-Many relationship.
  2. Faculty and Course has Many-to-Many relationship.
  3. Student and Course has Many-To-Many relationship
  4. Student has multi-varied attribute
SOLUTION
  1. Change multi-varied attribute into a entity.
  2. Create a Associative Entity and reduce many-to-many into one-to-many and pass they keys from both.

See below ERD Now .

 

We have following new associative entities .

  1. Researchers
  2. Faculty Courses
  3. Student Enrolled.

ERDs can get very complex and at enterprise level they are very difficult to draw and almost impossible to understand as they get so big. Consider this diagram , even though we kept it simple still it became a difficult diagram to go through. Even with this issue , it is always considered best to have a sketch.

This ERD is now ready for data modelling and after that Database Developers can code them and prepare to connect them with applications to store and retrieve data.

If you have mastered this ERD practice , to write a SQL query will be a child’s play for you . As a matter of fact , you will have a great understanding of DBMS and you’ll eventually understand about flow and management of data.

Next articles , will cover about Normalization with examples and see how a DBMS is optimised and avoids anomalies .  That will be the last practical article on designing DBMS with best practices . After that we will start moving towards Query Writing and how a database engine handles all operations.  Some very exciting and logical stuff is in queue.

If you have enjoyed this article or learned anything , like the post and give us feedback so that we will know how else can these articles be improved!

Stay Tuned and Happy Learning!

ERD Best Practices | Resolving DBMS Relationships Issues

Previously , we resolved a case study by creating an ERD for University Management System . Today , we will focus on ERD best practices and resolving dbms relationship issues . This can make our life a whole lot easier . So far , we have established the fact that an Entity-relation diagram is the first step in designing a DBMS . After an ERD , we usually move forward with relational modelling . Hence , it’s important that our ERD should be very well documented and well designed and it should be able to avoid some of the major pitfalls such as anomalies in dbms.

RESOLVING MANY-TO-MANY RELATIONSHIPS :

When we talk about resolving relationships in DBMS , we are adapting pro-active approach and making our design error-prone even before implementing the design. How’s that  for the best practices. Never allow a relationship that is many-to-many , instead reduce it one-to-many and create an associative Entity You’re probably like for a reason , pretty understandable let’s go for it. 

ERD University Management System

You can see Student and Course has many-to-many relationship between them .

Problem

In DBMS , one objective is that data is organised in a way that data should be able to identify all records in a table with minimum redundancy . Student and Course will have a problem that there could be a repetition of set of keys when they will be joined.

Many-To-Many means many courses related to more than one student and vice versa. If you want to see How many courses does a student takes. You will end up with something like this

How many more columns will you add as still you’re not sure as exactly how many more courses a Student will take. Will you add more columns ? because you cannot have more rows for Faiz as Student_ID is  Primary Key.

Solution:

Create a bridge table ( junction table ) also known as Associative entity and pass both Student_ID and Course_ID as Foreign key into it with a Primary Key of its own. The above model will now look like this

MULTI-VALUED ATTRIBUTES

Change multi-valued attributes to separate entities and provide its foreign key in place of  attribute . This way you can have a better control over such attributes.

These two are the main practices that no matter what you should follow so that you don’t end up with problems in future , where you have to change the design again and again.

We have applied these rules to resolve dbms relationships issues in the ERD Case Study | University Management System Part 2 System.. Then we’ll do a relational modelling for it too. If this was helpful , share with your friends and follow us for more articles to come .

Anomalies In DBMS

Anomalies in DBMS are the taboos . You do them and instantly your colleagues will curse you every time they have to do some operations on your DBMS. My rule is , that I always try to understand the meaning of word before understanding the concept, since English is not my native language . This practice has opened many doors for me, as designing a DBMS is exactly how we do things is real life. My advise for you is to do same .

Anomalies In DBMS

An anomaly is an abnormality, a blip on the screen of life that doesn’t fit with the rest of the pattern.

If you have followed the previous articles , by now you should be able to design a database management system. By designing ERD for Case Study of University Management System , you are at a stage where you just need to do the data modelling and you’re good to go for implementing the system. Easy Peasy.  Right? 

Remember Dr. Codd’s Rules? Physical and logical data independence and integrity of a database management is the most important part of it. Easy to say and very difficult to do . However , this is where you will thank this article because that is precisely what we have today . So let’s begin.

If you implement current ERD of University Management System , you will end up with multiple problems. It may be that you insert a student directly into department when you don’t have the student in the Student table. This will cause a lot of problems in updating and retrieval of records. We call these behaviour an Anomaly in DBMS.

Like Redundancy  anomalies are also very important topic for exams if you’re a student, and for interviews if you are looking for a job . Following are the ones we should be concerned about .

  1. Insert Anomaly
  2. Delete Anomaly
  3. Update Anomaly

INSERT Anomaly in Database

An Insert Anomaly occurs when attributes cannot be inserted into the database without the presence of other attributes. Usually when a child is inserted without parent.

Jerry is a new Student with department id 6. There is no Department with this Dept_ID 6. Hence , the anomaly. The usual behaviour should be a new department id with 6 and only then Student could have it.

UPDATE Anomaly in Database

When duplicated data is updated at one instance and not across all instances where it was duplicated. That’s an update anomaly .  See below English department has now Dept_ID 8 , but unfortunately it was not updated in Student table.

DELETE Anomaly in Database

Now if someone decides to delete Computer Science department , he may end up deleting all student’s data who had the department of Computer Science. So to say deletion of some attribute which causes deletion of other attributes is deletion anomaly.

These anomalies are addressed by Normalization . The normalization makes sure that all these three issues and other possible be addressed at the time of designing.  We will discuss them in our usual style in next few days.

If you like this and understood it , share with your friends and fellows.

Happy Learning and Cheers!

Role of Keys In Database Management System | Data Modelling

Best thing about being a business intelligence consultant is that my work is closely related to real-life scenarios and objects that I use daily . Every time when i come across new concepts or even problems , I have always been able to map in my world . Such is our topic for today-Role of Keys In Database Management System .

When we hear the word “KEY” the first thing that pops-up is an image of a door. Right? After reading this you will think of relation/tables now . Just as a key is used to access the contents behind the door , which can be a room , car or a security safe. Keys in DBMS is used to access the contents/tuples ( Rows ) of tables . That should be enough with the analogy I think . Time to jump to explanations and let’s try to understand different types of Keys used .

Before you read further , this article is a pre-requisite for coming articles on Normalisation & Joins .We already have articles on what is a database? Database Management System , Data Modelling , Database Relationship and most recently , We solved a case study as an example for University Management System by designing an ERD . You can go through them as well .

DBMS Keys are as follows  :

  1. Primary Key
  2. Foreign Key
  3. Composite Key/ Composite Primary Key
  4. Candidate Key
  5. Key-Attribute & Non-Key Attribute
PRIMARY KEY :

Attribute which is unique and can identify all records in table. This is one of the most important attributes . Primary key in another table , has a potential to be a foreign key provided they have same data types and cannot be NULL
Hence  , while designing a DBMS database designers keeps all primary keys of same data type and data length which is a good practice .

Refer to image at the bottom for visual example

FOREIGN KEY :

Attribute which is referenced to the primary key attribute of another table. This is used to create relationships between two tables. For example , In below image Dept_ID in Student table is a foreign key from Table Department which also has the same Attribute with exact same data type and data length. Important point is that , that Student.Dept_ID is mapped to the Department.Dept_ID and both tables have these attributes in it.

Foreign Key can be duplicated and NULL.

COMPOSITE PRIMARY KEY

Some cases demand that a table is identified by combining more than one column. The result is the composite primary key. It has same properties as of Primary key.

Student.StudentID and Student.EnrollNumber we can use these two as a composite key. Can we add the Studnet.Roll Number ? Comment below with your answer and reason

CANDIDATE KEY

Selection of a primary key can sometimes be a tricky thing to do . A primary key should not only be able to identify each record but also it should be usable across all tables as a foreign key , if needed be. Hence , table can have more than one attribute with this property . These attributes are candidate keys. They are also called alternate or secondary keys

Can you identify more keys in below image? Awaiting your comment

KEY ATTRIBUTE & NON KEY ATTRIBUTE :

All attributes that are some kind of keys are key attributes and rest of them are non-key attributes.

 

See below diagram and find all keys . In case you have any question just feel free to contact us!

 

Now that you know these keys and how they behave. We are ready to move towards Normalisation , We will learn normalisation in detail with the University Management Case Study.  We will apply these concepts on ERD and Relational Model.

If this was useful and helpful  Share with your friends and Subscribe to get the latest articles in your mailbox!

 

Entity Relation Diagram | University Management System Part 1

Exciting stuff today people. Today , we are actually going to get our hands-dirty and do a hands-on practice for designing a database management system by creating ERD Case Study for University Management System . In case , you need an overview of previous articles :

  1. What is a database?
  2. Database management system
  3. DBMS Models & Relationships
  4. Data Modelling & Crow’s Foot Notation

In all these articles , we have tried to keep things simple and use visual examples to explain everything . You can go through them and decide for yourself if they help you!

We will design a University Database Management System  today , because you’ll be able to understand the complete requirements for such scenario. Before we rush towards designing  , I want to introduce you to a very simple technique to crack case study.  Use this and the whole entity-relation diagram will be created before you know it.

There are four steps in designing a ERD for a DBMS .

  1. Identify Entity and members
  2. Decide relationships and Cardinality and Modality
  3. Draw Entities separately
  4. Connect relationships and entities

 

Now you know what to do with the user requirement ,  Just go through the entire case study first and then apply these four steps on it.

ERD Case Study :

In a University  , there are several departments and each department has a head of department who belongs to Faculty. Department have a name , phone extension , specific mailing address and Students that belong to the department. Students can belong to only one Department at a time and Department can have more than one or no Student

Students and faculty have names and unique identification numbers , with address , age , gender and other information. Student studies different Courses offered by University . Faculty teaches these Courses . In each semester one student can take more than one course and Faculty can teach more than one courses . Faculty members can teach in multiple Departments. Each course can be taught by many faculty members or no one

Faculty members are also working on multiple research projects. These projects are funded by government and university. One project can have more than one faculty member and one faculty member can work on more than one project

 

Huff , Looks like a long task . Lets apply our four steps on this requirement. Think of them and study this requirement again.

Identify Entity and Members  ;

Long ago ,  we told you how to identify entities . Remember? No? No worries  .  You can find it  ERD Terminologies . Start identifying nouns in above statement and make them bold characters.

In a University  , there are several departments and each department has a head of department who belongs to Faculty. Department have a name , phone extension , specific mailing address and Students that belong to the department. Students can belong to only one Department at a time and Department can have more than one or no Student

Students and faculty have names and unique identification numbers , with address , age , gender and other information. Student studies different Courses offered by University . Faculty teaches these Courses . In each semester one student can take more than one course and Faculty can teach more than one courses . Faculty members can teach in multiple Departments. Each course can be taught by many faculty members or no one

Faculty members are also working on multiple research projects. These projects are funded by government and university. One project can have more than one faculty member and one faculty member can work on more than one project

Decide Relationships , Cardinality and Modality

No idea how to do it? Go  ERD Terminologies ERD Terminologies. Simply identify verbs and identify them . Let’s make them bold Italic characters,

In a University  , there are several departments and each department has a head of department who belongs to Faculty. Department have a name , phone extension , specific mailing address and Students that belong to the department. Students can belong to only one Department at a time and Department can have more than one or no Student

Students and faculty have names and unique identification numbers , with address , age , gender and other information. Student studies different Courses offered by University . Faculty teaches these Courses . In each semester one student can take more than one course and Faculty can teach more than one courses . Faculty members can teach in multiple Departments. Each course can be taught by many faculty members or no one

Faculty members are also working on multiple research projects. These projects are funded by government and university. One project can have more than one faculty member and one faculty member can work on more than one project

Draw Entities  & Attribute Separately :

You may wonder about the members , as they can sometimes are missing , so we add the missing attributes that are not in the requirement by the knowledge of the industry we are designing the DBMS for . A primary key is a must attribute for  every entity .

Student have Name , age , gender , address , phone Number , Roll Number , Semester , Course_ID and Student_ID. Faculty have Name , age , gender , address , phone Number ,  Semester , Course_ID, Grade , Salary , Faculty_ID and designation. Course have Name , Code , Student_ID , Faculty_ID , Department_ID and Course_ID. Department have Name , Student_ID , Faculty_ID and Department_ID. Research Project – Project_ID, Faculty_ID , Name , Duration.

See Below the diagram

ERD University Management System
Complete ERD

Create Relationships Between Entities.

We know the relationships from above steps and also what will be the cardinality and modality. Using Crow’s Foot Notation  . We have combined them .

ERD Case Study

If we have to read this diagram , this is how it will go . Reading symbols at the other end

Student-Department ; One Student Belongs to One and Only one Department

Department-Student : Department can have more than one student and no Student. ( *Business Logic ) .

Student-Course  : Student must have one course and can have more than one.

Course-Student : One course can be offered to many students or no students at all

Course-Faculty : One course can be taught by many faculty member or no one (* Business Logic )

Faculty-Course : One faculty member will teach one course or more than one courses

Can you do the same for Research Project and Faculty? Comment below . If you can.

*Business Logic : While you design an ERD  . Never ever do anything with common sense and stick strictly with the logic , common sense says if department doesn’t have a student, how can it exist? But the user requirement says it can, Hence it will. Always stick with the user requirements

Concluding this long article , we have finally connected all the dots ( previous articles ) and develop this article in which we designed a ERD. Next step is to normalise this ERD and resolve relationships .

Also See : ERD Case Study Part 2 University Management Systems

If you liked this . Please  , share with your fellows and subscribe us for more articles. We will soon be going towards query writing as well.

Happy Learning!
Cheers

Data Modelling | Database Management Systems

Data Modelling  is the process of organising data elements into entities and mapping them into real world objects . This defines how different entities will behave together and what will be the relationship among them. Database Management Systems are designed to store information and data for softwares . Before , implementation of the system the DBMS is logically structured and designed. Because, this decides how data will be stored , processed and interact.

Previously ,we studied about proposed data models and last of it was relational database which is being mostly used in industry now . Business analyst collects requirements from client with careful and elaborate discussions. Therefore , This leads to data modelling diagrams and decisions about the business logic .
Data Modelling comes in pretty much handy in implementing those decisions . Think of data modelling as a road map into your software solution. Because , that’s the role of it in DBMS implementation.

Data Modelling

Concluding above discussion and a simple definition for Data Modelling could be :

 The logical structure of a database and  determining the manner in which is  stored, organised and manipulated

Two Data Models are

  1. Entity-Relationship Model ( ERD Diagrams ) 
  2. Relational Model 
ENTITY-RELATIONSHIP MODEL

This model is designed by translating real-world scenarios into database model by creating  entities ,  relationships ,  attributes and constraints.

   Entity & Members

Entities ( real-world ) objects and their attributes (properties ) . Example Student is an entity and his Name , age , address , gender etc are attributes

   Relationships

The logical association and way of interaction between entities enforced by using cardinalities and modalities . Example , Student studies in a University . When we link them we need to define how should they interact. Because , data modelling requires that a relationship between entities must have a minimum & maximum association. For instance , here we need to know how many students can be in a university at maximum and minimum.

  • One-to-One
  • One-to-Many
  • Many-to-Many
  • Many-to-One

This is all linked together by Crow’s Foot Notation .  Check a saple ERD for a University Management System

 

RELATIONAL MODEL :

This model is more scientific . Hence , its implementation is very wide and practical in the industry . Here , entities are known as relation/table . Attributes/members are converted into columns. Every members has its own domain data in it. See below for the explanation :

 

Table : Relation also  referred as entity in ERD Model

Attributes : Properties of entities

Columns :   For storing attribute data

Rows      :   For unique data against all attributes

 

These relations in a DBMS are usually combine together by Joins , Normalised and constraints for reporting , storage and support of application.

Also See : Joins Explained

An example for Relational Model given below. We will use same example which we were using for an ERD .

Here we define attribute name with the data type and relationship between them with cardinality and modality.

Must-Read : Entity Relationship Diagram | University Case Study

In the above , link we have a case study.  We will first breakup the scenario into elements & make an ERD out of it. Classic example of data modelling .

After this we have normalisation and  we have used same example of ERD and apply normalization on it.

Once you are done with these articles. You will be able to translate the complete user requirement into database management system because data modelling is an essential to all other concepts.

If this was helpful  , follow us or subscribe and share with your friends.

Happy Learning!
Cheers!

 

Crow’s Foot Notation

When we design a database management system , we set some basic rules and define a limit and relation between different entities . In other words , we decide in what capacity the entities will interact each other.  Cardinality & Modality are two concepts that take care of these rules. Hence , today’s article is about how to link entities with each other.Crow’s Foot Notation is used to represent them.

Supposedly , you landed here directly and need to go through what i am saying. Go through these articles first and then come back .

  1. What is a Database?
  2. Database Relationships.
  3.  ERD Terminologies

If you are not a beginner  , then please continue . After this article , we will learn how to do an ERD  ( Entity-Relation Diagram ) because we have covered every aspect of the terms and concepts we have to use in ERD .

Cardinality

What will be the maximum relationship between entities . This sometimes is also referred as mandatory part of relationship. Let’s clear this up with an example.

If we talk about a relation between Student and University. I can say that Awais studied in a Government College University Lahore , now by rule Awais can be a student of only one university at max. Right? What about University? University can have one and more than one students.

Relationship between Student-University , hence is one-to-many.  What is the maximum relationship between two entities? It will either be one or many.

   Representation :

Crow's Foot Notation

 

 Modality

What will be the minimum relationship between entities. This is referred as optional part of relationship that entities can and cannot have .  Taking another example of Awais’ University.

Awais’s University gives out Annual Scholarships to its brightest and sharpest . Relationship between Student and Scholarship could be , any student may or may not have a Scholarship , but  Scholarship  will be given to at least one student .  So Student may or may not have a Scholarship but Scholarship will have one or many students.

Representation :

Crow's Foot Notation

Combining examples and defining relationships between them :

Crow's Foot Notation

Reading Method :

Student-University : Read the relationship from Student to University . Cardinality and Modality symbols will be on University side. One Student at max in One University and at least in One University.

University-Student : One University can have more than one Student and at least one .

Comment below the rest of the relationships , if you understand the concept here.

 Relationships between entities are defined by cardinalities and modalities . Symbol at right side for maximum and left for minimum.

Crow’s Foot Notation Visually

Crow's Foot Notation

 

Got Questions for Us ? Do not hesitate to comment below and if this has helped . Share with your friends and community  , knowledge increases when we share with others.

Next Article , will be on an ERD diagram . How to create one and translation of a case study into ERD.  Very exciting stuff. Subscribe for more articles . See you next time.