Correlated and scalar subqueries in SQL

This recipe explains correlated and scalar subqueries in SQL using SQL server

Correlated and scalar subqueries in SQL

In this tutorial, let us understand what are subqueries and what are the different types of subqueries with clear examples.

Generally, subqueries are nothing but a query within a query. There are a lot of ways in which we can write subqueries in a query. Subqueries can sometimes be called as inner queries and the main queries are called as outer queries. Subqueries can be inserted in various places like SELECT, FROM, or WHERE clause of the main query.

There are three main types of subqueries

  1. Scalar subquery : A subquery that returns just one row and one column. Or in general terms, a subquery that returns just one value
  2. Multiple row / column subqueries : A subquery that returns multiple rows or multiple columns or both.
  3. Correlated subquery : A subquery that depends on the results of the outer query.

Let us understand Scalar and Correlated subqueries in detail with an example in this tutorial.

In this tutorial we are using a simple employee table which has 7 columns and 49 rows. The sample data is shown below.

Scalar subqueries

Scalar subqueries are fairly easy to understand. Let us consider an example where we want to find all the employees from the employees table whose salary is greater than the average salary. The best way to find this out is by using the concept of scalar subquery. Take a look at the following query.

In the WHERE clause of the query, we can see another simple query which just returns the average of all the salaries. If this query is executed separately, the result obtained is 5818 which the average of all the salaries. This value can now be used as a condition for the WHERE clause in the outer query.

The result of the above query is

EMPLOYEE_ID FIRST_NAME LAST_NAME JOB_TYPE SALARY MANAGER_ID DEPARTMENT_ID
4 Michael Hartstein MK_MAN 13000 100 20
5 Pat Fay MK_REP 6000 201 20
6 Susan Mavris HR_REP 6500 101 40
7 Hermann Baer PR_REP 10000 101 70
8 Shelley Higgins AC_MGR 12008 101 110
9 William Gietz AC_ACCOUNT 8300 205 110
10 Neena Kochhar AD_VP 17000 100 90
11 Lex De Haan AD_VP 17000 100 90
12 Alexander Hunold IT_PROG 9000 102 60
13 Bruce Ernst IT_PROG 6000 103 60
17 Nancy Greenberg FI_MGR 12008 101 100
18 Daniel Faviet FI_ACCOUNT 9000 108 100
19 John Chen FI_ACCOUNT 8200 108 100
20 Ismael Sciarra FI_ACCOUNT 7700 108 100
21 Jose Manuel Urman FI_ACCOUNT 7800 108 100
22 Luis Popp FI_ACCOUNT 6900 108 100
23 Den Raphaely PU_MAN 11000 100 30
29 Matthew Weiss ST_MAN 8000 100 50
30 Adam Fripp ST_MAN 8200 100 50
31 Payam Kaufling ST_MAN 7900 100 50
32 Shanta Vollman ST_MAN 6500 100 50

Scalar subqueries can be used within various clauses such as SELECT, FROM, WHERE, JOIN, UPDATE, DELETE INSERT and so on. Let us try to reframe the same example by replacing the subquery within a different clause.

The above query also returns the same results as that of the previous query.

Correlated subqueries

Correlated subqueries can be used in scenarios where a subquery needs to be executed for each candidate row in the outer query. In correlated subqueries, the subquery returns a different set of results for each candidate row considered by the main query. This way the outer query also depends on the inner query.

The best way to understand this is by considering an example. In the previous example we saw how to fetch the employee details of those whose salary is greater than the overall average salary. In this example, let us try to find out the employees whose salary is greater than the average salary of their department.

In the above query, we can see that the DEPARTMENT_ID of the outer table is being used inside the WHERE clause of the inner query. Here, for each row corresponding to the outer query,we calculate the average salary of that department and compare it with the current salary.

The result of the above query is

EMPLOYEE_ID FIRST_NAME LAST_NAME JOB_TYPE SALARY MANAGER_ID DEPARTMENT_ID
4 Michael Hartstein MK_MAN 13000 100 20
23 Den Raphaely PU_MAN 11000 100 30
29 Matthew Weiss ST_MAN 8000 100 50
30 Adam Fripp ST_MAN 8200 100 50
31 Payam Kaufling ST_MAN 7900 100 50
32 Shanta Vollman ST_MAN 6500 100 50
33 Kevin Mourgos ST_MAN 5800 100 50
12 Alexander Hunold IT_PROG 9000 102 60
13 Bruce Ernst IT_PROG 6000 103 60
17 Nancy Greenberg FI_MGR 12008 101 100
18 Daniel Faviet FI_ACCOUNT 9000 108 100
8 Shelley Higgins AC_MGR 12008 101 110

Hope this tutorial was helpful!

What Users are saying..

profile image

Jingwei Li

Graduate Research assistance at Stony Brook University
linkedin profile url

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

Word2Vec and FastText Word Embedding with Gensim in Python
In this NLP Project, you will learn how to use the popular topic modelling library Gensim for implementing two state-of-the-art word embedding methods Word2Vec and FastText models.

End-to-End ML Model Monitoring using Airflow and Docker
In this MLOps Project, you will learn to build an end to end pipeline to monitor any changes in the predictive power of model or degradation of data.

PyTorch Project to Build a GAN Model on MNIST Dataset
In this deep learning project, you will learn how to build a GAN Model on MNIST Dataset for generating new images of handwritten digits.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Personalized Medicine: Redefining Cancer Treatment
In this Personalized Medicine Machine Learning Project you will learn to classify genetic mutations on the basis of medical literature into 9 classes.

Build Piecewise and Spline Regression Models in Python
In this Regression Project, you will learn how to build a piecewise and spline regression model from scratch in Python to predict the points scored by a sports team.

Deep Learning Project for Text Detection in Images using Python
CV2 Text Detection Code for Images using Python -Build a CRNN deep learning model to predict the single-line text in a given image.

Classification Projects on Machine Learning for Beginners - 1
Classification ML Project for Beginners - A Hands-On Approach to Implementing Different Types of Classification Algorithms in Machine Learning for Predictive Modelling

AWS MLOps Project for ARCH and GARCH Time Series Models
Build and deploy ARCH and GARCH time series forecasting models in Python on AWS .