Tuesday 11 June 2024

Understanding Joins in SQL Server

Leave a Comment

As we know SQL Server has been created based on two mathematical concepts, they are Set Theory and Predicate Logic. In set theory, the cartesian product is the basic operation. Joins in SQL Server also works in the same way as the Cartesian product.

In mathematics, the Cartesian Product of sets A and B is defined as the set of all ordered pairs (x, y) such that x belongs to A and y belongs to B.

For example, if A = {1, 2} and B = {3, 4, 5}, then the Cartesian Product of A and B is {(1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5)}.

When we apply joins between tables, the same cartesian product will happen first.

Joins are required to return the data from multiple tables. These tables should have common functionally similar columns to have a join condition between tables.

We will understand the joins after taking a look at cross-join (cartesian product) first.

CROSS Join

When we apply cross join between two tables(TableA and TableB), every row in TableA will have a combination with every row in TableB.

Let's take an example and look at it.

IF OBJECT_ID('dbo.ProductCategory') IS NOT NULL
    DROP TABLE dbo.ProductCategory;
CREATE TABLE dbo.ProductCategory (
    ProductCategoryId INT PRIMARY KEY IDENTITY,
    CategoryName VARCHAR(500)
);
GO
INSERT INTO dbo.ProductCategory (CategoryName) VALUES ('Fruits');
INSERT INTO dbo.ProductCategory (CategoryName) VALUES ('Vegetables');
INSERT INTO dbo.ProductCategory (CategoryName) VALUES ('Water');
INSERT INTO dbo.ProductCategory (CategoryName) VALUES ('Dairy Based Food');
INSERT INTO dbo.ProductCategory (CategoryName) VALUES ('Meat');
GO
SELECT * FROM dbo.ProductCategory;
GO
IF OBJECT_ID('dbo.Product') IS NOT NULL
    DROP TABLE dbo.Product;
CREATE TABLE dbo.Product (
    ProductId INT PRIMARY KEY IDENTITY,
    ProductName VARCHAR(500),
    ProductCategoryId INT NOT NULL
);
GO
INSERT INTO dbo.Product (ProductName, ProductCategoryId) VALUES ('Apple', 1);
INSERT INTO dbo.Product (ProductName, ProductCategoryId) VALUES ('Mango', 1);
INSERT INTO dbo.Product (ProductName, ProductCategoryId) VALUES ('Capsicum', 2);
INSERT INTO dbo.Product (ProductName, ProductCategoryId) VALUES ('Tomato', 2);
INSERT INTO dbo.Product (ProductName, ProductCategoryId) VALUES ('Milk', 4);
INSERT INTO dbo.Product (ProductName, ProductCategoryId) VALUES ('Curd', 4);
INSERT INTO dbo.Product (ProductName, ProductCategoryId) VALUES ('Chicken', 5);
INSERT INTO dbo.Product (ProductName, ProductCategoryId) VALUES ('Mutton', 5);
INSERT INTO dbo.Product (ProductName, ProductCategoryId) VALUES ('Pasta', 50);
INSERT INTO dbo.Product (ProductName, ProductCategoryId) VALUES ('Brown Rice', 65);
GO
SELECT * FROM dbo.Product;
GO
SELECT * FROM dbo.ProductCategory
CROSS JOIN dbo.Product;
GO

In general, we refer to the left table as the ProductCategory table and the right table as the Product table. Every row from the ProductCategory table will combine each row with each row from the product table.

Below is the one-part result of the cross-join. Please run the above script to check the full resultset.


Inner Join

when we apply INNER JOIN between tables, the result is going to be only the rows which are satisfying the given condition. Non-matching rows will be ignored.

SELECT
    PC.ProductCategoryId,
    PC.CategoryName,
    P.ProductId,
    P.ProductName
FROM dbo.ProductCategory PC
INNER JOIN dbo.Product P ON PC.ProductCategoryId = P.ProductCategoryId;
LEFT OUTER Join/ LEFT Join

When we apply LEFT JOIN between tables, the result is going to be only the rows which are satisfying the given condition plus Non-matching rows from left side table, null values will be returned for the corresponding rows from the right side table.

SELECT
    PC.ProductCategoryId,
    PC.CategoryName,
    P.ProductId,
    P.ProductName
FROM dbo.ProductCategory PC
RIGHT OUTER Join / RIGHT Join

when we apply RIGHT JOIN between tables, the result is going to be only the rows which are satisfying the given condition plus Non-matching rows from the RIGHT side table, null values will be returned for the corresponding rows from the LEFT side table.

SELECT
    PC.ProductCategoryId,
    PC.CategoryName,
    P.ProductId,
    P.ProductName
FROM dbo.ProductCategory PC
RIGHT JOIN dbo.Product P ON PC.ProductCategoryId = P.ProductCategoryId;
FULL OUTER Join / FULL Join

when we apply FULL JOIN between tables, the result is going to be only the rows which are satisfying the given condition plus Non-matching rows from left side table, plus non-matching rows from right table, null values will be returned for the corresponding rows from the both side tables.

SELECT
    PC.ProductCategoryId,
    PC.CategoryName,
    P.ProductId,
    P.ProductName
FROM dbo.ProductCategory PC
FULL JOIN dbo.Product P ON PC.ProductCategoryId = P.ProductCategoryId;
SELF JOIN in SQL Server

When we join a table with itself, it is called a "self join". It is essentially an inner join where both the left and right tables are the same table. We use a self join when a logical relationship exists between two columns within the same table.

Example
List out all the employees along with their managers.

In the same way, we can join more than two tables as well. The only thing we need to find out is the common table column between tables and the columns to be returned.

There is another important difference to understand between the join condition and the where condition.

what is the above difference/how does the above difference behave in the case of inner join and left join?

Two important points here.

  1. Matching Predicate
  2. Filtering Predicate

Let's use the below query to understand the above two points.

SELECT
    PC.ProductCategoryId,
    PC.CategoryName,
    P.ProductId,
    P.ProductName
FROM
    dbo.ProductCategory PC
INNER JOIN
    dbo.Product P ON PC.ProductCategoryId = P.ProductCategoryId
WHERE
    PC.CategoryName = 'Meat';

When we write any condition in the ON clause it becomes a Matching Predicate. The condition we write in where clause is Filtering Predicate.

Let's modify the above query and try to understand what will happen.

SELECT
    PC.ProductCategoryId,
    PC.CategoryName,
    P.ProductId,
    P.ProductName
FROM
    dbo.ProductCategory PC
INNER JOIN
    dbo.Product P ON PC.ProductCategoryId = P.ProductCategoryId AND PC.CategoryName = 'Meat';

In the case of inner join, it does not make any difference.

SELECT
    PC.ProductCategoryId,
    PC.CategoryName,
    P.ProductId,
    P.ProductName
FROM
    dbo.ProductCategory PC
INNER JOIN
    dbo.Product P ON PC.ProductCategoryId = P.ProductCategoryId
WHERE
    PC.CategoryName = 'Meat';

Now move the filter condition from where clause to on as below.

SELECT
    PC.ProductCategoryId,
    PC.CategoryName,
    P.ProductId,
    P.ProductName
FROM
    dbo.ProductCategory PC
INNER JOIN
    dbo.Product P ON PC.ProductCategoryId = P.ProductCategoryId AND PC.CategoryName = 'Meat';

Now we understand that there is no difference in the case of writing conditions in where clause or on clause when we use inner join. But in the case of left join, there is an important difference.

SELECT
    PC.ProductCategoryId,
    PC.CategoryName,
    P.ProductId,
    P.ProductName
FROM
    dbo.ProductCategory PC
LEFT JOIN
    dbo.Product P ON PC.ProductCategoryId = P.ProductCategoryId
WHERE
    PC.CategoryName = 'Meat';

Now move the condition from where clause to on clause.


See the surprising result.

So, please note that the condition we write in on clause is matching predicate, it won't filter the data, if there is no match from the right side table, it will give a null value. You have to write your filter conditions always in the where clause to filter the data. Joins can be used to update statements and delete statements as well. When you write update/delete statements based on left join be careful. Please test your query to understand whether it is giving the correct result or not.

Best SQL 2022 Hosting Recommendation

One of the most important things when choosing a good SQL 2019 hosting is the feature and reliability. HostForLIFE is the leading provider of Windows hosting and affordable SQL 2019, their servers are optimized for PHP web applications. The performance and the uptime of the hosting service are excellent and the features of the web hosting plan are even greater than what many hosting providers ask you to pay for. 

At HostForLIFEASP.NET, customers can also experience fast SQL 2019 hosting. The company invested a lot of money to ensure the best and fastest performance of the datacenters, servers, network and other facilities. Its datacenters are equipped with the top equipments like cooling system, fire detection, high speed Internet connection, and so on. That is why HostForLIFEASP.NET guarantees 99.9% uptime for SQL 2019. And the engineers do regular maintenance and monitoring works to assure its Orchard hosting are security and always up.

0 comments:

Post a Comment