TSql Practical SQL Examples (Joining Tables)

 

This is the 3rd post on practical T-SQL; this post focuses on joining tables this time to continue the first one on SELECT, where you would start with…

 

The DimProduct in AdventureWorks

SQL normalization makes your data stored in relate tables; to relate them the technique is called performing joins on tables. There are quite a few variations on the join technique. All of the variations (inner joins and outer joins and more) are covered here…

 

AdventureWorksDW2008 is not a normalized database. It’s denormalized into a star schema (it’s a snowflaked star). Star and snowflake schemas are used in data warehousing and building SSAS cubes. The same product name EnglishProductName can appear more than once if other columns in the record are different—for example, the same product can have a different price at different times because is a slowly changing dimension table. Read this post about that

select EnglishProductName, ProductSubcategoryKey from DimProduct

where ProductSubcategoryKey is
not
null

EnglishProductName ProductSubcategoryKey

————————————————– ———————

HL Road Frame – Black, 58 14

HL Road Frame – Red, 58 14

Sport-100 Helmet, Red 31

Sport-100 Helmet, Red 31

N> the columns are related through the ProductSubcategoryKey

— cross join

SELECT EnglishProductName, EnglishProductSubcategoryName

FROM dbo.DimProduct, dbo.DimProductSubcategory

–(22422 row(s) affected)

SELECT
COUNT(EnglishProductName)
FROM dbo.DimProduct

———–

–606

SELECT
COUNT(EnglishProductSubcategoryName)
FROM dbo.DimProductSubcategory

–37

In this case the two tables are separated by a comma in the From clause. This syntax is called a cross join (or cartesian product). It shows every single POSSIBLE combination of product name and product subcategory name—even when the combination is wrong aka for some rows the pair product, subcategory has no meaning

 

— it does not conform to the standard ANSI SQL rules

SELECT EnglishProductName, EnglishProductSubcategoryName

FROM dbo.DimProduct ,

dbo.DimProductSubcategory

WHERE DimProduct.ProductSubcategoryKey = DimProductSubcategory.ProductSubcategoryKey

Avoid this way of joining as it would be not portable and suppported in the future

 

— join right way

SELECT EnglishProductName,EnglishProductSubcategoryName FROM dbo.DimProduct

JOIN dbo.DimProductSubcategory

ON dbo.DimProduct.ProductSubcategoryKey = dbo.DimProductSubcategory.ProductSubcategoryKey


HL Bottom Bracket Bottom Brackets

Road-750 Black, 44 Road Bikes

Road-750 Black, 48 Road Bikes

Road-750 Black, 52 Road Bikes

 

(397 row(s) affected)

 

The column ProductSubcategoryKey is in the DimProductSubcategory table the primary key and in the DimProduct table it’s a foreign key.

The DimProductSubcategory table is called the ONE table. The DimProduct table is sometimes called the MANY table, because one subcategory can have many products…

 

— joing better readibility

SELECT EnglishProductName, EnglishProductSubcategoryName

FROM dbo.DimProduct AS P


JOIN dbo.DimProductSubcategory AS S ON P.ProductSubcategoryKey = S.ProductSubcategoryKey

Use the aslias for the table to avoid a too long query

N> if you want to display the value of ProductSubcategoryKey too you need to fully qualified it (ex S. ProductSubcategoryKey) because is common between the joined tables

 

— joing 3 tables

SELECT EnglishProductName AS Product, EnglishProductSubcategoryName AS SubCategory , EnglishProductCategoryName AS Category

FROM dbo.DimProduct AS P


JOIN dbo.DimProductSubcategory AS S ON P.ProductSubcategoryKey = S.ProductSubcategoryKey


JOIN dbo.DimProductCategory AS C ON S.ProductCategoryKey = C.ProductCategoryKey

Mountain-500 Black, 44 Mountain Bikes Bikes

Mountain-500 Black, 48 Mountain Bikes Bikes

Mountain-500 Black, 52 Mountain Bikes Bikes

LL Bottom Bracket Bottom Brackets Components

ML Bottom Bracket Bottom Brackets Components

HL Bottom Bracket Bottom Brackets Components

Road-750 Black, 44 Road Bikes Bikes

Road-750 Black, 48 Road Bikes Bikes

After the FROM Clause list all the pair tables that have the shared column to use for the join

N> Order By clause must come after the Where clause after the Join part of the query, so the ordered is JOIN … ON … WHERE … ORDER BY

 

— check for ‘Adjustable Race’

SELECT p.EnglishProductName, ProductSubcategoryKey FROM dbo.DimProduct AS P

WHERE p.EnglishProductName =
‘Adjustable Race’

–EnglishProductName ProductSubcategoryKey

—————————————————- ———————

–Adjustable Race NULL

SELECT p.EnglishProductName, s.EnglishProductSubcategoryName FROM dbo.DimProduct AS P

INNER
JOIN dbo.DimProductSubcategory AS S

ON p.ProductSubcategoryKey = s.ProductSubcategoryKey

WHERE p.EnglishProductName =
‘Adjustable Race’

— EnglishProductName EnglishProductSubcategoryName

—————————————————- ————————————————–

–(0 row(s) affected)

Why you can’t find the ‘Adjustable Race’ product in the inner join results? Just because the inner join is a match of common values of the P and S tables on ProductSubcategoryKey, but for ‘Adjustable Race’ ProductSubcategoryKey balue is NULL, this value will not match any ProductSubcategoryKey of the ProductSubcategoryKey table, so it’s a kind of Orphan product…

— outer left join

SELECT P.EnglishProductName AS P, S.EnglishProductSubcategoryName AS S

FROM dbo.DimProduct AS P

LEFT
OUTER
JOIN dbo.DimProductSubcategory AS S

ON P.ProductSubcategoryKey = S.ProductSubcategoryKey


Seat Tube NULL

Top Tube NULL

Tension Pulley NULL

Rear Derailleur Cage NULL

HL Road Frame – Black, 58 Road Frames

HL Road Frame – Red, 58 Road Frames

Sport-100 Helmet, Red Helmets

Sport-100 Helmet, Red Helmets

Inner joins return matching records; outer joins return matching and nonmatching records; a left outer join returns all the records from the left table. The left table is the table before the join operator. This is sometimes called the preserved table.

 

— right

SELECT
TOP 3 S.EnglishProductSubcategoryName AS S , P.EnglishProductName AS P

FROM dbo.DimProductSubcategory AS S

RIGHT
OUTER
JOIN dbo.DimProduct AS P

ON P.ProductSubcategoryKey = S.ProductSubcategoryKey

go

SELECT
TOP 3 P.EnglishProductName AS P, S.EnglishProductSubcategoryName AS S

FROM dbo.DimProductSubcategory AS S

RIGHT
OUTER
JOIN dbo.DimProduct AS P

ON P.ProductSubcategoryKey = S.ProductSubcategoryKey

S P

————————————————– ————————————————–

NULL Adjustable Race

NULL Bearing Ball

NULL BB Ball Bearing

 

(3 row(s) affected)

 

P S

————————————————– ————————————————–

Adjustable Race NULL

Bearing Ball NULL

BB Ball Bearing NULL

 

(3 row(s) affected)

 

Using RIGHT OUTER JOIN
you can have the same result (mirrored in this example)

 

— right join to get orpahn product sub category

SELECT P.EnglishProductName, S.EnglishProductSubcategoryName FROM dbo.DimProduct AS P

RIGHT
OUTER
JOIN dbo.DimProductSubcategory AS S

ON P.ProductSubcategoryKey = S.ProductSubcategoryKey

WHERE s.ProductSubcategoryKey IS
NULL

EnglishProductName EnglishProductSubcategoryName

————————————————– ————————————————–

 

(0 row(s) affected)

 

You can use a full outer join returns all records from both the left table and the right table, including matched and mismatched records.

 

SELECT EmployeeKey, ParentEmployeeKey, LastName +
‘ ‘
+ FirstName

FROM dbo.DimEmployee

EmployeeKey ParentEmployeeKey

———– —————– —————————————————————————————————–

1 18 Gilbert Guy

2 7 Brown Kevin

3 14 Tamburello Roberto

4 3 Walters Rob

5 3 Walters Rob

6 267 D’Hers Thierry

A table may be joined to itself. It has a foreign key (ParentEmployeeKey) that points to the primary key (EmployeeKey ) in the same table, in this case we can define self-joins.

 

— self join , employee and related parent

SELECT E.EmployeeKey, E.LastName +
‘ ‘
+ E.FirstName, E.ParentEmployeeKey,

P.LastName +
‘ ‘
+ P.FirstName

FROM dbo.DimEmployee E INNER
JOIN

dbo.DimEmployee P

ON E.ParentEmployeeKey = P.EmployeeKey

ORDER
BY E.EmployeeKey

EmployeeKey ParentEmployeeKey

———– —————————————————————————————————– —————– —————————————————————————————————–

1 Gilbert Guy 18 Brown Jo

2 Brown Kevin 7 Bradley David

3 Tamburello Roberto 14 Duffy Terri

4 Walters Rob 3 Tamburello Roberto

5 Walters Rob 3 Tamburello Roberto

 

N> CTEs possibly provide a more elegant (but more difficult) approach

 

— chef

SELECT e.EmployeeKey, p.ParentEmployeeKey FROM dbo.DimEmployee AS E

RIGHT
OUTER
JOIN dbo.DimEmployee AS P

ON E.EmployeeKey = P.EmployeeKey

WHERE p.ParentEmployeeKey IS
NULL

EmployeeKey ParentEmployeeKey

———– —————–

112 NULL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s