ComputersFile Types

SQL distinct: description, examples, properties

Often, when using SQL to retrieve information from tables, the user receives redundant data, consisting of absolutely identical duplicate rows. To exclude this situation, use the SQL distinct argument in the Select clause. In this article, we will consider examples of using this argument, as well as situations in which it is better to abandon the argument.

Before we begin to consider specific examples, we will create a couple of necessary tables in the database.

Preparing tables

Imagine that we have in the database stored information about the wallpaper, presented in two tables. This is the Oboi table (wallpaper) with the fields id (unique identifier), type (type of wallpaper - paper, vinyl, etc.), color, struct and price. And the Ostatki table (leftovers) with the fields id_oboi (reference to the unique identifier in the Oboi table) and count (the number of rolls in the warehouse).

Fill in the tables with data. In the table with wallpaper we add 9 entries:

Oboi

Id

Type

Color

Struct

Price

1

Paper

Multicolor

Embossed

56.9

2

Double-layer paper

Beige

Smooth

114.8

3

Vinyl

Orange

Embossed

504

4

Non-woven fabrics

Beige

Embossed

1020.9

5

Double-layer paper

Beige

Smooth

150.6

6th

Paper

Multicolor

Smooth

95.4

7th

Vinyl

Brown

Smooth

372

8

Non-woven fabrics

White

Embossed

980.1

9

Fabric

Pink

Smooth

1166.5

In the table with the remainders there are also nine records:

Ostatki

Id_oboi

Count

1

8

2

12

3

24

4

9

5

16

6th

7th

7th

24

8

32

9

eleven

Let's start by describing how to use distinct in SQL.

Place distinct in the Select clause

The distinct argument should be placed immediately after the Select keyword in the queries. It is applied immediately to all the columns specified in the Select clause because it will exclude absolutely identical rows from the result of the query. Thus, it is sufficient to specify "select distinct" when writing a SQL query. The only exception is the use of distinct inside the aggregate functions, which will be considered a little later.

It should be remembered that most DBMS does not recognize your request like this:

SELECT distinct Ostatki.Count, distinct Oboi. *

FROM Oboi

INNER JOIN Ostatki ON Oboi.id = Ostatki.id_oboi

Here, the argument is specified several times or specified once, but before the second, third or other selectable column. You will receive an error with a reference to inaccuracies in the syntax.

Using distinct in standard queries

It is obvious that with the proper construction of the structure of tables and their filling, within the same table situations are excluded when absolutely identical strings are encountered. Therefore, the execution of the "Select distinct *" query with a selection from one table is practically impractical.

Imagine a situation where we need to find out what type of wallpaper we have, just for convenience, sort by type:

SELECT Oboi.type

FROM Oboi order by type

And we get the result:

Type

Paper

Paper

Double-layer paper

Double-layer paper

Vinyl

Vinyl

Fabric

Non-woven fabrics

Non-woven fabrics

As you can see, there are duplicate rows in the table. If we add to the Select distinct clause:

SELECT distinct Oboi.type

FROM Oboi order by type

Then we get the result without repeating:

Type

Paper

Double-layer paper

Vinyl

Fabric

Non-woven fabrics

Thus, if the data was correctly entered into the tables, then immediately after the call or the request of the buyers we will be able to answer that there are no liquid wallpapers, glass pavement and acrylic wallpaper in the store. Given that the assortment in stores is usually not limited to one hundred wallpaper, it would be quite labor-consuming to look through the list of non-unique types.

Using distinct inside the aggregate functions

SQL distinct can be used with any aggregate function. But for Min and Max, its use will not have any effect, and when calculating the sum or average value, it is rarely possible to imagine a situation where it would not be necessary to take into account the repetitions.

Let's say we want to find out how much our warehouse is full, and for this we send a request that calculates the total number of rolls in the warehouse:

SELECT sum (Ostatki.count)

FROM Ostatki

The query will return a response 143. If we change to:

SELECT sum (distinct Ostatki.count)

FROM Ostatki

Then we get only 119, because the wallpaper under articles 3 and 7 are in stock in the same quantity. However, it is obvious that this answer is incorrect.

Most often, SQL is used with the Count function. So, without difficulty, we can find out how many unique types of wallpaper we generally have:

SELECT count (distinct Oboi.type)

FROM Oboi

And get the result 5 - paper ordinary and two-layer, vinyl, fabric and non-woven. Surely everyone saw an advertisement like: "Only we have more than 20 kinds of different wallpapers!", By which it is meant that in this store there are not a couple of dozen rolls of everything, but wallpaper of the most diverse modern types.

Interestingly, in a single request, you can specify several functions Count with or without the distinct attribute. That is, this is the only situation where distinct in Select can be present several times.

When should I reject the use of an argument

The use of the SQL distinct argument should be discarded in one of two ways:

  1. You select from tables and are sure of the uniqueness of the values in each. In this case, the use of the argument is inexpedient, because this is an additional load on the server or client (depending on the type of DBMS).
  2. You are afraid of losing the necessary data. Let us explain.

Suppose the boss asks you to display a list of wallpaper that you have, with only two columns - type and color. By habit, you specify the argument distinct:

SELECT distinct Oboi.type, Oboi.color

FROM Oboi

ORDER BY Oboi.type

And - you lose some data:

Type

Color

Paper

Multicolor

Double-layer paper

Beige

Vinyl

Brown

Vinyl

Orange

Fabric

Pink

Non-woven fabrics

Beige

Non-woven fabrics

White

It may appear that we have only one type of paper wallpaper (conventional and two-layered), although in fact even in our small table they have two articles (the result without distinct):

Type

Color

Paper

Multicolor

Paper

Multicolor

Double-layer paper

Beige

Double-layer paper

Beige

Vinyl

Brown

Vinyl

Orange

Fabric

Pink

Non-woven fabrics

White

Non-woven fabrics

Beige

Therefore, as with any query, with the distinct argument one must be careful and correctly solve the problem with its application depending on the task in hand.

Alternative distinct

The opposite of the distinct argument is the All argument. When you use it, duplicate lines are saved. But since by default the DBMS thinks that it is necessary to print all the values, the All argument is more of a specifier than a real functional argument. We hope that you now understand when distinct (SQL) is used. The description gave you complete information about the advisability of applying this argument in solving various problems. After all, as it turned out, even such a simple argument in its application hides a very palpable probability of losing some data and inferring accurate information.

Similar articles

 

 

 

 

Trending Now

 

 

 

 

Newest

Copyright © 2018 en.birmiss.com. Theme powered by WordPress.