+---------------------------+------------------------------------------+
| artist_name | album_name |
+---------------------------+------------------------------------------+
| New Order | Retro - John McCready FAN |
| New Order | Substance (Disc 2) |
| New Order | Retro - Miranda Sawyer POP |
| New Order | Retro - New Order / Bobby Gillespie LIVE |
| New Order | Power, Corruption & Lies |
| New Order | Substance 1987 (Disc 1) |
| New Order | Brotherhood |
| Nick Cave & The Bad Seeds | Let Love In |
| Miles Davis | Live Around The World |
| Miles Davis | In A Silent Way |
| The Rolling Stones | Exile On Main Street |
| The Stone Roses | Second Coming |
| Kylie Minogue | Light Years |
+---------------------------+------------------------------------------+
13 rows inset sec)
Because
none of the rows are identical, no duplicates are removed using DISTINCT. You can check this by rephrasing the query to omit the
DISTINCT
clause; you’ll get the same output.
To
remove duplicates, MySQL needs to sort the output. If indexes are available that are in the same order as required for the sort—or the data itself is in an order that’s useful—this process has very little overhead. However, for large tables and without an easy way of accessing the data in the right order, sorting can be very slow. You should use
DISTINCT
(and other aggregate functions) with caution on large data sets. If you douse it, you can
check its behavior using the EXPLAIN
statement discussed in Chapter 8.
The GROUP BY ClauseThe
GROUP BYclause sorts data into groups for the purpose of aggregation. It’s similar to
ORDER BY, but it occurs much earlier
in the query process GROUP BYis used to organize the data before other clauses—such as WHERE, ORDER BY, and functions—are applied. In contrast, ORDER BY
is applied last—after the query has been resolved—to reorganize the query output for display.
An example will help
you understand what GROUP BYis used for. Suppose you want to know how many albums we own by each artist. Using the techniques you’ve learned so far, you
could perform an INNER JOINbetween artist and album, and use an
ORDER
BY artist_name clause to organize the artists into an order to make it easy for you to count. Here’s the query that you’d use:
mysql>
SELECT artist_name FROMShare with your friends: