Joins: inner, outer, left, right
In (My)SQL, join is a means for combining records from two tables into a single set which can be either returned as is or used in another join. In order to perform the operation a join has to define the relationship between records in either table, as well as the way it will evaluate the relationship. The relationship itself is created through a set of conditions that are part of the join and usually are put inside ON clause. The rest is determined through a join type, which can either be an inner join or an outer join.
The SQL clauses that set the respective join type in a query are [INNER] JOIN and {LEFT | RIGHT} [OUTER] JOIN. As you can see the actual keywords INNER and OUTER are optional and can be omitted, however outer joins require specifying the direction – either left or right (more on that later).
Examples of queries using joins:
SELECT * FROM users JOIN files ON files.owner_id = users.id WHERE users.id = 1;
SELECT * FROM users LEFT JOIN files ON files.owner_id = users.id WHERE users.id = 1;
Inner join, outer join
The primary difference between the two basic types (each has several subtypes) is in making the decision whether joining of two rows was successful or not, which essentially determines whether the combined row can be returned or not.
Inner joins require that a row from the first table has a match in the second table based on the join conditions. In means that the first query from the example above will only return any rows if files table contains at least one record where owner_id is 1 (has to be equal to users.id by the join conditions and users.id is filtered in WHERE to accept only that one value). Otherwise it will return no rows at all, even if users contains a valid user record. Assuming there are two users, but only one has any files:
mysql> SELECT * FROM users JOIN files ON files.owner_id = users.id WHERE users.id = 1;
+----+--------------+---------+----+----------+------------------+
| id | name | enabled | id | owner_id | filename |
+----+--------------+---------+----+----------+------------------+
| 1 | Albin Kolano | 1 | 1 | 1 | audit report.doc |
+----+--------------+---------+----+----------+------------------+
1 row in set (0.00 sec)
mysql> SELECT * FROM users JOIN files ON files.owner_id = users.id WHERE users.id = 2;
Empty set (0.01 sec)
Outer joins, on the other hand, consider a join successful even if no records from the second table meet the join conditions (i.e. whether there are any matches or not). In such case outer join sets all values in the missing columns to NULL. The second query from the example will return rows whenever there are matches in users and regardless of the contents of files table.
mysql> SELECT * FROM users LEFT JOIN files ON files.owner_id = users.id WHERE users.id = 1; +----+--------------+---------+------+----------+------------------+ | id | name | enabled | id | owner_id | filename | +----+--------------+---------+------+----------+------------------+ | 1 | Albin Kolano | 1 | 1 | 1 | audit report.doc | +----+--------------+---------+------+----------+------------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM users LEFT JOIN files ON files.owner_id = users.id WHERE users.id = 2; +----+-------------------+---------+------+----------+----------+ | id | name | enabled | id | owner_id | filename | +----+-------------------+---------+------+----------+----------+ | 2 | Nadzieja Surowiec | 1 | NULL | NULL | NULL | +----+-------------------+---------+------+----------+----------+ 1 row in set (0.00 sec)
Left join, right join
Unlike inner joins, outer joins require that the join direction is specified. Inner join is a symmetrical and bi-directional relationship, which means A JOIN B produces the same result as B JOIN A. That is not true for outer joins, because they accept when for a record in A there is no matching record in B and in such case the reverse operation is impossible as it would have to start with the non-existing record in B. This is the reason why setting the direction is necessary. A LEFT JOIN B finds matches for rows from table A in table B, while A RIGHT JOIN B finds matches for records from B in A.
In practice there is very little or even no real purpose for using RIGHT JOIN and in majority of cases everyone just sticks to using LEFT JOIN whenever they need outer join.
When does the join type matter?
Choosing the appropriate type depends on the logic you are trying to implement.
You have to use inner join when mandatory pieces of information are located in both tables and partial information is considered incomplete or even useless. The case of this could be listing user’s files based on the earlier example:
mysql> SELECT *
-> FROM users
-> JOIN files
-> ON files.owner_id = users.id
-> WHERE users.name = 'Nadzieja Surowiec'
-> AND users.enabled = 1;
Empty set (0.00 sec)
The query finds the user’s record in users table and verifies that they are allowed to use the service through the value of users.enabled column and then searches for their files in files table. If there are no matches in either table, the query does not return any result, which is the correct behavior. If outer join was used in this case, a useless partial result could be returned or even incorrect result:
mysql> SELECT *
-> FROM users
-> LEFT JOIN files
-> ON files.owner_id = users.id
-> WHERE users.name = 'Nadzieja Surowiec'
-> AND users.enabled = 1;
+----+-------------------+---------+------+----------+----------+
| id | name | enabled | id | owner_id | filename |
+----+-------------------+---------+------+----------+----------+
| 2 | Nadzieja Surowiec | 1 | NULL | NULL | NULL |
+----+-------------------+---------+------+----------+----------+
1 row in set (0.00 sec)
mysql> SELECT COUNT(1)
-> FROM users
-> LEFT JOIN files
-> ON files.owner_id = users.id
-> WHERE users.name = 'Nadzieja Surowiec'
-> AND users.enabled = 1;
+----------+
| COUNT(1) |
+----------+
| 1 |
+----------+
1 row in set (0.00 sec)
The application relying on such queries not only would not be able to make anything out of such file information where file data is all set to NULL values, but also it would have to include additional and in fact redundant logic to filter out such results. The row count in this case is correct, as the query returned a single row, but it does not represent how many files the user has, so it is not a valid information that the application could use.
Outer join must be used to perform a join with a table, which holds information that is only optional for the result.
In our example we are working with a query that lists user’s files and we already established that the join between users and files has to be inner join. But let’s give our users the opportunity to choose a custom icon for any file if they want to. The information could be kept in a separate table called file_icon.
Now, for each listed file we also want to see if user has set a custom icon for that file and return the icon name if they have set it. The icon information is entirely optional, so we want the query to return rows regardless of whether there is an entry for the given file in file_icon or not. Therefore we have to use outer join for this particular task.
mysql> SELECT *
-> FROM users
-> JOIN files
-> ON files.owner_id = users.id
-> LEFT JOIN file_icon
-> ON file_icon.file_id = files.id
-> WHERE users.name = 'Albin Kolano'
-> AND users.enabled = 1;
+----+--------------+---------+----+----------+------------------+---------+------------------+
| id | name | enabled | id | owner_id | filename | file_id | icon_image |
+----+--------------+---------+----+----------+------------------+---------+------------------+
| 1 | Albin Kolano | 1 | 1 | 1 | audit report.doc | 1 | MS-Word-Icon.png |
| 1 | Albin Kolano | 1 | 2 | 1 | stats-201104.xls | NULL | NULL |
+----+--------------+---------+----+----------+------------------+---------+------------------+
The outer join allowed us to grab the complete list of user’s files and along with their icons if any were set. If we used inner join instead, the result would be missing the record of stats-201104.xls file.
To Be Continued
More on dealing with practical problems when designing join queries will be covered in a separate post.
PlanetMySQL Voting:
Vote UP /
Vote DOWN
Read more at: http://planet.mysql.com/
Can MySQL use primary key values from a secondary index?
In the article about the role of a primary key, I mentioned that a secondary index in an InnoDB table consists not only of the values of its member columns, but also values of the table’s primary key are concatenated to the index. I.e. the primary key contents is part of every other index.
Assuming the following table structure:
CREATE TABLE `bets` ( `id` int(10) unsigned NOT NULL, `user_id` int(10) unsigned NOT NULL, `game_id` int(10) unsigned NOT NULL, ... PRIMARY KEY (`id`), KEY `user_id` (`user_id`) ) ENGINE=InnoDB
Here is the visualization:

If MySQL could use in queries these implicitly added values, it would maybe allow to save some space on listing the primary key columns at the end of an index explicitly. Let’s check various cases.
Row filtering
mysql> EXPLAIN
-> SELECT *
-> FROM bets
-> JOIN games
-> ON games.id = bets.id
-> WHERE bets.user_id = 111
-> AND bets.id > 3476G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: bets
type: ref
possible_keys: user_id
key: user_id
key_len: 4
ref: const
rows: 22
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: games
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: game.bets.id
rows: 1
Extra:
Both key_len and ref fields indicate that only one four bytes long column is used from the user_id index. MySQL cannot use the primary key values in a secondary index for filtering in WHERE clause.
Sorting with ORDER BY
mysql> EXPLAIN
-> SELECT *
-> FROM bets
-> JOIN games
-> ON games.id = bets.game_id
-> WHERE bets.user_id = 111
-> ORDER BY bets.id DESCG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: bets
type: ref
possible_keys: user_id
key: user_id
key_len: 4
ref: const
rows: 22
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: games
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: game.bets.game_id
rows: 1
Extra:
Extra only returns Using where, but there is no Using filesort. It means ORDER BY will be optimized using the hidden primary key data from the secondary index.
Aggregating with GROUP BY
mysql> EXPLAIN
-> SELECT *
-> FROM bets
-> JOIN games
-> ON games.id = bets.game_id
-> WHERE bets.user_id = 111
-> GROUP BY bets.idG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: bets
type: ref
possible_keys: user_id
key: user_id
key_len: 4
ref: const
rows: 22
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: games
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: game.bets.game_id
rows: 1
Extra:
Also in this case Extra neither shows Using filesort nor Using temporary, which would indicate no index is used for grouping. Therefore MySQL can optimize GROUP BY on the concatenated primary key values.
Covering index
mysql> EXPLAIN
-> SELECT bets.id
-> FROM bets
-> WHERE bets.user_id = 111G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: bets
type: ref
possible_keys: user_id
key: user_id
key_len: 4
ref: const
rows: 22
Extra: Using index
The query execution plan confirms through Using index that it will only need index contents to return result. MySQL can read and return the hidden primary key values to avoid the additional data lookup.
Summary
In InnoDB tables each entry of a secondary index always contains a copy of the corresponding primary key value. These values may in some cases be used to the benefit of query execution plan:
- for ORDER BY on the primary key column(s)
- for GROUP BY on the primary key column(s)
- when returning the primary key column(s) values in the SELECT list
MySQL cannot use them, however, to optimize filtering in WHERE.
PlanetMySQL Voting:
Vote UP /
Vote DOWN
Read more at: http://planet.mysql.com/
Meet the MySQL Experts Podcast: MySQL Replication Global Transaction Identifiers & HA Utilities
0
0
1
223
1273
Homework
10
2
1494
14.0
Normal
0
false
false
false
EN-US
JA
X-NONE
In the latest episode
of our “Meet The MySQL Experts” podcast, Luis Soares,
Engineering Manager of MySQL Replication discusses the new Global Transaction
Identifiers (GTIDs) that are part of the latest MySQL 5.6 Development Release. We are also joined by Chuck Bell who
discusses how the new MySQL HA utilities use GTIDs to create a self-healing
replication topology.
In the podcast, we cover how GTIDs and the HA utilities are implemented, how they are
configured and considerations for their use.
You can also learn
more from Luis’ blog on GTIDs in MySQL 5.6 and Chuck’s blog on the HA utilities.
Of course, GTIDs are
just one of the major new features of MySQL replication. For a complete
overview, take a look at our DevZone article: MySQL 5.6 Replication – Enabling the Next Generation of Web & Cloud
Services.
You can try out MySQL
5.6 and GTIDs by downloading the Development Release (select Development Release tab)
Enjoy the GTID podcast and let us know what topics you would like covered in future podcasts! Also check out the library of Meet the MySQL Experts podcasts
PlanetMySQL Voting:
Vote UP /
Vote DOWN
Read more at: http://planet.mysql.com/
MySQL Handshake and Encryption
Interestingly, I have given the presentation on MySQL and Security at least 4 times in the past 6 weeks* and it was only last night, with the sharp minds at Baron’s Central Virginia MySQL Meetup Group (sadly Baron was not there!), that someone asked about when encryption happens in the MySQL handshake.
We had been talking about how MySQL authenticates users, and how if there are no ACL’s set for a given host, MySQL will reject connections from that host – even “telnet host 3306″ will be refused – and that’s when a clever audience member asked where in the handshake process encryption started. Is it before the username is sent? Before the password is sent? Does it encrypt all traffic, even the handshake traffic?
I think that’s an excellent question, and I know there’s a few sharp minds out there who probably know the answer….otherwise I will research the answer this weekend, when I’m back home in Boston.
* Effective MySQL User Group, as part of a tutorial for Percona Live: MySQL Conference and Expo , at the Professional IT Community Conference last week, and last night at the Central Virginia MySQL Meetup Group
PlanetMySQL Voting:
Vote UP /
Vote DOWN
Read more at: http://planet.mysql.com/
Bug.mysql.com and Contributions!
Oracle enhanced the bugs.mysql.com site to provide a better experience for users to submit contributions !
A new ‘Contributions’ tab has been added to the bugs.mysql.com user interface. This tab will allow users to have a defined space for their contributions. An Oracle Contributor Agreement (OCA) will still be required for all contributions. If needed, the OCA FAQ is posted here.
Please take advantage of this new feature when you help support and enhance MySQL !
PlanetMySQL Voting:
Vote UP /
Vote DOWN
Read more at: http://planet.mysql.com/
Connector/Net 6.4
Connector/Net 6.4 (6.4.5 GA, published on Friday, 18 May 2012)
PlanetMySQL Voting:
Vote UP /
Vote DOWN
Read more at: http://planet.mysql.com/
Log Buffer #272, A Carnival of the Vanities for DBAs
It is evident and beyond doubt now that the new media technologies like Twitter and Facebook are not going to wipe-out the blogs, rather they are complimenting each other very nicely and it seems they were made for each other. This Log Buffer Edition enhances this match, and presents you Log Buffer #272. Oracle: It [...]
PlanetMySQL Voting:
Vote UP /
Vote DOWN
Read more at: http://planet.mysql.com/
Upcoming conferences to learn more about MySQL Cluster & Replication
There are a couple of conferences coming up where you can expect to learn about the latest developments in MySQL Cluster and MySQL Replication (as well as what else is happening in MySQL 5.6).
The first is the Oracle MySQL Innovation Day which is being held in Oracle HQ at Redwood Shores. This is an all-day event on 5th June – unfortunately I won’t be able to attend this one but there will be lots of great Cluster and replication sessions. If you can’t make it out to California then there will be a live Webcast. You can register here to attend in person or join the webcast.
The second is MySQL Connect – this runs the weekend before Oracle OpenWorld in San Francisco; it’s not until 29th September but it’s worth registering now to get the early bird pricing and save $500 (end 13th July). There are lots of great sessions lined up both from the MySQL experts within Oracle and users and community members.
PlanetMySQL Voting:
Vote UP /
Vote DOWN
Read more at: http://planet.mysql.com/
Database is in Transition. try the statement later: SQL Server
I was trying to take the database offline to perform some maintenance on one of our QA server.
We have got a strange error “Database is in Transition. try the statement later Error 952“
Even i was not able to run the sp_who and sp_who2 also as it was also giving the same error.
Resolution: What i have tried?
I closed the SSMS (SQL Server Management Studio) and restarted it again.
Then to confirm and check for any blocking query, i ran sp_who and sp_who2 and it ran successfully.
Then to make Database offline/online i have used the below command and it worked like a charm :)
GO
ALTER DATABASE <db_name>
SET OFFLINE WITH ROLLBACK IMMEDIATE
PlanetMySQL Voting:
Vote UP /
Vote DOWN
Read more at: http://planet.mysql.com/
How having many tables affects MySQL memory usage?
You could say: what could be the reason for having really big number of tables? Just design the application properly! It’s not always that easy. And this post isn’t really about arguing whether having many tables is good or not, it’s about what happens in terms of memory usage if you already reached that point.
Btw what do I mean by *many*? From my experience it’s tens of thousends or even millions rather than hundreds.
The inspiration for me to write this post was strong desire to try out the latest declared improvements in that area announced to be done in MySQL 5.6. _1
What I did was a very simple test where I loaded sql dump of databases and tables definitions only into different versions of MySQL. During the load time I was monitoring memory usage of mysqld process (RSS value from ps aux) as a function of tables number. Additionaly I compared disk usage in case you choose single innodb table space versus file per table, also in relation to number of tables.
Settings I used for test were all default and I chose lowest possible innodb buffer pool:
innodb_buffer_pool_size = 8M
except when I started to test Percona versions I realized that the lowest allowed value there was 32MB, this warning appeared in error log after the service start:
“[Warning] option ‘innodb-buffer-pool-size’: signed value 8388608 adjusted to 33554432″
So both Percona versions I tested had 32MB and all others had 8MB but I think it wasn’t that important in this particular test.
Additionally for Percona versions I used the
innodb_dict_size_limit
variable introduced already in 2009 _2.
The sample table set that I used has around 50 different tables with various number of fields and indexes.
Here are the results of creating more than a million tables:
I wonder what is the difference between the patch that Percona did few years ago (and which was available starting from 5.0.77-b13 of Percona Server) and latest fix available in MySQL 5.6.x branch. Any way It’s really nice to see it among all other great improvements that are happening in 5.6.x development line.
OK, now let’s see how the disk usage is affected in this simple test by just creating empty tables in two scenarios: single InnoDB ibdata file and when innodb_file_per_table option is set. In this case I observed no differences between various MySQL versions, so you can see only one here.
Related links:
http://www.mysqlperformanceblog.com/2010/05/06/how-much-memory-innodb-dictionary-can-take/
http://fromdual.com/how-mysql-behaves-with-many-schemata-tables-and-partitions
http://www.percona.com/doc/percona-server/5.5/management/innodb_dict_size_limit.html?id=percona-server:features:innodb_dict_size_limit
http://dev.mysql.com/doc/refman/5.6/en/innodb-performance.html#innodb-performance-table-cache
http://blogs.innodb.com/wp/2011/12/improving-innodb-memory-usage-continued/
http://ma.tt/2006/03/wordpress-and-lyceum/
References:
- http://blogs.innodb.com/wp/2011/12/improving-innodb-memory-usage/
- http://www.mysqlperformanceblog.com/2009/02/11/limiting-innodb-data-dictionary/
PlanetMySQL Voting:
Vote UP /
Vote DOWN
Read more at: http://planet.mysql.com/

