?

Log in

No account? Create an account
SQL Tip of the Day - The Cover Story
October 2013
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
 
Thu, May. 26th, 2005 11:57 am
SQL Tip of the Day

Doing a NOT IN to find records without equivalents is considerably faster than doing an OUTER JOIN combined with a IS NULL where clause (assuming proper indexing). Thus to find books without authors:
SELECT title
FROM titles LEFT OUTER JOIN authors
   ON titles.authorid = authors.authorid
WHERE authorname IS NULL

is slower than
SELECT title, authorname
FROM titles
WHERE authorid NOT IN (SELECT authorid FROM authors)

Later!

Tags:
Current Mood: cheerful cheerful

11CommentReplyShare

purly
purly
...
Thu, May. 26th, 2005 05:23 pm (UTC)

I think that would depend on the size of the data set on the left and right. ;)


ReplyThread
sirroxton
sirroxton
Adam Augusta
Thu, May. 26th, 2005 05:30 pm (UTC)

That's weird.

I would have expected the performance to be comparable. That is, assuming you changed the last line of the first query from:
WHERE authorname IS NULL
To:
WHERE authors.authorid IS NULL

I mean, either way, you're just lining up a couple of indexes, right? :)


ReplyThread
bronzite
bronzite
Robert Bronzite
Thu, May. 26th, 2005 05:55 pm (UTC)

I mean, either way, you're just lining up a couple of indexes, right? :)

Actually not, although it would appear that way at first glance. Because this is an outer join, the connection requires an index scan for each row of the left table. Now, on the set I was testing when I discovered this, there was an 800,000 row table being joined to a 191 row table, and due to the cardinality difference, it generated nearly 60 million rows of searching. The difference between the runs times was 372 seconds for the first query vs. less than one for the second.


ReplyThread Parent
sirroxton
sirroxton
Adam Augusta
Thu, May. 26th, 2005 06:29 pm (UTC)

An index scan for each row? *boggle* My DBMS-fu is weak.


ReplyThread Parent
bronzite
bronzite
Robert Bronzite
Thu, May. 26th, 2005 06:38 pm (UTC)

Well, it still needs to conduct the join somehow. It could cartesian the two tables together and do the join that way, but dear god, the overhead. It would defeat the point of having the indexes there. An Index Scan is expensive, but not nearly so as a table scan.


ReplyThread Parent
jalawingedone
jalawingedone
Thu, May. 26th, 2005 06:38 pm (UTC)

Well, adding you as a friend has totally kicked the geekness of my lj up another ten notches.


ReplyThread
bronzite
bronzite
Robert Bronzite
Thu, May. 26th, 2005 06:52 pm (UTC)

Yeah, that'll happen, but only on technical fronts. Don't expect too much Tolkien-y goodness springing forth from here :-).


ReplyThread Parent
jalawingedone
jalawingedone
Thu, May. 26th, 2005 07:20 pm (UTC)

Well, it was really bad because Jeremiah came over to my house monday for my birthday and my younger brother needed help on science homework. It was calculator programing, and they were talking code. I'm surrounded by technical geekiness, which is not my strongest suit.


ReplyThread Parent
bronzite
bronzite
Robert Bronzite
Thu, May. 26th, 2005 07:28 pm (UTC)

Dating a CS can be a harrowing experience for those not indoctrinated, especially when they start to get into groups, just ask Kate. You'll get used to it.

Or you'll enter a failure mode.

Whatever works.


ReplyThread Parent
ultimatepsi
ultimatepsi
Kate Nineteen
Thu, May. 26th, 2005 10:21 pm (UTC)

Actually, don't ask me. I prefer CS majors.

And eventually, one learns which speech libraries you need to include.


ReplyThread Parent
petercooperjr
petercooperjr
Peter Cooper Jr.
Thu, Jun. 2nd, 2005 11:57 pm (UTC)

I hope you're talking about Oracle... In MS SQL Server, the following command (which is how I'd solve the problem) is definitely fastest. I think it has the same execution plan as the first thing you had, but I'd need to test it to make sure.

SELECT title
FROM titles
WHERE NOT EXISTS (
SELECT 1
FROM authors
WHERE authors.authorid = titles.authorid
)


Basically, it's a matter of just asking SQL what you really mean.


ReplyThread