SQL and Database: Nulls AND data?

I have a query that I’m working on…

I’m trying to get a list where I’m excluding a comment, but I want the NULLs. If I just run the query, I get everything. When I add the following condition, not only does the comment I don’t want disappear, but all of the NULLs disappear too. Why?

WHERE SC.Comment != ‘NO COC’

At first, I thought that != was invalid T-SQL, but apparently != and <> are identical. TIL.

Now, that said, NULL is a major pain in the ass. NULL is completely indeterminate - it’s not “equal to” anything, not even itself. Which also means that it’s not “not equal to” anything.

So if you want to exclude those NO COC records but include the NULL records, you need to do this:

WHERE (SC.Comment <> 'NO COC' or SC.Comment IS NULL)

I put it in parens in case you need to add other predicates to the WHERE clause.


Now, for the more hardcore stuff.

Like I said above, NULL is a pain in the ass, and the above construct could really wreak havoc with your execution plan for the query because it makes indexes less useful. I wish I could find the blog post I read about it now, but depending on other factors (you’ll quickly learn that the DBA’s favorite answer to anything is “it depends”), you may get much better performance by breaking it into two different queries. The important part here is that whatever and wherever here have to be identical, and any additional predicates in the WHERE clause have to be copied & pasted as well.

SELECT whatever FROM whatever WHERE SC.Comment <> 'NO COC'
UNION
SELECT whatever FROM whatever WHERE SC.Comment IS NULL

YMMV, of course - you’ll want to try it both ways and look at the I/O statistics & execution plans to see if there’s a significant difference.

That will only return lines where the comment is NULL, though, won’t it? If there are other comments that Nabiki wants to see, that would exclude them as well, I think. (Of course I’m an SQL newbie, so I could be totally wrong).

What about using NOT LIKE? So the statement would now look like:

WHERE SC.Comment NOT LIKE ‘%NO COC%’

I don’t know that that would work any better, but it’s worth a try. :slight_smile:

That is exactly what happened. I only got NULL returns. Unfortunately, the NOT LIKE ‘%NO COC%’ got absolutely nothing back.

Edit: Wait, there were no results due to the dates. I widened the date range and got the other comments, but no NULLS when I used the NOT LIKE.

I wonder if there’s something else in the query that’s messing with it, because I thought for sure that was the way to go (and my hatred for NULL now grows).

What about the UNION version?

The UNION version got me no results. That may be because I did a subquery. My original query looks like this.

SELECT S.Field1
, S.Field2
, SC.Comment

FROM dbo.X AS S WITH (NOLOCK)
LEFT OUTER JOIN dbo.SC AS SC WITH (NOLOCK)
ON
S.field = SC.field

WHERE S.Field3 = 0
AND S.Createdate > DATEADD(d, -2, CURRENT_TIMESTAMP)
AND S.Field4 LIKE (‘11%’)
AND LEN (S.Field4) = 8
AND SC.Comment != ‘%NO COC%’

ORDER BY S.Field1

Try this:

SELECT *
FROM (
	SELECT S.Field1
		,S.Field2
		,SC.Comment
	FROM dbo.X AS S
	LEFT OUTER JOIN dbo.SC AS SC ON S.field = SC.field
	WHERE S.Field3 = 0
		AND S.Createdate > DATEADD(d, - 2, CURRENT_TIMESTAMP)
		AND S.Field4 LIKE ('11%')
		AND LEN(S.Field4) = 8
		AND SC.Comment not like '%NO COC%'
	
	UNION
	
	SELECT S.Field1
		,S.Field2
		,SC.Comment
	FROM dbo.X AS S
	LEFT OUTER JOIN dbo.SC AS SC ON S.field = SC.field
	WHERE S.Field3 = 0
		AND S.Createdate > DATEADD(d, - 2, CURRENT_TIMESTAMP)
		AND S.Field4 LIKE ('11%')
		AND LEN(S.Field4) = 8
		AND SC.Comment IS NULL
	) AS A
ORDER BY A.Field1

EDIT: Revised in light of my stupidly missing the NOT LIKE requirement

SELECT S.Field1
		,S.Field2
		,SC.Comment
	FROM dbo.X AS S
	LEFT OUTER JOIN dbo.SC AS SC ON S.field = SC.field
	WHERE S.Field3 = 0
		AND S.Createdate > DATEADD(d, - 2, CURRENT_TIMESTAMP)
		AND S.Field4 LIKE ('11%')
		AND LEN(S.Field4) = 8
		AND (SC.Comment IS NULL or SC.Comment not like '%NO COC%')
    ORDER BY S.Field1

I can’t help but think that there’s a more optimal way to do this, but I should be doing my monthly SOX reviews right now.

Notice that I removed WITH (NOLOCK). NOLOCK is not a magic turbo button. NOLOCK is, in 99.99999% of cases, the wrong thing to use and very dangerous because you stand a very, very real risk of dirty reads (TL;DR: You’ll get data that exists in uncommitted transactions, and if those transactions are rolled back, your query will return results that don’t actually exist in the table when it’s all over). Your database should probably be set for the READ COMMITTED snapshot isolation level; this will eliminate the need for NOLOCK (in fact, NOLOCK may override this setting) and make your queries a lot safer. Use NOLOCK only when you know it’s 100% OK to do so. See also http://stackoverflow.com/questions/1452996/is-the-nolock-sql-server-hint-bad-practice & http://blogs.msdn.com/b/davidlean/archive/2009/04/06/sql-server-nolock-hint-other-poor-ideas.aspx

The one thing I see wrong is that “SC.Comment != ‘%NO COC%’” just won’t work. Any time you add the “%” mask, you automatically have to use “like”:

SC.Comment not like ‘%NO COC%’

It’s the law. Besides that, I’ll have to play with it some more. I can only re-emphasize Dak’s “pain in the ass” determination.

That appears to have worked. As Brazil mentioned, I got rid of the %'s around the NO COC.

Oh dip. I didn’t even pick up on that, @Brazil. Good catch. I’ve edited my post accordingly.

I was just fortunate to come into it with no preconceptions after Dak had done all the hard work. I have to say that when I first read the “NO COC”, I figured Nabiki was about to pull some SQL-based women’s lib joke on us.

2 Likes

Get your mind out of the gutter! :wink:

OK, now the real question: which version runs better? The one with the OR, or the one with the UNION (assuming identical resultsets)?

I would think the OR, since it only has to query the data once, while the UNION has to query it twice.

Run 'em both, with I/O stats turned on & the actual execution plan - it’s the only way to know which way is faster with your data & indexes. like and NULL can really screw with execution plans, and there are cases where the UNION will get better results.

The wildcard at the beginning of the like is especially troublesome, because it makes any index on that field less useful. If the string you’re searching for starts with NO COC, use likeNO COC%` instead and any indexes that use that field will become much more useful because the index can be searched.