ANSI_NULLS: When indexes break a query

How often does adding an index in SQL Server break something? Let’s say it’s under normal operating conditions, no corruption, not adding unique or other constraints.

As a consultant and with 15+ years experience, it’s just not something I’ve seen very often at all. It is so rare, that even “seasoned pros” like yours truly may run into this and be surprised.

What I’m talking about here is the conflict between the ANSI_NULLS setting and certain types of indexes, that when combined, cause a query to simply error out. The error message is actually very clear and descriptive. It is even documented, which would be great, if I thought I needed to check there first!

Microsoft has a good primer here on ANSI_NULLS settings and its effect on Boolean logic. There are several people who have written detailed explanations elsewhere, but please just try to keep ANSI_NULLS ON for any new work. This story involves some legacy code that has been around for a long time and whose author is now unknown.

The Error:
DELETE failed because the following SET options have incorrect settings: ‘ANSI_NULLS’. Verify that SET options are correct for use with indexed views and/or indexes on computed columns and/or filtered indexes and/or query notifications and/or XML data type methods and/or spatial index operations.

In this case it was a delete statement in a stored procedure that started to fail and I had just added an index on a computed column.

The Cause:
That error message pretty much said it all. A new index was added to a computed column on a table and this seemingly unrelated delete statement in a stored procedure started to fail. Which lead me to question, where exactly was this ANSI_NULLS being set incorrectly (to FALSE)?

The uses_ansi_nulls flag can be set in various places such on the table itself, on stored procedures, and various other objects.
In this case the table had uses_ansi_nulls set to 1 but the stored procedure was set to 0.

It’s important to note here that uses_ansi_nulls is not part of the stored procedure’s definition so you wont find it after the CREATE PROC. It’s part of the schema object and can be viewed through system views such as sys.sql_modules and sys.tables. It is defined when the object is created or altered based on the session’s execution setting. This can be changed explicitly via the SET ANSI_NULLS option. Most people will likely just roll with the defaults that SSMS or Azure Data Studio provides and not even realize this option is there.

The fix:
In this case I had to remove the index until such time when I can go back and test the stored procedure altered with ANSI_NULLS ON. Sure I could have just ran an ALTER PROC with the same definition with correct session options but that carries a risk of changing the semantics of code I hadn’t reviewed yet. I could imagine a scenarios where the original developer had not used “IS NULL” anywhere in the WHERE clauses and my simple change would potential change data being modified. Not something I wanted to do right now!

The lesson learned:
Check the uses_ansi_nulls settings for any objects dependent on the table where I’m about to add the index. This won’t catch ad-hoc statements but it’s a start. Beyond that, have a test environment and test plan!

The repro:
Here’s a code snippet that will show how to reproduce this error using a trivial table and computed column. Keep in mind that this is just the computed column scenario. The filtered index scenario is also just as likely and then there’s XML, Spatial, and Query Notifications (as mentioned in the error) which I haven’t tried to reproduce.

SET ANSI_NULLS ON
GO

CREATE TABLE dbo.Person
(
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50) NOT NULL,
FullName AS (CONCAT(FirstName, ' ', LastName))
)
GO

CREATE INDEX IX_Person_FullName on dbo.Person(FullName)
GO

SET ANSI_NULLS OFF
GO

CREATE OR ALTER PROC dbo.PopulatePerson
as
BEGIN
	DELETE FROM dbo.Person

	INSERT INTO dbo.Person (FirstName, LastName)
	VALUES ('Harold', 'Potter')
END
GO

SET ANSI_NULLS ON
GO

EXEC dbo.PopulatePerson
GO

Addition reading / references:
SQL Docs: Indexes on Computed Columns
SQL Docs: CREATE INDEX > Filtered Indexes

Can you import this JSON?

In a recent project, I got the opportunity to learn way more about the SQL Server (2016 and later) JSON functions than I would have even imagined. The task was to “get it imported”. Much like a staging table, the goal here was ingestion, not normal form or relational design (although some of that came for the ride). The file is the full content from the source each time, so there is no append logic here.

The JSON document arrived resembling something like this:

{
    "sales": [
        {
            "pointOfSale": "Register 2",
            "guid": "ac93eaa6-e3eb-46f4-a613-c0ee201681d9",
            "clerk": 1122,
            "end": "2020-07-27T10:20:07.863",
            "completed": true,
            "amount": 48.99,
            "schedule": {
                "scheduleName": "First Shift",
                "supervisor": 448
            },
            "departments": [
                {
                    "guid": "50a6091c-e675-4a2b-a4eb-0f6a343fbc36",
                    "name": "Seasonal",
                    "events": [
                        {
                            "type": "Browsing",
                            "duration": 60
                        },
                        {
                            "type": "Inquiry",
                            "duration": 200,
                            "employeeSkills": [
                                "Sales",
                                "Spanish"
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}

This file is just one sales array item. The real document would have hundreds more.

How did I break this down?

First, I determined the hierarchy and what sections of the document need tables of their own. Next was to decide on data types and the key for each table, which would also be used for the parent to child relationships. JSON arrays (represented by square brackets) indicate where a child table structure begins. In this document, there is an object hierarchy like
sales > departments > events > employeeSkills.
These will be the four tables I demo here.

Getting the JSON document into a variable

The first task is to get the JSON contents in memory. For the purposes of this post, that document is in a file located on my local file system. I get that into a variable like this:

DECLARE @JsonDoc VARCHAR(MAX) =
(
SELECT 
	BulkColumn
FROM 
	OPENROWSET (BULK 'C:\JsonDemo\Sales.json', SINGLE_CLOB) JsonDoc
)

Here I am dealing with ASCII data. Accordingly, I have chosen a VARCHAR data type and SINGLE_CLOB option in OPENROWSET. If I were handling UNICODE data these would have been NVARCHAR and SINGLE_NCLOB respectively.

Enter OPENJSON

This is an amazing system table-valued function. It takes inputs of the JSON itself and a JSON path expression. In its most basic form of syntax, it returns KEY, VALUE, and TYPE columns. To understand this, I found it helpful to see the output on the first item in our topmost array. In my document, that is sales[0]:

SELECT * 
FROM 
OPENJSON(@JsonDoc, '$.sales[0]')

In JSON schema terms, the type‘s value has the following meaning:

0null
1string
2number
3boolean
4array
5object

Sales table structure

I mapped the string, number, and boolean typed values directly into columns with the best fit SQL Server data types. guid mapped directly into a uniqueidentifier, and lengths needed to be chosen for the other strings. The values from the JSON number type need to be reviewed as they can flow into int, decimal, and other numeric types. For the schedule object, I examined it further and found it to have a 1:1 relationship to the Sales table, so I simply extended Sales and added the schedule object’s columns to the end of the Sales table. To examine the contents of the first schedule, I used this:

SELECT * FROM 
OPENJSON(@JsonDoc, '$.sales[0].schedule')

This could be normalized in a later process. For now, I know I’m dealing with another string and number (which seems to be an int).

The departments type of array indicates another table needed, so I

Here is what the code looks like for Sales:

SELECT *
FROM 
OPENJSON(@JsonDoc, '$.sales')
WITH (
	pointOfSale VARCHAR(50),
	[guid] UNIQUEIDENTIFIER,
	clerk INT,
	[end] DATETIME,
	completed BIT,
	amount DECIMAL(18,2),
	scheduleName VARCHAR(50) '$.schedule.scheduleName',
	supervisor INT '$.schedule.supervisor'
)

Here, I’m using the WITH clause to build the column list. When a name is not found at the first level of the path provided in OPENJSON, then an additional column_path must be provided. The scope of $ in the the column_path begins at the item provided in the OPENJSON. So, in this case, the $ in supervisor starts at the ‘$.sales‘ context.

First level Child Table – Departments

Here, the code introduces the CROSS APPLY concept:

SELECT
departments.[guid],
departments.[name],
CONVERT(UNIQUEIDENTIFIER, JSON_VALUE(sales.value, '$.guid')) AS salesGuid
FROM 
OPENJSON(@JsonDoc, '$.sales') AS sales
CROSS APPLY OPENJSON(sales.[value], '$.departments') 
WITH (
	[guid] UNIQUEIDENTIFIER,
	[name] VARCHAR(50)
) AS departments;

Here, I used the WITH clause for the departments table and then used JSON_VALUE to extract the key (salesGuid) from the parent table.

Second level Child Table – Events

This was mostly the same pattern as the departments table until I realized there is no key. I didn’t recognize this as a problem until I saw the array below, employeeSkills, would need to be linked back to the event. For this purpose, I included the key from OPENJSON as a surrogate. Here, I relied on JSON_VALUE completely rather than the WITH clause column list.

SELECT
CONVERT(INT, [events].[key]) AS eventKey,
CONVERT(VARCHAR(50), JSON_VALUE([events].value, '$.type')) AS [type],
CONVERT(INT, JSON_VALUE([events].value, '$.duration')) AS [duration],
CONVERT(UNIQUEIDENTIFIER, JSON_VALUE(departments.value, '$.guid')) AS departmentsGuid
FROM 
OPENJSON(@JsonDoc, '$.sales') AS sales
CROSS APPLY OPENJSON(sales.[value], '$.departments') AS departments
CROSS APPLY OPENJSON(departments.[value], '$.events') AS [events];

Third level Child Table – EmployeeSkills

This one was interesting. It’s just an array of strings, with no column names. Turns out our old friend “value” is here to help. I didn’t need JSON_VALUE with this part, because there is no JSON object structure.

SELECT
CONVERT(INT, [events].[key]) AS eventKey,
CONVERT(UNIQUEIDENTIFIER, JSON_VALUE(departments.value, '$.guid')) AS departmentsGuid,
CONVERT(VARCHAR(50), [employeeSkills].[value]) AS [type]
FROM 
OPENJSON(@JsonDoc, '$.sales') AS sales
CROSS APPLY OPENJSON(sales.[value], '$.departments') AS departments
CROSS APPLY OPENJSON(departments.[value], '$.events') AS [events]
CROSS APPLY OPENJSON([events].[value], '$.employeeSkills') AS [employeeSkills]

Bringing it together

Running all four queries on this document produces four tables than can be joined together in a relational way.

For the final product, I made each table a SELECT INTO with a preceding DROP IF EXISTS and wrapped it all into a Stored Procedure.

There you have it, with four queries and all that learning, this JSON has been put into a relational form! I hope these examples will help others as they face a similar challenge.