Writes, optimizes, and debugs T-SQL queries. Explains SQL Server internals, troubleshoots performance issues, and guides database administration tasks including backup/restore, high availability, security, and index design. Use when the user asks about T-SQL syntax, SQL Server administration, query performance, stored procedures, indexes, locking, transactions, backup/restore, high availability, security, or any MSSQL-related topic — even without saying 'SQL Server' explicitly. Also trigger on terms like SSMS, tempdb, bcp, sqlcmd, MSSQL, sp_executesql, NOLOCK, columnstore, Hekaton, RCSI, param sniffing, or execution plan.
100
100%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Load this file when the user asks about:
Collation conflict errors on JOINs or UNION#temp table column defaults to tempdb collation)master, model, tempdb)COLLATE DATABASE_DEFAULT pattern_BIN vs _BIN2 collations for Always Encrypted or binary sortA collation defines:
| Aspect | Meaning |
|---|---|
| Character repertoire | Which characters are supported (e.g., Latin, Unicode) |
| Code page | For non-Unicode (char/varchar), the 8-bit encoding used |
| Sort order | Dictionary, binary, or binary-2 |
| Comparison rules | Case, accent, kana, width, variation-selector sensitivity |
Collation applies to:
char, varchar — single-byte, code-page-dependentnchar, nvarchar — Unicode (UCS-2 / UTF-16); collation controls sort/compare, not storagetext, ntext — deprecated; inherit column collationCollation does not apply to int, datetime, or other non-character types.
SQL_Latin1_General_CP1_CI_AS
│ │ │ │ │
│ │ │ │ └── AS = Accent Sensitive
│ │ │ └───── CI = Case Insensitive
│ │ └────────── CP1 = Code Page 1252
│ └──────────────────────── Latin1_General = locale/language
└──────────────────────────── SQL_ prefix = SQL Server legacy collation| Prefix | Type | Notes |
|---|---|---|
SQL_ | SQL Server legacy | Older sort rules; different from Windows OS sort; avoid for new objects |
| (no prefix) | Windows collation | Aligns with Windows NLS; recommended for most use cases |
Prefer Windows collations (e.g., Latin1_General_CI_AS) over SQL_ collations for new databases. SQL legacy collations sort char/varchar differently from nchar/nvarchar, which causes subtle JOIN inconsistencies.
| Collation | CI/CS | AI/AS | Notes |
|---|---|---|---|
SQL_Latin1_General_CP1_CI_AS | CI | AS | SQL Server install default (US); legacy |
Latin1_General_CI_AS | CI | AS | Windows equivalent; preferred |
Latin1_General_CS_AS | CS | AS | Case-sensitive variant |
Latin1_General_CI_AI | CI | AI | Case and accent insensitive |
Latin1_General_BIN | N/A | N/A | Binary sort (deprecated in favor of BIN2) |
Latin1_General_BIN2 | N/A | N/A | Byte-by-byte Unicode code point sort; required for Always Encrypted deterministic |
Latin1_General_100_CI_AS_SC | CI | AS | Supplementary characters support (SC) |
Latin1_General_100_CI_AS_SC_UTF8 | CI | AS | UTF-8 storage (2019+) |
Japanese_CI_AS | CI | AS | Japanese locale; kana insensitive by default |
Japanese_CS_AS_KS_WS | CS | AS | Kana and width sensitive |
[!NOTE] SQL Server 2019 UTF-8 collations (
_UTF8suffix) allowvarchar/charto store full Unicode using UTF-8 encoding. This reduces storage for ASCII-heavy data compared tonvarchar. Requires SQL Server 2019+. 1
| Suffix | Attribute | Meaning |
|---|---|---|
_CI | Case Insensitive | 'A' = 'a' |
_CS | Case Sensitive | 'A' <> 'a' |
_AI | Accent Insensitive | 'e' = 'é' |
_AS | Accent Sensitive | 'e' <> 'é' |
_KI | Kana Insensitive | Hiragana = Katakana |
_KS | Kana Sensitive | Hiragana ≠ Katakana |
_WI | Width Insensitive | Half-width = Full-width (Japanese) |
_WS | Width Sensitive | Half-width ≠ Full-width |
_VSI | Variation Selector Insensitive | Base char = char + variation selector |
_VSS | Variation Selector Sensitive | (2017+ for Japanese collations) |
_SC | Supplementary Characters | Surrogate pair support (emoji, rare Unicode) |
_BIN | Binary sort | Legacy byte-by-byte per char; use BIN2 instead |
_BIN2 | Binary-2 sort | Byte-by-byte on code point; deterministic, fastest comparison |
_UTF8 | UTF-8 encoding | varchar/char stored as UTF-8 (2019+) |
-- All collations and their properties
SELECT name,
description,
COLLATIONPROPERTY(name, 'CodePage') AS code_page,
COLLATIONPROPERTY(name, 'LCID') AS lcid,
COLLATIONPROPERTY(name, 'ComparisonStyle') AS comparison_style
FROM sys.fn_helpcollations()
WHERE name LIKE 'Latin1%'
ORDER BY name;
-- Check a specific collation's properties
SELECT COLLATIONPROPERTY('Latin1_General_CI_AS', 'CodePage') AS code_page, -- 1252
COLLATIONPROPERTY('Latin1_General_CI_AS', 'LCID') AS lcid, -- 1033
COLLATIONPROPERTY('Latin1_General_CI_AS', 'ComparisonStyle') AS style; -- 196609Collation is applied and inherited in a hierarchy:
Server (instance default)
└── Database (default for new objects)
└── Column (explicit or inherited from database)
└── Expression (COLLATE clause overrides)Set at install time. Controls:
master, model, tempdb) collations#temp table columns that inherit from tempdb-- Check server collation
SELECT SERVERPROPERTY('Collation') AS server_collation;[!WARNING] Changing server collation post-install is complex and risky — requires rebuilding system databases. Avoid if at all possible.
Controls the default collation for new columns and the collation of varchar/nvarchar literals in queries against that database.
-- Check database collation
SELECT name, collation_name
FROM sys.databases
WHERE name = DB_NAME();
-- Set at create time
CREATE DATABASE Inventory
COLLATE Latin1_General_CI_AS;
-- Change existing (see Section 10 for full workflow)
ALTER DATABASE Inventory
COLLATE Latin1_General_CS_AS;Overrides the database default for a specific column.
CREATE TABLE dbo.Products (
ProductID int NOT NULL PRIMARY KEY,
ProductCode varchar(20) COLLATE Latin1_General_CS_AS NOT NULL, -- case-sensitive SKU
ProductName nvarchar(200) NOT NULL -- inherits DB collation
);
-- Add column with explicit collation
ALTER TABLE dbo.Products
ADD InternalRef varchar(10) COLLATE Latin1_General_BIN2 NOT NULL
DEFAULT '';| Scenario | Recommended collation | Reason |
|---|---|---|
| US English, case-insensitive (most common) | Latin1_General_CI_AS | Windows collation, predictable |
| Case-sensitive login names / codes | Latin1_General_CS_AS or _BIN2 | Strict equality |
| Always Encrypted deterministic | Latin1_General_BIN2 | Required by AE |
| Multi-language (Western European) | Latin1_General_100_CI_AS_SC | SC for emoji/supplementary chars |
| Japanese text | Japanese_CI_AS | Correct kana/width defaults |
| UTF-8 storage optimization (2019+) | Latin1_General_100_CI_AS_SC_UTF8 | Saves storage for ASCII-heavy Unicode |
| Legacy SQL Server default (do not use for new DBs) | SQL_Latin1_General_CP1_CI_AS | Only keep for compatibility |
_BIN (legacy binary)_BIN applies binary sort only to the last character of each varchar value; the preceding characters use a dictionary sort. This is confusing and not truly binary. Always use _BIN2 when you need binary-order comparison.
Override the collation at expression or column level in any query:
-- Make a CI column comparison CS for this query only
SELECT *
FROM dbo.Users
WHERE Username = 'Admin' COLLATE Latin1_General_CS_AS;
-- Case-insensitive search on a CS column
SELECT *
FROM dbo.Products
WHERE ProductCode = 'ABC-001' COLLATE Latin1_General_CI_AS;
-- Accent-insensitive search
SELECT *
FROM dbo.Customers
WHERE LastName = 'Müller' COLLATE Latin1_General_CI_AI;-- Sort by collation different from column's collation
SELECT ProductName
FROM dbo.Products
ORDER BY ProductName COLLATE Latin1_General_CS_AS;When you need a column or expression to use the current database's collation dynamically (e.g., in a stored proc that might run in different databases):
-- Safe cross-database temp table comparison (see Section 9)
SELECT t.Name
FROM #TempResults t
JOIN dbo.Reference r
ON t.Name COLLATE DATABASE_DEFAULT = r.Name;When joining columns from two databases with different collations, SQL Server raises:
Msg 468, Level 16: Cannot resolve the collation conflict between
"Latin1_General_CS_AS" and "Latin1_General_CI_AS" in the equal to operation.-- Join across databases with different collations
SELECT a.CustomerID,
b.OrderID
FROM DBa.dbo.Customers a
JOIN DBb.dbo.Orders b
ON a.CustomerCode = b.CustomerCode COLLATE Latin1_General_CI_AS;Apply the COLLATE clause to the side with the less restrictive collation, or pick a shared target collation. Applying it to the right-hand side is conventional but either works.
[!WARNING] Adding
COLLATEin aJOINpredicate can prevent index seeks on the collated expression — the optimizer may not be able to use an index whose key was built with a different collation. Test execution plans after adding explicit collation.
-- Fix UNION collation conflict
SELECT Name COLLATE Latin1_General_CI_AS FROM DBa.dbo.People
UNION ALL
SELECT Name COLLATE Latin1_General_CI_AS FROM DBb.dbo.People;This is one of the most common collation bugs in production code.
Columns in #temp tables without explicit COLLATE inherit the server collation (via tempdb). If your database collation differs from the server collation, comparisons silently change behavior or raise errors.
-- tempdb collation: SQL_Latin1_General_CP1_CI_AS (server default)
-- UserDB collation: Latin1_General_CS_AS (explicitly set)
CREATE TABLE #Results (Name varchar(100)); -- inherits tempdb/server CI_AS
INSERT #Results SELECT Name FROM dbo.Users;
-- This comparison is CS because dbo.Users.Name is CS,
-- but #Results.Name is CI — may silently return wrong results
SELECT * FROM #Results WHERE Name = 'admin';DATABASE_DEFAULT-- Option A: explicit collation matching your database
CREATE TABLE #Results (
Name varchar(100) COLLATE Latin1_General_CS_AS NOT NULL
);
-- Option B: COLLATE DATABASE_DEFAULT — resolves to the
-- database that executes the CREATE TABLE statement
CREATE TABLE #Results (
Name varchar(100) COLLATE DATABASE_DEFAULT NOT NULL
);[!NOTE]
COLLATE DATABASE_DEFAULTis the safest portable option — it resolves to whatever the current database's collation is at execution time, avoiding hard-coded collation names.
Table variables are created in the user's database context (not tempdb), so they inherit the database collation by default. The mismatch problem is less common with table variables, but explicit collation is still good practice.
ALTER DATABASE ... COLLATE changes the default collation for new objects, but does not change existing columns.
-- Step 1: change the database default
ALTER DATABASE Inventory
COLLATE Latin1_General_CI_AS;
-- Step 2 (manual): rebuild existing varchar/nvarchar columns
-- For each column that should use the new collation:
ALTER TABLE dbo.Products
ALTER COLUMN ProductName nvarchar(200) COLLATE Latin1_General_CI_AS NOT NULL;ALTER TABLE ... ALTER COLUMN ... COLLATE <new>-- Find all varchar/nvarchar columns not using the new target collation
SELECT OBJECT_SCHEMA_NAME(c.object_id) AS schema_name,
OBJECT_NAME(c.object_id) AS table_name,
c.name AS column_name,
c.collation_name
FROM sys.columns c
WHERE c.collation_name IS NOT NULL -- only character columns
AND c.collation_name <> 'Latin1_General_CI_AS'
AND OBJECTPROPERTY(c.object_id, 'IsUserTable') = 1
ORDER BY schema_name, table_name, column_name;[!WARNING] Changing column collation is an offline schema change — it rebuilds the column in the table. On large tables, plan for a maintenance window or use online schema change tools (e.g.,
sp_rename-swap approach).
| Scenario | Default collation | Notes |
|---|---|---|
| New Azure SQL Database (Portal) | SQL_Latin1_General_CP1_CI_AS | Same as on-prem install default |
| Serverless database | Same as elastic pool / template | Inherited |
| Contained database (recommended) | Set explicitly at CREATE time | Cannot be changed after creation |
| Azure SQL Managed Instance | Set at instance creation | Cannot be changed post-creation |
[!NOTE] Azure SQL For Azure SQL Database, the collation can be set when creating the database with
CREATE DATABASE ... COLLATE. After creation,ALTER DATABASE ... COLLATEis not supported for user databases on Azure SQL Database (Hyperscale and others). Plan the collation before provisioning.
-- Azure SQL: set collation at create time
CREATE DATABASE MyAppDB
COLLATE Latin1_General_100_CI_AS_SC;For Azure SQL Managed Instance, the instance-level collation is set at provisioning and cannot be changed. Individual database collations can differ from the instance collation.
An index on a character column is built using that column's collation. If you apply COLLATE in a query predicate to override the column's collation, SQL Server cannot use the index for a seek — the index was built with different comparison semantics.
-- Index on ProductCode is Latin1_General_CS_AS
-- This CANNOT use the index as a seek:
SELECT * FROM dbo.Products
WHERE ProductCode = 'abc-001' COLLATE Latin1_General_CI_AS;
-- Results in an Index Scan (or Table Scan) — must compare every row
-- This CAN use the index:
SELECT * FROM dbo.Products
WHERE ProductCode = 'abc-001'; -- uses column's native collationA computed column expression inherits the collation of its inputs. If you want to index a computed column that involves a COLLATE override, the expression must be deterministic and the COLLATE must be to the same or a compatible collation.
-- Create computed column for case-insensitive lookup on CS column
ALTER TABLE dbo.Products
ADD ProductCodeCI AS (ProductCode COLLATE Latin1_General_CI_AS) PERSISTED;
CREATE INDEX IX_Products_CodeCI ON dbo.Products (ProductCodeCI);
-- Now this can use the index:
SELECT * FROM dbo.Products
WHERE ProductCodeCI = 'abc-001'; -- matches computed column expressionFiltered index WHERE predicates use the column's collation. A predicate like WHERE Status = 'active' on a CS column will only filter 'active' (not 'Active').
-- Server collation
SELECT SERVERPROPERTY('Collation') AS server_collation;
-- All database collations
SELECT name,
collation_name,
compatibility_level
FROM sys.databases
ORDER BY name;
-- All character columns and their collations in current database
SELECT OBJECT_SCHEMA_NAME(c.object_id) AS schema_name,
OBJECT_NAME(c.object_id) AS table_name,
c.name AS column_name,
c.collation_name,
t.name AS data_type,
c.max_length
FROM sys.columns c
JOIN sys.types t ON t.system_type_id = c.system_type_id
AND t.user_type_id = c.system_type_id -- exclude UDTs here; adjust if needed
WHERE c.collation_name IS NOT NULL
ORDER BY schema_name, table_name, column_name;
-- Find columns with non-default collation
DECLARE @db_collation sysname = CAST(DATABASEPROPERTYEX(DB_NAME(), 'Collation') AS sysname);
SELECT OBJECT_SCHEMA_NAME(c.object_id) AS schema_name,
OBJECT_NAME(c.object_id) AS table_name,
c.name AS column_name,
c.collation_name
FROM sys.columns c
WHERE c.collation_name IS NOT NULL
AND c.collation_name <> @db_collation
AND OBJECTPROPERTY(c.object_id, 'IsUserTable') = 1
ORDER BY schema_name, table_name;
-- Find all collation names available on this instance
SELECT name,
description
FROM sys.fn_helpcollations()
ORDER BY name;
-- tempdb collation (source of temp table inheritance)
SELECT collation_name FROM sys.databases WHERE name = 'tempdb';
-- Check current database's collation
SELECT DATABASEPROPERTYEX(DB_NAME(), 'Collation') AS current_db_collation;SQL_Latin1_General_CP1_CI_AS mismatch with Latin1_General_CI_ASThese two collations look equivalent but sort varchar data differently for characters outside ASCII. They will cause Msg 468 collation conflicts when joining. Pick one and stick with it. Prefer the Windows collation (no SQL_ prefix) for new databases.
tempdb collationSee Section 9. Always use COLLATE DATABASE_DEFAULT or an explicit collation on #temp table string columns. Failing to do so is a common source of hard-to-diagnose bugs, especially after restoring to a different server.
Applying COLLATE to a column reference in a WHERE or JOIN prevents the optimizer from using indexes built on that column. Use computed + persisted columns instead for frequently queried alternative-collation lookups.
_BIN2 collationAlways Encrypted requires string column (varchar, char, etc.) collations to be binary-code point (_BIN2). Forgetting this causes AE setup failures or runtime errors. 2
_UTF8 collations store varchar as UTF-8. ASCII characters (0–127) take 1 byte (same as code page 1252). Characters U+0080–U+07FF take 2 bytes; U+0800+ take 3 bytes. This is more efficient than nvarchar (2 bytes/char) for ASCII-dominant data but can be larger for CJK-heavy text. Profile before migrating. 1
ALTER DATABASE ... COLLATE does NOT change existing columnsDevelopers often believe changing the database collation retroactively fixes all columns. It only affects new columns created afterward. Existing columns must be altered individually (see Section 10).
_SC)Without _SC, surrogate pairs (emoji, rare Unicode, e.g., U+10000–U+10FFFF) are treated as two separate characters. LEN(N'😀') returns 2 without _SC, 1 with _SC. String functions (SUBSTRING, LEFT, RIGHT, LEN, CHARINDEX) all behave differently. 3
ORDER BY with CI collation sorts 'A' and 'a' together — order between them is undefinedWith a case-insensitive collation, ORDER BY Name will put 'Alice' before 'alice' or 'ALICE' — but which exact order is not guaranteed. Add a secondary ORDER BY Name COLLATE Latin1_General_BIN2 if you need a deterministic tie-break.
LIKE patterns are collation-awareLIKE 'A%' on a CI column matches 'Alice' and 'alice'. On a CS column it matches only 'Alice%'. Wildcard behavior (especially [A-Z]) is also collation-dependent — [A-Z] on a CI collation may include accented letters between A and Z in the locale's sort order.
This is a data-movement operation. For large tables it can cause significant blocking and log growth. Consider using ONLINE options where available, or offline maintenance windows.
COLLATE DATABASE_DEFAULT in stored procedures is resolved at parse time in some contextsIn dynamic SQL or cross-database calls, DATABASE_DEFAULT resolves to the database where the statement executes, which may differ from the database where the stored procedure was created. This is usually what you want but can be surprising.
Remote queries via linked servers return data with the remote server's collation. If the remote server uses a different collation, implicit collation conflicts arise in local JOINs. Always test collation behavior when federating data across linked servers.
references/25-null-handling.md — NULL behavior in comparisons (distinct from collation issues)references/16-security-encryption.md — Always Encrypted _BIN2 collation requirementreferences/34-tempdb.md — tempdb sizing and collation inheritancereferences/45-linked-servers.md — cross-server collation conflictsreferences/53-migration-compatibility.md — collation considerations during upgradesCollation and Unicode Support - SQL Server — UTF-8 collations introduced in SQL Server 2019: storage behaviour, trade-offs, and compatibility level requirements ↩ ↩2
Always Encrypted - SQL Server — documents that Always Encrypted does not support string columns with collations other than binary-code point (_BIN2) collations ↩
Collation and Unicode Support - SQL Server — comprehensive reference covering collation concepts, naming conventions, sensitivity attributes, supplementary characters, and BIN2 binary sort ↩
references