Skip to content

Commit 3a9ad39

Browse files
committed
[docs] SQL Server HA failover
1 parent 1ca7d51 commit 3a9ad39

File tree

1 file changed

+97
-0
lines changed

1 file changed

+97
-0
lines changed

doc/user/content/ingest-data/sql-server/self-hosted.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,103 @@ available(also for PostgreSQL)."
276276
277277
{{% sql-server-direct/next-steps %}}
278278
279+
## High Availability
280+
281+
### Using SQL Server Always On Availability Groups
282+
283+
To make your SQL Server source resilient to database failovers, configure
284+
Materialize to connect through a SQL Server [Always On Availability Group (AG)
285+
listener](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/listeners-client-connectivity-application-failover).
286+
When a failover occurs, SQL Server drops the existing connection and routes new
287+
connections to the new primary replica transparently.
288+
289+
#### Prerequisites
290+
291+
Before connecting Materialize to an AG, ensure:
292+
293+
1. **Your AG listener is configured and accessible.** Materialize must connect
294+
via the listener DNS name, not individual node hostnames.
295+
296+
1. **CDC is enabled on all potential primary replicas.** SQL Server's Change
297+
Data Capture metadata is **not** replicated across AG nodes.
298+
299+
1. **CDC capture and cleanup jobs exist on all potential primary replicas.**
300+
After a role change, the new primary must have these jobs to continue
301+
replicating changes.
302+
303+
SQL Server CDC metadata, including capture and cleanup jobs, **does not
304+
replicate** to AG secondary replicas. After a failover, you must ensure the new
305+
primary has CDC enabled and the required jobs are running.
306+
307+
**Recommended approach:** Create an automated script or SQL Agent job that runs
308+
on each potential primary after a role change:
309+
310+
```sql
311+
USE YourDatabase;
312+
313+
-- Enable CDC if not already enabled
314+
IF NOT EXISTS (SELECT 1 FROM sys.databases WHERE name = 'YourDatabase' AND is_cdc_enabled = 1)
315+
BEGIN
316+
EXEC sys.sp_cdc_enable_db;
317+
END
318+
319+
-- Enable CDC on tables (if not already enabled)
320+
IF NOT EXISTS (SELECT 1 FROM cdc.change_tables WHERE source_object_id = OBJECT_ID('schema.table_name'))
321+
BEGIN
322+
EXEC sys.sp_cdc_enable_table
323+
@source_schema = 'schema',
324+
@source_name = 'table_name',
325+
@role_name = NULL;
326+
END
327+
328+
-- Create capture job if it doesn't exist
329+
IF NOT EXISTS (SELECT 1 FROM msdb.dbo.cdc_jobs WHERE job_type = 'capture')
330+
BEGIN
331+
EXEC sys.sp_cdc_add_job @job_type = 'capture';
332+
END
333+
334+
-- Create cleanup job if it doesn't exist
335+
IF NOT EXISTS (SELECT 1 FROM msdb.dbo.cdc_jobs WHERE job_type = 'cleanup')
336+
BEGIN
337+
EXEC sys.sp_cdc_add_job @job_type = 'cleanup';
338+
-- Extend retention to cover expected failover + recovery time
339+
EXEC sys.sp_cdc_change_job @job_type = 'cleanup', @retention = 43200;
340+
END
341+
```
342+
343+
{{< note >}}
344+
Adjust the `@retention` value based on your expected recovery time. The default
345+
retention is ~3 days (4320 minutes). If CDC change data is pruned before
346+
Materialize can ingest it after a failover, you must [drop and recreate the
347+
source](/sql/drop-source/) to trigger a new snapshot.
348+
{{< /note >}}
349+
350+
#### Connecting to an AG listener
351+
352+
Create your SQL Server connection using the **AG listener** as the host:
353+
354+
```mzsql
355+
CREATE SECRET sqlserver_pass AS '<SQL_SERVER_PASSWORD>';
356+
357+
CREATE CONNECTION sqlserver_ag TO SQL SERVER (
358+
HOST 'my-ag-listener.example.com', -- AG listener DNS name
359+
PORT 1433,
360+
USER 'materialize',
361+
PASSWORD SECRET sqlserver_pass,
362+
DATABASE '<DATABASE_NAME>'
363+
);
364+
365+
CREATE SOURCE mz_source
366+
FROM SQL SERVER CONNECTION sqlserver_ag
367+
FOR ALL TABLES;
368+
```
369+
370+
When the AG fails over to a new primary, Materialize will:
371+
372+
1. Detect the dropped connection
373+
1. Reconnect to the AG listener (now pointing to the new primary)
374+
1. Resume ingestion from the last persisted [LSN](#monitoring-source-progress)
375+
279376
## Considerations
280377

281378
{{% include-md file="shared-content/sql-server-considerations.md" %}}

0 commit comments

Comments
 (0)