How to resolve SQL server replication Data missmatch


Identifying and fixing data mismatches in SQL Server replication can be challenging, but there are several steps we can follow.

1. Identify the Data Mismatch:

First, determine which tables or columns are affected and what type of mismatch is occurring (e.g., missing rows, different values).

2. Check the replication configuration:

Verify the replication agent settings, subscription properties, and distribution database settings to ensure that the replication configuration is correctly configured.

3. Check the replication status:

Use SQL Server Management Studio (SSMS) or replication monitoring tools to monitor replication status and look for any errors or warnings.

4. Compare Source and Destination Data:

Compare the data in the source and destination tables using tools such as Redgate’s SQL Data Compare or other data comparison tools to identify discrepancies.

5. Check for conflicts:

Check the replication conflict tables for conflict information and resolve it accordingly if the same row is updated on both the publisher and subscriber.

6. Verify schema changes:

Ensure that schema changes (e.g., table structure, constraints, etc.) are correctly replicated to the subscriber. Misconfiguration of the schema can lead to data mismatches.

7. Check Data Transformation:

You should review any data transformations that are used during replication (e.g., custom stored procedures, triggers) for potential issues that could lead to data mismatches.

8. Check Data Types and Collations:

Make sure that the data types and collations are consistent between the source and destination databases. Mismatched data types can result in conversion errors.

9. Validate Primary Keys and Indexes:

It is important to ensure that primary keys and indexes are replicated correctly. If primary key constraints are not replicated correctly, data mismatches may occur.

10. Restart the Replication Agents:

Restarting the replication agents may resolve transient issues causing data mismatches in some cases.

11. Resynchronize the subscriber:

You may consider reinitializing the subscriber from a fresh snapshot of the publisher’s data if the data mismatch is significant.

12. Analyze replication logs and error messages:

Inspect the SQL Server Error Logs and replication-specific logs for any error messages or warnings that may provide insight into the cause of the data mismatch.

Remember to take proper backups of your databases before making any significant changes to avoid data loss.

So far, from my experience, I have seen that the data validation errors are reported for a mismatch in row count for the table(s).  And, it is fairly easy to determine the error and fix when it comes to resolving it. Without wasting much time, let’s talk about various ways to deal with this very error.

The most common, and indeed planned for, problem with replication is that of LATENCY. Latency in this context simply refers to the delay between publisher and subscriber (via distributor) and taken in isolation with no other symptoms can be perfectly normal. Latency can be caused by various factors, including geographical distance between nodes; network traffic or bandwidth constraints; transactional load on the publisher; and system resources. Less common causes, but by no means rare, are problems such as stalled agents, failed jobs and unexpected locking / other resource conflicts.

Here’s one such query you can run which uses some of these tables. The following query will pull all information about distribution agents that are actively distributing

transactions (in progress) and will provide useful information such as the delivery rate (txs/sec). I’ve added WITH (NOLOCK) from habit when dealing with system tables but it’s not strictly necessary in this case since this is a history table. Feel free to amend and add/remove columns or filters:

SELECT      da.name, da.publisher_db, da.subscription_type,

dh.runstatus, dh.delivery_rate, dh.start_time, dh.duration

FROM        dbo.MSdistribution_history dh WITH (NOLOCK)

INNER JOIN  dbo.msdistribution_agents da WITH (NOLOCK)

ON          dh.agent_id = da.id

WHERE       dh.runstatus = 3 — 3 means ‘in progress’, table explanation here:

http://msdn.microsoft.com/en-us/library/ms179878.aspx

AND         dh.start_time BETWEEN DATEADD(dd,-30,GETDATE()) AND GETDATE()

ORDER BY    dh.start_time DESC

Fixing Data Validation Errors in Replication:

  • Skip missing Transaction
  • Ignore Data Consistency Issue
  • Drop and Republish articles which are out of sync

Basically, when one has data validation errors in replication it is about inconsistency or out of sync articles. First, one has to find out what exactly the error is. Based on the error reported, use the tablediff utility to compare the records and find out the mismatch records, this utility will also create tsql statement that can help fix the data validation. Row count can be manually validated as well in case the article is small. Once a difference is found, manually insert the record(s) and it will fix the issue.

When a validation or inconsistency reported due to the key violation of duplicate value, find out the offending records. And then, one can create another agent profile and use skiperrors parameter to resolve this error. Also, one can also drop and republish the article that has reported an error and fix the issue.

I hope this tips will help you fix the data validation issue.