SalesOS.

Duplicate Detection & Merge

Find and resolve duplicate records with AI-powered matching and guided merge workflows.

Duplicate records are one of the most common data quality problems in any CRM. They lead to fragmented customer histories, inaccurate reporting, and wasted effort when multiple reps unknowingly work the same account. SalesOS provides a comprehensive duplicate detection and merge system that uses AI-powered matching to identify probable duplicates and offers guided workflows to resolve them without losing critical data.

How Duplicate Detection Works

SalesOS continuously monitors your database for records that appear to represent the same entity. The system analyzes contacts, accounts, and leads using multiple detection methods and assigns a confidence score to each potential duplicate pair. When duplicates are identified, they appear in your Duplicate Detection dashboard for review and resolution.

Detection runs automatically in the background on a scheduled basis and can also be triggered manually when you import data or want to audit a specific segment of your database.

Detection Methods

SalesOS combines three detection approaches to maximize accuracy while minimizing false positives.

Exact Match

The system compares normalized field values for direct equality. Normalization removes differences in casing, whitespace, and common formatting variations before comparison. Fields evaluated for exact match include:

  • Email address -- The strongest single identifier. Email matches are assigned very high confidence.
  • Phone number -- Normalized to strip country codes, dashes, spaces, and parentheses before comparison.
  • Company domain -- The domain portion of website URLs is extracted and compared.
  • Tax ID / Registration number -- For accounts with government registration identifiers on file.

Fuzzy Match

Fuzzy matching identifies records that are similar but not identical. This catches common scenarios like misspellings, abbreviations, and partial entries. SalesOS uses token-based similarity algorithms on:

  • Contact names -- Handles transposed first/last names, nicknames (e.g., "Rob" vs "Robert"), and typos.
  • Company names -- Recognizes abbreviations ("Inc" vs "Incorporated"), dropped words, and common misspellings.
  • Addresses -- Matches on partial address similarity, recognizing that "123 Main St" and "123 Main Street, Suite 200" likely refer to the same location.

The fuzzy match engine returns a similarity score between 0 and 100. Only pairs exceeding your configured threshold (default: 70) are surfaced as potential duplicates.

AI Confidence Scoring

Beyond field-level comparison, SalesOS applies a machine learning model that evaluates the overall likelihood two records represent the same entity. The AI model considers:

  • The combination of matching fields (email match + name match is stronger than name match alone).
  • Historical merge decisions made by your team, which train the model to reflect your organization's standards.
  • Contextual signals such as shared deals, overlapping activity history, or mutual relationships.
  • Industry-specific patterns (e.g., franchise locations that share a parent company name but are distinct entities).

The AI model produces a composite confidence score displayed as a percentage alongside each duplicate pair.

Duplicate Detection Dashboard

Navigate to Settings > Data Quality > Duplicate Detection or access the dashboard from the notification bell when new duplicates are detected.

Dashboard Overview

The dashboard provides a summary of your current duplicate landscape:

MetricDescription
Pending DuplicatesTotal number of unresolved duplicate pairs awaiting review
High ConfidencePairs scoring 90% or above -- likely true duplicates
Medium ConfidencePairs scoring 70-89% -- probable duplicates requiring verification
Low ConfidencePairs scoring 50-69% -- possible duplicates that need careful review
Resolved This MonthNumber of duplicate pairs resolved (merged or dismissed) in the current month
Auto-MergedDuplicates resolved automatically by rules you have configured

Filtering and Sorting

Use the filter controls at the top of the dashboard to narrow the duplicate list:

  • Entity Type -- Filter by Contacts, Accounts, or Leads.
  • Confidence Range -- Slider to set minimum and maximum confidence thresholds.
  • Detection Method -- Show only pairs detected by a specific method (exact, fuzzy, or AI).
  • Date Detected -- Filter by when the duplicate was first identified.
  • Assigned Reviewer -- Show duplicates assigned to a specific team member for review.

Sort the list by confidence score (highest first is the default), date detected, or record age.

Reviewing Duplicate Pairs

Click on any duplicate pair in the dashboard to open the side-by-side comparison view. This view displays both records with their field values aligned in rows, making it easy to identify differences and decide which values to keep.

Comparison View Layout

  • Left panel -- Record A with all populated fields.
  • Right panel -- Record B with all populated fields.
  • Center column -- Match indicators showing which fields are identical, similar, or conflicting.
  • Bottom bar -- Action buttons for Merge, Dismiss, or Flag for Review.

Field-Level Indicators

Each field row is color-coded:

  • Green -- Values are identical or functionally equivalent.
  • Yellow -- Values are similar but not identical (fuzzy match).
  • Red -- Values conflict and require a manual decision.
  • Gray -- Field is populated in one record but empty in the other.

Dismissing False Positives

If two records are not actually duplicates, click Not a Duplicate to dismiss the pair. Dismissed pairs are excluded from future detection runs unless the records are subsequently modified in ways that trigger re-evaluation. You can optionally add a reason for dismissal, which helps train the AI model.

Merge Workflow

When you confirm that two records are duplicates, the merge workflow guides you through consolidating them into a single, complete record.

Selecting the Master Record

The first step is choosing which record becomes the master (surviving) record. SalesOS suggests a master based on:

  • Record completeness -- The record with more populated fields is preferred.
  • Activity recency -- The record with more recent interactions is preferred.
  • Relationship depth -- The record with more associated deals, activities, and notes is preferred.

You can override the suggestion and select either record as the master.

Field-Level Merge Choices

For each field where the two records differ, you choose which value to keep:

  • Keep master value -- Retain the value from the master record.
  • Keep duplicate value -- Overwrite the master's value with the duplicate's value.
  • Keep both -- For multi-value fields (like tags or phone numbers), combine values from both records.
  • Manual entry -- Type a custom value if neither record has the correct information.

Fields where both records have the same value are automatically kept without requiring a choice.

Relationship Reassignment

All relationships from the duplicate record are transferred to the master:

  • Activities -- Calls, emails, meetings, and notes are reassigned to the master contact or account.
  • Deals -- Opportunities linked to the duplicate are relinked to the master.
  • Tasks -- Open and completed tasks transfer to the master.
  • Files and Attachments -- Documents associated with the duplicate are moved.
  • Custom object associations -- Any custom object linked to the duplicate is reassigned.

Completing the Merge

Click Merge Records to execute the consolidation. The duplicate record is soft-deleted (retained internally for audit purposes) and all its relationships are transferred. A success notification confirms the merge with a link to the merged master record.

Bulk Merge Operations

For large datasets -- especially after imports -- you may have dozens or hundreds of duplicate pairs to resolve. The bulk merge feature accelerates this process.

Selecting Pairs for Bulk Merge

From the dashboard, check the boxes next to multiple duplicate pairs, or use Select All filtered results. Then click Bulk Merge.

Bulk Merge Options

  • Auto-select master -- Let SalesOS choose the master record for each pair based on the completeness and recency heuristics.
  • Prefer newer record -- Always keep the more recently created record as master.
  • Prefer older record -- Always keep the original record as master.
  • Field resolution strategy -- Choose a default for conflicting fields: prefer master, prefer more complete, or prefer most recent update.

Review and Confirm

Before execution, SalesOS displays a summary showing how many pairs will be merged and any pairs where conflicts cannot be auto-resolved. Pairs requiring manual intervention are flagged and excluded from the bulk operation until resolved individually.

Click Execute Bulk Merge to process all selected pairs. A progress bar shows completion status, and a summary report is generated when done.

Prevention Rules

Rather than only cleaning up duplicates after the fact, SalesOS can prevent them from being created in the first place.

Block on Create

Configure rules that check for existing matches before allowing a new record to be saved:

  • Exact email match -- Block creation of a contact if an identical email already exists.
  • Domain match -- Warn when creating an account with a domain that already exists in the system.
  • Name + Company match -- Flag potential duplicates when a contact shares the same name and company as an existing record.

Behavior Options

For each prevention rule, choose the enforcement level:

  • Block -- Prevent the record from being created entirely. The user sees an error with a link to the existing record.
  • Warn -- Display a warning modal showing the potential duplicate, but allow the user to proceed if they confirm the records are distinct.
  • Log Only -- Allow creation but log the potential duplicate for later review in the dashboard.

Import-Time Prevention

When importing records via CSV or integration sync, prevention rules are applied in batch. Records flagged as potential duplicates are routed to a staging area where you can review and decide whether to create, merge, or skip each one.

Merge History and Undo

Every merge operation is recorded in the merge history log, accessible from Settings > Data Quality > Merge History.

Merge History Log

Each entry shows:

  • Date and time of the merge.
  • Performed by -- The user who executed the merge.
  • Master record -- Link to the surviving record.
  • Merged record -- The record that was absorbed.
  • Fields changed -- Summary of which field values were selected.

Undoing a Merge

Within 30 days of a merge, you can undo the operation:

  1. Find the merge in the history log.
  2. Click Undo Merge.
  3. Confirm the restoration.

SalesOS recreates the duplicate record with its original field values and reassigns relationships back to their original owners. Note that any changes made to the master record after the merge are preserved -- only the merged-in data and relationships are separated back out.

After 30 days, the undo option expires and the merge becomes permanent. This retention window is configurable by administrators under Settings > Data Quality > Retention.

Configuration Options

Administrators can fine-tune duplicate detection behavior from Settings > Data Quality > Configuration.

Detection Sensitivity

SettingDescriptionDefault
Minimum confidence thresholdPairs below this score are not surfaced50%
Auto-merge thresholdPairs at or above this score can be auto-merged95%
Fuzzy match sensitivityControls how strict the text similarity algorithm is70
Detection frequencyHow often the background scan runsEvery 6 hours

Scope Settings

  • Entity types to scan -- Choose which record types are included in detection (Contacts, Accounts, Leads, or all).
  • Field weights -- Adjust how much each field contributes to the confidence score. For example, increase the weight of email to make email matches dominant.
  • Exclusion rules -- Define patterns to exclude from detection (e.g., generic email domains like gmail.com should not cause account-level matches).

Notification Settings

  • Notify on high-confidence duplicates -- Send an alert when duplicates scoring above your threshold are detected.
  • Weekly digest -- Receive a weekly summary of unresolved duplicates.
  • Assign reviewers -- Route duplicates to specific team members based on record ownership or entity type.

Auto-Merge Rules

For organizations with high data volume, you can configure automatic merging for pairs that meet strict criteria:

  • Confidence score at or above the auto-merge threshold (default 95%).
  • No conflicting field values (all differences are empty-vs-populated, not value-vs-value).
  • Both records are owned by the same user or team.

Auto-merged records still appear in the merge history log and can be undone within the retention window.

Best Practices

  • Start with high-confidence pairs. When first enabling duplicate detection, work through the 90%+ confidence pairs first. These are almost certainly true duplicates and resolving them quickly improves your data quality baseline.

  • Establish field authority rules. Decide which data source is authoritative for each field. For example, if your marketing automation tool is the source of truth for email addresses, configure merge rules to prefer values from records created via that integration.

  • Enable prevention rules for key fields. At minimum, enable a warning on exact email match. This single rule prevents the most common form of duplicate creation -- re-entering a contact whose email already exists.

  • Review and train regularly. The AI model improves when you dismiss false positives and confirm true duplicates. Spending a few minutes each week reviewing medium-confidence pairs trains the system to better match your organization's data patterns.

  • Audit after imports. Run a manual detection scan after every bulk import. Even with import-time prevention enabled, edge cases can slip through, and catching them immediately is easier than finding them weeks later.

  • Use bulk merge judiciously. Bulk merge is powerful but irreversible at scale. Start with a small batch (10-20 pairs) to verify that your auto-selection and field resolution settings produce the expected results before processing hundreds of pairs.

  • Document your merge policies. Create a brief internal guide specifying which record type takes priority, how to handle conflicting phone numbers or addresses, and when to escalate to an admin. Consistency across your team prevents the same types of duplicates from recurring.

  • Monitor the weekly digest. Even if your data is currently clean, new duplicates arise from manual entry, integrations syncing overlapping data, or form submissions from existing customers. The weekly digest keeps you aware without requiring daily dashboard visits.