D365 Data Migration: What the Docs Skip

Every Dynamics 365 implementation has a data migration. Every single one. And it’s always underestimated. The project plan says “data migration — 2 weeks” like it’s one task. It’s not. It’s a project inside the project, and if you get it wrong, go-live slips or — worse — you go live with bad data that nobody trusts.

I’ve done this more than fifty times across different industries and Dynamics modules. Here’s what I’ve learned.

The Migration Phases

Data migration is not “load the CSV.” It’s a pipeline with distinct phases, and skipping any of them will cost you.

Extract → Transform → Validate → Load → Verify

That’s it. Five phases. Every migration follows this pattern, whether you’re moving 500 contacts from a spreadsheet or 12 million transactions from a legacy ERP.

Extract

Get the data out of the source system. This sounds simple until you realize the source system is a 15-year-old SQL Server database with 400 tables, no documentation, and column names like CUST_FLG_2.

Rules for extraction:

Export to CSV or SQL staging tables. Not Excel — Excel corrupts data types silently. A phone number like 0044123456 becomes 44123456 the moment Excel touches it.
Include every column, even ones you think you won’t need. You’ll be asked about them later.
Record row counts for every table. You’ll use these for verification at the end.
Timestamp the extract. If migration takes two weeks, you need to know exactly when the data was pulled.

Transform

This is where you reshape the source data to match Dataverse’s schema. It’s always the hardest phase.

Common transformations:

Field mapping. Source column CUST_NM maps to name on the Account table. Build a mapping document. Every field, every table. No exceptions.
Lookup resolution. The source system stores a customer type as a string “Premium.” Dataverse stores it as an option set integer value 100000001. You need to map every single value.
Date formats. The source uses MM/DD/YYYY. Dataverse expects ISO 8601. Or the source has dates like 00/00/1900 for “unknown” — Dataverse won’t accept that.
Concatenation and splitting. Full name in one field needs to become first name + last name. Or three address lines need to become a composite address.
Currency conversion. Dataverse currency fields require a transaction currency ID. If your source doesn’t track currency, you need to assign one.

I do all transformation in a staging database. SQL Server. Not in the migration tool itself. The migration tool’s job is to load clean, transformed data — not to do the heavy lifting of data reshaping.

Validate

Before you load anything, validate the transformed data against Dataverse’s constraints.

Required fields populated?
String lengths within Dataverse column limits?
Option set values match exactly?
Lookup targets exist in the destination? (You can’t create a Contact with an Account lookup if that Account hasn’t been loaded yet.)
No duplicate records based on your deduplication key?
Dates within Dataverse’s acceptable range (January 1, 1753 through December 31, 9999)?

Write SQL queries against your staging database that flag every violation. Fix them before loading. Finding data problems during the load phase means restarting the load, and that burns hours.

Load

Actually importing data into Dataverse. This is where your tool choice matters. More on tools below.

Verify

After loading, verify:

Row counts. Source table had 45,000 records → target entity should have 45,000 records.
Spot checks. Pull 20 random records. Compare every field between source and destination.
Lookup integrity. Every lookup field actually resolves to the correct parent record.
Business logic. If there are plugins or flows that fire on create, did they produce the expected results? Or did you disable them during migration (you should have) and now need to run catch-up logic?

Reference Data vs. Transactional Data

This distinction drives your entire migration sequence.

Reference data is the stuff everything else points to: currencies, business units, teams, users, territories, price lists, product catalogs, tax codes. This gets loaded first. Always. If it’s not there, every transactional record with a lookup to it will fail.

Transactional data is the operational stuff: accounts, contacts, opportunities, cases, orders, invoices, activities. This goes second, and the load order matters because of parent-child relationships.

A typical load order for a Sales migration:

Currencies, business units, teams
Users (mapped to Azure AD / Entra ID accounts)
Territories, price lists, products
Accounts
Contacts (lookup to Account)
Opportunities (lookup to Account, Contact)
Opportunity Products (lookup to Opportunity, Product)
Quotes, Orders, Invoices
Activities — emails, phone calls, tasks (lookup to regarding record)

Get this order wrong and you’ll spend the whole day staring at lookup resolution failures.

The Tools

Here’s where I have opinions.

Data Import Wizard (Built-in)

The wizard built into Dynamics 365. You upload a CSV, map columns, and click Import.

When to use it: Small datasets under a few thousand records. One-off reference data loads. When a power user needs to import their own data and you’re not going to be around.

When to avoid it: Anything over 10,000 records. Anything with complex lookup resolution. Anything that needs to be repeatable.

The wizard has terrible error handling. When a record fails, you get a vague “this record could not be processed” error in a system view that’s buried three clicks deep. Finding out why it failed requires detective work. For bulk operations, this is unacceptable.

Microsoft’s documentation makes it sound like the Import Wizard is a real migration tool. It’s not. It’s a convenience feature.

KingswaySoft SSIS Integration Toolkit

This is what I use for almost every migration. KingswaySoft makes SSIS (SQL Server Integration Services) components specifically for Dynamics 365 and Dataverse. You build data flows in Visual Studio using the SSIS designer.

When to use it: Any migration of meaningful size. Complex transformations. Repeatable loads (dev → test → UAT → production). Migrations from SQL databases, other CRMs, flat files — anything.

Why I prefer it:

Lookup resolution is built in. You configure a lookup component and it resolves text values to GUIDs automatically. “Contoso” → the GUID of the Contoso Account record. It caches the results, so it’s fast.
Error handling is excellent. Failed records go to an error output with the actual Dataverse error message. You can redirect them to a table or file and fix them.
Performance. It supports multi-threaded writes. I’ve loaded millions of records in hours, not days.
Repeatable. The same SSIS package runs against Dev, Test, UAT, and Prod. Just change the connection string.
Upsert support. It can match on alternate keys and create-or-update in one pass. This is critical for delta migrations.

The downside: It requires Visual Studio with SSIS tools, which means a developer machine. Non-technical people can’t use it. The learning curve for SSIS is real — if you’ve never built a data flow package, budget time for ramping up.

KingswaySoft isn’t free. The license costs money. It’s worth every penny. I’ve seen teams try to save a few hundred dollars by building custom console apps instead, and then spend ten times that amount in developer hours debugging their homegrown import logic.

XrmToolBox — Data Import Manager

XrmToolBox has several data migration plugins. The Data Import Manager by Aiden Kaskela is the one worth knowing about.

When to use it: Mid-size migrations where SSIS is overkill but the built-in wizard is too limited. Quick one-off loads during development. When you need a visual tool but don’t want to install Visual Studio.

It handles CSV/Excel imports with field mapping, lookup resolution, and decent error logging. It’s not as powerful as KingswaySoft for complex scenarios, but it covers 70% of what most people need.

Dataverse SDK / Web API (Custom Code)

Writing a C# console app (or a .NET tool) that uses the Dataverse SDK (Microsoft.PowerPlatform.Dataverse.Client) to push records through the API.

When to use it: When you need total control. Complex conditional logic during the load (“if this field is X, create two records instead of one”). Migrations that involve calling external APIs or doing lookups against non-Dataverse sources mid-flight.

When to avoid it: When a tool like KingswaySoft already does what you need. Custom code means custom bugs.

If you go this route, use ExecuteMultipleRequest for batching. Don’t create records one at a time — the round-trip overhead will destroy your throughput. Batch sizes of 100-200 work well. Above that, you risk hitting the 2-minute execution timeout.

For very large datasets, use the Elastic Tables Bulk API or the CreateMultiple / UpdateMultiple messages available since 2023. These are significantly faster than ExecuteMultiple because they bypass some of the per-record pipeline overhead.

var request = new ExecuteMultipleRequest
{
    Requests = new OrganizationRequestCollection(),
    Settings = new ExecuteMultipleSettings
    {
        ContinueOnError = true,
        ReturnResponses = true
    }
};

foreach (var entity in batch)
{
    request.Requests.Add(new CreateRequest { Target = entity });
}

var response = (ExecuteMultipleResponse)service.Execute(request);

// Check response.IsFaulted and iterate response.Responses
// for individual record errors

Power Automate

When to use it for migration: Almost never.

Power Automate is great for ongoing integrations. It is terrible for bulk data migration. The throughput is low. The concurrency limits are tight. Error handling for bulk operations is primitive — if record 4,500 out of 50,000 fails, good luck finding it in the flow run history.

I’ve seen consultants build migration flows in Power Automate because “the client doesn’t have a developer.” This always ends badly. The flow runs for 19 hours, hits the daily action limit, migrates half the data, and fails silently on the rest.

If you genuinely can’t use SSIS or custom code, Power Automate with a manual trigger and Apply to each on a SharePoint list can work for very small datasets (under 5,000 records). But at that point, the Import Wizard is probably easier.

Handling Lookups and Relationships

Lookups are the number one source of migration failures. A lookup field stores a GUID reference to another record. Your source data doesn’t have GUIDs — it has names, codes, or IDs from the legacy system.

Strategy 1: Alternate Keys

Create an alternate key on the target table using the legacy system’s ID. When you import, match on the alternate key instead of the GUID. KingswaySoft supports this natively. With the SDK, you use UpsertRequest with a KeyAttributeCollection.

This is my preferred approach. It’s clean, it’s fast, and it makes delta migrations trivial.

Strategy 2: Pre-built Lookup Map

Before loading, query Dataverse for all existing records in the lookup target table. Build a dictionary: legacy ID → Dataverse GUID. Use it during the transform phase to stamp the correct GUID onto every record.

This works but gets unwieldy when you have 15 lookup fields per record pointing to 8 different tables.

Strategy 3: Name Matching

Match on the Name field. “Contoso Ltd” in the source must match “Contoso Ltd” in Dataverse exactly. Case-sensitive. No trailing spaces.

This breaks constantly. Someone typed “Contoso Ltd.” with a period in the source. Or there are two records named “Contoso Ltd” and the tool picks the wrong one. Use name matching only for reference data with guaranteed unique names (like countries or currencies).

The Dependency Order Problem

If you’re loading Accounts that have a Parent Account lookup, you have a circular dependency. Account A is the parent of Account B, but both are in the same import file.

Solutions:

Two-pass load. First pass: create all Accounts without the Parent Account field. Second pass: update the Parent Account lookup using an alternate key match.
Topological sort. Order the records so parents come before children. Only works if the hierarchy is acyclic.

Two-pass is simpler and more reliable. I use it every time.

Deduplication Strategies

If the source data has duplicates (it does — every legacy system has duplicates), you need to decide: clean them before migration or after?

Before is always better. Deduplication after migration means dealing with merged records, re-pointed lookups, lost activities, and confused users.

Identifying Duplicates

Define a match rule. Common ones:

Entity	Match Fields	Notes
Account	Name + City + State	Watch for abbreviations (“St.” vs “Street”)
Contact	Email address	Most reliable for contacts
Contact	First name + Last name + Company	Catches contacts without email
Lead	Email or Company + Name	Leads are the messiest — expect 10-20% duplicates

Run the match rule in SQL against your staging database. Flag duplicates. Have the business decide which record to keep — the one with the most recent activity date is usually right, but not always.

Dataverse Duplicate Detection

Dataverse has built-in duplicate detection rules. I disable them during migration and enable them afterward. Here’s why:

During migration, duplicate detection fires on every create and either blocks the record or creates a duplicate detection job. Both are bad. Blocking records means your batch job errors out. Creating detection jobs means you have 50,000 pending “potential duplicate” records to review after migration — and nobody will review them.

Clean the data before it goes in. Turn off duplicate detection during the load. Turn it back on for ongoing data entry.

Data Quality Issues You Will Hit

After doing this for 14 years, here’s my list of things that go wrong in every single migration:

Phone numbers. They’re stored as free-text in every legacy system. You’ll see (555) 123-4567, 555.123.4567, 5551234567, +1-555-123-4567, and call Jenny all in the same column. Decide on a format and normalize everything during transform.

Email addresses. Invalid ones everywhere. john@companycom, n/a, blank spaces, multiple addresses crammed into one field. Validate with a regex during the transform phase. Move invalid ones to a notes field or a custom “legacy email” text field.

Addresses. Free-text nightmare. “123 Main St, Suite 200” in one field vs. structured street/city/state/zip in Dataverse. Parsing addresses programmatically is unreliable. If the data set is small enough, have someone fix them manually. For large sets, use a best-effort parser and accept some will be wrong.

Status fields. The source system has 12 custom statuses. Dataverse has Active/Inactive plus a Status Reason with 4 values. You need a mapping table for every status value. The business will argue about these mappings for days. Start the conversation early.

Dates with no timezone. Legacy date fields rarely store timezone. Dataverse stores everything in UTC. If the source system is in Eastern Time, every date needs to be adjusted. Miss this and all your dates are 4-5 hours off.

Null vs. empty string. In Dataverse, null and empty string are different things. Some migration tools treat them the same. This causes subtle bugs — like a search filter for “email is not empty” returning records that have an empty string but not null.

Rich text / HTML fields. If you’re migrating notes or description fields, check for HTML encoding. Angle brackets, ampersands, special characters — they all need escaping or they’ll render as broken HTML in the Dataverse form.

Performance Considerations

For small migrations (under 100,000 records total), performance doesn’t matter much. Use whatever tool you want.

For large migrations (millions of records), performance is the difference between a 4-hour migration window and a “we need to push go-live.”

Throughput Numbers

Rough guidelines based on my experience:

Method	Throughput (records/hour)	Notes
Import Wizard	5,000–10,000	Single-threaded, no tuning possible
KingswaySoft SSIS	50,000–200,000	Depends on thread count and record complexity
SDK ExecuteMultiple	30,000–80,000	Batch size and thread count matter
SDK CreateMultiple	100,000–300,000+	Newer, faster, fewer round trips
Power Automate	1,000–5,000	Throttled, API limits, not designed for this

These are rough numbers. Actual throughput depends on record complexity (how many lookups to resolve), plugins and flows firing on the target table, and your Dataverse instance’s API limits.

Speeding Things Up

Disable plugins and flows on target tables during migration. This is the single biggest performance improvement you can make. A table with three plugins and two Power Automate flows will import five times slower than the same table with everything disabled.

Document exactly what you disabled. Re-enable everything after migration. Miss one and you’ll have silent production bugs.

Disable duplicate detection rules. Already mentioned, but it’s a performance point too. Each rule adds processing time per record.

Disable auditing on tables being migrated. Auditing writes to the audit log for every create/update. On a million-record import, that’s a million audit log entries you don’t need.

Increase API limits if possible. Dataverse enforces API protection limits (5,000+ requests per user per 5 minutes for enterprise licenses). For migration, use a dedicated application user (service principal) with higher limits. Contact Microsoft support for temporary limit increases on large migrations.

Run during off-hours. Dataverse is a shared service. Running a heavy migration during business hours affects other users. The 2 AM migration window exists for a reason.

Parallelize carefully. Multiple threads improve throughput, but too many threads trigger throttling (HTTP 429 responses). Start with 4-8 threads and increase until you see 429s, then back off. KingswaySoft handles retry-on-429 automatically. With custom code, you need to implement exponential backoff yourself.

Rollback Planning

Things go wrong. You need a plan for undoing a bad migration.

Option 1: Backup and restore. Take a Dataverse backup before migration starts. If the migration is catastrophically bad, restore the entire environment. This is nuclear — it rolls back everything, not just the migrated data. Only use it when you’re migrating into a clean environment.

Option 2: Delete and reload. If the migrated data is identifiable (tagged with a specific import batch ID or created-by user), you can bulk-delete it and start over. Add a custom field like Migration Batch to every table you’re migrating. Set it during import. If you need to roll back, delete all records where Migration Batch = "Batch-2026-04-04".

Option 3: Compensating updates. For delta migrations where you’re updating existing records, capture the before-state in your staging database. If the migration is bad, run an update to restore the original values.

My approach: I always use Option 2 for initial loads and Option 3 for delta updates. I’ve never had to restore a full environment backup for a migration, and I’d like to keep it that way.

The Migration Runs: Dev, Test, Rehearsal, Production

Never run the migration once. Run it at least four times:

Dev migration. Test your packages, find the obvious errors. Data quality is terrible. Everything breaks. This is expected.
Test migration. Fix the errors from dev. Run again. Fewer errors. Validate with the business team.
Rehearsal migration. Run the exact production migration plan, step by step, against a copy of production. Time everything. This tells you how long the real migration will take and whether your migration window is long enough.
Production migration. Execute the plan. No improvising.

The rehearsal is non-negotiable. I’ve seen teams skip it because “we already tested in dev and test.” Then production migration takes three times longer than expected because the data volumes are different, the plugins behave differently with production security roles, and the lookup resolution hits records that don’t exist in lower environments.

What I Wish Microsoft Would Fix

The documentation for data migration in Dataverse is scattered and incomplete. There’s no single “here’s how to do a proper data migration” guide. You get a page about the Import Wizard, a page about the SDK, and nothing about the actual methodology.

The Import Wizard error messages are still terrible in 2026. “A record with this value already exists” — which value? Which field? The error should include the field name, the value, and the duplicate record ID. It doesn’t.

The API throttling documentation gives you the limits but doesn’t tell you what realistic throughput to expect. Every team has to discover their actual throughput through trial and error.

And the duplicate detection rules interface hasn’t been meaningfully updated in a decade. You can define rules, but there’s no way to preview matches before publishing. You publish the rule, run it, and hope the match criteria are right.

My Standard Migration Toolkit

For every migration project, this is my setup:

KingswaySoft SSIS Integration Toolkit — primary load tool
SQL Server — staging database for transforms and validation
XrmToolBox — Data Import Manager for quick one-off loads, SQL 4 CDS for querying Dataverse with SQL syntax during verification
Excel — only for the mapping document and for the business team to review sample data. Never as a data transport format
A custom SQL validation script library — reusable queries that check for orphaned lookups, missing required fields, duplicate matches, and data type violations

If you’re starting a Dynamics 365 implementation and data migration is on the plan, get KingswaySoft licensed in the first week. Not the last week. Build your SSIS packages early, iterate on them through the project, and by the time you’re doing the production rehearsal, you’ll have packages that work. Teams that wait until the last month to start migration work are the ones who delay go-live.