Hello GREGOIRE MANSIO,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that you would like to force Azure Synapse copy Data to ignore internal GUID.
There are three proven patterns listed below, depends on your best use case:
Pattern A (Mapping Data Flow Upsert):
Use a Mapping Data Flow where your source (JSON) feeds a Derived Column that sets id_dwh = uuid(), then write to the Synapse sink with Upsert keyed on client_id. This works because uuid() generates a GUID per row inside Data Flows, avoiding Synapse’s restriction on default expressions, and you explicitly map only the columns you intend to write (including id_dwh), so the service won’t attempt to insert NULLs into the GUID column. See Data Flow sink/mapping capabilities and the uuid() function reference.
Pattern B (Staging table + MERGE):
Keep your Copy activity simple by loading JSON into a staging table that mirrors the source (no GUID column), then run a Script/Stored Procedure that upserts into the target with MERGE on client_id. On inserts, generate the GUID inline not with a default, or example:
MERGE dbo.Target AS TUSING dbo.Staging AS S ON T.client_id = S.client_idWHEN MATCHED THEN UPDATE SET T.col1 = S.col1, T.col2 = S.col2WHEN NOT MATCHED BY TARGET THEN INSERT (client_id, col1, col2, id_dwh) VALUES (S.client_id, S.col1, S.col2, NEWID());
MERGE is GA for Synapse Dedicated SQL pools and is the recommended approach for upsert logic; consider hash distribution for performance when using WHEN NOT MATCHED [BY TARGET]. Also update statistics after loads for optimal plans.
Pattern C (Copy-only with a view or explicit mapping):
If you must stay with Copy Upsert only, create a view over the target table that omits id_dwh, point the sink to that view, and key on client_id. This prevents the service from pushing NULLs into the GUID column; afterwards, run a quick Script to backfill id_dwh for new rows:
UPDATE dbo.Target SET id_dwh = NEWID() WHERE id_dwh IS NULL;
Alternatively, keep the sink on the base table but set explicit column mappings that exclude id_dwh to avoid the “column count mismatch”/NULL insert behavior. These workarounds address two Synapse realities: (1) DEFAULT expressions like NEWID() are not supported in Dedicated SQL pools (only constants), and (2) Copy/PolyBase paths require careful mapping when sink and source schemas differ.
I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.