Improve connectivity between Azure Databricks and SAP Datasphere

Question

Improve connectivity between Azure Databricks and SAP Datasphere

LearnAndLearn 120

We're ingesting into SAP Datasphere remote tables from Azure Databricks via JDBC through a single, shared DP Agent. Under higher data volumes or when multiple spaces run at once, the DP agent becomes overloaded, disconnects for 10 minutes, and causes taskchains to fail. Because taskchains are dependent, one failure cascades into many.

DSP consumes Databricks data as remote tables over JDBC using one DP Agent shared across all DSP spaces.

When the DP Agent is overloaded by data throughput/concurrency, it disconnects. The automatic reconnect takes 10 minutes.

Any running persistency at the moment of disconnect fails. Due to taskchain dependencies, subsequent chains start and fail as well while the agent remains offline.

We avoid running multiple taskchains simultaneously, but cross-space overlap still occurs because they share the same DP Agent, triggering the same overload/disconnect behavior.

What we need ?

Stabilize DSP Databricks ingestion by reducing DP Agent overload risk.
Prevent cascading failures when a disconnect happens (retries/backoff/isolation).
Restore stakeholder trust with reliable, timely refreshes for SAC dashboards.

Is there any recommendation for this situation ?

Thank you.

Manoj Kumar Boyini 330 Reputation points Microsoft External Staff Moderator

2025-10-21T07:59:08.6733333+00:00

Hi LearnAndLearn,
I hope you had a chance to review the information shared earlier, and I hope this information has been helpful! If you still have questions, please let us know what is needed in the comments so the question can be answered.

Thanks,
Manoj

1 answer

Your answer

Manoj Kumar Boyini 330 Reputation points Microsoft External Staff Moderator

2025-10-21T07:59:08.6733333+00:00

Hi LearnAndLearn,
I hope you had a chance to review the information shared earlier, and I hope this information has been helpful! If you still have questions, please let us know what is needed in the comments so the question can be answered.

Thanks,
Manoj

Answer 1

Hi ,

Thanks for reaching out to Microsoft Q&A.

seems your issue is a single DP agent bottleneck under concurrent high volume JDBC loads from azure databricks to SAP Datasphere (DSP). When the agent overloads, it disconnects for 10 minutes, causing cascading taskchain failures.

Recommendations:

Scale DP Agents: Deploy multiple DP Agents (ideally 1 per DSP space or space group) to distribute load and remove the single point of failure.

Throttle Databricks concurrency: Reduce Spark parallelism using coalesce() or repartition() and control the number of concurrent JDBC writers.

Use JDBC batching: Enable batchsize (1,000 - 5,000 rows) and disable auto-commit to reduce transaction overhead on the DP Agent.
Add retry and backoff logic: Implement exponential backoff retries (ex: 30s -> 90s -> 180s) in orchestrations to handle temporary disconnects gracefully.

Stagger taskchains: Schedule chain start times to avoid overlapping execution across spaces.

Improve monitoring: Track DP Agent CPU, memory, and connection counts; alert on rising load before disconnections.

Long-term: Decouple Databricks from DSP by staging data in ADLS and letting DSP import from there, ensuring scalability and resilience.

Expected Outcome: This approach stabilizes DSP ingestion, isolates failures, and restores reliability for SAC dashboards with minimal architectural disruption.

Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.

Share via

Improve connectivity between Azure Databricks and SAP Datasphere

1 answer

Your answer