Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
When you host your app with Azure Functions in the Flex Consumption plan, you can control how updates are deployed to running instances. A site update occurs whenever you deploy code, modify application settings, or change other configuration properties. The Flex Consumption plan provides a site configuration setting (SiteUpdateStrategy) that you can use to control whether your function app experiences downtime during these updates and how in-progress executions are handled.
The Flex Consumption plan currently supports these update strategies:
- Recreate: Functions restarts all running instances after replacing your code with the latest version. This approach might cause brief downtime while instances are recycled and preserves the default behavior from other Azure Functions hosting plans.
- Rolling update (preview): Provides zero-downtime deployments by draining and replacing instances in batches. In-progress executions complete naturally without forced termination.
Important
The rolling update strategy is currently in preview and isn't recommended for production apps. Review the current limitations and considerations before enabling this strategy in any production app.
Strategy comparison
This table compares the two site update strategies:
| Consideration | Recreate | Rolling update |
|---|---|---|
| Downtime | Brief downtime as your app scales out from zero after the restart | No period of downtime |
| In-progress executions | Forcefully terminated | Allowed to complete within the 60-minute scale-in grace period (HTTP functions limited to 230-second timeout) |
| Speed | Faster - instances are restarted immediately | Slower - instances are updated in batches at regular intervals |
| Backward compatibility | Not necessary as one version runs at a time | Changes must be backward compatible, especially with stateful workloads or breaking changes |
| How to set | Default behavior, consistent with other hosting plans | Opt-in configuration |
| Use when... | ✔ You need fast deployments. ✔ Brief downtime is acceptable. ✔ You're deploying breaking changes and need a clean restart. ✔ Your functions are stateless and can handle interruptions. |
✔ You require zero-downtime deployments. ✔ You have long-running or critical functions that can't be interrupted. ✔ Your changes are backward-compatible. ✔ You must preserve in-progress executions. |
Update strategy behaviors
This table compares the update process of the two strategies:
Recreate strategy:
Rolling update strategy:
- A site update (code or configuration changes) is applied to your function app.
- The recreate strategy is triggered to update running instances with the new changes.
- The platform forcefully restarts all live and draining instances.
- The scaling system immediately begins provisioning new instances with the updated version (original instances might still be deprovisioning in the background).
- A site update (code or configuration changes) is applied to your function app.
- The rolling update strategy is triggered to update running instances with the new changes.
- The platform assigns all live instances to batches.
- At regular intervals, the platform drains one batch of instances. Draining prevents instances from accepting new events while allowing in-progress executions to complete (up to the one hour maximum execution time).
- Simultaneously, the scaling platform provisions new instances running the updated version to replace the draining capacity.
- This process continues until all live instances are running the updated version.
This table compares the key characteristics of the two strategies:
Recreate strategy:
Rolling update strategy:
- Brief downtime: Your app is unavailable while instances restart and scale out
- Execution interruption: In-progress executions are terminated immediately
- No completion signal: Monitor instance logs to track when original instances stop emitting logs
- Zero downtime: deployments are done in batches so that executions complete without forced termination.
- Asynchronous operations: Draining and scale-out happen simultaneously without waiting for each other to complete. The scale-out isn't guaranteed to occur before the next drain interval.
- Overlapping updates: You can initiate additional rolling updates while one is in progress. All non-latest instances are drained, and only the newest version is scaled out.
- Dynamic scaling: The platform adjusts instance count based on current demand during the update.
- Platform managed capacity: When demand increases, the platform provisions more instances than it drains. When demand decreases, it creates only the necessary instances to meet current needs. This approach ensures continuous availability while optimizing resource usage.
Rolling update strategy considerations
Keep these current behaviors and limitations in mind when using the rolling update strategy. This list is maintained during the preview period and could change as the feature approaches general availability (GA).
- Platform-managed parameters: The platform controls the parameters (such as batch count, instances per batch, number of batches, and drain intervals) that determine rolling update behaviors. These parameters might change before GA to optimize performance and reliability.
- No real-time monitoring: There's currently no visibility into how many instances are draining, how many batches remain, or current progress percentages.
- No completion signal: However, you can monitor instance logs to estimate when an update completes.
- Single-instance scenarios: Apps running on one instance experience brief downtime similar to recreate, though in-progress executions still complete.
- Durable Functions: Because mixing versions during updates can cause unexpected behavior in a Durable orchestration, use an explicit orchestration version match strategy.
- Infrastructure as Code: Deploying code and configuration changes together triggers multiple rolling updates that might overlap.
- Backward compatibility: Make sure that your changes work with the previous version during the rolling update transition period.
Configure your update strategy
You can set the update strategy for your app using the SiteUpdateStrategy site setting, which is a child of functionAppConfig. By default, SiteUpdateStrategy.type is set to Recreate. Currently, only Bicep and ARM templates with API version 2023-12-01 or later support changing this property.
functionAppConfig: {
...
siteUpdateStrategy: {
type: 'RollingUpdate'
}
...
}
Changes to the site update strategy take effect at the next site update. For example, changing type from Recreate to RollingUpdate uses the recreate strategy for that update. All subsequent site updates then use rolling updates.
Monitoring site updates
During the public preview, there's no built-in completion signal for site updates. You can use KQL queries in Application Insights as a best-effort approach to estimate rolling update progress.
Monitoring rolling update progress
These KQL queries provide a best-effort estimate of rolling update progress by tracking instance turnover in Application Insights logs. This approach has significant limitations and shouldn't be relied upon for production automation:
// Rolling update completion check
let deploymentStart = datetime('2025-10-30T19:00:00Z'); // Set to your deployment start time
let checkInterval = 10s; // How often you run this query
let buffer = 30s; // Safety buffer for instance detection
//
// Get original instances (active before deployment)
let originalInstances =
traces
| where timestamp between ((deploymentStart - buffer) .. deploymentStart)
| where cloud_RoleInstance != ""
| summarize by InstanceId = cloud_RoleInstance;
//
// Get currently active instances
let currentInstances =
traces
| where timestamp >= now() - checkInterval
| where cloud_RoleInstance != ""
| summarize by InstanceId = cloud_RoleInstance;
//
// Check completion status
currentInstances
| join kind=leftouter (originalInstances | extend IsOriginal = true) on InstanceId
| extend IsOriginal = isnotnull(IsOriginal)
| summarize
OriginalStillActiveInstances = make_set_if(InstanceId, IsOriginal),
NewInstances = make_set_if(InstanceId, not(IsOriginal)),
OriginalStillActiveCount = countif(IsOriginal),
NewCount = countif(not(IsOriginal)),
TotalOriginal = toscalar(originalInstances | count)
| extend
RollingUpdateComplete = iff(OriginalStillActiveCount == 0, "YES", "NO"),
PercentComplete = round(100.0 * (1.0 - todouble(OriginalStillActiveCount) / todouble(TotalOriginal)), 1)
| project RollingUpdateComplete, PercentComplete, OriginalStillActiveCount, NewCount
How to use this query for estimation:
- Paste this query in the Logs blade of the Application Insights resource associated with your function app.
- Set
deploymentStartto the timestamp when your site update returns success. - Run the query periodically to estimate progress. Set the polling interval to be at least as long as your average function execution time, and ensure the
checkIntervalvariable in the query matches this polling frequency. - The query returns approximate values:
RollingUpdateComplete: Best estimate of whether all original instances are replacedPercentComplete: Estimated percentage of original instances that are replacedOriginalStillActiveCount: Estimated number of original instances still runningNewCount: Number of new instances currently active
Keep these limitations in mind when using these queries:
Timing gap: The
deploymentStarttime represents when your site update returns success, but the actual rolling update might not start immediately. During this gap, any scale-out events provision instances running the original version. Since the query only tracks instances active atdeploymentStart, it doesn't monitor these new original-version instances, potentially causing false completion signals.Log-based detection: This approach relies on application logs to infer instance state rather than directly querying instance status. Instances might be running but not actively logging, leading to false completion signals when original instances are still active but didn't emit logs within the
checkIntervalwindow.
Recommendation for production: Use rolling updates when zero-downtime deployments are critical. Ensure your deployment pipelines don't require waiting for update completion before proceeding to subsequent steps. Use recreate when you need faster, more predictable update timing and can tolerate brief downtime.
FAQ
I'm used to deployment slots for zero downtime deployments. How do rolling updates differ?
- Unlike deployment slots, rolling updates require no additional infrastructure. Set
siteUpdateStrategy.typeto"RollingUpdate"for zero-downtime deployments. - Rolling updates preserve in-progress executions, while deployment slots terminate them during swaps. Certain site properties and sticky settings can't be swapped and require modifying the production slot directly.
- Unlike deployment slots, rolling updates don't provide a separate environment for you to canary test changes or route a percentage of live traffic to. If you need these features, use a plan that supports deployment slots, like Elastic Premium, or manage separate Flex Consumption apps behind a traffic manager.
How do I roll back a site update?
- There's currently no feature to roll back a site update. If a rollback is necessary, initiate another site update with the previous state of code or configuration.
How are timer triggers handled?
- Timer triggers maintain their singleton nature. Once a timer-triggered function app is marked for drain, new timer functions run on the latest version.
I'm seeing runtime errors during the rolling update...what went wrong?
- If new instances fail to start or encounter runtime errors, the issue is likely in the application code, dependencies, configuration settings, or environment variables that you modified.
- To resolve the issue, redeploy your last known healthy version to restore the runtime. Then test your proposed changes in a development or staging environment before reattempting. Review error logs to identify what specific change caused the issue.