Realtime API doesn’t stream response.function_call_arguments.delta events

Question

Realtime API doesn’t stream response.function_call_arguments.delta events

Daria Smirnova 0

Description:

We are using the gpt-4o-mini-realtime-preview and gpt-realtime models via the Azure OpenAI Realtime API (tested with API versions 2024-10-01-preview and 2025-04-01-preview). When streaming responses with a function call, we only receive response.text.delta events — but never any response.function_call_arguments.delta events.

We also tested the newer API version, which only returned error events and no deltas at all. This prevents us from receiving function call arguments in real time.

Expected behavior:

Both response.text.delta and response.function_call_arguments.delta events should stream during a Realtime session, as described in the OpenAI Realtime API documentation.

What we’ve tried:

• Tested across different API versions (2024-10-01-preview, 2025-04-01-preview).

• Verified session creation and permissions (all succeed).

• Confirmed function call definition format matches the latest documentation.

We suppose that this limitation occurs on Azure's side, since we did same tests on direct OpenAI realtime connection, and we successfully received all the needed deltas.

One more discovery: we also couldn't receive conversation.item.input_audio_transcription events with user speech transcript. Same as with previous test, direct OpenAI realtime connection didn't have any issues with this.

2 answers

Your answer

Answer 1

Hello Daria Smirnova,

It appears that you’re facing an issue where the Azure OpenAI Realtime API does not stream response.function_call_arguments.delta events, even though it does stream response.text.delta events when using models like gpt-4o-mini-realtime-preview and gpt-realtime. You’ve also noticed that this works with direct OpenAI’s realtime API but not with Azure’s, and that you’re unable to receive conversation.item.input_audio_transcription events for user speech transcripts through Azure as well.

Based on your description and troubleshooting steps:

This behavior suggests there is a limitation or a gap in the current Azure OpenAI Realtime API implementation regarding streaming of function_call_arguments.delta events and possibly audio transcription events.

Since it works as expected on OpenAI’s direct API, and not via Azure, the issue is likely with Azure’s current API layer or model deployment in their service.

You’ve already verified API versions, permissions, and function call definitions, which rules out most client-side misconfigurations.

Recommended Steps:

Report the issue via Azure support channels or the feedback form in the Azure portal, referencing your use case and the fact that it works directly with OpenAI.

Monitor the official Azure OpenAI Service documentation and release notes for updates on real-time API capabilities.

Meanwhile, if streaming function call arguments is critical, you may need to consider using the direct OpenAI API (if possible within your deployment requirements) until Azure’s API adds or fixes this streaming functionality.

This looks like a service-side limitation rather than something you can resolve from your end. Kindly approve the answer if you find it is helpful.

Best Regards,

Jerald Felix

Answer 2

Hi Daria,

You’re observing this behavior because OpenAI and Azure OpenAI use the same models (e.g., gpt-4o-mini-realtime-preview) but have different deployment configurations, leading to differences in event behavior.
Below is an example code that returns response.function_call_arguments.delta events using gpt-4o-mini-realtime-preview :

import asyncio
from openai import AsyncAzureOpenAI
from azure.identity.aio import DefaultAzureCredential, get_bearer_token_provider

async def main() -> None:
    # Authenticate using Entra ID
    credential = DefaultAzureCredential()
    token_provider = get_bearer_token_provider(
        credential, "https://cognitiveservices.azure.com/.default"
    )

    # Initialize Azure OpenAI Realtime client
    client = AsyncAzureOpenAI(
        azure_endpoint="<AI-FOUNDRY-ENDPOINT>",
        azure_ad_token_provider=token_provider,
        api_version="2024-10-01-preview",
    )

    # Connect to Realtime API session
    async with client.beta.realtime.connect(
        model="gpt-4o-mini-realtime-preview"
    ) as connection:
        
        # Configure session with function/tool definition
        await connection.session.update(
            session={
                "modalities": ["text"],
                "tools": [
                    {
                        "type": "function",
                        "name": "get_weather",
                        "description": "Get the current weather for a location",
                        "parameters": {
                            "type": "object",
                            "properties": {
                                "location": {
                                    "type": "string",
                                    "description": "City name"
                                },
                                "unit": {
                                    "type": "string",
                                    "enum": ["celsius", "fahrenheit"]
                                }
                            },
                            "required": ["location"]
                        }
                    }
                ]
            }
        )

        # Send hardcoded user message that should trigger function call
        await connection.conversation.item.create(
            item={
                "type": "message",
                "role": "user",
                "content": [
                    {"type": "input_text", "text": "What's the weather in San Francisco?"}
                ],
            }
        )

        # Request response
        await connection.response.create()

        # Stream events - only print function_call_arguments.delta
        async for event in connection:
            if event.type == "response.function_call_arguments.delta":
                print(event.delta, end="", flush=True)
            elif event.type == "response.done":
                print("\n--- Done ---")
                break

    await credential.close()

if __name__ == "__main__":
    asyncio.run(main())

Below is an example of the events returned by the server when the above script is executed:

SessionCreatedEvent(
    event_id='event_001',
    session=Session(
        id='sess_001',
        instructions='[SYSTEM_INSTRUCTIONS]',
        model='gpt-4o-mini-realtime-preview-2024-12-17',
        modalities=['audio', 'text'],
        voice='alloy',
        temperature=0.8,
        tools=[]
    ),
    type='session.created'
)

SessionUpdatedEvent(
    event_id='event_002',
    session=Session(
        id='sess_001',
        modalities=['text'],
        tools=[
            Tool(
                name='get_weather',
                description='Get the current weather for a location',
                parameters={
                    'type': 'object',
                    'properties': {
                        'location': {'type': 'string', 'description': 'City name'},
                        'unit': {'type': 'string', 'enum': ['celsius', 'fahrenheit']}
                    },
                    'required': ['location']
                }
            )
        ]
    ),
    type='session.updated'
)

ConversationItemCreatedEvent(
    event_id='event_003',
    item=ConversationItem(
        id='item_001',
        type='message',
        role='user',
        content=[ConversationItemContent(type='input_text', text="What's the weather in San Francisco?")],
        status='completed'
    ),
    type='conversation.item.created'
)

ResponseCreatedEvent(
    event_id='event_004',
    response=RealtimeResponse(
        id='resp_001',
        status='in_progress',
        output=[]
    ),
    type='response.created'
)

ResponseOutputItemAddedEvent(
    event_id='event_005',
    response_id='resp_001',
    output_index=0,
    item=ConversationItem(
        id='item_002',
        type='function_call',
        name='get_weather',
        call_id='call_001',
        arguments='',
        status='in_progress'
    ),
    type='response.output_item.added'
)

ConversationItemCreatedEvent(
    event_id='event_006',
    previous_item_id='item_001',
    item=ConversationItem(
        id='item_002',
        type='function_call',
        name='get_weather',
        call_id='call_001',
        status='in_progress'
    ),
    type='conversation.item.created'
)

ResponseFunctionCallArgumentsDeltaEvent(
    event_id='event_007',
    response_id='resp_001',
    item_id='item_002',
    output_index=0,
    call_id='call_001',
    delta='{"',
    type='response.function_call_arguments.delta'
)

# [Additional delta events for: location, ":", San,  Francisco, ", unit, ":", c, elsius, "}]

ResponseFunctionCallArgumentsDoneEvent(
    event_id='event_008',
    response_id='resp_001',
    item_id='item_002',
    output_index=0,
    call_id='call_001',
    name='get_weather',
    arguments='{"location":"San Francisco","unit":"celsius"}',
    type='response.function_call_arguments.done'
)

ResponseOutputItemDoneEvent(
    event_id='event_009',
    response_id='resp_001',
    output_index=0,
    item=ConversationItem(
        id='item_002',
        type='function_call',
        name='get_weather',
        call_id='call_001',
        arguments='{"location":"San Francisco","unit":"celsius"}',
        status='completed'
    ),
    type='response.output_item.done'
)

ResponseDoneEvent(
    event_id='event_010',
    response=RealtimeResponse(
        id='resp_001',
        status='completed',
        usage=RealtimeResponseUsage(
            input_tokens=172,
            output_tokens=21,
            total_tokens=193
        )
    ),
    type='response.done'
)

Also note that:
The server response.text.delta event is returned when the model-generated text is updated.

The server response.function_call_arguments.delta event is returned when the model-generated function call arguments are updated.

Here is relevant documentation:
https://free.blessedness.top/en-us/azure/ai-foundry/openai/realtime-audio-reference#server-events

Feel free to accept this as an answer.
Thankyou for reaching out to The Microsoft QNA Portal.

Share via

Realtime API doesn’t stream response.function_call_arguments.delta events

2 answers

Your answer