Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Azure Private Link lets you connect to services in Azure by using a private endpoint. A private endpoint is a private IP address that's accessible only within a specific virtual network and subnet.
This article explains how to set up and use Private Link and private endpoints with the Speech service. This article then describes how to remove private endpoints later, but still use the Speech resource.
Note
Before you proceed, review how to use virtual networks with Azure AI services.
Setting up an AI Foundry resource for Speech for the private endpoint scenarios requires performing the following tasks:
Private endpoints and Virtual Network service endpoints
Azure provides private endpoints and Virtual Network service endpoints for traffic that tunnels via the private Azure backbone network. The purpose and underlying technologies of these endpoint types are similar. But there are differences between the two technologies. We recommend that you learn about the pros and cons of both before you design your network.
There are a few things to consider when you decide which technology to use:
- Both technologies ensure that traffic between the virtual network and the Speech resource doesn't travel over the public internet.
- A private endpoint provides a dedicated private IP address for your Speech resource. This IP address is accessible only within a specific virtual network and subnet. You have full control of the access to this IP address within your network infrastructure.
- Virtual Network service endpoints don't provide a dedicated private IP address for the Speech resource. Instead, they encapsulate all packets sent to the Speech resource and deliver them directly over the Azure backbone network.
- Both technologies support on-premises scenarios. By default, when they use Virtual Network service endpoints, Azure service resources secured to virtual networks can't be reached from on-premises networks. But you can change that behavior.
- Virtual Network service endpoints are often used to restrict the access for an AI Foundry resource for Speech based on the virtual networks from which the traffic originates.
- For Azure AI services, enabling the Virtual Network service endpoint forces the traffic for all Azure AI Foundry resources to go through the private backbone network. That requires explicit network access configuration. (For more information, see Configure virtual networks and the Speech resource networking settings.) Private endpoints don't have this limitation and provide more flexibility for your network configuration. You can access one resource through the private backbone and another through the public internet by using the same subnet of the same virtual network.
- Private endpoints incur extra costs. Virtual Network service endpoints are free.
- Private endpoints require extra DNS configuration.
- One Speech resource can work simultaneously with both private endpoints and Virtual Network service endpoints.
We recommend that you try both endpoint types before you make a decision about your production design.
For more information, see these resources:
- Azure Private Link and private endpoint documentation
- Virtual Network service endpoints documentation
This article describes the usage of the private endpoints with Speech service. Usage of the VNet service endpoints is described here.
Create a custom domain name
Caution
An AI Foundry resource for Speech with a custom domain name enabled uses a different way to interact with Speech service. You might have to adjust your application code for both of these scenarios: with private endpoint and without private endpoint.
Follow these steps to create a custom subdomain name for Azure AI services for your Speech resource.
Caution
When you turn on a custom domain name, the operation is not reversible. The only way to go back to the regional name is to create a new Speech resource.
If your Speech resource has a lot of associated custom models and projects created via Speech Studio, we strongly recommend trying the configuration with a test resource before you modify the resource used in production.
To create a custom domain name using the Azure portal, follow these steps:
- Go to the Azure portal and sign in to your Azure account. 
- Select the required Speech resource. 
- In the Resource Management group on the left pane, select Networking. 
- On the Firewalls and virtual networks tab, select Generate Custom Domain Name. A new right panel appears with instructions to create a unique custom subdomain for your resource. 
- In the Generate Custom Domain Name panel, enter a custom domain name. Your full custom domain will look like: - https://{your custom name}.cognitiveservices.azure.com.- Remember that after you create a custom domain name, it cannot be changed. - After you've entered your custom domain name, select Save. 
- After the operation finishes, in the Resource management group, select Keys and Endpoint. Confirm that the new endpoint name of your resource starts this way: - https://{your custom name}.cognitiveservices.azure.com.
Turn on private endpoints
We recommend using the private DNS zone attached to the virtual network with the necessary updates for the private endpoints. You can create a private DNS zone during the provisioning process. If you're using your own DNS server, you might also need to change your DNS configuration.
Decide on a DNS strategy before you provision private endpoints for a production Speech resource. And test your DNS changes, especially if you use your own DNS server.
Use one of the following articles to create private endpoints. These articles use a web app as a sample resource to make available through private endpoints.
- Create a private endpoint by using the Azure portal
- Create a private endpoint by using Azure PowerShell
- Create a private endpoint by using Azure CLI
Use these parameters instead of the parameters in the article that you chose:
| Setting | Value | 
|---|---|
| Resource type | Microsoft.CognitiveServices/accounts | 
| Resource | <your-speech-resource-name> | 
| Target sub-resource | account | 
DNS for private endpoints: Review the general principles of DNS for private endpoints in Azure AI Foundry resources. Then confirm that your DNS configuration is working correctly by performing the checks described in the following sections.
Resolve DNS from the virtual network
This check is required.
Follow these steps to test the custom DNS entry from your virtual network:
- Sign in to a virtual machine located in the virtual network to which you attached your private endpoint. 
- Open a Windows command prompt or a Bash shell, run - nslookup, and confirm that it successfully resolves your resource's custom domain name.- C:\>nslookup my-private-link-speech.cognitiveservices.azure.com Server: UnKnown Address: 168.63.129.16 Non-authoritative answer: Name: my-private-link-speech.privatelink.cognitiveservices.azure.com Address: 172.28.0.10 Aliases: my-private-link-speech.cognitiveservices.azure.com
- Confirm that the IP address matches the IP address of your private endpoint. 
Resolve DNS from other networks
Perform this check only if you've turned on either the All networks option or the Selected Networks and Private Endpoints access option in the Networking section of your resource.
If you plan to access the resource by using only a private endpoint, you can skip this section.
- Sign in to a computer attached to a network allowed to access the resource. 
- Open a Windows command prompt or Bash shell, run - nslookup, and confirm that it successfully resolves your resource's custom domain name.- C:\>nslookup my-private-link-speech.cognitiveservices.azure.com Server: UnKnown Address: fe80::1 Non-authoritative answer: Name: vnetproxyv1-weu-prod.westeurope.cloudapp.azure.com Address: 13.69.67.71 Aliases: my-private-link-speech.cognitiveservices.azure.com my-private-link-speech.privatelink.cognitiveservices.azure.com westeurope.prod.vnet.cog.trafficmanager.net
Note
The resolved IP address points to a virtual network proxy endpoint, which dispatches the network traffic to the private endpoint for the Speech resource. The behavior will be different for a resource with a custom domain name but without private endpoints. See this section for details.
Adjust an application to use an AI Foundry resource for Speech with a private endpoint
An AI Foundry resource for Speech with a custom domain interacts with the Speech service in a different way. This is true for a custom-domain-enabled Speech resource both with and without private endpoints. Information in this section applies to both scenarios.
Follow instructions in this section to adjust existing applications and solutions to use an AI Foundry resource for Speech with a custom domain name and a private endpoint turned on.
An AI Foundry resource for Speech with a custom domain name and a private endpoint turned on uses a different way to interact with the Speech service. This section explains how to use such a resource with the Speech service REST APIs and the Speech SDK.
Note
An AI Foundry resource for Speech without private endpoints that uses a custom domain name also has a special way of interacting with the Speech service. This way differs from the scenario of an AI Foundry resource for Speech that uses a private endpoint. This is important to consider because you may decide to remove private endpoints later. See Adjust an application to use an AI Foundry resource for Speech without private endpoints later in this article.
Speech resource with a custom domain name and a private endpoint: Usage with the REST APIs
We use my-private-link-speech.cognitiveservices.azure.com as a sample Speech resource DNS name (custom domain) for this section.
Speech service has REST APIs for Speech to text and Text to speech. Consider the following information for the private-endpoint-enabled scenario.
Speech to text has two REST APIs. Each API serves a different purpose, uses different endpoints, and requires a different approach when you're using it in the private-endpoint-enabled scenario.
The Speech to text REST APIs are:
- Speech to text REST API, which is used for Batch transcription and custom speech.
- Speech to text REST API for short audio, which is used for real-time speech to text.
Usage of the Speech to text REST API for short audio and the Text to speech REST API in the private endpoint scenario is the same. It's equivalent to the Speech SDK case described later in this article.
Speech to text REST API uses a different set of endpoints, so it requires a different approach for the private-endpoint-enabled scenario.
The next subsections describe both cases.
Speech to text REST API
Usually, Speech resources use Azure AI services regional endpoints for communicating with the Speech to text REST API. These resources have the following naming format:
{region}.api.cognitive.microsoft.com.
This is a sample request URL:
https://westeurope.api.cognitive.microsoft.com/speechtotext/v3.1/transcriptions
Note
See this article for Azure Government and Microsoft Azure operated by 21Vianet endpoints.
After you turn on a custom domain for an AI Foundry resource for Speech (which is necessary for private endpoints), that resource will use the following DNS name pattern for the basic REST API endpoint:
{your custom name}.cognitiveservices.azure.com
That means that in our example, the REST API endpoint name is:
my-private-link-speech.cognitiveservices.azure.com
And the sample request URL needs to be converted to:
https://my-private-link-speech.cognitiveservices.azure.com/speechtotext/v3.1/transcriptions
This URL should be reachable from the virtual network with the private endpoint attached (provided the correct DNS resolution).
After you turn on a custom domain name for an AI Foundry resource for Speech, you typically replace the host name in all request URLs with the new custom domain host name. All other parts of the request (like the path /speechtotext/v3.1/transcriptions in the earlier example) remain the same.
Tip
Some customers develop applications that use the region part of the regional endpoint's DNS name (for example, to send the request to the Speech resource deployed in the particular Azure region).
A custom domain for an AI Foundry resource for Speech contains no information about the region where the resource is deployed. So the application logic described earlier will not work and needs to be altered.
Speech to text REST API for short audio and Text to speech REST API
The Speech to text REST API for short audio and the Text to speech REST API use two types of endpoints:
- Azure AI services regional endpoints for communicating with the Azure AI services REST API to obtain an authorization token
- Special endpoints for all other operations
Note
See this article for Azure Government and Azure operated by 21Vianet endpoints.
The detailed description of the special endpoints and how their URL should be transformed for a private-endpoint-enabled Speech resource is provided in this subsection about usage with the Speech SDK. The same principle described for the SDK applies for the Speech to text REST API for short audio and the Text to speech REST API.
Get familiar with the material in the subsection mentioned in the previous paragraph and see the following example. The example describes the Text to speech REST API. Usage of the Speech to text REST API for short audio is fully equivalent.
Note
When you're using the Speech to text REST API for short audio and Text to speech REST API in private endpoint scenarios, use a resource key passed through the Ocp-Apim-Subscription-Key header. (See details for Speech to text REST API for short audio and Text to speech REST API)
Using an authorization token and passing it to the special endpoint via the Authorization header will work only if you've turned on the All networks access option in the Networking section of your Speech resource. In other cases you will get either Forbidden or BadRequest error when trying to obtain an authorization token.
Text to speech REST API usage example
We use West Europe as a sample Azure region and my-private-link-speech.cognitiveservices.azure.com as a sample Speech resource DNS name (custom domain). The custom domain name my-private-link-speech.cognitiveservices.azure.com in our example belongs to the Speech resource created in the West Europe region.
To get the list of the voices supported in the region, perform the following request:
https://westeurope.tts.speech.microsoft.com/cognitiveservices/voices/list
See more details in the Text to speech REST API documentation.
For the private-endpoint-enabled Speech resource, the endpoint URL for the same operation needs to be modified. The same request looks like this:
https://my-private-link-speech.cognitiveservices.azure.com/tts/cognitiveservices/voices/list
See a detailed explanation in the Construct endpoint URL subsection for the Speech SDK.
Speech resource with a custom domain name and a private endpoint: Usage with the Speech SDK
Using the Speech SDK with a custom domain name and private-endpoint-enabled Speech resources requires you to review and likely change your application code.
We use my-private-link-speech.cognitiveservices.azure.com as a sample Speech resource DNS name (custom domain) for this section.
Construct endpoint URL
Usually in SDK scenarios (and in the speech to text REST API for short audio and text to speech REST API scenarios), Speech resources use the dedicated regional endpoints for different service offerings. The DNS name format for these endpoints is:
{region}.{speech service offering}.speech.microsoft.com
An example DNS name is:
westeurope.stt.speech.microsoft.com
All possible values for the region (first element of the DNS name) are listed in Speech service supported regions. (See this article for Azure Government and Azure operated by 21Vianet endpoints.) The following table presents the possible values for the Speech service offering (second element of the DNS name):
| DNS name value | Speech service offering | 
|---|---|
| s2s | Speech Translation | 
| stt | Speech to text | 
| tts | Text to speech | 
| voice | Custom voice | 
So the earlier example (westeurope.stt.speech.microsoft.com) stands for a Speech to text endpoint in West Europe.
Private-endpoint-enabled endpoints communicate with Speech service via a special proxy. Because of that, you must change the endpoint connection URLs.
A "standard" endpoint URL looks like:
{region}.{speech service offering}.speech.microsoft.com/{URL path}
A private endpoint URL looks like:
{your custom name}.cognitiveservices.azure.com/{URL path}
The Speech SDK automatically will configure the /{URL path} depending on the service used.
Therefore only the /{baseURL} must be configured as described.
Modifying applications
Follow these steps to modify your code:
- Determine the application endpoint URL from the 'Keys and Endpoints' menu of your resource on Azure portal. In this example it would be - my-private-link-speech.cognitiveservices.azure.com.
- Create a - SpeechConfiginstance by using an endpoint URL:- Modify the endpoint that you determined, as described in the earlier Construct endpoint URL section. 
- Modify how you create the instance of - SpeechConfig. Most likely, your application is using something like this:- var config = SpeechConfig.FromSubscription(speechKey, azureRegion);- This example doesn't work for a private-endpoint-enabled Speech resource because of the host name and URL changes that we described in the previous sections. If you try to run your existing application without any modifications by using the key of a private-endpoint-enabled resource, you get an authentication error (401). - To make it work, modify how you instantiate the - SpeechConfigclass and use "from endpoint"/"with endpoint" initialization. Suppose we have the following two variables defined:- speechKeycontains the key of the private-endpoint-enabled Speech resource.
- endPointcontains the modified endpoint URL (using the type required by the corresponding programming language). In our example, this variable should contain:- wss://my-private-link-speech.cognitiveservices.azure.com
 - Create a - SpeechConfiginstance:- var config = SpeechConfig.FromEndpoint(endPoint, speechKey);- auto config = SpeechConfig::FromEndpoint(endPoint, speechKey);- SpeechConfig config = SpeechConfig.fromEndpoint(endPoint, speechKey);- import azure.cognitiveservices.speech as speechsdk config = speechsdk.SpeechConfig(endpoint=endPoint, subscription=speechKey)- SPXSpeechConfiguration *config = [[SPXSpeechConfiguration alloc] initWithEndpoint:endPoint subscription:speechKey];- import * as sdk from "microsoft.cognitiveservices.speech.sdk"; config: sdk.SpeechConfig = sdk.SpeechConfig.fromEndpoint(new URL(endPoint), speechKey);
 
After this modification, your application should work with the private-endpoint-enabled Speech resources. We're working on more seamless support of private endpoint scenarios.
Speech resource with a custom domain name and without private endpoints: Usage with the Speech SDK
Using the Speech SDK with custom-domain-enabled Speech resources without private endpoints is equivalent to the configuration described with private endpoints in this document.
Use of Speech Studio
Speech Studio is a web portal with tools for building and integrating Azure AI Speech service in your application. When you work in Speech Studio projects, network connections and API calls to the corresponding Speech resource are made on your behalf. Working with private endpoints, virtual network service endpoints, and other network security options can limit the availability of Speech Studio features. You normally use Speech Studio when working with features, like custom speech, custom voice and audio content creation.
Reaching Speech Studio web portal from a Virtual network
To use Speech Studio from a virtual machine within an Azure Virtual network, you must allow outgoing connections to the required set of service tags for this virtual network. See details here.
Access to the Speech resource endpoint is not equal to access to Speech Studio web portal. Access to Speech Studio web portal via private or Virtual Network service endpoints is not supported.
Working with Speech Studio projects
This section describes working with the different kind of Speech Studio projects for the different network security options of the Speech resource. It's expected that the web browser connection to Speech Studio is established. Speech resource network security settings are set in Azure portal.
- Go to the Azure portal and sign in to your Azure account.
- Select the Speech resource.
- In the Resource Management group in the left pane, select Networking > Firewalls and virtual networks.
- Select one option from All networks, Selected Networks and Private Endpoints, or Disabled.
Custom speech, Custom voice and Audio Content Creation
The following table describes custom speech/custom voice/audio content creation project accessibility per Speech resource Networking > Firewalls and virtual networks security setting.
Note
If you allow only private endpoints via the Networking > Private endpoint connections tab, then you can't use Speech Studio with the Speech resource. You can still use the Speech resource outside of Speech Studio.
| Speech resource network security setting | Speech Studio project accessibility | 
|---|---|
| All networks | No restrictions | 
| Selected Networks and Private Endpoints | Accessible from allowed public IP addresses | 
| Disabled | Not accessible | 
If you select Selected Networks and private endpoints, then you will see a tab with Virtual networks and Firewall access configuration options. In the Firewall section, you must allow at least one public IP address and use this address for the browser connection with Speech Studio.
If you allow only access via Virtual network, then in effect you don't allow access to the Speech resource through Speech Studio. You can still use the Speech resource outside of Speech Studio.
To use custom speech without relaxing network access restrictions on your production Speech resource, consider one of these workarounds.
- Create another Speech resource for development that can be used on a public network. Prepare your custom model in Speech Studio on the development resource, and then copy the model to your production resource. See the Models_CopyTo REST request with Speech to text REST API.
- You have the option to not use Speech Studio for custom speech. Use the Speech to text REST API for all custom speech operations.
To use custom voice without relaxing network access restrictions on your production Speech resource, consider one of these workarounds:
- Create another Speech resource for development that can be used on a public network. Prepare your custom model in Speech Studio on the development resource, then submit an Azure support ticket to request assistance with copying the model to your production resource.
- You have the option to not use Speech Studio for custom voice. Use the Custom voice REST API directly for all custom voice operations with your production resource.
Adjust an application to use an AI Foundry resource for Speech without private endpoints
In this article, we noted several times that enabling a custom domain for an AI Foundry resource for Speech is irreversible. Such a resource uses a different way of communicating with Speech service, compared to the ones that are using regional endpoint names.
This section explains how to use an AI Foundry resource for Speech with a custom domain name but without any private endpoints with the Speech service REST APIs and Speech SDK. This might be a resource that was once used in a private endpoint scenario, but then had its private endpoints deleted.
DNS configuration
Remember how a custom domain DNS name of the private-endpoint-enabled Speech resource is resolved from public networks. In this case, the IP address resolved points to a proxy endpoint for a virtual network. That endpoint is used for dispatching the network traffic to the private-endpoint-enabled Azure AI Foundry resource.
However, when all resource private endpoints are removed (or right after the enabling of the custom domain name), the CNAME record of the Speech resource is reprovisioned. It now points to the IP address of the corresponding Azure AI services regional endpoint.
So the output of the nslookup command looks like this:
C:\>nslookup my-private-link-speech.cognitiveservices.azure.com
Server:  UnKnown
Address:  fe80::1
Non-authoritative answer:
Name:    apimgmthskquihpkz6d90kmhvnabrx3ms3pdubscpdfk1tsx3a.cloudapp.net
Address:  13.93.122.1
Aliases:  my-private-link-speech.cognitiveservices.azure.com
          westeurope.api.cognitive.microsoft.com
          cognitiveweprod.trafficmanager.net
          cognitiveweprod.azure-api.net
          apimgmttmdjylckcx6clmh2isu2wr38uqzm63s8n4ub2y3e6xs.trafficmanager.net
          cognitiveweprod-westeurope-01.regional.azure-api.net
Compare it with the output from this section.
Speech resource with a custom domain name and without private endpoints: Usage with the REST APIs
Speech to text REST API
Speech to text REST API usage is fully equivalent to the case of private-endpoint-enabled Speech resources.
Speech to text REST API for short audio and Text to speech REST API
In this case, usage of the Speech to text REST API for short audio and usage of the Text to speech REST API have no differences from the general case, with one exception. (See the following note.) You should use both APIs as described in the Speech to text REST API for short audio and Text to speech REST API documentation.
Note
When you're using the Speech to text REST API for short audio and Text to speech REST API in custom domain scenarios, use an API key passed through the Ocp-Apim-Subscription-Key header. (See details for Speech to text REST API for short audio and Text to speech REST API)
Using an authorization token and passing it to the special endpoint via the Authorization header will work only if you've turned on the All networks access option in the Networking section of your Speech resource. In other cases you will get either Forbidden or BadRequest error when trying to obtain an authorization token.
Simultaneous use of private endpoints and Virtual Network service endpoints
You can use private endpoints and Virtual Network service endpoints to access to the same Speech resource simultaneously. To enable this simultaneous use, you need to use the Selected Networks and Private Endpoints option in the networking settings of the Speech resource in the Azure portal. Other options aren't supported for this scenario.
Pricing
For pricing details, see Azure Private Link pricing.
