Transcriptions - Transcribe

Service:: Azure AI Services

API Version:: 2024-11-15

Synchronous transcription of an audio file.

POST {endpoint}/speechtotext/transcriptions:transcribe?api-version=2024-11-15

URI Parameters

Name	In	Required	Type	Description
audio	formData	True	file (binary)	The content of the audio file to be transcribed. The audio file must be shorter than 2 hours in audio duration and smaller than 250 MB in size.
definition	formData		string	Metadata for a transcription request. This field contains a JSON-serialized object of type `TranscribeDefinition`.
endpoint	path	True	string	Supported Cognitive Services endpoints (protocol and hostname, for example: https://westus.api.cognitive.microsoft.com).
api-version	query	True	string	The requested api version.

Request Header

Media Types: "multipart/form-data"

Name	Required	Type	Description
Ocp-Apim-Subscription-Key	True	string	Provide your cognitive services account key here.

Responses

Name	Type	Description
200 OK	TranscribeResult	OK
Other Status Codes	Error	An error occurred.

Security

Ocp-Apim-Subscription-Key

Provide your cognitive services account key here.

Type: apiKey
In: header

Examples

Transcribe an audio file

Sample request

HTTP

POST {endpoint}/speechtotext/transcriptions:transcribe?api-version=2024-11-15

Sample response

Status code:: 200

{
  "durationMilliseconds": 2000,
  "combinedPhrases": [
    {
      "text": "Weather"
    }
  ],
  "phrases": [
    {
      "offsetMilliseconds": 40,
      "durationMilliseconds": 320,
      "text": "Weather",
      "words": [
        {
          "text": "weather",
          "offsetMilliseconds": 40,
          "durationMilliseconds": 320
        }
      ],
      "locale": "en-US",
      "confidence": 0.78983736
    }
  ]
}

Definitions

Name	Description
ChannelCombinedPhrases	The full transcript per channel.
DetailedErrorCode	DetailedErrorCode
Error	Error
ErrorCode	ErrorCode
InnerError	InnerError
Phrase	A transcribed phrase.
TranscribeResult	The result of the transcribe operation.
Word	Time-stamped word in the display form.

ChannelCombinedPhrases

Object

The full transcript per channel.

Name	Type	Description
channel	integer (int32)	The 0-based channel index. Only present if channel separation is enabled.
text	string	The transcribed text.

DetailedErrorCode

Enumeration

DetailedErrorCode

Value	Description
InvalidParameterValue	Invalid parameter value.
InvalidRequestBodyFormat	Invalid request body format.
EmptyRequest	Empty Request.
MissingInputRecords	Missing Input Records.
InvalidDocument	Invalid Document.
ModelVersionIncorrect	Model Version Incorrect.
InvalidDocumentBatch	Invalid Document Batch.
UnsupportedLanguageCode	Unsupported language code.
DataImportFailed	Data import failed.
InUseViolation	In use violation.
InvalidLocale	Invalid locale.
InvalidBaseModel	Invalid base model.
InvalidAdaptationMapping	Invalid adaptation mapping.
InvalidDataset	Invalid dataset.
InvalidTest	Invalid test.
FailedDataset	Failed dataset.
InvalidModel	Invalid model.
InvalidTranscription	Invalid transcription.
InvalidPayload	Invalid payload.
InvalidParameter	Invalid parameter.
EndpointWithoutLogging	Endpoint without logging.
InvalidPermissions	Invalid permissions.
InvalidPrerequisite	Invalid prerequisite.
InvalidProductId	Invalid product id.
InvalidSubscription	Invalid subscription.
InvalidProject	Invalid project.
InvalidProjectKind	Invalid project kind.
InvalidRecordingsUri	Invalid recordings uri.
OnlyOneOfUrlsOrContainerOrDataset	Only one of urls or container or dataset.
ExceededNumberOfRecordingsUris	Exceeded number of recordings uris.
InvalidChannels	Invalid channels.
ModelMismatch	Model mismatch.
ProjectGenderMismatch	Project gender mismatch.
ModelDeprecated	Model deprecated.
ModelExists	Model exists.
ModelNotDeployable	Model not deployable.
EndpointNotUpdatable	Endpoint not updatable.
SingleDefaultEndpoint	Single default endpoint.
EndpointCannotBeDefault	Endpoint cannot be default.
InvalidModelUri	Invalid model uri.
SubscriptionNotFound	Subscription not found.
QuotaViolation	Quota violation.
UnsupportedDelta	Unsupported delta.
UnsupportedFilter	Unsupported filter.
UnsupportedPagination	Unsupported pagination.
UnsupportedDynamicConfiguration	Unsupported dynamic configuration.
UnsupportedOrderBy	Unsupported order by.
NoUtf8WithBom	No utf8 with bom.
ModelDeploymentNotCompleteState	Model deployment not complete state.
SkuLimitsExist	Sku limits exist.
DeployingFailedModel	Deploying failed model.
UnsupportedTimeRange	Unsupported time range.
InvalidLogDate	Invalid log date.
InvalidLogId	Invalid log id.
InvalidLogStartTime	Invalid log start time.
InvalidLogEndTime	Invalid log end time.
InvalidTopForLogs	Invalid top for logs.
InvalidSkipTokenForLogs	Invalid skip token for logs.
DeleteNotAllowed	Delete not allowed.
Forbidden	Forbidden.
DeployNotAllowed	Deploy not allowed.
UnexpectedError	Unexpected error.
InvalidCollection	Invalid collection.
InvalidCallbackUri	Invalid callback uri.
InvalidSasValidityDuration	Invalid sas validity duration.
InaccessibleCustomerStorage	Inaccessible customer storage.
UnsupportedClassBasedAdaptation	Unsupported class based adaptation.
InvalidWebHookEventKind	Invalid web hook event kind.
InvalidTimeToLive	Invalid time to live.
InvalidSourceAzureResourceId	Invalid source Azure resource ID.
ModelCopyAuthorizationExpired	Expired ModelCopyAuthorization.
EndpointLoggingNotSupported	Endpoint logging not supported.
NoLanguageIdentified	Language Identification did not recognize any language.
MultipleLanguagesIdentified	Language Identification recognized multiple languages. No dominant language could be determined.
InvalidAudioFormat	The format of input audio is not supported.
BadChannelConfiguration	There is a mismatch between audio channels in the data, in the configuration, or the requirements of the application.
InvalidChannelSpecification	The selection of channels in the transcription request is not supported (e.g., neither 0 nor 1 have been selected.)
AudioLengthLimitExceeded	The audio file is longer than the maximum allowed duration.
EmptyAudioFile	The audio file is empty.

Error

Object

Error

Name	Type	Description
code	ErrorCode	ErrorCode High level error codes.
details	Error[]	Additional supportive details regarding the error and/or expected policies.
innerError	InnerError	InnerError New Inner Error format which conforms to Cognitive Services API Guidelines which is available at https://microsoft.sharepoint.com/%3Aw%3A/t/CognitiveServicesPMO/EUoytcrjuJdKpeOKIK_QRC8BPtUYQpKBi8JsWyeDMRsWlQ?e=CPq8ow. This contains required properties ErrorCode, message and optional properties target, details(key value pair), inner error(this can be nested).
message	string	High level error message.
target	string	The source of the error. For example it would be "documents" or "document id" in case of invalid document.

ErrorCode

Enumeration

ErrorCode

Value	Description
InvalidRequest	Representing the invalid request error code.
InvalidArgument	Representing the invalid argument error code.
InternalServerError	Representing the internal server error error code.
ServiceUnavailable	Representing the service unavailable error code.
NotFound	Representing the not found error code.
PipelineError	Representing the pipeline error error code.
Conflict	Representing the conflict error code.
InternalCommunicationFailed	Representing the internal communication failed error code.
Forbidden	Representing the forbidden error code.
NotAllowed	Representing the not allowed error code.
Unauthorized	Representing the unauthorized error code.
UnsupportedMediaType	Representing the unsupported media type error code.
TooManyRequests	Representing the too many requests error code.
UnprocessableEntity	Representing the unprocessable entity error code.

InnerError

Object

InnerError

Name	Type	Description
code	DetailedErrorCode	DetailedErrorCode Detailed error code enum.
details	object	Additional supportive details regarding the error and/or expected policies.
innerError	InnerError	InnerError New Inner Error format which conforms to Cognitive Services API Guidelines which is available at https://microsoft.sharepoint.com/%3Aw%3A/t/CognitiveServicesPMO/EUoytcrjuJdKpeOKIK_QRC8BPtUYQpKBi8JsWyeDMRsWlQ?e=CPq8ow. This contains required properties ErrorCode, message and optional properties target, details(key value pair), inner error(this can be nested).
message	string	High level error message.
target	string	The source of the error. For example it would be "documents" or "document id" in case of invalid document.

Phrase

Object

A transcribed phrase.

Name	Type	Description
channel	integer (int32)	The 0-based channel index. Only present if channel separation is enabled.
confidence	number (float)	The confidence value for the phrase.
durationMilliseconds	integer (int32)	The duration of the phrase in milliseconds.
locale	string	The locale of the phrase.
offsetMilliseconds	integer (int32)	The start offset of the phrase in milliseconds.
speaker	integer (int32)	A unique integer number that is assigned to each speaker detected in the audio without particular order. Only present if speaker diarization is enabled.
text	string	The transcribed text of the phrase.
words	Word[]	The words that make up the phrase. Only present if word-level timestamps are enabled.

TranscribeResult

Object

The result of the transcribe operation.

Name	Type	Description
combinedPhrases	ChannelCombinedPhrases[]	The full transcript for each channel.
durationMilliseconds	integer (int32)	The duration of the audio in milliseconds.
phrases	Phrase[]	The transcription results segmented into phrases.

Word

Object

Time-stamped word in the display form.

Name	Type	Description
durationMilliseconds	integer (int32)	The duration of the word in milliseconds.
offsetMilliseconds	integer (int32)	The start offset of the word in milliseconds.
text	string	The recognized word, including punctuation.

Share via

Transcriptions - Transcribe

URI Parameters

Request Header

Responses

Security

Ocp-Apim-Subscription-Key

Examples

Transcribe an audio file

Sample request

Sample response

Definitions

ChannelCombinedPhrases

DetailedErrorCode

Error

ErrorCode

InnerError

Phrase

TranscribeResult

Word