is it possible to use Azure video indexer to explain the video content and quickly locate the video clip?

Yu Cai 100 Reputation points
2023-11-27T18:09:41.58+00:00

Hello, may I ask is it possible to use Azure video indexer to explain the video content? For example, I want to use it to identify the car accident, monitor the grocery store surveillance video et al. How can I specifically train Azure video indexer to do this job? For instance, in below video, Azure video indexer mistake the dog as the bicycle although I mentioned dog in the tag list,. However, can I fine tuning video indexer so that it will not mistake the dog as a bicycle?

User's image

In the https://free.blessedness.top/en-us/azure/azure-video-indexer/customize-brands-model-overview, I do not find the details how to do it? If Microsoft provide a video instruction how to use these tools, it would be easier for us.

User's image

Azure AI Video Indexer
Azure AI Video Indexer
An Azure video analytics service that uses AI to extract actionable insights from stored videos.
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 25,761 Reputation points Volunteer Moderator
    2025-09-29T15:22:16.87+00:00

    Hello Yu Cai,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you are looking for possible ways to use Azure video indexer to explain the video content and quickly locate the video clip.

    Azure Video Indexer is not designed for direct training or fine-tuning, but you can extend its capabilities using custom models, Logic Apps, and OpenAI. For tasks like shoplifting detection, you must build a custom pipeline using AVI for indexing and your own model for classification. AVI is not a trainable model in the traditional ML sense. It uses pre-trained models for object detection, speech recognition, and sentiment analysis. You cannot fine-tune these models directly. However, you can overcome AVI’s limitations by integrate your own custom model using Azure Logic Apps and Azure OpenAI or Azure AI Computer Vision using the following steps:

    1. Index the video using AVI.
    2. Extract frames or object metadata using AVI API.
    3. Send frames to your custom model (hosted on Azure AI).
    4. Classify objects/actions using your model (e.g., detect shoplifting).
    5. Patch AVI insights with corrected labels via API.

    The links here will give you more details on the above steps: https://github.com/Azure-Samples/azure-video-indexer-samples/blob/master/BringYourOwn-Samples/README.MD and https://www.youtube.com/watch?v=yMqJufR9Rfs to extend its capabilities using custom models, Logic Apps, and OpenAI.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.