Azure Video Indexer Insights Changed

Chris Pressley 0 Reputation points
2023-10-26T06:47:43.16+00:00

Hi, We migrated to Video Indexer a few months ago from Media Services, and things worked fine. We've recently re-tested the same video and it appears that where we used to get a large list of fragments with events in each fragment, with an set interval between each fragment. We now just get one large fragment with a long list of events in it. Has this changed recently?

I was expecting the way the algo worked to stay consistent and return the same JSON as it has done previously:

Before and After

Azure AI Video Indexer
Azure AI Video Indexer
An Azure video analytics service that uses AI to extract actionable insights from stored videos.
Azure
Azure
A cloud computing platform and infrastructure for building, deploying and managing applications and services through a worldwide network of Microsoft-managed datacenters.
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 25,761 Reputation points Volunteer Moderator
    2025-09-29T15:56:24.7966667+00:00

    Hello Chris Pressley,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that your Azure Video Indexer Insights Changed.

    If you must have the exact old JSON structure (not just fragments at a given interval), follow this reproducible diagnostic flow:

    1. Fetch canonical insights using the Get Video Index (Videos/{videoId}/Index) API and save it. Microsoft Learn
    2. Fetch faces artifact via Get Video Artifact Download Url?type=faces and save it. Microsoft Learn
    3. Record: the exact videoId, the UTC timestamps when each indexing job ran, indexingPreset used on upload (if any), and your account location and accountId. (These are the fields MS support will ask for.) Microsoft Learn
    4. Re-index the video explicitly setting indexingPreset to Standard or Advanced (match previous environment or try both) and re-download artifacts. Example upload query param: &indexingPreset=Advanced (the API sample uses indexingPreset=Default in docs; other valid settings correspond to Basic/Standard/Advanced in the Indexing Configuration doc). Compare outputs. api-portal.videoindexer.ai+1
    5. If reindexing with the same preset doesn’t restore the old format, open a Microsoft support ticket

    Alternatively, if your old code expects N fragments at a fixed interval, but the artifact now contains one big fragment with many events, you can recreate fragments by bucketing events into fixed time slices.

    // Requires: Newtonsoft.Json (Json.NET)
    using System;
    using System.Collections.Generic;
    using System.Globalization;
    using System.Net.Http;
    using System.Threading.Tasks;
    using Newtonsoft.Json;
    using Newtonsoft.Json.Linq;
    public class VideoIndexerHelpers
    {
        static HttpClient http = new HttpClient();
        // Helper: parse a time token that can be "00:00:01.234" OR an integer tick value (string or number).
        static double ParseTimeToSeconds(JToken t, double timescaleFallback = 10000000.0)
        {
            if (t == null) return 0.0;
            // If it's a string with colon -> parse as TimeSpan
            var s = t.Type == JTokenType.String ? (string)t : null;
            if (!string.IsNullOrEmpty(s) && s.Contains(":"))
            {
                if (TimeSpan.TryParse(s, out var ts)) return ts.TotalSeconds;
                // In some outputs there's a fractional seconds format with '.' — TimeSpan.Parse usually handles it.
            }
            // Otherwise if numeric -> treat as ticks
            if (t.Type == JTokenType.Integer || t.Type == JTokenType.Float || long.TryParse(t.ToString(), out var ticks))
            {
                // look for timescale in the JSON surrounding node (caller should pass timescale if present).
                return ticks / timescaleFallback;
            }
            // fallback zero
            return 0.0;
        }
        // Main: download artifact JSON and bucket into fixed fragments
        public static async Task<string> RebucketFacesToFragmentsAsync(
            string artifactUrl, // direct download url for faces.json OR GetVideoIndex endpoint
            double fragmentSeconds = 2.0,
            double timescale = 10000000.0 // common default ticks-per-second - we'll override if JSON contains a timescale field
        )
        {
            var txt = await http.GetStringAsync(artifactUrl);
            var root = JToken.Parse(txt);
            // Try to detect a timescale field (common names: "timescale", "timeScale", "timescale" at top)
            var topTimescale = root.SelectToken("timescale") ?? root.SelectToken("timeScale") ?? root.SelectToken("TimeScale");
            if (topTimescale != null && double.TryParse(topTimescale.ToString(), out var tsVal))
            {
                timescale = tsVal;
            }
            // Collect all face events (generic): look for face entries and their instances
            var events = new List<(double start, double end, JObject instance, string faceId)>();
            // Common layout: root["faces"] -> array of faces -> each has "instances" -> each instance has "start"/"end" or "startTime"/"endTime" or numeric ticks
            var facesArray = root.SelectToken("faces") as JArray ?? root.SelectToken("Faces") as JArray;
            if (facesArray != null)
            {
                foreach (var face in facesArray)
                {
                    var faceId = (string)face["id"] ?? (string)face["faceId"] ?? (string)face["id"] ?? "";
                    var instances = face["instances"] as JArray ?? face["faceInstances"] as JArray;
                    if (instances == null) continue;
                    foreach (var inst in instances)
                    {
                        // find start and end keys
                        var startToken = inst["start"] ?? inst["startTime"] ?? inst["start_time"] ?? inst["offset"];
                        var endToken = inst["end"] ?? inst["endTime"] ?? inst["end_time"] ?? inst["duration"];
                        double start = ParseTimeToSeconds(startToken, timescale);
                        double end = ParseTimeToSeconds(endToken, timescale);
                        // if end looks like a duration rather than absolute time, and start exists, compute end = start + duration
                        if (endToken != null && (endToken.Type == JTokenType.Integer || endToken.Type == JTokenType.Float) && startToken != null && (string)startToken != null && !((string)startToken).Contains(":") && end < 1000)
                        {
                            // this may be ambiguous; be conservative and only use it if end is noticeably small (a duration)
                            end = start + end;
                        }
                        if (end <= 0) end = start + 0.1; // fallback tiny duration
                        events.Add((start, end, (JObject)inst, faceId));
                    }
                }
            }
            else
            {
                // If we cannot find faces[] top-level, try to find any "fragments" / "events" -> fall back to scanning the tree for "instances"
                foreach (var inst in root.SelectTokens("$..instances[*]"))
                {
                    var startToken = inst["start"] ?? inst["startTime"];
                    var endToken = inst["end"] ?? inst["endTime"];
                    double start = ParseTimeToSeconds(startToken, timescale);
                    double end = ParseTimeToSeconds(endToken, timescale);
                    events.Add((start, end, (JObject)inst.Parent, ""));
                }
            }
            // Build fragments dictionary
            var fragments = new SortedDictionary<int, JObject>();
            foreach (var ev in events)
            {
                var idx = (int)Math.Floor(ev.start / fragmentSeconds);
                if (!fragments.ContainsKey(idx))
                {
                    var fragStart = idx * fragmentSeconds;
                    var fragObj = new JObject
                    {
                        ["start"] = fragStart,
                        ["end"] = fragStart + fragmentSeconds,
                        ["events"] = new JArray()
                    };
                    fragments[idx] = fragObj;
                }
                // store a compact event record
                var eobj = new JObject
                {
                    ["faceId"] = ev.faceId,
                    ["start"] = ev.start,
                    ["end"] = ev.end,
                    ["rawInstance"] = ev.instance
                };
                ((JArray)fragments[idx]["events"]).Add(eobj);
            }
            // Create final array ordered by fragment index
            var outArray = new JArray();
            foreach (var kv in fragments)
            {
                outArray.Add(kv.Value);
            }
            var output = new JObject
            {
                ["fragmentDurationSeconds"] = fragmentSeconds,
                ["generatedAtUtc"] = DateTime.UtcNow.ToString("o"),
                ["fragments"] = outArray
            };
            return output.ToString(Formatting.Indented);
        }
    }
    

    I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.