Azure Video Indexer Insights Changed

Question

Azure Video Indexer Insights Changed

Chris Pressley 0

Hi, We migrated to Video Indexer a few months ago from Media Services, and things worked fine. We've recently re-tested the same video and it appears that where we used to get a large list of fragments with events in each fragment, with an set interval between each fragment. We now just get one large fragment with a long list of events in it. Has this changed recently?

I was expecting the way the algo worked to stay consistent and return the same JSON as it has done previously:

Before and After

romungi-MSFT 49,051 Reputation points Microsoft Employee Moderator

2023-10-26T09:55:55.34+00:00

@Chris Pressley I do not much experience using the video indexer APIs but I have previously uploaded videos to get insights from the video indexer portal. Looking at the output structure above, I do not notice similar format in my insights JSON file. I also do not see the same defined in the reference from documentation. Are you also using the video indexer portal to upload videos and index them to download insights or are you using a different API?

Chris Pressley 0

Hi,

If i use the Portal and click download, Artifacts -> faces.json - this is the data you get. I am actually using the following API call to get the information in code though:

var queryParams = CreateQueryString(
    new Dictionary<string, string>()
    {
        {"accessToken", videoAccessToken},
        {"type", "faces"},
    });
try
{
    var artifactRequestResult = await client.GetAsync($"{apiUrl}/{accountLocation}/Accounts/{accountId}/Videos/{videoId}/ArtifactUrl?{queryParams}");

    VerifyStatus(artifactRequestResult, System.Net.HttpStatusCode.OK);
    var artifactLink = artifactRequestResult.Content.ReadAsStringAsync().Result.Replace("\"", "");
    var json = await client.GetStringAsync(artifactLink);
    return json;
}
catch (Exception ex)
{
    logger.LogError(ex.ToString());
}

1 answer

Your answer

romungi-MSFT 49,051 Reputation points Microsoft Employee Moderator

2023-10-26T09:55:55.34+00:00

@Chris Pressley I do not much experience using the video indexer APIs but I have previously uploaded videos to get insights from the video indexer portal. Looking at the output structure above, I do not notice similar format in my insights JSON file. I also do not see the same defined in the reference from documentation. Are you also using the video indexer portal to upload videos and index them to download insights or are you using a different API?
Chris Pressley 0 Reputation points

2023-10-26T11:57:32.7366667+00:00

Hi,

If i use the Portal and click download, Artifacts -> faces.json - this is the data you get. I am actually using the following API call to get the information in code though:

var queryParams = CreateQueryString( new Dictionary<string, string>() { {"accessToken", videoAccessToken}, {"type", "faces"}, }); try { var artifactRequestResult = await client.GetAsync($"{apiUrl}/{accountLocation}/Accounts/{accountId}/Videos/{videoId}/ArtifactUrl?{queryParams}"); VerifyStatus(artifactRequestResult, System.Net.HttpStatusCode.OK); var artifactLink = artifactRequestResult.Content.ReadAsStringAsync().Result.Replace("\"", ""); var json = await client.GetStringAsync(artifactLink); return json; } catch (Exception ex) { logger.LogError(ex.ToString()); }

Answer 1

Hello Chris Pressley,

Welcome to the Microsoft Q&A and thank you for posting your questions here.

I understand that your Azure Video Indexer Insights Changed.

If you must have the exact old JSON structure (not just fragments at a given interval), follow this reproducible diagnostic flow:

Fetch canonical insights using the Get Video Index (Videos/{videoId}/Index) API and save it. Microsoft Learn
Fetch faces artifact via Get Video Artifact Download Url?type=faces and save it. Microsoft Learn
Record: the exact videoId, the UTC timestamps when each indexing job ran, indexingPreset used on upload (if any), and your account location and accountId. (These are the fields MS support will ask for.) Microsoft Learn
Re-index the video explicitly setting indexingPreset to Standard or Advanced (match previous environment or try both) and re-download artifacts. Example upload query param: &indexingPreset=Advanced (the API sample uses indexingPreset=Default in docs; other valid settings correspond to Basic/Standard/Advanced in the Indexing Configuration doc). Compare outputs. api-portal.videoindexer.ai+1
If reindexing with the same preset doesn’t restore the old format, open a Microsoft support ticket

Alternatively, if your old code expects N fragments at a fixed interval, but the artifact now contains one big fragment with many events, you can recreate fragments by bucketing events into fixed time slices.

// Requires: Newtonsoft.Json (Json.NET)
using System;
using System.Collections.Generic;
using System.Globalization;
using System.Net.Http;
using System.Threading.Tasks;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
public class VideoIndexerHelpers
{
    static HttpClient http = new HttpClient();
    // Helper: parse a time token that can be "00:00:01.234" OR an integer tick value (string or number).
    static double ParseTimeToSeconds(JToken t, double timescaleFallback = 10000000.0)
    {
        if (t == null) return 0.0;
        // If it's a string with colon -> parse as TimeSpan
        var s = t.Type == JTokenType.String ? (string)t : null;
        if (!string.IsNullOrEmpty(s) && s.Contains(":"))
        {
            if (TimeSpan.TryParse(s, out var ts)) return ts.TotalSeconds;
            // In some outputs there's a fractional seconds format with '.' — TimeSpan.Parse usually handles it.
        }
        // Otherwise if numeric -> treat as ticks
        if (t.Type == JTokenType.Integer || t.Type == JTokenType.Float || long.TryParse(t.ToString(), out var ticks))
        {
            // look for timescale in the JSON surrounding node (caller should pass timescale if present).
            return ticks / timescaleFallback;
        }
        // fallback zero
        return 0.0;
    }
    // Main: download artifact JSON and bucket into fixed fragments
    public static async Task<string> RebucketFacesToFragmentsAsync(
        string artifactUrl, // direct download url for faces.json OR GetVideoIndex endpoint
        double fragmentSeconds = 2.0,
        double timescale = 10000000.0 // common default ticks-per-second - we'll override if JSON contains a timescale field
    )
    {
        var txt = await http.GetStringAsync(artifactUrl);
        var root = JToken.Parse(txt);
        // Try to detect a timescale field (common names: "timescale", "timeScale", "timescale" at top)
        var topTimescale = root.SelectToken("timescale") ?? root.SelectToken("timeScale") ?? root.SelectToken("TimeScale");
        if (topTimescale != null && double.TryParse(topTimescale.ToString(), out var tsVal))
        {
            timescale = tsVal;
        }
        // Collect all face events (generic): look for face entries and their instances
        var events = new List<(double start, double end, JObject instance, string faceId)>();
        // Common layout: root["faces"] -> array of faces -> each has "instances" -> each instance has "start"/"end" or "startTime"/"endTime" or numeric ticks
        var facesArray = root.SelectToken("faces") as JArray ?? root.SelectToken("Faces") as JArray;
        if (facesArray != null)
        {
            foreach (var face in facesArray)
            {
                var faceId = (string)face["id"] ?? (string)face["faceId"] ?? (string)face["id"] ?? "";
                var instances = face["instances"] as JArray ?? face["faceInstances"] as JArray;
                if (instances == null) continue;
                foreach (var inst in instances)
                {
                    // find start and end keys
                    var startToken = inst["start"] ?? inst["startTime"] ?? inst["start_time"] ?? inst["offset"];
                    var endToken = inst["end"] ?? inst["endTime"] ?? inst["end_time"] ?? inst["duration"];
                    double start = ParseTimeToSeconds(startToken, timescale);
                    double end = ParseTimeToSeconds(endToken, timescale);
                    // if end looks like a duration rather than absolute time, and start exists, compute end = start + duration
                    if (endToken != null && (endToken.Type == JTokenType.Integer || endToken.Type == JTokenType.Float) && startToken != null && (string)startToken != null && !((string)startToken).Contains(":") && end < 1000)
                    {
                        // this may be ambiguous; be conservative and only use it if end is noticeably small (a duration)
                        end = start + end;
                    }
                    if (end <= 0) end = start + 0.1; // fallback tiny duration
                    events.Add((start, end, (JObject)inst, faceId));
                }
            }
        }
        else
        {
            // If we cannot find faces[] top-level, try to find any "fragments" / "events" -> fall back to scanning the tree for "instances"
            foreach (var inst in root.SelectTokens("$..instances[*]"))
            {
                var startToken = inst["start"] ?? inst["startTime"];
                var endToken = inst["end"] ?? inst["endTime"];
                double start = ParseTimeToSeconds(startToken, timescale);
                double end = ParseTimeToSeconds(endToken, timescale);
                events.Add((start, end, (JObject)inst.Parent, ""));
            }
        }
        // Build fragments dictionary
        var fragments = new SortedDictionary<int, JObject>();
        foreach (var ev in events)
        {
            var idx = (int)Math.Floor(ev.start / fragmentSeconds);
            if (!fragments.ContainsKey(idx))
            {
                var fragStart = idx * fragmentSeconds;
                var fragObj = new JObject
                {
                    ["start"] = fragStart,
                    ["end"] = fragStart + fragmentSeconds,
                    ["events"] = new JArray()
                };
                fragments[idx] = fragObj;
            }
            // store a compact event record
            var eobj = new JObject
            {
                ["faceId"] = ev.faceId,
                ["start"] = ev.start,
                ["end"] = ev.end,
                ["rawInstance"] = ev.instance
            };
            ((JArray)fragments[idx]["events"]).Add(eobj);
        }
        // Create final array ordered by fragment index
        var outArray = new JArray();
        foreach (var kv in fragments)
        {
            outArray.Add(kv.Value);
        }
        var output = new JObject
        {
            ["fragmentDurationSeconds"] = fragmentSeconds,
            ["generatedAtUtc"] = DateTime.UtcNow.ToString("o"),
            ["fragments"] = outArray
        };
        return output.ToString(Formatting.Indented);
    }
}

I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.

Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

Share via

Azure Video Indexer Insights Changed

1 answer

Your answer