Documents processed are 0 when Azure Search is indexing documents from Blob storage

Prashant abkari 0 Reputation points
2024-08-20T08:09:09.3066667+00:00

I have followed this below till the end by making necessary changes like the API key and the ConnectionString to connect to my datasource, creating index and indexers. At the end of running the indexer, the number of documents processed is 0. I have 2 JSON files in the container that I am reading from. I am unsure why the document processed count shows as 0. I don't have any complex networking or security rules configured.

https://free.blessedness.top/en-us/azure/search/search-semi-structured-data

POST {{baseUrl}}/indexers?api-version=2024-07-01  HTTP/1.1
  Content-Type: application/json
  api-key: {{apiKey}}

    {
      "name" : "ny-philharmonic-indexer",
      "dataSourceName" : "employerdata-container",
      "targetIndexName" : "ny-philharmonic-index",
      "parameters" : { 
        "configuration" : { 
          "parsingMode" : "jsonArray", "documentRoot": "/programs"}
        },
      "fieldMappings" : [ 
      ]
    }
POST {{baseUrl}}/indexes?api-version=2024-07-01  HTTP/1.1
  Content-Type: application/json
  api-key: {{apiKey}}

    {
      "name": "ny-philharmonic-index",  
      "fields": [
        {"name": "programID", "type": "Edm.String", "key": true, "searchable": true, "retrievable": true, "filterable": true, "facetable": true, "sortable": true},
        {"name": "orchestra", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": true, "facetable": true, "sortable": true},
        {"name": "season", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": true, "facetable": true, "sortable": true},
        { "name": "concerts", "type": "Collection(Edm.ComplexType)", 
          "fields": [
            { "name": "eventType", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": false, "sortable": false, "facetable": false},
            { "name": "Location", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true },
            { "name": "Venue", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true },
            { "name": "Date", "type": "Edm.String", "searchable": false, "retrievable": true, "filterable": true, "sortable": false, "facetable": true },
            { "name": "Time", "type": "Edm.String", "searchable": false, "retrievable": true, "filterable": true, "sortable": false, "facetable": true }
          ]
        },
        { "name": "works", "type": "Collection(Edm.ComplexType)", 
          "fields": [
            { "name": "ID", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": false, "sortable": false, "facetable": false},
            { "name": "composerName", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true },
            { "name": "workTitle", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true },
            { "name": "conductorName", "type": "Edm.String", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true },
            { "name": "soloists", "type": "Collection(Edm.String)", "searchable": true, "retrievable": true, "filterable": true, "sortable": false, "facetable": true }
          ]
        }
      ]
    }
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 39,106 Reputation points Volunteer Moderator
    2025-10-20T18:29:40.88+00:00

    Hello Prashant !

    Thank you for posting on Microsoft Learn Q&A.

    parsingMode: "jsonArray" tells the blob indexer that each blob is a top level JSON array and that each array element is one search document. In this mode, documentRoot is not used.

    If your files look like this (an object with a programs array inside):

    {
      "programs": [
        { "programID": "1943-01-23A", "orchestra": "...", "works": [ ... ] },
        { "programID": "1943-01-24B", "orchestra": "...", "works": [ ... ] }
      ]
    }
    

    then the correct setup is not jsonArray + "/programs". The indexer will scan for a top level array and reports 0 processed.

    Your blob is an object with a nested array /programs (like above) so you need to use parsingMode: "json" and point documentRoot to the array elements:

    "parameters": {
      "configuration": {
        "parsingMode": "json",
        "documentRoot": "/programs/*"
      }
    }
    

    The /* means emit one document per element in the programs array.

    https://free.blessedness.top/en-us/azure/search/search-how-to-index-azure-blob-json

    Your blob is a top-level array If your file is:

    [
      { "programID": "1943-01-23A", "orchestra": "...", "works": [ ... ] },
      { "programID": "1943-01-24B", "orchestra": "...", "works": [ ... ] }
    ]
    

    then keep:

    "parameters": {
      "configuration": {
        "parsingMode": "jsonArray"
      }
    }
    

    and remove documentRoot.

    https://free.blessedness.top/en-us/azure/search/search-how-to-index-azure-blob-storage

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.