Hello Tho Le !
Thank you for posting on Microsoft Learn Q&A.
I think you should you keep one indexer (Blob/ADLS) and pull the table metadata during enrichment so each chunk gets the file metadata.
You need to add a Web API skill that receives metadata_storage_path and looks up metadata in your Azure Table (via an Azure Function) then use a ShaperSkill to merge the returned metadata object onto each chunk object.
Output field mappings write both chunk fields (text/vector) and the replicated metadata fields into the same index doc so the result: one index, one indexer, chunk-level docs with all the metadata and no key conflict.
You can store metadata with the content source so you can still use a single blob indexer:
- option A: put file level metadata in blob system metadata or custom headers
- option B: drop a sidecar JSON per file that your skillset reads and merges before or while chunking and again you end with one indexer and chunk docs that already include the metadata
You can chunk the documents before indexing (Databricks/ADF/Azure Function) and write a JSONL/Parquet where each row is:
key: "<metadata_storage_path>#<chunk_no>"
path: "<metadata_storage_path>"
chunk_no: <n>
text: ...
vector: ...
… + all metadata columns
Point a single indexer at this dataset. You fully control the key (path#chunk_no) and avoid joining altogether.
If you must keep two indexes you need to chunk index documents include the original path as a field in addition to chunk_id and keep a separate metadata index keyed by metadata_storage_path.
At query time you can have search vectors on the chunk index then take top chunks and join to the metadata index in your app by metadata_storage_path this way you avoids re-indexing but the join is in application code.