你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

IndexingParametersConfiguration interface

包:: @azure/search-documents

特定于索引器的配置属性的字典。每个名称都是特定属性的名称。每个值都必须是基元类型。

属性

allowSkillsetToReadFileData	如果为 true，将创建一个路径 //document//file_data，该路径表示从 Blob 数据源下载的原始文件数据。这样，就可以将原始文件数据传递到自定义技能，以便在扩充管道中进行处理，或传递到文档提取技能。
dataToExtract	指定要从 Azure Blob 存储中提取的数据，并在“imageAction”设置为“none”以外的值时告知索引器从映像内容中提取的数据。这适用于.PDF或其他应用程序中的嵌入图像内容，或者 Azure blob 中的 .jpg 和 .png等图像文件。
delimitedTextDelimiter	对于 CSV Blob，为 CSV 文件指定行尾单字符分隔符，其中每行都启动一个新文档（例如“\|”）。
delimitedTextHeaders	对于 CSV Blob，指定以逗号分隔的列标题列表，可用于将源字段映射到索引中的目标字段。
documentRoot	对于 JSON 数组，给定结构化或半结构化文档，可以使用此属性指定数组的路径。
excludedFileNameExtensions	从 Azure Blob 存储进行处理时要忽略的文件扩展名的逗号分隔列表。例如，可以在索引期间排除“.png，.mp4”跳过这些文件。
executionEnvironment	指定索引器应在其中执行的环境。
failOnUnprocessableDocument	对于 Azure Blob，如果要在文档索引失败时继续编制索引，则设置为 false。
failOnUnsupportedContentType	对于 Azure Blob，如果想要在遇到不受支持的内容类型时继续编制索引，并且事先不知道所有内容类型（文件扩展名），则设置为 false。
firstLineContainsHeaders	对于 CSV Blob，指示每个 Blob 的第一行（非空白）行包含标头。
imageAction	确定如何在 Azure Blob 存储中处理嵌入的图像和图像文件。将“imageAction”配置设置为“none”以外的任何值需要技能集也附加到该索引器。
indexedFileNameExtensions	从 Azure Blob 存储进行处理时要选择的文件扩展名的逗号分隔列表。例如，可以将索引集中在特定应用程序文件“.docx、.pptx、.msg”上，以专门包括这些文件类型。
indexStorageMetadataOnlyForOversizedDocuments	对于 Azure Blob，请将此属性设置为 true，以仍为 Blob 内容的存储元数据编制索引，这些元数据太大而无法处理。默认情况下，超大 Blob 被视为错误。有关 blob 大小限制，请参阅 https://docs.microsoft.com/azure/search/search-limits-quotas-capacity。
parsingMode	表示用于从 Azure Blob 数据源编制索引的分析模式。
pdfTextRotationAlgorithm	确定用于从 Azure Blob 存储中的 PDF 文件提取文本的算法。
queryTimeout	为 Azure SQL 数据库数据源增加超过 5 分钟默认值的超时，格式为“hh：mm：ss”。

属性详细信息

allowSkillsetToReadFileData

如果为 true，将创建一个路径 //document//file_data，该路径表示从 Blob 数据源下载的原始文件数据。这样，就可以将原始文件数据传递到自定义技能，以便在扩充管道中进行处理，或传递到文档提取技能。

allowSkillsetToReadFileData?: boolean

属性值

boolean

dataToExtract

指定要从 Azure Blob 存储中提取的数据，并在“imageAction”设置为“none”以外的值时告知索引器从映像内容中提取的数据。这适用于.PDF或其他应用程序中的嵌入图像内容，或者 Azure blob 中的 .jpg 和 .png等图像文件。

dataToExtract?: "storageMetadata" | "allMetadata" | "contentAndMetadata"

属性值

"storageMetadata" | "allMetadata" | "contentAndMetadata"

delimitedTextDelimiter

对于 CSV Blob，为 CSV 文件指定行尾单字符分隔符，其中每行都启动一个新文档（例如“|”）。

delimitedTextDelimiter?: string

属性值

string

delimitedTextHeaders

对于 CSV Blob，指定以逗号分隔的列标题列表，可用于将源字段映射到索引中的目标字段。

delimitedTextHeaders?: string

属性值

string

documentRoot

对于 JSON 数组，给定结构化或半结构化文档，可以使用此属性指定数组的路径。

documentRoot?: string

属性值

string

excludedFileNameExtensions

从 Azure Blob 存储进行处理时要忽略的文件扩展名的逗号分隔列表。例如，可以在索引期间排除“.png，.mp4”跳过这些文件。

excludedFileNameExtensions?: string

属性值

string

executionEnvironment

指定索引器应在其中执行的环境。

executionEnvironment?: "standard" | "private"

属性值

"standard" | "private"

failOnUnprocessableDocument

对于 Azure Blob，如果要在文档索引失败时继续编制索引，则设置为 false。

failOnUnprocessableDocument?: boolean

属性值

boolean

failOnUnsupportedContentType

对于 Azure Blob，如果想要在遇到不受支持的内容类型时继续编制索引，并且事先不知道所有内容类型（文件扩展名），则设置为 false。

failOnUnsupportedContentType?: boolean

属性值

boolean

firstLineContainsHeaders

对于 CSV Blob，指示每个 Blob 的第一行（非空白）行包含标头。

firstLineContainsHeaders?: boolean

属性值

boolean

imageAction

确定如何在 Azure Blob 存储中处理嵌入的图像和图像文件。将“imageAction”配置设置为“none”以外的任何值需要技能集也附加到该索引器。

imageAction?: "none" | "generateNormalizedImages" | "generateNormalizedImagePerPage"

属性值

"none" | "generateNormalizedImages" | "generateNormalizedImagePerPage"

indexedFileNameExtensions

从 Azure Blob 存储进行处理时要选择的文件扩展名的逗号分隔列表。例如，可以将索引集中在特定应用程序文件“.docx、.pptx、.msg”上，以专门包括这些文件类型。

indexedFileNameExtensions?: string

属性值

string

indexStorageMetadataOnlyForOversizedDocuments

对于 Azure Blob，请将此属性设置为 true，以仍为 Blob 内容的存储元数据编制索引，这些元数据太大而无法处理。默认情况下，超大 Blob 被视为错误。有关 blob 大小限制，请参阅 https://docs.microsoft.com/azure/search/search-limits-quotas-capacity。

indexStorageMetadataOnlyForOversizedDocuments?: boolean

属性值

boolean

parsingMode

表示用于从 Azure Blob 数据源编制索引的分析模式。

parsingMode?: "text" | "default" | "delimitedText" | "json" | "jsonArray" | "jsonLines"

属性值

pdfTextRotationAlgorithm

确定用于从 Azure Blob 存储中的 PDF 文件提取文本的算法。

pdfTextRotationAlgorithm?: "none" | "detectAngles"

属性值

"none" | "detectAngles"

queryTimeout

为 Azure SQL 数据库数据源增加超过 5 分钟默认值的超时，格式为“hh：mm：ss”。

queryTimeout?: string

属性值

string

通过

IndexingParametersConfiguration interface

属性

属性详细信息

allowSkillsetToReadFileData

属性值

dataToExtract

属性值

delimitedTextDelimiter

属性值

delimitedTextHeaders

属性值

documentRoot

属性值

excludedFileNameExtensions

属性值

executionEnvironment

属性值

failOnUnprocessableDocument

属性值

failOnUnsupportedContentType

属性值

firstLineContainsHeaders

属性值

imageAction

属性值

indexedFileNameExtensions

属性值

indexStorageMetadataOnlyForOversizedDocuments

属性值

parsingMode

属性值

pdfTextRotationAlgorithm

属性值

queryTimeout

属性值