Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
Databricks SQL
Databricks Runtime 13.2 and above
Returns the combined schema of all JSON strings in a group in DDL format.
Syntax
schema_of_json_agg(jsonStr [, options] ) [FILTER ( WHERE cond ) ]
This function can also be invoked as a window function using the OVER clause.
Arguments
jsonStr: ASTRINGliteral withJSON.options: An optionalMAPliteral with keys and values beingSTRING. For details on options, seefrom_jsonfunction.cond: An optionalBOOLEANexpression filtering the rows used for aggregation.
Returns
A STRING holding a definition of an array of structs with n fields of strings where the column names are derived from the distinct set of JSON keys .
The field values hold the derived formatted SQL types.
The schema of each record is merged together by field name.
When two fields with the same name have a different type across records, Azure Databricks uses the least common type.
When no such type exists, the type is derived as a STRING.
For example, INT and DOUBLE become DOUBLE, while STRUCT<i INT> and STRING become STRING.
The schema obtained from reading a column of JSON data is the same as the one derived from the following.
SELECT * FROM json.`/my/data`;
To derive the schema of a single JSON string, use schema_of_json function.
Examples
> SELECT schema_of_json_agg(a) FROM VALUES('{"foo": "bar"}') AS data(a);
STRUCT<foo: STRING>
> SELECT schema_of_json_agg(a) FROM VALUES('[1]') AS data(a);
ARRAY<BIGINT>
> CREATE TEMPORARY VIEW data(a) AS VALUES
('{"foo": "bar", "wing": {"ding": "dong"}}'),
('{"top": "level", "wing": {"stop": "go"}}')
> SELECT schema_of_json_agg(a) FROM data;
STRUCT<foo: STRING,top: STRING,wing: STRUCT<ding: STRING, stop: STRING>>