Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:  Databricks SQL
 Databricks SQL  Databricks Runtime 13.2 and above
 Databricks Runtime 13.2 and above
Returns the combined schema of all JSON strings in a group in DDL format.
Syntax
schema_of_json_agg(jsonStr [, options] ) [FILTER ( WHERE cond ) ]
This function can also be invoked as a window function using the OVER clause.
Arguments
- jsonStr: A- STRINGliteral with- JSON.
- options: An optional- MAPliteral with keys and values being- STRING. For details on options, see- from_jsonfunction.
- cond: An optional- BOOLEANexpression filtering the rows used for aggregation.
Returns
A STRING holding a definition of an array of structs with n fields of strings where the column names are derived from the distinct set of JSON keys .
The field values hold the derived formatted SQL types.
The schema of each record is merged together by field name.
When two fields with the same name have a different type across records, Azure Databricks uses the least common type.
When no such type exists, the type is derived as a STRING.
For example, INT and DOUBLE become DOUBLE, while STRUCT<i INT> and STRING become STRING.
The schema obtained from reading a column of JSON data is the same as the one derived from the following.
SELECT * FROM json.`/my/data`;
To derive the schema of a single JSON string, use schema_of_json function.
Examples
> SELECT schema_of_json_agg(a) FROM VALUES('{"foo": "bar"}') AS data(a);
  STRUCT<foo: STRING>
> SELECT schema_of_json_agg(a) FROM VALUES('[1]') AS data(a);
  ARRAY<BIGINT>
> CREATE TEMPORARY VIEW data(a) AS VALUES
  ('{"foo": "bar",   "wing": {"ding": "dong"}}'),
  ('{"top": "level", "wing": {"stop": "go"}}')
> SELECT schema_of_json_agg(a) FROM data;
  STRUCT<foo: STRING,top: STRING,wing: STRUCT<ding: STRING, stop: STRING>>