Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
Support for this Databricks Runtime version has ended. For the end-of-support date, see End-of-support history. For all supported Databricks Runtime versions, see Databricks Runtime release notes versions and compatibility.
The following release notes provide information about Databricks Runtime 12.1, powered by Apache Spark 3.3.1.
Databricks released this version in January 2023.
New features and improvements
- Delta Lake table features supported for protocol management
- Predictive I/O for updates is in public preview
- Catalog Explorer is now available to all personas
- Support for multiple stateful operators in a single streaming query
- Support for protocol buffers is in Public Preview
- Support for Confluent Schema Registry authentication
- Support for sharing table history with Delta Sharing shares
- Support for streaming with Delta Sharing shares
- Table version using timestamp now supported for Delta Sharing tables in catalogs
- Support for WHEN NOT MATCHED BY SOURCE for MERGE INTO
- Optimized statistics collection for CONVERT TO DELTA
- Unity Catalog support for undropping tables
Delta Lake table features supported for protocol management
Azure Databricks has introduced support for Delta Lake table features, which introduce granular flags specifying which features are supported by a given table. See Delta Lake feature compatibility and protocols.
Predictive I/O for updates is in public preview
Predictive I/O now accelerates DELETE, MERGE, and UPDATE operations for Delta tables with deletion vectors enabled on Photon enabled compute. See What is predictive I/O?.
Catalog Explorer is now available to all personas
Catalog Explorer is now available to all Azure Databricks personas when using Databricks Runtime 7.3 LTS and above.
Support for multiple stateful operators in a single streaming query
Users can now chain stateful operators with append mode in the streaming query. Not all operators are fully supported. Stream-stream time interval join and flatMapGroupsWithState don't allow other stateful operators to be chained.
Support for protocol buffers is in Public Preview
You can use the from_protobuf and to_protobuf functions to exchange data between binary and struct types. See Read and write protocol buffers.
Support for Confluent Schema Registry authentication
Azure Databricks integration with Confluent Schema Registry now supports external schema registry addresses with authentication. This feature is available for from_avro, to_avro, from_protobuf, and to_protobuf functions. See Protobuf or Avro.
Support for sharing table history with Delta Sharing shares
You can now share a table with full history using Delta Sharing, allowing recipients to perform time travel queries and query the table using Spark Structured Streaming. WITH HISTORY is recommended instead of CHANGE DATA FEED, although the latter continues to be supported. See ALTER SHARE and Add tables to a share.
Support for streaming with Delta Sharing shares
Spark Structured Streaming now works with the format deltasharing on a source Delta Sharing table that has been shared using WITH HISTORY.
Table version using timestamp now supported for Delta Sharing tables in catalogs
You can now use the SQL syntax TIMESTAMP AS OF in SELECT statements to specify the version of a Delta Sharing table that's mounted in a catalog. Tables must be shared using WITH HISTORY.
Support for WHEN NOT MATCHED BY SOURCE for MERGE INTO
You can now add WHEN NOT MATCHED BY SOURCE clauses to MERGE INTO to update or delete rows in the chosen table that don't have matches in the source table based on the merge condition. The new clause is available in SQL, Python, Scala, and Java. See MERGE INTO.
Optimized statistics collection for CONVERT TO DELTA
Statistics collection for the CONVERT TO DELTA operation is now much faster. This reduces the number of workloads that might use NO STATISTICS for efficiency.
Unity Catalog support for undropping tables
This feature was initially released in Public Preview. It is GA as of October 25, 2023.
You can now undrop a dropped managed or external table in an existing schema within seven days of dropping. See UNDROP and SHOW TABLES DROPPED.
Library upgrades
- Upgraded Python libraries:
- filelock from 3.8.0 to 3.8.2
- platformdirs from 2.5.4 to 2.6.0
- setuptools from 58.0.4 to 61.2.0
- Upgraded R libraries:
- Upgraded Java libraries:
- io.delta.delta-sharing-spark_2.12 from 0.5.2 to 0.6.2
- org.apache.hive.hive-storage-api from 2.7.2 to 2.8.1
- org.apache.parquet.parquet-column from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.apache.parquet.parquet-common from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.apache.parquet.parquet-encoding from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.apache.parquet.parquet-format-structures from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.apache.parquet.parquet-hadoop from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.apache.parquet.parquet-jackson from 1.12.3-databricks-0001 to 1.12.3-databricks-0002
- org.tukaani.xz from 1.8 to 1.9
Apache Spark
Databricks Runtime 12.1 includes Apache Spark 3.3.1. This release includes all Spark fixes and improvements included in Databricks Runtime 12.0 (EoS), as well as the following additional bug fixes and improvements made to Spark:
- [SPARK-41405] [SC-119769][12.1.0] Revert “[SC-119411][sql] Centralize the column resolution logic” and “[SC-117170][spark-41338][SQL] Resolve outer references and normal columns in the same analyzer batch”
- [SPARK-41405] [SC-119411][sql] Centralize the column resolution logic
- [SPARK-41859] [SC-119514][sql] CreateHiveTableAsSelectCommand should set the overwrite flag correctly
- [SPARK-41659] [SC-119526][connect][12.X] Enable doctests in pyspark.sql.connect.readwriter
- [SPARK-41858] [SC-119427][sql] Fix ORC reader perf regression due to DEFAULT value feature
- [SPARK-41807] [SC-119399][core] Remove non-existent error class: UNSUPPORTED_FEATURE.DISTRIBUTE_BY
- [SPARK-41578] [12.x][sc-119273][SQL] Assign name to _LEGACY_ERROR_TEMP_2141
- [SPARK-41571] [SC-119362][sql] Assign name to _LEGACY_ERROR_TEMP_2310
- [SPARK-41810] [SC-119373][connect] Infer names from a list of dictionaries in SparkSession.createDataFrame
- [SPARK-40993] [SC-119504][spark-41705][CONNECT][12.x] Move Spark Connect documentation and script to dev/ and Python documentation
- [SPARK-41534] [SC-119456][connect][SQL][12.x] Setup initial client module for Spark Connect
- [SPARK-41365] [SC-118498][ui][3.3] Stages UI page fails to load for proxy in specific yarn environment
- [SPARK-41481] [SC-118150][core][SQL] Reuse
INVALID_TYPED_LITERALinstead of_LEGACY_ERROR_TEMP_0020 - [SPARK-41049] [SC-119305][sql] Revisit stateful expression handling
- [SPARK-41726] [SC-119248][sql] Remove
OptimizedCreateHiveTableAsSelectCommand - [SPARK-41271] [SC-118648][sc-118348][SQL] Support parameterized SQL queries by
sql() - [SPARK-41066] [SC-119344][connect][PYTHON] Implement
DataFrame.sampleByandDataFrame.stat.sampleBy - [SPARK-41407] [SC-119402][sc-119012][SQL][all tests] Pull out v1 write to WriteFiles
- [SPARK-41565] [SC-118868][sql] Add the error class
UNRESOLVED_ROUTINE - [SPARK-41668] [SC-118925][sql] DECODE function returns wrong results when passed NULL
- [SPARK-41554] [SC-119274] fix changing of Decimal scale when scale decreased by m…
- [SPARK-41065] [SC-119324][connect][PYTHON] Implement
DataFrame.freqItemsandDataFrame.stat.freqItems - [SPARK-41742] [SC-119404][spark-41745][CONNECT][12.x] Reenable doc tests and add missing column alias to count()
- [SPARK-41069] [SC-119310][connect][PYTHON] Implement
DataFrame.approxQuantileandDataFrame.stat.approxQuantile - [SPARK-41809] [SC-119367][connect][PYTHON] Make function
from_jsonsupport DataType Schema - [SPARK-41804] [SC-119382][sql] Choose correct element size in
InterpretedUnsafeProjectionfor array of UDTs - [SPARK-41786] [SC-119308][connect][PYTHON] Deduplicate helper functions
- [SPARK-41745] [SC-119378][spark-41789][12.X] Make
createDataFramesupport list of Rows - [SPARK-41344] [SC-119217][sql] Make error clearer when table not found in SupportsCatalogOptions catalog
- [SPARK-41803] [SC-119380][connect][PYTHON] Add missing function
log(arg1, arg2) - [SPARK-41808] [SC-119356][connect][PYTHON] Make JSON functions support options
- [SPARK-41779] [SC-119275][spark-41771][CONNECT][python] Make
__getitem__support filter and select - [SPARK-41783] [SC-119288][spark-41770][CONNECT][python] Make column op support None
- [SPARK-41440] [SC-119279][connect][PYTHON] Avoid the cache operator for general Sample.
- [SPARK-41785] [SC-119290][connect][PYTHON] Implement
GroupedData.mean - [SPARK-41629] [SC-119276][connect] Support for Protocol Extensions in Relation and Expression
- [SPARK-41417] [SC-118000][core][SQL] Rename
_LEGACY_ERROR_TEMP_0019toINVALID_TYPED_LITERAL - [SPARK-41533] [SC-119342][connect][12.X] Proper Error Handling for Spark Connect Server / Client
- [SPARK-41292] [SC-119357][connect][12.X] Support Window in pyspark.sql.window namespace
- [SPARK-41493] [SC-119339][connect][PYTHON] Make csv functions support options
- [SPARK-39591] [SC-118675][ss] Async Progress Tracking
- [SPARK-41767] [SC-119337][connect][PYTHON][12.x] Implement
Column.{withField, dropFields} - [SPARK-41068] [SC-119268][connect][PYTHON] Implement
DataFrame.stat.corr - [SPARK-41655] [SC-119323][connect][12.X] Enable doctests in pyspark.sql.connect.column
- [SPARK-41738] [SC-119170][connect] Mix ClientId in SparkSession cache
- [SPARK-41354] [SC-119194][connect] Add
RepartitionByExpressionto proto - [SPARK-41784] [SC-119289][connect][PYTHON] Add missing
__rmod__in Column - [SPARK-41778] [SC-119262][sql] Add an alias “reduce” to ArrayAggregate
- [SPARK-41067] [SC-119171][connect][PYTHON] Implement
DataFrame.stat.cov - [SPARK-41764] [SC-119216][connect][PYTHON] Make the internal string op name consistent with FunctionRegistry
- [SPARK-41734] [SC-119160][connect] Add a parent message for Catalog
- [SPARK-41742] [SC-119263] Support df.groupBy().agg({“*”:”count”})
- [SPARK-41761] [SC-119213][connect][PYTHON] Fix arithmetic ops:
__neg__,__pow__,__rpow__ - [SPARK-41062] [SC-118182][sql] Rename
UNSUPPORTED_CORRELATED_REFERENCEtoCORRELATED_REFERENCE - [SPARK-41751] [SC-119211][connect][PYTHON] Fix
Column.{isNull, isNotNull, eqNullSafe} - [SPARK-41728] [SC-119164][connect][PYTHON][12.x] Implement
unwrap_udtfunction - [SPARK-41333] [SC-119195][spark-41737] Implement
GroupedData.{min, max, avg, sum} - [SPARK-41751] [SC-119206][connect][PYTHON] Fix
Column.{bitwiseAND, bitwiseOR, bitwiseXOR} - [SPARK-41631] [SC-101081][sql] Support implicit lateral column alias resolution on Aggregate
- [SPARK-41529] [SC-119207][connect][12.X] Implement SparkSession.stop
- [SPARK-41729] [SC-119205][core][SQL][12.x] Rename
_LEGACY_ERROR_TEMP_0011toUNSUPPORTED_FEATURE.COMBINATION_QUERY_RESULT_CLAUSES - [SPARK-41717] [SC-119078][connect][12.X] Deduplicate print and repr_html at LogicalPlan
- [SPARK-41740] [SC-119169][connect][PYTHON] Implement
Column.name - [SPARK-41733] [SC-119163][sql][SS] Apply tree-pattern based pruning for the rule ResolveWindowTime
- [SPARK-41732] [SC-119157][sql][SS] Apply tree-pattern based pruning for the rule SessionWindowing
- [SPARK-41498] [SC-119018] Propagate metadata through Union
- [SPARK-41731] [SC-119166][connect][PYTHON][12.x] Implement the column accessor
- [SPARK-41736] [SC-119161][connect][PYTHON]
pyspark_types_to_proto_typesshould supportsArrayType - [SPARK-41473] [SC-119092][connect][PYTHON] Implement
format_numberfunction - [SPARK-41707] [SC-119141][connect][12.X] Implement Catalog API in Spark Connect
- [SPARK-41710] [SC-119062][connect][PYTHON] Implement
Column.between - [SPARK-41235] [SC-119088][sql][PYTHON]High-order function: array_compact implementation
- [SPARK-41518] [SC-118453][sql] Assign a name to the error class
_LEGACY_ERROR_TEMP_2422 - [SPARK-41723] [SC-119091][connect][PYTHON] Implement
sequencefunction - [SPARK-41703] [SC-119060][connect][PYTHON] Combine NullType and typed_null in Literal
- [SPARK-41722] [SC-119090][connect][PYTHON] Implement 3 missing time window functions
- [SPARK-41503] [SC-119043][connect][PYTHON] Implement Partition Transformation Functions
- [SPARK-41413] [SC-118968][sql] Avoid shuffle in Storage-Partitioned Join when partition keys mismatch, but join expressions are compatible
- [SPARK-41700] [SC-119046][connect][PYTHON] Remove
FunctionBuilder - [SPARK-41706] [SC-119094][connect][PYTHON]
pyspark_types_to_proto_typesshould supportsMapType - [SPARK-41702] [SC-119049][connect][PYTHON] Add invalid column ops
- [SPARK-41660] [SC-118866][sql] Only propagate metadata columns if they are used
- [SPARK-41637] [SC-119003][sql] ORDER BY ALL
- [SPARK-41513] [SC-118945][sql] Implement an accumulator to collect per mapper row count metrics
- [SPARK-41647] [SC-119064][connect][12.X] Deduplicate docstrings in pyspark.sql.connect.functions
- [SPARK-41701] [SC-119048][connect][PYTHON] Make column op support
decimal - [SPARK-41383] [SC-119015][spark-41692][SPARK-41693] Implement
rollup,cubeandpivot - [SPARK-41635] [SC-118944][sql] GROUP BY ALL
- [SPARK-41645] [SC-119057][connect][12.X] Deduplicate docstrings in pyspark.sql.connect.dataframe
- [SPARK-41688] [SC-118951][connect][PYTHON] Move Expressions to expressions.py
- [SPARK-41687] [SC-118949][connect] Deduplicate docstrings in pyspark.sql.connect.group
- [SPARK-41649] [SC-118950][connect] Deduplicate docstrings in pyspark.sql.connect.window
- [SPARK-41681] [SC-118939][connect] Factor GroupedData out to group.py
- [SPARK-41292] [SC-119038][spark-41640][SPARK-41641][connect][PYTHON][12.x] Implement
Windowfunctions - [SPARK-41675] [SC-119031][sc-118934][CONNECT][python][12.X] Make Column op support
datetime - [SPARK-41672] [SC-118929][connect][PYTHON] Enable the deprecated functions
- [SPARK-41673] [SC-118932][connect][PYTHON] Implement
Column.astype - [SPARK-41364] [SC-118865][connect][PYTHON] Implement
broadcastfunction - [SPARK-41648] [SC-118914][connect][12.X] Deduplicate docstrings in pyspark.sql.connect.readwriter
- [SPARK-41646] [SC-118915][connect][12.X] Deduplicate docstrings in pyspark.sql.connect.session
- [SPARK-41643] [SC-118862][connect][12.X] Deduplicate docstrings in pyspark.sql.connect.column
- [SPARK-41663] [SC-118936][connect][PYTHON][12.x] Implement the rest of Lambda functions
- [SPARK-41441] [SC-118557][sql] Support Generate with no required child output to host outer references
- [SPARK-41669] [SC-118923][sql] Early pruning in canCollapseExpressions
- [SPARK-41639] [SC-118927][sql][PROTOBUF] : Remove ScalaReflectionLock from SchemaConverters
- [SPARK-41464] [SC-118861][connect][PYTHON] Implement
DataFrame.to - [SPARK-41434] [SC-118857][connect][PYTHON] Initial
LambdaFunctionimplementation - [SPARK-41539] [SC-118802][sql] Remap stats and constraints against output in logical plan for LogicalRDD
- [SPARK-41396] [SC-118786][sql][PROTOBUF] OneOf field support and recursion checks
- [SPARK-41528] [SC-118769][connect][12.X] Merge namespace of Spark Connect and PySpark API
- [SPARK-41568] [SC-118715][sql] Assign name to _LEGACY_ERROR_TEMP_1236
- [SPARK-41440] [SC-118788][connect][PYTHON] Implement
DataFrame.randomSplit - [SPARK-41583] [SC-118718][sc-118642][CONNECT][protobuf] Add Spark Connect and protobuf into setup.py with specifying dependencies
- [SPARK-27561] [SC-101081][12.x][SQL] Support implicit lateral column alias resolution on Project
- [SPARK-41535] [SC-118645][sql] Set null correctly for calendar interval fields in
InterpretedUnsafeProjectionandInterpretedMutableProjection - [SPARK-40687] [SC-118439][sql] Support data masking built-in function 'mask'
- [SPARK-41520] [SC-118440][sql] Split AND_OR TreePattern to separate AND and OR TreePatterns
- [SPARK-41349] [SC-118668][connect][PYTHON] Implement DataFrame.hint
- [SPARK-41546] [SC-118541][connect][PYTHON]
pyspark_types_to_proto_typesshould support StructType. - [SPARK-41334] [SC-118549][connect][PYTHON] Move
SortOrderproto from relations to expressions - [SPARK-41387] [SC-118450][ss] Assert current end offset from Kafka data source for Trigger.AvailableNow
- [SPARK-41508] [SC-118445][core][SQL] Rename
_LEGACY_ERROR_TEMP_1180toUNEXPECTED_INPUT_TYPEand remove_LEGACY_ERROR_TEMP_1179 - [SPARK-41319] [SC-118441][connect][PYTHON] Implement Column.{when, otherwise} and Function
whenwithUnresolvedFunction - [SPARK-41541] [SC-118460][sql] Fix call to wrong child method in SQLShuffleWriteMetricsReporter.decRecordsWritten()
- [SPARK-41453] [SC-118458][connect][PYTHON] Implement
DataFrame.subtract - [SPARK-41248] [SC-118436][sc-118303][SQL] Add “spark.sql.json.enablePartialResults” to enable/disable JSON partial results
- [SPARK-41437] Revert “[SC-117601][sql] Do not optimize the inputquery twice for v1 write fallback”
- [SPARK-41472] [SC-118352][connect][PYTHON] Implement the rest of string/binary functions
- [SPARK-41526] [SC-118355][connect][PYTHON] Implement
Column.isin - [SPARK-32170] [SC-118384] [CORE] Improve the speculation through the stage task metrics.
- [SPARK-41524] [SC-118399][ss] Differentiate SQLConf and extraOptions in StateStoreConf for its usage in RocksDBConf
- [SPARK-41465] [SC-118381][sql] Assign a name to the error class _LEGACY_ERROR_TEMP_1235
- [SPARK-41511] [SC-118365][sql] LongToUnsafeRowMap support ignoresDuplicatedKey
- [SPARK-41409] [SC-118302][core][SQL] Rename
_LEGACY_ERROR_TEMP_1043toWRONG_NUM_ARGS.WITHOUT_SUGGESTION - [SPARK-41438] [SC-118344][connect][PYTHON] Implement
DataFrame.colRegex - [SPARK-41437] [SC-117601][sql] Do not optimize the input query twice for v1 write fallback
- [SPARK-41314] [SC-117172][sql] Assign a name to the error class
_LEGACY_ERROR_TEMP_1094 - [SPARK-41443] [SC-118004][sql] Assign a name to the error class _LEGACY_ERROR_TEMP_1061
- [SPARK-41506] [SC-118241][connect][PYTHON] Refactor LiteralExpression to support DataType
- [SPARK-41448] [SC-118046] Make consistent MR job IDs in FileBatchWriter and FileFormatWriter
- [SPARK-41456] [SC-117970][sql] Improve the performance of try_cast
- [SPARK-41495] [SC-118125][connect][PYTHON] Implement
collectionfunctions: P~Z - [SPARK-41478] [SC-118167][sql] Assign a name to the error class _LEGACY_ERROR_TEMP_1234
- [SPARK-41406] [SC-118161][sql] Refactor error message for
NUM_COLUMNS_MISMATCHto make it more generic - [SPARK-41404] [SC-118016][sql] Refactor
ColumnVectorUtils#toBatchto makeColumnarBatchSuite#testRandomRowstest more primitive dataType - [SPARK-41468] [SC-118044][sql] Fix PlanExpression handling in EquivalentExpressions
- [SPARK-40775] [SC-118045][sql] Fix duplicate description entries for V2 file scans
- [SPARK-41492] [SC-118042][connect][PYTHON] Implement MISC functions
- [SPARK-41459] [SC-118005][sql] fix thrift server operation log output is empty
- [SPARK-41395] [SC-117899][sql]
InterpretedMutableProjectionshould usesetDecimalto set null values for decimals in an unsafe row - [SPARK-41376] [SC-117840][core][3.3] Correct the Netty preferDirectBufs check logic on executor start
- [SPARK-41484] [SC-118159][sc-118036][CONNECT][python][12.x] Implement
collectionfunctions: E~M - [SPARK-41389] [SC-117426][core][SQL] Reuse
WRONG_NUM_ARGSinstead of_LEGACY_ERROR_TEMP_1044 - [SPARK-41462] [SC-117920][sql] Date and timestamp type can up cast to TimestampNTZ
- [SPARK-41435] [SC-117810][sql] Change to call
invalidFunctionArgumentsErrorforcurdate()whenexpressionsis not empty - [SPARK-41187] [SC-118030][core] LiveExecutor MemoryLeak in AppStatusListener when ExecutorLost happen
- [SPARK-41360] [SC-118083][core] Avoid BlockManager re-registration if the executor has been lost
- [SPARK-41378] [SC-117686][sql] Support Column Stats in DS v2
- [SPARK-41402] [SC-117910][sql][CONNECT][12.x] Override prettyName of StringDecode
- [SPARK-41414] [SC-118041][connect][PYTHON][12.x] Implement date/timestamp functions
- [SPARK-41329] [SC-117975][connect] Resolve circular imports in Spark Connect
- [SPARK-41477] [SC-118025][connect][PYTHON] Correctly infer the datatype of literal integers
- [SPARK-41446] [SC-118024][connect][PYTHON][12.x] Make
createDataFramesupport schema and more input dataset types - [SPARK-41475] [SC-117997][connect] Fix lint-scala command error and typo
- [SPARK-38277] [SC-117799][ss] Clear write batch after RocksDB state store's commit
- [SPARK-41375] [SC-117801][ss] Avoid empty latest KafkaSourceOffset
- [SPARK-41412] [SC-118015][connect] Implement
Column.cast - [SPARK-41439] [SC-117893][connect][PYTHON] Implement
DataFrame.meltandDataFrame.unpivot - [SPARK-41399] [SC-118007][sc-117474][CONNECT] Refactor column related tests to test_connect_column
- [SPARK-41351] [SC-117957][sc-117412][CONNECT][12.x] Column should support != operator
- [SPARK-40697] [SC-117806][sc-112787][SQL] Add read-side char padding to cover external data files
- [SPARK-41349] [SC-117594][connect][12.X] Implement DataFrame.hint
- [SPARK-41338] [SC-117170][sql] Resolve outer references and normal columns in the same analyzer batch
- [SPARK-41436] [SC-117805][connect][PYTHON] Implement
collectionfunctions: A~C - [SPARK-41445] [SC-117802][connect] Implement DataFrameReader.parquet
- [SPARK-41452] [SC-117865][sql]
to_charshould return null when format is null - [SPARK-41444] [SC-117796][connect] Support read.json()
- [SPARK-41398] [SC-117508][sql] Relax constraints on Storage-Partitioned Join when partition keys after runtime filtering do not match
- [SPARK-41228] [SC-117169][sql] Rename & Improve error message for
COLUMN_NOT_IN_GROUP_BY_CLAUSE. - [SPARK-41381] [SC-117593][connect][PYTHON] Implement
count_distinctandsum_distinctfunctions - [SPARK-41433] [SC-117596][connect] Make Max Arrow BatchSize configurable
- [SPARK-41397] [SC-117590][connect][PYTHON] Implement part of string/binary functions
- [SPARK-41382] [SC-117588][connect][PYTHON] Implement
productfunction - [SPARK-41403] [SC-117595][connect][PYTHON] Implement
DataFrame.describe - [SPARK-41366] [SC-117580][connect] DF.groupby.agg() should be compatible
- [SPARK-41369] [SC-117584][connect] Add connect common to servers' shaded jar
- [SPARK-41411] [SC-117562][ss] Multi-Stateful Operator watermark support bug fix
- [SPARK-41176] [SC-116630][sql] Assign a name to the error class _LEGACY_ERROR_TEMP_1042
- [SPARK-41380] [SC-117476][connect][PYTHON][12.x] Implement aggregation functions
- [SPARK-41363] [SC-117470][connect][PYTHON][12.x] Implement normal functions
- [SPARK-41305] [SC-117411][connect] Improve Documentation for Command proto
- [SPARK-41372] [SC-117427][connect][PYTHON] Implement DataFrame TempView
- [SPARK-41379] [SC-117420][ss][PYTHON] Provide cloned spark session in DataFrame in user function for foreachBatch sink in PySpark
- [SPARK-41373] [SC-117405][sql][ERROR] Rename CAST_WITH_FUN_SUGGESTION to CAST_WITH_FUNC_SUGGESTION
- [SPARK-41358] [SC-117417][sql] Refactor
ColumnVectorUtils#populatemethod to usePhysicalDataTypeinstead ofDataType - [SPARK-41355] [SC-117423][sql] Workaround hive table name validation issue
- [SPARK-41390] [SC-117429][sql] Update the script used to generate
registerfunction inUDFRegistration - [SPARK-41206] [SC-117233][sc-116381][SQL] Rename the error class
_LEGACY_ERROR_TEMP_1233toCOLUMN_ALREADY_EXISTS - [SPARK-41357] [SC-117310][connect][PYTHON][12.x] Implement math functions
- [SPARK-40970] [SC-117308][connect][PYTHON] Support List[Column] for Join's on argument
- [SPARK-41345] [SC-117178][connect] Add Hint to Connect Proto
- [SPARK-41226] [SC-117194][sql][12.x] Refactor Spark types by introducing physical types
- [SPARK-41317] [SC-116902][connect][PYTHON][12.x] Add basic support for DataFrameWriter
- [SPARK-41347] [SC-117173][connect] Add Cast to Expression proto
- [SPARK-41323] [SC-117128][sql] Support current_schema
- [SPARK-41339] [SC-117171][sql] Close and recreate RocksDB write batch instead of just clearing
- [SPARK-41227] [SC-117165][connect][PYTHON] Implement DataFrame cross join
- [SPARK-41346] [SC-117176][connect][PYTHON] Implement
ascanddescfunctions - [SPARK-41343] [SC-117166][connect] Move FunctionName parsing to server side
- [SPARK-41321] [SC-117163][connect] Support target field for UnresolvedStar
- [SPARK-41237] [SC-117167][sql] Reuse the error class
UNSUPPORTED_DATATYPEfor_LEGACY_ERROR_TEMP_0030 - [SPARK-41309] [SC-116916][sql] Reuse
INVALID_SCHEMA.NON_STRING_LITERALinstead of_LEGACY_ERROR_TEMP_1093 - [SPARK-41276] [SC-117136][sql][ML][mllib][PROTOBUF][python][R][ss][AVRO] Optimize constructor use of
StructType - [SPARK-41335] [SC-117135][connect][PYTHON] Support IsNull and IsNotNull in Column
- [SPARK-41332] [SC-117131][connect][PYTHON] Fix
nullOrderinginSortOrder - [SPARK-41325] [SC-117132][connect][12.X] Fix missing avg() for GroupBy on DF
- [SPARK-41327] [SC-117137][core] Fix
SparkStatusTracker.getExecutorInfosby switch On/OffHeapStorageMemory info - [SPARK-41315] [SC-117129][connect][PYTHON] Implement
DataFrame.replaceandDataFrame.na.replace - [SPARK-41328] [SC-117125][connect][PYTHON] Add logical and string API to Column
- [SPARK-41331] [SC-117127][connect][PYTHON] Add
orderByanddrop_duplicates - [SPARK-40987] [SC-117124][core]
BlockManager#removeBlockInternalshould ensure the lock is unlocked gracefully - [SPARK-41268] [SC-117102][sc-116970][CONNECT][python] Refactor “Column” for API Compatibility
- [SPARK-41312] [SC-116881][connect][PYTHON][12.x] Implement DataFrame.withColumnRenamed
- [SPARK-41221] [SC-116607][sql] Add the error class
INVALID_FORMAT - [SPARK-41272] [SC-116742][sql] Assign a name to the error class _LEGACY_ERROR_TEMP_2019
- [SPARK-41180] [SC-116760][sql] Reuse
INVALID_SCHEMAinstead of_LEGACY_ERROR_TEMP_1227 - [SPARK-41260] [SC-116880][python][SS][12.x] Cast NumPy instances to Python primitive types in GroupState update
- [SPARK-41174] [SC-116609][core][SQL] Propagate an error class to users for invalid
formatofto_binary() - [SPARK-41264] [SC-116971][connect][PYTHON] Make Literal support more datatypes
- [SPARK-41326] [SC-116972] [CONNECT] Fix deduplicate is missing input
- [SPARK-41316] [SC-116900][sql] Enable tail-recursion wherever possible
- [SPARK-41297] [SC-116931] [CONNECT] [PYTHON] Support String Expressions in filter.
- [SPARK-41256] [SC-116932][sc-116883][CONNECT] Implement DataFrame.withColumn(s)
- [SPARK-41182] [SC-116632][sql] Assign a name to the error class _LEGACY_ERROR_TEMP_1102
- [SPARK-41181] [SC-116680][sql] Migrate the map options errors onto error classes
- [SPARK-40940] [SC-115993][12.x] Remove Multi-stateful operator checkers for streaming queries.
- [SPARK-41310] [SC-116885][connect][PYTHON] Implement DataFrame.toDF
- [SPARK-41179] [SC-116631][sql] Assign a name to the error class _LEGACY_ERROR_TEMP_1092
- [SPARK-41003] [SC-116741][sql] BHJ LeftAnti does not update numOutputRows when codegen is disabled
- [SPARK-41148] [SC-116878][connect][PYTHON] Implement
DataFrame.dropnaandDataFrame.na.drop - [SPARK-41217] [SC-116380][sql] Add the error class
FAILED_FUNCTION_CALL - [SPARK-41308] [SC-116875][connect][PYTHON] Improve DataFrame.count()
- [SPARK-41301] [SC-116786] [CONNECT] Homogenize Behavior for SparkSession.range()
- [SPARK-41306] [SC-116860][connect] Improve Connect Expression proto documentation
- [SPARK-41280] [SC-116733][connect] Implement DataFrame.tail
- [SPARK-41300] [SC-116751] [CONNECT] Unset schema is interpreted as Schema
- [SPARK-41255] [SC-116730][sc-116695] [CONNECT] Rename RemoteSparkSession
- [SPARK-41250] [SC-116788][sc-116633][CONNECT][python] DataFrame. toPandas should not return optional pandas dataframe
- [SPARK-41291] [SC-116738][connect][PYTHON]
DataFrame.explainshould print and return None - [SPARK-41278] [SC-116732][connect] Clean up unused QualifiedAttribute in Expression.proto
- [SPARK-41097] [SC-116653][core][SQL][ss][PROTOBUF] Remove redundant collection conversion base on Scala 2.13 code
- [SPARK-41261] [SC-116718][python][SS] Fix issue for applyInPandasWithState when the columns of grouping keys are not placed in order from earliest
- [SPARK-40872] [SC-116717][3.3] Fallback to original shuffle block when a push-merged shuffle chunk is zero-size
- [SPARK-41114] [SC-116628][connect] Support local data for LocalRelation
- [SPARK-41216] [SC-116678][connect][PYTHON] Implement
DataFrame.{isLocal, isStreaming, printSchema, inputFiles} - [SPARK-41238] [SC-116670][connect][PYTHON] Support more built-in datatypes
- [SPARK-41230] [SC-116674][connect][PYTHON] Remove
strfrom Aggregate expression type - [SPARK-41224] [SC-116652][spark-41165][SPARK-41184][connect] Optimized Arrow-based collect implementation to stream from server to client
- [SPARK-41222] [SC-116625][connect][PYTHON] Unify the typing definitions
- [SPARK-41225] [SC-116623] [CONNECT] [PYTHON] Disable unsupported functions.
- [SPARK-41201] [SC-116526][connect][PYTHON] Implement
DataFrame.SelectExprin Python client - [SPARK-41203] [SC-116258] [CONNECT] Support Dataframe.tansform in Python client.
- [SPARK-41213] [SC-116375][connect][PYTHON] Implement
DataFrame.__repr__andDataFrame.dtypes - [SPARK-41169] [SC-116378][connect][PYTHON] Implement
DataFrame.drop - [SPARK-41172] [SC-116245][sql] Migrate the ambiguous ref error to an error class
- [SPARK-41122] [SC-116141][connect] Explain API can support different modes
- [SPARK-41209] [SC-116584][sc-116376][PYTHON] Improve PySpark type inference in _merge_type method
- [SPARK-41196] [SC-116555][sc-116179] [CONNECT] Homogenize the protobuf version across the Spark connect server to use the same major version.
- [SPARK-35531] [SC-116409][sql] Update hive table stats without unnecessary convert
- [SPARK-41154] [SC-116289][sql] Incorrect relation caching for queries with time travel spec
- [SPARK-41212] [SC-116554][sc-116389][CONNECT][python] Implement
DataFrame.isEmpty - [SPARK-41135] [SC-116400][sql] Rename
UNSUPPORTED_EMPTY_LOCATIONtoINVALID_EMPTY_LOCATION - [SPARK-41183] [SC-116265][sql] Add an extension API to do plan normalization for caching
- [SPARK-41054] [SC-116447][ui][CORE] Support RocksDB as KVStore in live UI
- [SPARK-38550] [SC-115223]Revert “[SQL][core] Use a disk-based store to save more debug information for live UI”
- [SPARK-41173] [SC-116185][sql] Move
require()out from the constructors of string expressions - [SPARK-41188] [SC-116242][core][ML] Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes
- [SPARK-41130] [SC-116155][sql] Rename
OUT_OF_DECIMAL_TYPE_RANGEtoNUMERIC_OUT_OF_SUPPORTED_RANGE - [SPARK-41175] [SC-116238][sql] Assign a name to the error class _LEGACY_ERROR_TEMP_1078
- [SPARK-41106] [SC-116073][sql] Reduce collection conversion when create AttributeMap
- [SPARK-41139] [SC-115983][sql] Improve error class:
PYTHON_UDF_IN_ON_CLAUSE - [SPARK-40657] [SC-115997][protobuf] Require shading for Java class jar, improve error handling
- [SPARK-40999] [SC-116168] Hint propagation to subqueries
- [SPARK-41017] [SC-116054][sql] Support column pruning with multiple nondeterministic Filters
- [SPARK-40834] [SC-114773][sql] Use SparkListenerSQLExecutionEnd to track final SQL status in UI
- [SPARK-41118] [SC-116027][sql]
to_number/try_to_numbershould returnnullwhen format isnull - [SPARK-39799] [SC-115984][sql] DataSourceV2: View catalog interface
- [SPARK-40665] [SC-116210][sc-112300][CONNECT] Avoid embedding Spark Connect in the Apache Spark binary release
- [SPARK-41048] [SC-116043][sql] Improve output partitioning and ordering with AQE cache
- [SPARK-41198] [SC-116256][ss] Fix metrics in streaming query having CTE and DSv1 streaming source
- [SPARK-41199] [SC-116244][ss] Fix metrics issue when DSv1 streaming source and DSv2 streaming source are co-used
- [SPARK-40957] [SC-116261][sc-114706] Add in memory cache in HDFSMetadataLog
- [SPARK-40940] Revert “[SC-115993] Remove Multi-stateful operator checkers for streaming queries.”
- [SPARK-41090] [SC-116040][sql] Throw Exception for
db_name.view_namewhen creating temp view by Dataset API - [SPARK-41133] [SC-116085][sql] Integrate
UNSCALED_VALUE_TOO_LARGE_FOR_PRECISIONintoNUMERIC_VALUE_OUT_OF_RANGE - [SPARK-40557] [SC-116182][sc-111442][CONNECT] Code Dump 9 Commits
- [SPARK-40448] [SC-114447][sc-111314][CONNECT] Spark Connect build as Driver Plugin with Shaded Dependencies
- [SPARK-41096] [SC-115812][sql] Support reading parquet FIXED_LEN_BYTE_ARRAY type
- [SPARK-41140] [SC-115879][sql] Rename the error class
_LEGACY_ERROR_TEMP_2440toINVALID_WHERE_CONDITION - [SPARK-40918] [SC-114438][sql] Mismatch between FileSourceScanExec and Orc and ParquetFileFormat on producing columnar output
- [SPARK-41155] [SC-115991][sql] Add error message to SchemaColumnConvertNotSupportedException
- [SPARK-40940] [SC-115993] Remove Multi-stateful operator checkers for streaming queries.
- [SPARK-41098] [SC-115790][sql] Rename
GROUP_BY_POS_REFERS_AGG_EXPRtoGROUP_BY_POS_AGGREGATE - [SPARK-40755] [SC-115912][sql] Migrate type check failures of number formatting onto error classes
- [SPARK-41059] [SC-115658][sql] Rename
_LEGACY_ERROR_TEMP_2420toNESTED_AGGREGATE_FUNCTION - [SPARK-41044] [SC-115662][sql] Convert DATATYPE_MISMATCH.UNSPECIFIED_FRAME to INTERNAL_ERROR
- [SPARK-40973] [SC-115132][sql] Rename
_LEGACY_ERROR_TEMP_0055toUNCLOSED_BRACKETED_COMMENT
Maintenance updates
See Databricks Runtime 12.1 maintenance updates.
System environment
- Operating System: Ubuntu 20.04.5 LTS
- Java: Zulu 8.64.0.19-CA-linux64
- Scala: 2.12.14
- Python: 3.9.5
- R: 4.2.2
- Delta Lake: 2.2.0
Installed Python libraries
| Library | Version | Library | Version | Library | Version |
|---|---|---|---|---|---|
| argon2-cffi | 21.3.0 | argon2-cffi-bindings | 21.2.0 | asttokens | 2.0.5 |
| attrs | 21.4.0 | backcall | 0.2.0 | backports.entry-points-selectable | 1.2.0 |
| beautifulsoup4 | 4.11.1 | black | 22.3.0 | bleach | 4.1.0 |
| boto3 | 1.21.32 | botocore | 1.24.32 | certifi | 2021.10.8 |
| cffi | 1.15.0 | chardet | 4.0.0 | charset-normalizer | 2.0.4 |
| click | 8.0.4 | cryptography | 3.4.8 | cycler | 0.11.0 |
| Cython | 0.29.28 | dbus-python | 1.2.16 | debugpy | 1.5.1 |
| decorator | 5.1.1 | defusedxml | 0.7.1 | distlib | 0.3.6 |
| docstring-to-markdown | 0.11 | entrypoints | 0.4 | executing | 0.8.3 |
| facets-overview | 1.0.0 | fastjsonschema | 2.16.2 | filelock | 3.8.2 |
| fonttools | 4.25.0 | idna | 3.3 | ipykernel | 6.15.3 |
| ipython | 8.5.0 | ipython-genutils | 0.2.0 | ipywidgets | 7.7.2 |
| jedi | 0.18.1 | Jinja2 | 2.11.3 | jmespath | 0.10.0 |
| joblib | 1.1.0 | jsonschema | 4.4.0 | jupyter-client | 6.1.12 |
| jupyter_core | 4.11.2 | jupyterlab-pygments | 0.1.2 | jupyterlab-widgets | 1.0.0 |
| kiwisolver | 1.3.2 | MarkupSafe | 2.0.1 | matplotlib | 3.5.1 |
| matplotlib-inline | 0.1.2 | mccabe | 0.7.0 | mistune | 0.8.4 |
| mypy-extensions | 0.4.3 | nbclient | 0.5.13 | nbconvert | 6.4.4 |
| nbformat | 5.3.0 | nest-asyncio | 1.5.5 | nodeenv | 1.7.0 |
| notebook | 6.4.8 | numpy | 1.21.5 | packaging | 21.3 |
| pandas | 1.4.2 | pandocfilters | 1.5.0 | parso | 0.8.3 |
| pathspec | 0.9.0 | patsy | 0.5.2 | pexpect | 4.8.0 |
| pickleshare | 0.7.5 | Pillow | 9.0.1 | pip | 21.2.4 |
| platformdirs | 2.6.0 | plotly | 5.6.0 | pluggy | 1.0.0 |
| prometheus-client | 0.13.1 | prompt-toolkit | 3.0.20 | protobuf | 3.19.4 |
| psutil | 5.8.0 | psycopg2 | 2.9.3 | ptyprocess | 0.7.0 |
| pure-eval | 0.2.2 | pyarrow | 7.0.0 | pycparser | 2.21 |
| pyflakes | 2.5.0 | Pygments | 2.11.2 | PyGObject | 3.36.0 |
| pyodbc | 4.0.32 | pyparsing | 3.0.4 | pyright | 1.1.283 |
| pyrsistent | 0.18.0 | python-dateutil | 2.8.2 | python-lsp-jsonrpc | 1.0.0 |
| python-lsp-server | 1.6.0 | pytz | 2021.3 | pyzmq | 22.3.0 |
| requests | 2.27.1 | requests-unixsocket | 0.2.0 | rope | 0.22.0 |
| s3transfer | 0.5.0 | scikit-learn | 1.0.2 | scipy | 1.7.3 |
| seaborn | 0.11.2 | Send2Trash | 1.8.0 | setuptools | 61.2.0 |
| six | 1.16.0 | soupsieve | 2.3.1 | ssh-import-id | 5.10 |
| stack-data | 0.2.0 | statsmodels | 0.13.2 | tenacity | 8.0.1 |
| terminado | 0.13.1 | testpath | 0.5.0 | threadpoolctl | 2.2.0 |
| tokenize-rt | 4.2.1 | tomli | 1.2.2 | tornado | 6.1 |
| traitlets | 5.1.1 | typing_extensions | 4.1.1 | ujson | 5.1.0 |
| unattended-upgrades | 0.1 | urllib3 | 1.26.9 | virtualenv | 20.8.0 |
| wcwidth | 0.2.5 | webencodings | 0.5.1 | whatthepatch | 1.0.3 |
| wheel | 0.37.0 | widgetsnbextension | 3.6.1 | yapf | 0.31.0 |
Installed R libraries
R libraries are installed from the Microsoft CRAN snapshot on 2022-11-11.
| Library | Version | Library | Version | Library | Version |
|---|---|---|---|---|---|
| arrow | 10.0.0 | askpass | 1.1 | assertthat | 0.2.1 |
| backports | 1.4.1 | base | 4.2.2 | base64enc | 0.1-3 |
| bit | 4.0.4 | bit64 | 4.0.5 | blob | 1.2.3 |
| boot | 1.3-28 | brew | 1.0-8 | brio | 1.1.3 |
| broom | 1.0.1 | bslib | 0.4.1 | cachem | 1.0.6 |
| callr | 3.7.3 | caret | 6.0-93 | cellranger | 1.1.0 |
| chron | 2.3-58 | class | 7.3-20 | cli | 3.4.1 |
| clipr | 0.8.0 | clock | 0.6.1 | cluster | 2.1.4 |
| codetools | 0.2-18 | colorspace | 2.0-3 | commonmark | 1.8.1 |
| compiler | 4.2.2 | config | 0.3.1 | cpp11 | 0.4.3 |
| crayon | 1.5.2 | credentials | 1.3.2 | curl | 4.3.3 |
| data.table | 1.14.4 | datasets | 4.2.2 | DBI | 1.1.3 |
| dbplyr | 2.2.1 | desc | 1.4.2 | devtools | 2.4.5 |
| diffobj | 0.3.5 | digest | 0.6.30 | downlit | 0.4.2 |
| dplyr | 1.0.10 | dtplyr | 1.2.2 | e1071 | 1.7-12 |
| ellipsis | 0.3.2 | evaluate | 0.18 | fansi | 1.0.3 |
| farver | 2.1.1 | fastmap | 1.1.0 | fontawesome | 0.4.0 |
| forcats | 0.5.2 | foreach | 1.5.2 | foreign | 0.8-82 |
| forge | 0.2.0 | fs | 1.5.2 | future | 1.29.0 |
| future.apply | 1.10.0 | gargle | 1.2.1 | generics | 0.1.3 |
| gert | 1.9.1 | ggplot2 | 3.4.0 | gh | 1.3.1 |
| gitcreds | 0.1.2 | glmnet | 4.1-4 | globals | 0.16.1 |
| glue | 1.6.2 | googledrive | 2.0.0 | googlesheets4 | 1.0.1 |
| gower | 1.0.0 | graphics | 4.2.2 | grDevices | 4.2.2 |
| grid | 4.2.2 | gridExtra | 2.3 | gsubfn | 0.7 |
| gtable | 0.3.1 | hardhat | 1.2.0 | haven | 2.5.1 |
| highr | 0.9 | hms | 1.1.2 | htmltools | 0.5.3 |
| htmlwidgets | 1.5.4 | httpuv | 1.6.6 | httr | 1.4.4 |
| ids | 1.0.1 | ini | 0.3.1 | ipred | 0.9-13 |
| isoband | 0.2.6 | iterators | 1.0.14 | jquerylib | 0.1.4 |
| jsonlite | 1.8.3 | KernSmooth | 2.23-20 | knitr | 1.40 |
| labeling | 0.4.2 | later | 1.3.0 | lattice | 0.20-45 |
| lava | 1.7.0 | lifecycle | 1.0.3 | listenv | 0.8.0 |
| lubridate | 1.9.0 | magrittr | 2.0.3 | markdown | 1.3 |
| MASS | 7.3-58 | Matrix | 1.5-1 | memoise | 2.0.1 |
| methods | 4.2.2 | mgcv | 1.8-41 | mime | 0.12 |
| miniUI | 0.1.1.1 | ModelMetrics | 1.2.2.2 | modelr | 0.1.9 |
| munsell | 0.5.0 | nlme | 3.1-160 | nnet | 7.3-18 |
| numDeriv | 2016.8-1.1 | openssl | 2.0.4 | parallel | 4.2.2 |
| parallelly | 1.32.1 | pillar | 1.8.1 | pkgbuild | 1.3.1 |
| pkgconfig | 2.0.3 | pkgdown | 2.0.6 | pkgload | 1.3.1 |
| plogr | 0.2.0 | plyr | 1.8.7 | praise | 1.0.0 |
| prettyunits | 1.1.1 | pROC | 1.18.0 | processx | 3.8.0 |
| prodlim | 2019.11.13 | profvis | 0.3.7 | progress | 1.2.2 |
| progressr | 0.11.0 | promises | 1.2.0.1 | proto | 1.0.0 |
| proxy | 0.4-27 | ps | 1.7.2 | purrr | 0.3.5 |
| r2d3 | 0.2.6 | R6 | 2.5.1 | ragg | 1.2.4 |
| randomForest | 4.7-1.1 | rappdirs | 0.3.3 | rcmdcheck | 1.4.0 |
| RColorBrewer | 1.1-3 | Rcpp | 1.0.9 | RcppEigen | 0.3.3.9.3 |
| readr | 2.1.3 | readxl | 1.4.1 | recipes | 1.0.3 |
| rematch | 1.0.1 | rematch2 | 2.1.2 | remotes | 2.4.2 |
| reprex | 2.0.2 | reshape2 | 1.4.4 | rlang | 1.0.6 |
| rmarkdown | 2.18 | RODBC | 1.3-19 | roxygen2 | 7.2.1 |
| rpart | 4.1.19 | rprojroot | 2.0.3 | Rserve | 1.8-11 |
| RSQLite | 2.2.18 | rstudioapi | 0.14 | rversions | 2.1.2 |
| rvest | 1.0.3 | sass | 0.4.2 | scales | 1.2.1 |
| selectr | 0.4-2 | sessioninfo | 1.2.2 | shape | 1.4.6 |
| shiny | 1.7.3 | sourcetools | 0.1.7 | sparklyr | 1.7.8 |
| SparkR | 3.3.1 | spatial | 7.3-11 | splines | 4.2.2 |
| sqldf | 0.4-11 | SQUAREM | 2021.1 | stats | 4.2.2 |
| stats4 | 4.2.2 | stringi | 1.7.8 | stringr | 1.4.1 |
| survival | 3.4-0 | sys | 3.4.1 | systemfonts | 1.0.4 |
| tcltk | 4.2.2 | testthat | 3.1.5 | textshaping | 0.3.6 |
| tibble | 3.1.8 | tidyr | 1.2.1 | tidyselect | 1.2.0 |
| tidyverse | 1.3.2 | timechange | 0.1.1 | timeDate | 4021.106 |
| tinytex | 0.42 | tools | 4.2.2 | tzdb | 0.3.0 |
| urlchecker | 1.0.1 | usethis | 2.1.6 | utf8 | 1.2.2 |
| utils | 4.2.2 | uuid | 1.1-0 | vctrs | 0.5.0 |
| viridisLite | 0.4.1 | vroom | 1.6.0 | waldo | 0.4.0 |
| whisker | 0.4 | withr | 2.5.0 | xfun | 0.34 |
| xml2 | 1.3.3 | xopen | 1.0.0 | xtable | 1.8-4 |
| yaml | 2.3.6 | zip | 2.2.2 |
Installed Java and Scala libraries (Scala 2.12 cluster version)
| Group ID | Artifact ID | Version |
|---|---|---|
| antlr | antlr | 2.7.7 |
| com.amazonaws | amazon-kinesis-client | 1.12.0 |
| com.amazonaws | aws-java-sdk-autoscaling | 1.12.189 |
| com.amazonaws | aws-java-sdk-cloudformation | 1.12.189 |
| com.amazonaws | aws-java-sdk-cloudfront | 1.12.189 |
| com.amazonaws | aws-java-sdk-cloudhsm | 1.12.189 |
| com.amazonaws | aws-java-sdk-cloudsearch | 1.12.189 |
| com.amazonaws | aws-java-sdk-cloudtrail | 1.12.189 |
| com.amazonaws | aws-java-sdk-cloudwatch | 1.12.189 |
| com.amazonaws | aws-java-sdk-cloudwatchmetrics | 1.12.189 |
| com.amazonaws | aws-java-sdk-codedeploy | 1.12.189 |
| com.amazonaws | aws-java-sdk-cognitoidentity | 1.12.189 |
| com.amazonaws | aws-java-sdk-cognitosync | 1.12.189 |
| com.amazonaws | aws-java-sdk-config | 1.12.189 |
| com.amazonaws | aws-java-sdk-core | 1.12.189 |
| com.amazonaws | aws-java-sdk-datapipeline | 1.12.189 |
| com.amazonaws | aws-java-sdk-directconnect | 1.12.189 |
| com.amazonaws | aws-java-sdk-directory | 1.12.189 |
| com.amazonaws | aws-java-sdk-dynamodb | 1.12.189 |
| com.amazonaws | aws-java-sdk-ec2 | 1.12.189 |
| com.amazonaws | aws-java-sdk-ecs | 1.12.189 |
| com.amazonaws | aws-java-sdk-efs | 1.12.189 |
| com.amazonaws | aws-java-sdk-elasticache | 1.12.189 |
| com.amazonaws | aws-java-sdk-elasticbeanstalk | 1.12.189 |
| com.amazonaws | aws-java-sdk-elasticloadbalancing | 1.12.189 |
| com.amazonaws | aws-java-sdk-elastictranscoder | 1.12.189 |
| com.amazonaws | aws-java-sdk-emr | 1.12.189 |
| com.amazonaws | aws-java-sdk-glacier | 1.12.189 |
| com.amazonaws | aws-java-sdk-glue | 1.12.189 |
| com.amazonaws | aws-java-sdk-iam | 1.12.189 |
| com.amazonaws | aws-java-sdk-importexport | 1.12.189 |
| com.amazonaws | aws-java-sdk-kinesis | 1.12.189 |
| com.amazonaws | aws-java-sdk-kms | 1.12.189 |
| com.amazonaws | aws-java-sdk-lambda | 1.12.189 |
| com.amazonaws | aws-java-sdk-logs | 1.12.189 |
| com.amazonaws | aws-java-sdk-machinelearning | 1.12.189 |
| com.amazonaws | aws-java-sdk-opsworks | 1.12.189 |
| com.amazonaws | aws-java-sdk-rds | 1.12.189 |
| com.amazonaws | aws-java-sdk-redshift | 1.12.189 |
| com.amazonaws | aws-java-sdk-route53 | 1.12.189 |
| com.amazonaws | aws-java-sdk-s3 | 1.12.189 |
| com.amazonaws | aws-java-sdk-ses | 1.12.189 |
| com.amazonaws | aws-java-sdk-simpledb | 1.12.189 |
| com.amazonaws | aws-java-sdk-simpleworkflow | 1.12.189 |
| com.amazonaws | aws-java-sdk-sns | 1.12.189 |
| com.amazonaws | aws-java-sdk-sqs | 1.12.189 |
| com.amazonaws | aws-java-sdk-ssm | 1.12.189 |
| com.amazonaws | aws-java-sdk-storagegateway | 1.12.189 |
| com.amazonaws | aws-java-sdk-sts | 1.12.189 |
| com.amazonaws | aws-java-sdk-support | 1.12.189 |
| com.amazonaws | aws-java-sdk-swf-libraries | 1.11.22 |
| com.amazonaws | aws-java-sdk-workspaces | 1.12.189 |
| com.amazonaws | jmespath-java | 1.12.189 |
| com.chuusai | shapeless_2.12 | 2.3.3 |
| com.clearspring.analytics | stream | 2.9.6 |
| com.databricks | Rserve | 1.8-3 |
| com.databricks | jets3t | 0.7.1-0 |
| com.databricks.scalapb | compilerplugin_2.12 | 0.4.15-10 |
| com.databricks.scalapb | scalapb-runtime_2.12 | 0.4.15-10 |
| com.esotericsoftware | kryo-shaded | 4.0.2 |
| com.esotericsoftware | minlog | 1.3.0 |
| com.fasterxml | classmate | 1.3.4 |
| com.fasterxml.jackson.core | jackson-annotations | 2.13.4 |
| com.fasterxml.jackson.core | jackson-core | 2.13.4 |
| com.fasterxml.jackson.core | jackson-databind | 2.13.4.2 |
| com.fasterxml.jackson.dataformat | jackson-dataformat-cbor | 2.13.4 |
| com.fasterxml.jackson.datatype | jackson-datatype-joda | 2.13.4 |
| com.fasterxml.jackson.datatype | jackson-datatype-jsr310 | 2.13.4 |
| com.fasterxml.jackson.module | jackson-module-paranamer | 2.13.4 |
| com.fasterxml.jackson.module | jackson-module-scala_2.12 | 2.13.4 |
| com.github.ben-manes.caffeine | caffeine | 2.3.4 |
| com.github.fommil | jniloader | 1.1 |
| com.github.fommil.netlib | core | 1.1.2 |
| com.github.fommil.netlib | native_ref-java | 1.1 |
| com.github.fommil.netlib | native_ref-java-natives | 1.1 |
| com.github.fommil.netlib | native_system-java | 1.1 |
| com.github.fommil.netlib | native_system-java-natives | 1.1 |
| com.github.fommil.netlib | netlib-native_ref-linux-x86_64-natives | 1.1 |
| com.github.fommil.netlib | netlib-native_system-linux-x86_64-natives | 1.1 |
| com.github.luben | zstd-jni | 1.5.2-1 |
| com.github.wendykierp | JTransforms | 3.1 |
| com.google.code.findbugs | jsr305 | 3.0.0 |
| com.google.code.gson | gson | 2.8.6 |
| com.google.crypto.tink | tink | 1.6.1 |
| com.google.flatbuffers | flatbuffers-java | 1.12.0 |
| com.google.guava | guava | 15.0 |
| com.google.protobuf | protobuf-java | 2.6.1 |
| com.h2database | h2 | 2.0.204 |
| com.helger | profiler | 1.1.1 |
| com.jcraft | jsch | 0.1.50 |
| com.jolbox | bonecp | 0.8.0.RELEASE |
| com.lihaoyi | sourcecode_2.12 | 0.1.9 |
| com.microsoft.azure | azure-data-lake-store-sdk | 2.3.9 |
| com.ning | compress-lzf | 1.1 |
| com.sun.mail | javax.mail | 1.5.2 |
| com.tdunning | json | 1.8 |
| com.thoughtworks.paranamer | paranamer | 2.8 |
| com.trueaccord.lenses | lenses_2.12 | 0.4.12 |
| com.twitter | chill-java | 0.10.0 |
| com.twitter | chill_2.12 | 0.10.0 |
| com.twitter | util-app_2.12 | 7.1.0 |
| com.twitter | util-core_2.12 | 7.1.0 |
| com.twitter | util-function_2.12 | 7.1.0 |
| com.twitter | util-jvm_2.12 | 7.1.0 |
| com.twitter | util-lint_2.12 | 7.1.0 |
| com.twitter | util-registry_2.12 | 7.1.0 |
| com.twitter | util-stats_2.12 | 7.1.0 |
| com.typesafe | config | 1.2.1 |
| com.typesafe.scala-logging | scala-logging_2.12 | 3.7.2 |
| com.uber | h3 | 3.7.0 |
| com.univocity | univocity-parsers | 2.9.1 |
| com.zaxxer | HikariCP | 4.0.3 |
| commons-cli | commons-cli | 1.5.0 |
| commons-codec | commons-codec | 1.15 |
| commons-collections | commons-collections | 3.2.2 |
| commons-dbcp | commons-dbcp | 1.4 |
| commons-fileupload | commons-fileupload | 1.3.3 |
| commons-httpclient | commons-httpclient | 3.1 |
| commons-io | commons-io | 2.11.0 |
| commons-lang | commons-lang | 2.6 |
| commons-logging | commons-logging | 1.1.3 |
| commons-pool | commons-pool | 1.5.4 |
| dev.ludovic.netlib | arpack | 2.2.1 |
| dev.ludovic.netlib | blas | 2.2.1 |
| dev.ludovic.netlib | lapack | 2.2.1 |
| info.ganglia.gmetric4j | gmetric4j | 1.0.10 |
| io.airlift | aircompressor | 0.21 |
| io.delta | delta-sharing-spark_2.12 | 0.6.2 |
| io.dropwizard.metrics | metrics-core | 4.1.1 |
| io.dropwizard.metrics | metrics-graphite | 4.1.1 |
| io.dropwizard.metrics | metrics-healthchecks | 4.1.1 |
| io.dropwizard.metrics | metrics-jetty9 | 4.1.1 |
| io.dropwizard.metrics | metrics-jmx | 4.1.1 |
| io.dropwizard.metrics | metrics-json | 4.1.1 |
| io.dropwizard.metrics | metrics-jvm | 4.1.1 |
| io.dropwizard.metrics | metrics-servlets | 4.1.1 |
| io.netty | netty-all | 4.1.74.Final |
| io.netty | netty-buffer | 4.1.74.Final |
| io.netty | netty-codec | 4.1.74.Final |
| io.netty | netty-common | 4.1.74.Final |
| io.netty | netty-handler | 4.1.74.Final |
| io.netty | netty-resolver | 4.1.74.Final |
| io.netty | netty-tcnative-classes | 2.0.48.Final |
| io.netty | netty-transport | 4.1.74.Final |
| io.netty | netty-transport-classes-epoll | 4.1.74.Final |
| io.netty | netty-transport-classes-kqueue | 4.1.74.Final |
| io.netty | netty-transport-native-epoll-linux-aarch_64 | 4.1.74.Final |
| io.netty | netty-transport-native-epoll-linux-x86_64 | 4.1.74.Final |
| io.netty | netty-transport-native-kqueue-osx-aarch_64 | 4.1.74.Final |
| io.netty | netty-transport-native-kqueue-osx-x86_64 | 4.1.74.Final |
| io.netty | netty-transport-native-unix-common | 4.1.74.Final |
| io.prometheus | simpleclient | 0.7.0 |
| io.prometheus | simpleclient_common | 0.7.0 |
| io.prometheus | simpleclient_dropwizard | 0.7.0 |
| io.prometheus | simpleclient_pushgateway | 0.7.0 |
| io.prometheus | simpleclient_servlet | 0.7.0 |
| io.prometheus.jmx | collector | 0.12.0 |
| jakarta.annotation | jakarta.annotation-api | 1.3.5 |
| jakarta.servlet | jakarta.servlet-api | 4.0.3 |
| jakarta.validation | jakarta.validation-api | 2.0.2 |
| jakarta.ws.rs | jakarta.ws.rs-api | 2.1.6 |
| javax.activation | activation | 1.1.1 |
| javax.el | javax.el-api | 2.2.4 |
| javax.jdo | jdo-api | 3.0.1 |
| javax.transaction | jta | 1.1 |
| javax.transaction | transaction-api | 1.1 |
| javax.xml.bind | jaxb-api | 2.2.11 |
| javolution | javolution | 5.5.1 |
| jline | jline | 2.14.6 |
| joda-time | joda-time | 2.10.13 |
| net.java.dev.jna | jna | 5.8.0 |
| net.razorvine | pickle | 1.2 |
| net.sf.jpam | jpam | 1.1 |
| net.sf.opencsv | opencsv | 2.3 |
| net.sf.supercsv | super-csv | 2.2.0 |
| net.snowflake | snowflake-ingest-sdk | 0.9.6 |
| net.snowflake | snowflake-jdbc | 3.13.22 |
| net.sourceforge.f2j | arpack_combined_all | 0.1 |
| org.acplt.remotetea | remotetea-oncrpc | 1.1.2 |
| org.antlr | ST4 | 4.0.4 |
| org.antlr | antlr-runtime | 3.5.2 |
| org.antlr | antlr4-runtime | 4.8 |
| org.antlr | stringtemplate | 3.2.1 |
| org.apache.ant | ant | 1.9.2 |
| org.apache.ant | ant-jsch | 1.9.2 |
| org.apache.ant | ant-launcher | 1.9.2 |
| org.apache.arrow | arrow-format | 7.0.0 |
| org.apache.arrow | arrow-memory-core | 7.0.0 |
| org.apache.arrow | arrow-memory-netty | 7.0.0 |
| org.apache.arrow | arrow-vector | 7.0.0 |
| org.apache.avro | avro | 1.11.0 |
| org.apache.avro | avro-ipc | 1.11.0 |
| org.apache.avro | avro-mapred | 1.11.0 |
| org.apache.commons | commons-collections4 | 4.4 |
| org.apache.commons | commons-compress | 1.21 |
| org.apache.commons | commons-crypto | 1.1.0 |
| org.apache.commons | commons-lang3 | 3.12.0 |
| org.apache.commons | commons-math3 | 3.6.1 |
| org.apache.commons | commons-text | 1.10.0 |
| org.apache.curator | curator-client | 2.13.0 |
| org.apache.curator | curator-framework | 2.13.0 |
| org.apache.curator | curator-recipes | 2.13.0 |
| org.apache.derby | derby | 10.14.2.0 |
| org.apache.hadoop | hadoop-client-api | 3.3.4-databricks |
| org.apache.hadoop | hadoop-client-runtime | 3.3.4 |
| org.apache.hive | hive-beeline | 2.3.9 |
| org.apache.hive | hive-cli | 2.3.9 |
| org.apache.hive | hive-jdbc | 2.3.9 |
| org.apache.hive | hive-llap-client | 2.3.9 |
| org.apache.hive | hive-llap-common | 2.3.9 |
| org.apache.hive | hive-serde | 2.3.9 |
| org.apache.hive | hive-shims | 2.3.9 |
| org.apache.hive | hive-storage-api | 2.8.1 |
| org.apache.hive.shims | hive-shims-0.23 | 2.3.9 |
| org.apache.hive.shims | hive-shims-common | 2.3.9 |
| org.apache.hive.shims | hive-shims-scheduler | 2.3.9 |
| org.apache.httpcomponents | httpclient | 4.5.13 |
| org.apache.httpcomponents | httpcore | 4.4.14 |
| org.apache.ivy | ivy | 2.5.0 |
| org.apache.logging.log4j | log4j-1.2-api | 2.18.0 |
| org.apache.logging.log4j | log4j-api | 2.18.0 |
| org.apache.logging.log4j | log4j-core | 2.18.0 |
| org.apache.logging.log4j | log4j-slf4j-impl | 2.18.0 |
| org.apache.mesos | mesos-shaded-protobuf | 1.4.0 |
| org.apache.orc | orc-core | 1.7.6 |
| org.apache.orc | orc-mapreduce | 1.7.6 |
| org.apache.orc | orc-shims | 1.7.6 |
| org.apache.parquet | parquet-column | 1.12.3-databricks-0002 |
| org.apache.parquet | parquet-common | 1.12.3-databricks-0002 |
| org.apache.parquet | parquet-encoding | 1.12.3-databricks-0002 |
| org.apache.parquet | parquet-format-structures | 1.12.3-databricks-0002 |
| org.apache.parquet | parquet-hadoop | 1.12.3-databricks-0002 |
| org.apache.parquet | parquet-jackson | 1.12.3-databricks-0002 |
| org.apache.thrift | libfb303 | 0.9.3 |
| org.apache.thrift | libthrift | 0.12.0 |
| org.apache.xbean | xbean-asm9-shaded | 4.20 |
| org.apache.yetus | audience-annotations | 0.13.0 |
| org.apache.zookeeper | zookeeper | 3.6.2 |
| org.apache.zookeeper | zookeeper-jute | 3.6.2 |
| org.checkerframework | checker-qual | 3.5.0 |
| org.codehaus.jackson | jackson-core-asl | 1.9.13 |
| org.codehaus.jackson | jackson-mapper-asl | 1.9.13 |
| org.codehaus.janino | commons-compiler | 3.0.16 |
| org.codehaus.janino | janino | 3.0.16 |
| org.datanucleus | datanucleus-api-jdo | 4.2.4 |
| org.datanucleus | datanucleus-core | 4.1.17 |
| org.datanucleus | datanucleus-rdbms | 4.1.19 |
| org.datanucleus | javax.jdo | 3.2.0-m3 |
| org.eclipse.jetty | jetty-client | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-continuation | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-http | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-io | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-jndi | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-plus | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-proxy | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-security | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-server | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-servlet | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-servlets | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-util | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-util-ajax | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-webapp | 9.4.46.v20220331 |
| org.eclipse.jetty | jetty-xml | 9.4.46.v20220331 |
| org.eclipse.jetty.websocket | websocket-api | 9.4.46.v20220331 |
| org.eclipse.jetty.websocket | websocket-client | 9.4.46.v20220331 |
| org.eclipse.jetty.websocket | websocket-common | 9.4.46.v20220331 |
| org.eclipse.jetty.websocket | websocket-server | 9.4.46.v20220331 |
| org.eclipse.jetty.websocket | websocket-servlet | 9.4.46.v20220331 |
| org.fusesource.leveldbjni | leveldbjni-all | 1.8 |
| org.glassfish.hk2 | hk2-api | 2.6.1 |
| org.glassfish.hk2 | hk2-locator | 2.6.1 |
| org.glassfish.hk2 | hk2-utils | 2.6.1 |
| org.glassfish.hk2 | osgi-resource-locator | 1.0.3 |
| org.glassfish.hk2.external | aopalliance-repackaged | 2.6.1 |
| org.glassfish.hk2.external | jakarta.inject | 2.6.1 |
| org.glassfish.jersey.containers | jersey-container-servlet | 2.36 |
| org.glassfish.jersey.containers | jersey-container-servlet-core | 2.36 |
| org.glassfish.jersey.core | jersey-client | 2.36 |
| org.glassfish.jersey.core | jersey-common | 2.36 |
| org.glassfish.jersey.core | jersey-server | 2.36 |
| org.glassfish.jersey.inject | jersey-hk2 | 2.36 |
| org.hibernate.validator | hibernate-validator | 6.1.0.Final |
| org.javassist | javassist | 3.25.0-GA |
| org.jboss.logging | jboss-logging | 3.3.2.Final |
| org.jdbi | jdbi | 2.63.1 |
| org.jetbrains | annotations | 17.0.0 |
| org.joda | joda-convert | 1.7 |
| org.jodd | jodd-core | 3.5.2 |
| org.json4s | json4s-ast_2.12 | 3.7.0-M11 |
| org.json4s | json4s-core_2.12 | 3.7.0-M11 |
| org.json4s | json4s-jackson_2.12 | 3.7.0-M11 |
| org.json4s | json4s-scalap_2.12 | 3.7.0-M11 |
| org.lz4 | lz4-java | 1.8.0 |
| org.mariadb.jdbc | mariadb-java-client | 2.7.4 |
| org.mlflow | mlflow-spark | 1.27.0 |
| org.objenesis | objenesis | 2.5.1 |
| org.postgresql | postgresql | 42.3.3 |
| org.roaringbitmap | RoaringBitmap | 0.9.25 |
| org.roaringbitmap | shims | 0.9.25 |
| org.rocksdb | rocksdbjni | 6.24.2 |
| org.rosuda.REngine | REngine | 2.1.0 |
| org.scala-lang | scala-compiler_2.12 | 2.12.14 |
| org.scala-lang | scala-library_2.12 | 2.12.14 |
| org.scala-lang | scala-reflect_2.12 | 2.12.14 |
| org.scala-lang.modules | scala-collection-compat_2.12 | 2.4.3 |
| org.scala-lang.modules | scala-parser-combinators_2.12 | 1.1.2 |
| org.scala-lang.modules | scala-xml_2.12 | 1.2.0 |
| org.scala-sbt | test-interface | 1.0 |
| org.scalacheck | scalacheck_2.12 | 1.14.2 |
| org.scalactic | scalactic_2.12 | 3.0.8 |
| org.scalanlp | breeze-macros_2.12 | 1.2 |
| org.scalanlp | breeze_2.12 | 1.2 |
| org.scalatest | scalatest_2.12 | 3.0.8 |
| org.slf4j | jcl-over-slf4j | 1.7.36 |
| org.slf4j | jul-to-slf4j | 1.7.36 |
| org.slf4j | slf4j-api | 1.7.36 |
| org.spark-project.spark | unused | 1.0.0 |
| org.threeten | threeten-extra | 1.5.0 |
| org.tukaani | xz | 1.9 |
| org.typelevel | algebra_2.12 | 2.0.1 |
| org.typelevel | cats-kernel_2.12 | 2.1.1 |
| org.typelevel | macro-compat_2.12 | 1.1.1 |
| org.typelevel | spire-macros_2.12 | 0.17.0 |
| org.typelevel | spire-platform_2.12 | 0.17.0 |
| org.typelevel | spire-util_2.12 | 0.17.0 |
| org.typelevel | spire_2.12 | 0.17.0 |
| org.wildfly.openssl | wildfly-openssl | 1.0.7.Final |
| org.xerial | sqlite-jdbc | 3.8.11.2 |
| org.xerial.snappy | snappy-java | 1.1.8.4 |
| org.yaml | snakeyaml | 1.24 |
| oro | oro | 2.0.8 |
| pl.edu.icm | JLargeArrays | 1.5 |
| software.amazon.ion | ion-java | 1.0.2 |
| stax | stax-api | 1.0.1 |