ClickHouse v21.10 Release Notes

Release Date: 2021-10-14 // over 2 years ago
  • Backward Incompatible Change

    • Now the following MergeTree table-level settings: replicated_max_parallel_sends, replicated_max_parallel_sends_for_table, replicated_max_parallel_fetches, replicated_max_parallel_fetches_for_table do nothing. They never worked well and were replaced with max_replicated_fetches_network_bandwidth, max_replicated_sends_network_bandwidth and background_fetches_pool_size. #28404 (alesapin).

    πŸ†• New Feature

    • Add feature for creating user-defined functions (UDF) as lambda expressions. Syntax CREATE FUNCTION {function_name} as ({parameters}) -> {function core}. Example CREATE FUNCTION plus_one as (a) -> a + 1. Authors @Realist007. #27796 (Maksim Kita) #23978 (Realist007).
    • βž• Added Executable storage engine and executable table function. It enables data processing with external scripts in streaming fashion. #28102 (Maksim Kita) (ruct).
    • βž• Added ExecutablePool storage engine. Similar to Executable but it's using a pool of long running processes. #28518 (Maksim Kita).
    • βž• Add ALTER TABLE ... MATERIALIZE COLUMN query. #27038 (Vladimir Chebotarev).
    • πŸ‘Œ Support for partitioned write into s3 table function. #23051 (Vladimir Chebotarev).
    • πŸ‘Œ Support lz4 compression format (in addition to gz, bz2, xz, zstd) for data import / export. #25310 (Bharat Nallan).
    • Allow positional arguments under setting enable_positional_arguments. Closes #2592. #27530 (Kseniia Sumarokova).
    • Accept user settings related to file formats in SETTINGS clause in CREATE query for s3 tables. This closes #27580. #28037 (Nikita Mikhaylov).
    • πŸ‘ Allow SSL connection for RabbitMQ engine. #28365 (Kseniia Sumarokova).
    • βž• Add getServerPort function to allow getting server port. When the port is not used by the server, throw an exception. #27900 (Amos Bird).
    • βž• Add conversion functions between "snowflake id" and DateTime, DateTime64. See #27058. #27704 (jasine).
    • βž• Add function SHA512. #27830 (zhanglistar).
    • Add log_queries_probability setting that allows user to write to query_log only a sample of queries. Closes #16609. #27527 (Nikolay Degterinsky).

    Experimental Feature

    • πŸš€ web type of disks to store readonly tables on web server in form of static files. See #23982. #25251 (Kseniia Sumarokova). This is mostly needed to faciliate testing of operation on shared storage and for easy importing of datasets. Not recommended to use before release 21.11.
    • βž• Added new commands BACKUP and RESTORE. #21945 (Vitaly Baranov). This is under development and not intended to be used in current version.

    🐎 Performance Improvement

    • Speed up sumIf and countIf aggregation functions. #28272 (RaΓΊl MarΓ­n).
    • Create virtual projection for minmax indices. Now, when allow_experimental_projection_optimization is enabled, queries will use minmax index instead of reading the data when possible. #26286 (Amos Bird).
    • Introducing two checks in sequenceMatch and sequenceCount that allow for early exit when some deterministic part of the sequence pattern is missing from the events list. This change unlocks many queries that would previously fail due to reaching operations cap, and generally speeds up the pipeline. #27729 (Jakub Kuklis).
    • ✨ Enhance primary key analysis with always monotonic information of binary functions, notably non-zero constant division. #28302 (Amos Bird).
    • πŸ‘‰ Make hasAll filter condition leverage bloom filter data-skipping indexes. #27984 (Braulio Valdivielso MartΓ­nez).
    • Speed up data parts loading by delaying table startup process. #28313 (Amos Bird).
    • 🚚 Fixed possible excessive number of conditions moved from WHERE to PREWHERE (optimization controlled by settings optimize_move_to_prewhere). #28139 (lthaooo).
    • Enable optimize_distributed_group_by_sharding_key by default. #28105 (Azat Khuzhin).

    πŸ‘Œ Improvement

    • πŸ›  Check cluster name before creating Distributed table, do not allow to create a table with incorrect cluster name. Fixes #27832. #27927 (tavplubix).
    • βž• Add aggregate function quantileBFloat16Weighted similarly to other quantile...Weighted functions. This closes #27745. #27758 (Ivan Novitskiy).
    • πŸ‘ Allow to create dictionaries with empty attributes list. #27905 (Maksim Kita).
    • βž• Add interactive documentation in clickhouse-client about how to reset the password. This is useful in scenario when user has installed ClickHouse, set up the password and instantly forget it. See #27750. #27903 (alexey-milovidov).
    • πŸ‘Œ Support the case when the data is enclosed in array in JSONAsString input format. Closes #25517. #25633 (Kruglov Pavel).
    • Add new column last_queue_update_exception to system.replicas table. #26843 (nvartolomei).
    • πŸ‘Œ Support reconnections on failover for MaterializedPostgreSQL tables. Closes #28529. #28614 (Kseniia Sumarokova).
    • Generate a unique server UUID on first server start. #20089 (Bharat Nallan).
    • Introduce connection_wait_timeout (default to 5 seconds, 0 - do not wait) setting for MySQL engine. #28474 (Azat Khuzhin).
    • Do not allow creating MaterializedPostgreSQL with bad arguments. Closes #28423. #28430 (Kseniia Sumarokova).
    • πŸ”€ Use real tmp file instead of predefined "rows_sources" for vertical merges. This avoids generating garbage directories in tmp disks. #28299 (Amos Bird).
    • Added libhdfs3_conf in server config instead of export env LIBHDFS3_CONF in clickhouse-server.service. This is for configuration of interaction with HDFS. #28268 (Zhichang Yu).
    • πŸ›  Fix removing of parts in a Temporary state which can lead to an unexpected exception (Part %name% doesn't exist). Fixes #23661. #28221 #28221) (Azat Khuzhin).
    • 🌲 Fix zookeeper_log.address (before the first patch in this PR the address was always ::) and reduce number of calls getpeername(2) for this column (since each time entry for zookeeper_log is added getpeername() is called, cache this address in the zookeeper client to avoid this). #28212 (Azat Khuzhin).
    • πŸ‘Œ Support implicit conversions between index in operator [] and key of type Map (e.g. different Int types, String and FixedString). #28096 (Anton Popov).
    • πŸ‘Œ Support ON CONFLICT clause when inserting into PostgreSQL table engine or table function. Closes #27727. #28081 (Kseniia Sumarokova).
    • Lower restrictions for Enum data type to allow attaching compatible data. Closes #26672. #28028 (Dmitry Novik).
    • Add a setting empty_result_for_aggregation_by_constant_keys_on_empty_set to control the behavior of grouping by constant keys on empty set. This is to bring back the old baviour of #6842. #27932 (Amos Bird).
    • Added replication_wait_for_inactive_replica_timeout setting. It allows to specify how long to wait for inactive replicas to execute ALTER/OPTIMZE/TRUNCATE query (default is 120 seconds). If replication_alter_partitions_sync is 2 and some replicas are not active for more than replication_wait_for_inactive_replica_timeout seconds, then UNFINISHED will be thrown. #27931 (tavplubix).
    • πŸ‘Œ Support lambda argument for APPLY column transformer which allows applying functions with more than one argument. This is for #27877. #27901 (Amos Bird).
    • Enable tcp_keep_alive_timeout by default. #27882 (Azat Khuzhin).
    • πŸ‘Œ Improve remote query cancelation (in case of remote server abnormaly terminated). #27881 (Azat Khuzhin).
    • πŸ‘‰ Use Multipart copy upload for large S3 objects. #27858 (ianton-ru).
    • πŸ‘ Allow symlink traversal for library dictionaty path. #27815 (Kseniia Sumarokova).
    • Now ALTER MODIFY COLUM T to Nullable(T) doesn't require mutation. #27787 (victorgao).
    • Don't silently ignore errors and don't count delays in ReadBufferFromS3. #27484 (Vladimir Chebotarev).
    • πŸ‘Œ Improve ALTER ... MATERIALIZE TTL by recalculating metadata only without actual TTL action. #27019 (lthaooo).
    • πŸ‘ Allow reading the list of custom top level domains without a new line at EOF. #28213 (Azat Khuzhin).

    πŸ› Bug Fix

    • πŸ›  Fix cases, when reading compressed data from carbon-clickhouse fails with 'attempt to read after end of file'. Closes #26149. #28150 (FArthur-cmd).
    • πŸ›  Fix checking access grants when executing GRANT WITH REPLACE statement with ON CLUSTER clause. This PR improves fix #27001. #27983 (Vitaly Baranov).
    • πŸ‘ Allow selecting with extremes = 1 from a column of the type LowCardinality(UUID). #27918 (Vitaly Baranov).
    • πŸ›  Fix PostgreSQL-style cast (:: operator) with negative numbers. #27876 (Anton Popov).
    • After #26864. Fix shutdown of NamedSessionStorage: session contexts stored in NamedSessionStorage are now destroyed before destroying the global context. #27875 (Vitaly Baranov).
    • πŸ›  Bugfix for windowFunnel "strict" mode. This fixes #27469. #27563 (achimbab).
    • πŸ›  Fix infinite loop while reading truncated bzip2 archive. #28543 (Azat Khuzhin).
    • πŸ›  Fix UUID overlap in DROP TABLE for internal DDL from MaterializedMySQL. MaterializedMySQL is an experimental feature. #28533 (Azat Khuzhin).
    • πŸ›  Fix There is no subcolumn error, while select from tables, which have Nested columns and scalar columns with dot in name and the same prefix as Nested (e.g. n.id UInt32, n.arr1 Array(UInt64), n.arr2 Array(UInt64)). #28531 (Anton Popov).
    • πŸ›  Fix bug which can lead to error Existing table metadata in ZooKeeper differs in sorting key expression. after ALTER of ReplicatedVersionedCollapsingMergeTree. Fixes #28515. #28528 (alesapin).
    • πŸ›  Fixed possible ZooKeeper watches leak (minor issue) on background processing of distributed DDL queue. Closes #26036. #28446 (tavplubix).
    • πŸ›  Fix missing quoting of table names in MaterializedPostgreSQL engine. Closes #28316. #28433 (Kseniia Sumarokova).
    • πŸ›  Fix the wrong behaviour of non joined rows from nullable column. Close #27691. #28349 (vdimir).
    • πŸ›  Fix NOT-IN index optimization when not all key columns are used. This fixes #28120. #28315 (Amos Bird).
    • πŸ›  Fix intersecting parts due to new part had been replaced with an empty part. #28310 (Azat Khuzhin).
    • Fix inconsistent result in queries with ORDER BY and Merge tables with enabled setting optimize_read_in_order. #28266 (Anton Popov).
    • πŸ›  Fix possible read of uninitialized memory for queries with Nullable(LowCardinality) type and the setting extremes set to 1. Fixes #28165. #28205 (Nikolai Kochetov).
    • πŸ‘€ Multiple small fixes for projections. See detailed description in the PR. #28178 (Amos Bird).
    • πŸ›  Fix extremely rare segfaults on shutdown due to incorrect order of context/config reloader shutdown. #28088 (nvartolomei).
    • πŸ›  Fix handling null value with type of Nullable(String) in function JSONExtract. This fixes #27929 and #27930. This was introduced in https://github.com/ClickHouse/ClickHouse/pull/25452 . #27939 (Amos Bird).
    • πŸ”Š Multiple fixes for the new clickhouse-keeper tool. Fix a rare bug in clickhouse-keeper when the client can receive a watch response before request-response. #28197 (alesapin). Fix incorrect behavior in clickhouse-keeper when list watches (getChildren) triggered with set requests for children. #28190 (alesapin). Fix rare case when changes of clickhouse-keeper settings may lead to lost logs and server hung. #28360 (alesapin). Fix bug in clickhouse-keeper which can lead to endless logs when rotate_logs_interval decreased. #28152 (alesapin).

    πŸ— Build/Testing/Packaging Improvement

    • ⏱ Enable Thread Fuzzer in Stress Test. Thread Fuzzer is ClickHouse feature that allows to test more permutations of thread scheduling and discover more potential issues. This closes #9813. This closes #9814. This closes #9515. This closes #9516. #27538 (alexey-milovidov).
    • βž• Add new log level test for testing environments. It is even more verbose than the default trace. #28559 (alesapin).
    • πŸ”§ Print out git status information at CMake configure stage. #28047 (Braulio Valdivielso MartΓ­nez).
    • 0️⃣ Temporarily switched ubuntu apt repository to mirror ru.archive.ubuntu.com as the default one (archive.ubuntu.com) is not responding from our CI. #28016 (Ilya Yatsishin).