ClickHouse v21.10 Release Notes

Release Date: 2021-10-14 // 10 days ago
  • Backward Incompatible Change

    • Now the following MergeTree table-level settings: replicated_max_parallel_sends, replicated_max_parallel_sends_for_table, replicated_max_parallel_fetches, replicated_max_parallel_fetches_for_table do nothing. They never worked well and were replaced with max_replicated_fetches_network_bandwidth, max_replicated_sends_network_bandwidth and background_fetches_pool_size. #28404 (alesapin).

    πŸ†• New Feature

    • Add feature for creating user-defined functions (UDF) as lambda expressions. Syntax CREATE FUNCTION {function_name} as ({parameters}) -> {function core}. Example CREATE FUNCTION plus_one as (a) -> a + 1. Authors @Realist007. #27796 (Maksim Kita) #23978 (Realist007).
    • βž• Added Executable storage engine and executable table function. It enables data processing with external scripts in streaming fashion. #28102 (Maksim Kita) (ruct).
    • βž• Added ExecutablePool storage engine. Similar to Executable but it's using a pool of long running processes. #28518 (Maksim Kita).
    • βž• Add ALTER TABLE ... MATERIALIZE COLUMN query. #27038 (Vladimir Chebotarev).
    • πŸ‘Œ Support for partitioned write into s3 table function. #23051 (Vladimir Chebotarev).
    • πŸ‘Œ Support lz4 compression format (in addition to gz, bz2, xz, zstd) for data import / export. #25310 (Bharat Nallan).
    • Allow positional arguments under setting enable_positional_arguments. Closes #2592. #27530 (Kseniia Sumarokova).
    • Accept user settings related to file formats in SETTINGS clause in CREATE query for s3 tables. This closes #27580. #28037 (Nikita Mikhaylov).
    • πŸ‘ Allow SSL connection for RabbitMQ engine. #28365 (Kseniia Sumarokova).
    • βž• Add getServerPort function to allow getting server port. When the port is not used by the server, throw an exception. #27900 (Amos Bird).
    • βž• Add conversion functions between "snowflake id" and DateTime, DateTime64. See #27058. #27704 (jasine).
    • βž• Add function SHA512. #27830 (zhanglistar).
    • Add log_queries_probability setting that allows user to write to query_log only a sample of queries. Closes #16609. #27527 (Nikolay Degterinsky).

    Experimental Feature

    • πŸš€ web type of disks to store readonly tables on web server in form of static files. See #23982. #25251 (Kseniia Sumarokova). This is mostly needed to faciliate testing of operation on shared storage and for easy importing of datasets. Not recommended to use before release 21.11.
    • βž• Added new commands BACKUP and RESTORE. #21945 (Vitaly Baranov). This is under development and not intended to be used in current version.

    🐎 Performance Improvement

    • Speed up sumIf and countIf aggregation functions. #28272 (RaΓΊl MarΓ­n).
    • Create virtual projection for minmax indices. Now, when allow_experimental_projection_optimization is enabled, queries will use minmax index instead of reading the data when possible. #26286 (Amos Bird).
    • Introducing two checks in sequenceMatch and sequenceCount that allow for early exit when some deterministic part of the sequence pattern is missing from the events list. This change unlocks many queries that would previously fail due to reaching operations cap, and generally speeds up the pipeline. #27729 (Jakub Kuklis).
    • ✨ Enhance primary key analysis with always monotonic information of binary functions, notably non-zero constant division. #28302 (Amos Bird).
    • πŸ‘‰ Make hasAll filter condition leverage bloom filter data-skipping indexes. #27984 (Braulio Valdivielso MartΓ­nez).
    • Speed up data parts loading by delaying table startup process. #28313 (Amos Bird).
    • 🚚 Fixed possible excessive number of conditions moved from WHERE to PREWHERE (optimization controlled by settings optimize_move_to_prewhere). #28139 (lthaooo).
    • Enable optimize_distributed_group_by_sharding_key by default. #28105 (Azat Khuzhin).

    πŸ‘Œ Improvement

    • πŸ›  Check cluster name before creating Distributed table, do not allow to create a table with incorrect cluster name. Fixes #27832. #27927 (tavplubix).
    • βž• Add aggregate function quantileBFloat16Weighted similarly to other quantile...Weighted functions. This closes #27745. #27758 (Ivan Novitskiy).
    • πŸ‘ Allow to create dictionaries with empty attributes list. #27905 (Maksim Kita).
    • βž• Add interactive documentation in clickhouse-client about how to reset the password. This is useful in scenario when user has installed ClickHouse, set up the password and instantly forget it. See #27750. #27903 (alexey-milovidov).
    • πŸ‘Œ Support the case when the data is enclosed in array in JSONAsString input format. Closes #25517. #25633 (Kruglov Pavel).
    • Add new column last_queue_update_exception to system.replicas table. #26843 (nvartolomei).
    • πŸ‘Œ Support reconnections on failover for MaterializedPostgreSQL tables. Closes #28529. #28614 (Kseniia Sumarokova).
    • Generate a unique server UUID on first server start. #20089 (Bharat Nallan).
    • Introduce connection_wait_timeout (default to 5 seconds, 0 - do not wait) setting for MySQL engine. #28474 (Azat Khuzhin).
    • Do not allow creating MaterializedPostgreSQL with bad arguments. Closes #28423. #28430 (Kseniia Sumarokova).
    • πŸ”€ Use real tmp file instead of predefined "rows_sources" for vertical merges. This avoids generating garbage directories in tmp disks. #28299 (Amos Bird).
    • Added libhdfs3_conf in server config instead of export env LIBHDFS3_CONF in clickhouse-server.service. This is for configuration of interaction with HDFS. #28268 (Zhichang Yu).
    • πŸ›  Fix removing of parts in a Temporary state which can lead to an unexpected exception (Part %name% doesn't exist). Fixes #23661. #28221 #28221) (Azat Khuzhin).
    • 🌲 Fix zookeeper_log.address (before the first patch in this PR the address was always ::) and reduce number of calls getpeername(2) for this column (since each time entry for zookeeper_log is added getpeername() is called, cache this address in the zookeeper client to avoid this). #28212 (Azat Khuzhin).
    • πŸ‘Œ Support implicit conversions between index in operator [] and key of type Map (e.g. different Int types, String and FixedString). #28096 (Anton Popov).
    • πŸ‘Œ Support ON CONFLICT clause when inserting into PostgreSQL table engine or table function. Closes #27727. #28081 (Kseniia Sumarokova).
    • Lower restrictions for Enum data type to allow attaching compatible data. Closes #26672. #28028 (Dmitry Novik).
    • Add a setting empty_result_for_aggregation_by_constant_keys_on_empty_set to control the behavior of grouping by constant keys on empty set. This is to bring back the old baviour of #6842. #27932 (Amos Bird).
    • Added replication_wait_for_inactive_replica_timeout setting. It allows to specify how long to wait for inactive replicas to execute ALTER/OPTIMZE/TRUNCATE query (default is 120 seconds). If replication_alter_partitions_sync is 2 and some replicas are not active for more than replication_wait_for_inactive_replica_timeout seconds, then UNFINISHED will be thrown. #27931 (tavplubix).
    • πŸ‘Œ Support lambda argument for APPLY column transformer which allows applying functions with more than one argument. This is for #27877. #27901 (Amos Bird).
    • Enable tcp_keep_alive_timeout by default. #27882 (Azat Khuzhin).
    • πŸ‘Œ Improve remote query cancelation (in case of remote server abnormaly terminated). #27881 (Azat Khuzhin).
    • πŸ‘‰ Use Multipart copy upload for large S3 objects. #27858 (ianton-ru).
    • πŸ‘ Allow symlink traversal for library dictionaty path. #27815 (Kseniia Sumarokova).
    • Now ALTER MODIFY COLUM T to Nullable(T) doesn't require mutation. #27787 (victorgao).
    • Don't silently ignore errors and don't count delays in ReadBufferFromS3. #27484 (Vladimir Chebotarev).
    • πŸ‘Œ Improve ALTER ... MATERIALIZE TTL by recalculating metadata only without actual TTL action. #27019 (lthaooo).
    • πŸ‘ Allow reading the list of custom top level domains without a new line at EOF. #28213 (Azat Khuzhin).

    πŸ› Bug Fix

    • πŸ›  Fix cases, when reading compressed data from carbon-clickhouse fails with 'attempt to read after end of file'. Closes #26149. #28150 (FArthur-cmd).
    • πŸ›  Fix checking access grants when executing GRANT WITH REPLACE statement with ON CLUSTER clause. This PR improves fix #27001. #27983 (Vitaly Baranov).
    • πŸ‘ Allow selecting with extremes = 1 from a column of the type LowCardinality(UUID). #27918 (Vitaly Baranov).
    • πŸ›  Fix PostgreSQL-style cast (:: operator) with negative numbers. #27876 (Anton Popov).
    • After #26864. Fix shutdown of NamedSessionStorage: session contexts stored in NamedSessionStorage are now destroyed before destroying the global context. #27875 (Vitaly Baranov).
    • πŸ›  Bugfix for windowFunnel "strict" mode. This fixes #27469. #27563 (achimbab).
    • πŸ›  Fix infinite loop while reading truncated bzip2 archive. #28543 (Azat Khuzhin).
    • πŸ›  Fix UUID overlap in DROP TABLE for internal DDL from MaterializedMySQL. MaterializedMySQL is an experimental feature. #28533 (Azat Khuzhin).
    • πŸ›  Fix There is no subcolumn error, while select from tables, which have Nested columns and scalar columns with dot in name and the same prefix as Nested (e.g. n.id UInt32, n.arr1 Array(UInt64), n.arr2 Array(UInt64)). #28531 (Anton Popov).
    • πŸ›  Fix bug which can lead to error Existing table metadata in ZooKeeper differs in sorting key expression. after ALTER of ReplicatedVersionedCollapsingMergeTree. Fixes #28515. #28528 (alesapin).
    • πŸ›  Fixed possible ZooKeeper watches leak (minor issue) on background processing of distributed DDL queue. Closes #26036. #28446 (tavplubix).
    • πŸ›  Fix missing quoting of table names in MaterializedPostgreSQL engine. Closes #28316. #28433 (Kseniia Sumarokova).
    • πŸ›  Fix the wrong behaviour of non joined rows from nullable column. Close #27691. #28349 (vdimir).
    • πŸ›  Fix NOT-IN index optimization when not all key columns are used. This fixes #28120. #28315 (Amos Bird).
    • πŸ›  Fix intersecting parts due to new part had been replaced with an empty part. #28310 (Azat Khuzhin).
    • Fix inconsistent result in queries with ORDER BY and Merge tables with enabled setting optimize_read_in_order. #28266 (Anton Popov).
    • πŸ›  Fix possible read of uninitialized memory for queries with Nullable(LowCardinality) type and the setting extremes set to 1. Fixes #28165. #28205 (Nikolai Kochetov).
    • πŸ‘€ Multiple small fixes for projections. See detailed description in the PR. #28178 (Amos Bird).
    • πŸ›  Fix extremely rare segfaults on shutdown due to incorrect order of context/config reloader shutdown. #28088 (nvartolomei).
    • πŸ›  Fix handling null value with type of Nullable(String) in function JSONExtract. This fixes #27929 and #27930. This was introduced in https://github.com/ClickHouse/ClickHouse/pull/25452 . #27939 (Amos Bird).
    • πŸ”Š Multiple fixes for the new clickhouse-keeper tool. Fix a rare bug in clickhouse-keeper when the client can receive a watch response before request-response. #28197 (alesapin). Fix incorrect behavior in clickhouse-keeper when list watches (getChildren) triggered with set requests for children. #28190 (alesapin). Fix rare case when changes of clickhouse-keeper settings may lead to lost logs and server hung. #28360 (alesapin). Fix bug in clickhouse-keeper which can lead to endless logs when rotate_logs_interval decreased. #28152 (alesapin).

    πŸ— Build/Testing/Packaging Improvement

    • ⏱ Enable Thread Fuzzer in Stress Test. Thread Fuzzer is ClickHouse feature that allows to test more permutations of thread scheduling and discover more potential issues. This closes #9813. This closes #9814. This closes #9515. This closes #9516. #27538 (alexey-milovidov).
    • βž• Add new log level test for testing environments. It is even more verbose than the default trace. #28559 (alesapin).
    • πŸ”§ Print out git status information at CMake configure stage. #28047 (Braulio Valdivielso MartΓ­nez).
    • 0️⃣ Temporarily switched ubuntu apt repository to mirror ru.archive.ubuntu.com as the default one (archive.ubuntu.com) is not responding from our CI. #28016 (Ilya Yatsishin).

Previous changes from v21.9

  • Backward Incompatible Change

    • Do not output trailing zeros in text representation of Decimal types. Example: 1.23 will be printed instead of 1.230000 for decimal with scale 6. This closes #15794. It may introduce slight incompatibility if your applications somehow relied on the trailing zeros. Serialization in output formats can be controlled with the setting output_format_decimal_trailing_zeros. Implementation of toString and casting to String is changed unconditionally. #27680 (alexey-milovidov).
    • πŸ”€ Do not allow to apply parametric aggregate function with -Merge combinator to aggregate function state if state was produced by aggregate function with different parameters. For example, state of fooState(42)(x) cannot be finalized with fooMerge(s) or fooMerge(123)(s), parameters must be specified explicitly like fooMerge(42)(s) and must be equal. It does not affect some special aggregate functions like quantile and sequence* that use parameters for finalization only. #26847 (tavplubix).
    • Under clickhouse-local, always treat local addresses with a port as remote. #26736 (RaΓΊl MarΓ­n).
    • ⚑️ Fix the issue that in case of some sophisticated query with column aliases identical to the names of expressions, bad cast may happen. This fixes #25447. This fixes #26914. This fix may introduce backward incompatibility: if there are different expressions with identical names, exception will be thrown. It may break some rare cases when enable_optimize_predicate_expression is set. #26639 (alexey-milovidov).
    • πŸ›  Now, scalar subquery always returns Nullable result if it's type can be Nullable. It is needed because in case of empty subquery it's result should be Null. Previously, it was possible to get error about incompatible types (type deduction does not execute scalar subquery, and it could use not-nullable type). Scalar subquery with empty result which can't be converted to Nullable (like Array or Tuple) now throws error. Fixes #25411. #26423 (Nikolai Kochetov).
    • Introduce syntax for here documents. Example SELECT $doc$ VALUE $doc$. #26671 (Maksim Kita). This change is backward incompatible if in query there are identifiers that contain $ #28768.
    • ⚑️ Now indices can handle Nullable types, including isNull and isNotNull. #12433 and #12455 (Amos Bird) and #27250 (Azat Khuzhin). But this was done with on-disk format changes, and even though new server can read old data, old server cannot. Also, in case you have MINMAX data skipping indices, you may get Data after mutation/merge is not byte-identical error, since new index will have .idx2 extension while before it was .idx. That said, that you should not delay updating all existing replicas, in this case, otherwise, if old replica (<21.9) will download data from new replica with 21.9+ it will not be able to apply index for downloaded part.

    πŸ†• New Feature

    • Implementation of short circuit function evaluation, closes #12587. Add settings short_circuit_function_evaluation to configure short circuit function evaluation. #23367 (Kruglov Pavel).
    • βž• Add support for INTERSECT, EXCEPT, ANY, ALL operators. #24757 (Kirill Ershov). (Kseniia Sumarokova).
    • βž• Add support for encryption at the virtual file system level (data encryption at rest) using AES-CTR algorithm. #24206 (Latysheva Alexandra). (Vitaly Baranov) #26733 #26377 #26465.
    • βž• Added natural language processing (NLP) functions for tokenization, stemming, lemmatizing and search in synonyms extensions. #24997 (Nikolay Degterinsky).
    • βž• Added integration with S2 geometry library. #24980 (Andr0901). (Nikita Mikhaylov).
    • βž• Add SQLite table engine, table function, database engine. #24194 (Arslan Gumerov). (Kseniia Sumarokova).
    • βž• Added support for custom query for MySQL, PostgreSQL, ClickHouse, JDBC, Cassandra dictionary source. Closes #1270. #26995 (Maksim Kita).
    • βž• Add shared (replicated) storage of user, roles, row policies, quotas and settings profiles through ZooKeeper. #27426 (Kevin Michel).
    • βž• Add compression for INTO OUTFILE that automatically choose compression algorithm. Closes #3473. #27134 (Filatenkov Artur).
    • βž• Add INSERT ... FROM INFILE similarly to SELECT ... INTO OUTFILE. #27655 (Filatenkov Artur).
    • Added complex_key_range_hashed dictionary. Closes #22029. #27629 (Maksim Kita).
    • πŸ‘Œ Support expressions in JOIN ON section. Close #21868. #24420 (Vladimir C).
    • πŸ”§ When client connects to server, it receives information about all warnings that are already were collected by server. (It can be disabled by using option --no-warnings). Add system.warnings table to collect warnings about server configuration. #26246 (Filatenkov Artur). #26282 (Filatenkov Artur).
    • πŸ‘ Allow using constant expressions from with and select in aggregate function parameters. Close #10945. #27531 (abel-cheng).
    • βž• Add tupleToNameValuePairs, a function that turns a named tuple into an array of pairs. #27505 (Braulio Valdivielso MartΓ­nez).
    • βž• Add support for bzip2 compression method for import/export. Closes #22428. #27377 (Nikolay Degterinsky).
    • Added bitmapSubsetOffsetLimit(bitmap, offset, cardinality_limit) function. It creates a subset of bitmap limit the results to cardinality_limit with offset of offset. #27234 (DHBin).
    • βž• Add column default_database to system.users. #27054 (kevin wan).
    • πŸ‘Œ Supported cluster macros inside table functions 'cluster' and 'clusterAllReplicas'. #26913 (polyprogrammist).
    • βž• Add new functions currentRoles(), enabledRoles(), defaultRoles(). #26780 (Vitaly Baranov).
    • πŸ†• New functions currentProfiles(), enabledProfiles(), defaultProfiles(). #26714 (Vitaly Baranov).
    • Add functions that return (initial_)query_id of the current query. This closes #23682. #26410 (Alexey Boykov).
    • βž• Add REPLACE GRANT feature. #26384 (Caspian).
    • πŸ”€ EXPLAIN query now has EXPLAIN ESTIMATE ... mode that will show information about read rows, marks and parts from MergeTree tables. Closes #23941. #26131 (fastio).
    • βž• Added system.zookeeper_log table. All actions of ZooKeeper client are logged into this table. Implements #25449. #26129 (tavplubix).
    • Zero-copy replication for ReplicatedMergeTree over HDFS storage. #25918 (Zhichang Yu).
    • πŸ‘ Allow to insert Nested type as array of structs in Arrow, ORC and Parquet input format. #25902 (Kruglov Pavel).
    • βž• Add a new datatype Date32 (store data as Int32), support date range same with DateTime64 support load parquet date32 to ClickHouse Date32 Add new function toDate32 like toDate. #25774 (LiuNeng).
    • πŸ‘ Allow setting default database for users. #25268. #25687 (kevin wan).
    • βž• Add an optional parameter to MongoDB engine to accept connection string options and support SSL connection. Closes #21189. Closes #21041. #22045 (Omar Bazaraa).

    Experimental Feature

    • Added a compression codec AES_128_GCM_SIV which encrypts columns instead of compressing them. #19896 (PHO). Will be rewritten, do not use.
    • πŸ“‡ Rename MaterializeMySQL to MaterializedMySQL. #26822 (tavplubix).

    🐎 Performance Improvement

    • Improve the performance of fast queries when max_execution_time = 0 by reducing the number of clock_gettime system calls. #27325 (filimonov).
    • 🐎 Specialize date time related comparison to achieve better performance. This fixes #27083 . #27122 (Amos Bird).
    • 🐎 Share file descriptors in concurrent reads of the same files. There is no noticeable performance difference on Linux. But the number of opened files will be significantly (10..100 times) lower on typical servers and it makes operations easier. See #26214. #26768 (alexey-milovidov).
    • πŸ‘Œ Improve latency of short queries, that require reading from tables with large number of columns. #26371 (Anton Popov).
    • πŸ— Don't build sets for indices when analyzing a query. #26365 (RaΓΊl MarΓ­n).
    • Vectorize the SUM of Nullable integer types with native representation (David Manzanares, RaΓΊl MarΓ­n). #26248 (RaΓΊl MarΓ­n).
    • Compile expressions involving columns with Enum types. #26237 (Maksim Kita).
    • Compile aggregate functions groupBitOr, groupBitAnd, groupBitXor. #26161 (Maksim Kita).
    • πŸ‘Œ Improved memory usage with better block size prediction when reading empty DEFAULT columns. Closes #17317. #25917 (Vladimir Chebotarev).
    • ⬇️ Reduce memory usage and number of read rows in queries with ORDER BY primary_key. #25721 (Anton Popov).
    • Enable distributed_push_down_limit by default. #27104 (Azat Khuzhin).
    • πŸ‘‰ Make toTimeZone monotonicity when timeZone is a constant value to support partition puring when use sql like:. #26261 (huangzhaowei).

    πŸ‘Œ Improvement

    • Mark window functions as ready for general use. Remove the allow_experimental_window_functions setting. #27184 (Alexander Kuzmenkov).
    • πŸ‘Œ Improve compatibility with non-whole-minute timezone offsets. #27080 (RaΓΊl MarΓ­n).
    • If file descriptor in File table is regular file - allow to read multiple times from it. It allows clickhouse-local to read multiple times from stdin (with multiple SELECT queries or subqueries) if stdin is a regular file like clickhouse-local --query "SELECT * FROM table UNION ALL SELECT * FROM table" ... < file. This closes #11124. Co-authored with (alexey-milovidov). #25960 (BoloniniD).
    • βœ‚ Remove duplicate index analysis and avoid possible invalid limit checks during projection analysis. #27742 (Amos Bird).
    • Enable query parameters to be passed in the body of HTTP requests. #27706 (Hermano Lustosa).
    • Disallow arrayJoin on partition expressions. #27648 (RaΓΊl MarΓ­n).
    • 🌲 Log client IP address if authentication fails. #27514 (Misko Lee).
    • πŸ‘‰ Use bytes instead of strings for binary data in the GRPC protocol. #27431 (Vitaly Baranov).
    • Send response with error message if HTTP port is not set and user tries to send HTTP request to TCP port. #27385 (Braulio Valdivielso MartΓ­nez).
    • Add _CAST function for internal usage, which will not preserve type nullability, but non-internal cast will preserve according to setting cast_keep_nullable. Closes #12636. #27382 (Kseniia Sumarokova).
    • Add setting log_formatted_queries to log additional formatted query into system.query_log. It's useful for normalized query analysis because functions like normalizeQuery and normalizeQueryKeepNames don't parse/format queries in order to achieve better performance. #27380 (Amos Bird).
    • Add two settings max_hyperscan_regexp_length and max_hyperscan_regexp_total_length to prevent huge regexp being used in hyperscan related functions, such as multiMatchAny. #27378 (Amos Bird).
    • Memory consumed by bitmap aggregate functions now is taken into account for memory limits. This closes #26555. #27252 (alexey-milovidov).
    • βž• Add 10 seconds cache for S3 proxy resolver. #27216 (ianton-ru).
    • Split global mutex into individual regexp construction. This helps avoid huge regexp construction blocking other related threads. #27211 (Amos Bird).
    • πŸ‘Œ Support schema for PostgreSQL database engine. Closes #27166. #27198 (Kseniia Sumarokova).
    • Track memory usage in clickhouse-client. #27191 (Filatenkov Artur).
    • Try recording query_kind in system.query_log even when query fails to start. #27182 (Amos Bird).
    • Added columns replica_is_active that maps replica name to is replica active status to table system.replicas. Closes #27138. #27180 (Maksim Kita).
    • πŸ‘ Allow to pass query settings via server URI in Web UI. #27177 (kolsys).
    • βž• Add a new metric called MaxPushedDDLEntryID which is the maximum ddl entry id that current node push to zookeeper. #27174 (Fuwang Hu).
    • πŸ‘Œ Improved the existence condition judgment and empty string node judgment when clickhouse-keeper creates znode. #27125 (小路).
    • πŸ”€ Merge JOIN correctly handles empty set in the right. #27078 (Vladimir C).
    • πŸ— Now functions can be shard-level constants, which means if it's executed in the context of some distributed table, it generates a normal column, otherwise it produces a constant value. Notable functions are: hostName(), tcpPort(), version(), buildId(), uptime(), etc. #27020 (Amos Bird).
    • ⚑️ Updated extractAllGroupsHorizontal - upper limit on the number of matches per row can be set via optional third argument. #26961 (Vasily Nemkov).
    • πŸ”¦ Expose RocksDB statistics via system.rocksdb table. Read rocksdb options from ClickHouse config (rocksdb... keys). NOTE: ClickHouse does not rely on RocksDB, it is just one of the additional integration storage engines. #26821 (Azat Khuzhin).
    • πŸ”Š Less verbose internal RocksDB logs. NOTE: ClickHouse does not rely on RocksDB, it is just one of the additional integration storage engines. This closes #26252. #26789 (alexey-milovidov).
    • 0️⃣ Changing default roles affects new sessions only. #26759 (Vitaly Baranov).
    • 🐳 Watchdog is disabled in docker by default. Fix for not handling ctrl+c. #26757 (Mikhail f. Shiryaev).
    • SET PROFILE now applies constraints too if they're set for a passed profile. #26730 (Vitaly Baranov).
    • πŸ‘Œ Improve handling of KILL QUERY requests. #26675 (RaΓΊl MarΓ­n).
    • πŸ‘ mapPopulatesSeries function supports Map type. #26663 (Ildus Kurbangaliev).
    • Fix excessive (x2) connect attempts with skip_unavailable_shards. #26658 (Azat Khuzhin).
    • Avoid hanging clickhouse-benchmark if connection fails (i.e. on EMFILE). #26656 (Azat Khuzhin).
    • πŸ‘ Allow more threads to be used by the Kafka engine. #26642 (feihengye).
    • βž• Add round-robin support for clickhouse-benchmark (it does not differ from the regular multi host/port run except for statistics report). #26607 (Azat Khuzhin).
    • Executable dictionaries (executable, executable_pool) enable creation with DDL query using clickhouse-local. Closes #22355. #26510 (Maksim Kita).
    • Set client query kind for mysql and postgresql compatibility protocol handlers. #26498 (anneji-dev).
    • Apply LIMIT on the shards for queries like SELECT * FROM dist ORDER BY key LIMIT 10 w/ distributed_push_down_limit=1. Avoid running Distinct/LIMIT BY steps for queries like SELECT DISTINCT shading_key FROM dist ORDER BY key. Now distributed_push_down_limit is respected by optimize_distributed_group_by_sharding_key optimization. #26466 (Azat Khuzhin).
    • πŸš€ Updated protobuf to 3.17.3. Changelogs are available on https://github.com/protocolbuffers/protobuf/releases. #26424 (Ilya Yatsishin).
    • Enable use_hedged_requests setting that allows to mitigate tail latencies on large clusters. #26380 (alexey-milovidov).
    • πŸ‘Œ Improve behaviour with non-existing host in user allowed host list. #26368 (ianton-ru).
    • Add ability to set Distributed directory monitor settings via CREATE TABLE (i.e. CREATE TABLE dist (key Int) Engine=Distributed(cluster, db, table) SETTINGS monitor_batch_inserts=1 and similar). #26336 (Azat Khuzhin).
    • πŸ’Ύ Save server address in history URLs in web UI if it differs from the origin of web UI. This closes #26044. #26322 (alexey-milovidov).
    • βž• Add events to profile calls to sleep / sleepEachRow. #26320 (RaΓΊl MarΓ­n).
    • πŸ‘ Allow to reuse connections of shards among different clusters. It also avoids creating new connections when using cluster table function. #26318 (Amos Bird).
    • 0️⃣ Control the execution period of clear old temporary directories by parameter with default value. #26212. #26313 (fastio).
    • Add a setting function_range_max_elements_in_block to tune the safety threshold for data volume generated by function range. This closes #26303. #26305 (alexey-milovidov).
    • πŸ”€ Check hash function at table creation, not at sampling. Add settings for MergeTree, if someone create a table with incorrect sampling column but sampling never be used, disable this settings for starting the server without exception. #26256 (zhaoyu).
    • Added output_format_avro_string_column_pattern setting to put specified String columns to Avro as string instead of default bytes. Implements #22414. #26245 (Ilya Golshtein).
    • βž• Add information about column sizes in system.columns table for Log and TinyLog tables. This closes #9001. #26241 (Nikolay Degterinsky).
    • πŸ”§ Don't throw exception when querying system.detached_parts table if there is custom disk configuration and detached directory does not exist on some disks. This closes #26078. #26236 (alexey-milovidov).
    • Check for non-deterministic functions in keys, including constant expressions like now(), today(). This closes #25875. This closes #11333. #26235 (alexey-milovidov).
    • convert timestamp and timestamptz data types to DateTime64 in PostgreSQL table engine. #26234 (jasine).
    • πŸ‘ Apply aggressive IN index analysis for projections so that better projection candidate can be selected. #26218 (Amos Bird).
    • βœ‚ Remove GLOBAL keyword for IN when scalar function is passed. In previous versions, if user specified GLOBAL IN f(x) exception was thrown. #26217 (Amos Bird).
    • βž• Add error id (like BAD_ARGUMENTS) to exception messages. This closes #25862. #26172 (alexey-milovidov).
    • πŸ›  Fix incorrect output with --progress option for clickhouse-local. Progress bar will be cleared once it gets to 100% - same as it is done for clickhouse-client. Closes #17484. #26128 (Kseniia Sumarokova).
    • Add merge_selecting_sleep_ms setting. #26120 (lthaooo).
    • Remove complicated usage of Linux AIO with one block readahead and replace it with plain simple synchronous IO with O_DIRECT. In previous versions, the setting min_bytes_to_use_direct_io may not work correctly if max_threads is greater than one. Reading with direct IO (that is disabled by default for queries and enabled by default for large merges) will work in less efficient way. This closes #25997. #26003 (alexey-milovidov).
    • Flush Distributed table on REPLACE TABLE query. Resolves #24566 - Do not replace (or create) table on [CREATE OR] REPLACE TABLE ... AS SELECT query if insertion into new table fails. Resolves #23175. #25895 (tavplubix).
    • 🌲 Add views column to system.query_log containing the names of the (materialized or live) views executed by the query. Adds a new log table (system.query_views_log) that contains information about each view executed during a query. Modifies view execution: When an exception is thrown while executing a view, any view that has already startedwill continue running until it finishes. This used to be the behaviour under parallel_view_processing=true and now it's always the same behaviour. - Dependent views now report reading progress to the context. #25714 (RaΓΊl MarΓ­n).
    • Do connection draining asynchonously upon finishing executing distributed queries. A new server setting is added max_threads_for_connection_collector which specifies the number of workers to recycle connections in background. If the pool is full, connection will be drained synchronously but a bit different than before: It's drained after we send EOS to client, query will succeed immediately after receiving enough data, and any exception will be logged instead of throwing to the client. Added setting drain_timeout (3 seconds by default). Connection draining will disconnect upon timeout. #25674 (Amos Bird).
    • Support for multiple includes in configuration. It is possible to include users configuration, remote servers configuration from multiple sources. Simply place <include /> element with from_zk, from_env or incl attribute and it will be replaced with the substitution. #24404 (nvartolomei).
    • Fix multiple block insertion into distributed table with insert_distributed_one_random_shard = 1. This is a marginal feature. Mark as improvement. #23140 (Amos Bird).
    • πŸ‘Œ Support LowCardinality and FixedString keys/values for Map type. #21543 (hexiaoting).
    • Enable reloading of local disk config. #19526 (taiyang-li).

    πŸ› Bug Fix

    • πŸ›  Fix a couple of bugs that may cause replicas to diverge. #27808 (tavplubix).
    • πŸ›  Fix a rare bug in DROP PART which can lead to the error Unexpected merged part intersects drop range. #27807 (alesapin).
    • Prevent crashes for some formats when NULL (tombstone) message was coming from Kafka. Closes #19255. #27794 (filimonov).
    • πŸ›  Fix column filtering with union distinct in subquery. Closes #27578. #27689 (Kseniia Sumarokova).
    • πŸ›  Fix bad type cast when functions like arrayHas are applied to arrays of LowCardinality of Nullable of different non-numeric types like DateTime and DateTime64. In previous versions bad cast occurs. In new version it will lead to exception. This closes #26330. #27682 (alexey-milovidov).
    • πŸ›  Fix postgresql table function resulting in non-closing connections. Closes #26088. #27662 (Kseniia Sumarokova).
    • πŸ›  Fixed another case of Unexpected merged part ... intersecting drop range ... error. #27656 (tavplubix).
    • πŸ›  Fix an error with aliased column in Distributed table. #27652 (Vladimir C).
    • After setting max_memory_usage* to non-zero value it was not possible to reset it back to 0 (unlimited). It's fixed. #27638 (tavplubix).
    • πŸ›  Fixed underflow of the time value when constructing it from components. Closes #27193. #27605 (Vasily Nemkov).
    • πŸ›  Fix crash during projection materialization when some parts contain missing columns. This fixes #27512. #27528 (Amos Bird).
    • πŸ›  fix metric BackgroundMessageBrokerSchedulePoolTask, maybe mistyped. #27452 (Ben).
    • πŸ›  Fix distributed queries with zero shards and aggregation. #27427 (Azat Khuzhin).
    • Compatibility when /proc/meminfo does not contain KB suffix. #27361 (Mike Kot).
    • πŸ›  Fix incorrect result for query with row-level security, PREWHERE and LowCardinality filter. Fixes #27179. #27329 (Nikolai Kochetov).
    • πŸ›  Fixed incorrect validation of partition id for MergeTree tables that created with old syntax. #27328 (tavplubix).
    • πŸ›  Fix MySQL protocol when using parallel formats (CSV / TSV). #27326 (RaΓΊl MarΓ­n).
    • πŸ›  Fix Cannot find column error for queries with sampling. Was introduced in #24574. Fixes #26522. #27301 (Nikolai Kochetov).
    • πŸ›  Fix errors like Expected ColumnLowCardinality, gotUInt8 or Bad cast from type DB::ColumnVector<char8_t> to DB::ColumnLowCardinality for some queries with LowCardinality in PREWHERE. And more importantly, fix the lack of whitespace in the error message. Fixes #23515. #27298 (Nikolai Kochetov).
    • Fix distributed_group_by_no_merge = 2 with distributed_push_down_limit = 1 or optimize_distributed_group_by_sharding_key = 1 with LIMIT BY and LIMIT OFFSET. #27249 (Azat Khuzhin). These are obscure combination of settings that no one is using.
    • πŸ›  Fix mutation stuck on invalid partitions in non-replicated MergeTree. #27248 (Azat Khuzhin).
    • In case of ambiguity, lambda functions prefer its arguments to other aliases or identifiers. #27235 (RaΓΊl MarΓ­n).
    • πŸ›  Fix column structure in merge join, close #27091. #27217 (Vladimir C).
    • πŸ›  In rare cases system.detached_parts table might contain incorrect information for some parts, it's fixed. Fixes #27114. #27183 (tavplubix).
    • πŸ›  Fix uninitialized memory in functions multiSearch* with empty array, close #27169. #27181 (Vladimir C).
    • πŸ›  Fix synchronization in GRPCServer. This PR fixes #27024. #27064 (Vitaly Baranov).
    • Fixed cache, complex_key_cache, ssd_cache, complex_key_ssd_cache configuration parsing. Options allow_read_expired_keys, max_update_queue_size, update_queue_push_timeout_milliseconds, query_wait_timeout_milliseconds were not parsed for dictionaries with non cache type. #27032 (Maksim Kita).
    • πŸ›  Fix possible mutation stack due to race with DROP_RANGE. #27002 (Azat Khuzhin).
    • πŸ›  Now partition ID in queries like ALTER TABLE ... PARTITION ID xxx validates for correctness. Fixes #25718. #26963 (alesapin).
    • πŸ›  Fix "Unknown column name" error with multiple JOINs in some cases, close #26899. #26957 (Vladimir C).
    • πŸ›  Fix reading of custom TLDs (stops processing with lower buffer or bigger file). #26948 (Azat Khuzhin).
    • πŸ›  Fix error Missing columns: 'xxx' when DEFAULT column references other non materialized column without DEFAULT expression. Fixes #26591. #26900 (alesapin).
    • πŸ›  Fix loading of dictionary keys in library-bridge for library dictionary source. #26834 (Kseniia Sumarokova).
    • πŸ›  Aggregate function parameters might be lost when applying some combinators causing exceptions like Conversion from AggregateFunction(topKArray, Array(String)) to AggregateFunction(topKArray(10), Array(String)) is not supported. It's fixed. Fixes #26196 and #26433. #26814 (tavplubix).
    • Add event_time_microseconds value for REMOVE_PART in system.part_log. In previous versions is was not set. #26720 (Azat Khuzhin).
    • πŸ“‡ Do not remove data on ReplicatedMergeTree table shutdown to avoid creating data to metadata inconsistency. #26716 (nvartolomei).
    • πŸ›  Sometimes SET ROLE could work incorrectly, this PR fixes that. #26707 (Vitaly Baranov).
    • πŸ›  Some fixes for parallel formatting (https://github.com/ClickHouse/ClickHouse/issues/26694). #26703 (RaΓΊl MarΓ­n).
    • πŸ›  Fix potential nullptr dereference in window functions. This fixes #25276. #26668 (Alexander Kuzmenkov).
    • πŸ›  Fix clickhouse-client history file conversion (when upgrading from the format of 3 years old version of clickhouse-client) if file is empty. #26589 (Azat Khuzhin).
    • πŸ›  Fix incorrect function names of groupBitmapAnd/Or/Xor (can be displayed in some occasions). This fixes. #26557 (Amos Bird).
    • ⚑️ Update chown cmd check in clickhouse-server docker entrypoint. It fixes the bug that cluster pod restart failed (or timeout) on kubernetes. #26545 (Ky Li).
    • πŸ›  Fix crash in RabbitMQ shutdown in case RabbitMQ setup was not started. Closes #26504. #26529 (Kseniia Sumarokova).
    • πŸ›  Fix issues with CREATE DICTIONARY query if dictionary name or database name was quoted. Closes #26491. #26508 (Maksim Kita).
    • πŸ›  Fix broken column name resolution after rewriting column aliases. This fixes #26432. #26475 (Amos Bird).
    • πŸ›  Fix some fuzzed msan crash. Fixes #22517. #26428 (Nikolai Kochetov).
    • πŸ”€ Fix infinite non joined block stream in partial_merge_join close #26325. #26374 (Vladimir C).
    • πŸ›  Fix possible crash when login as dropped user. This PR fixes #26073. #26363 (Vitaly Baranov).
    • Fix optimize_distributed_group_by_sharding_key for multiple columns (leads to incorrect result w/ optimize_skip_unused_shards=1/allow_nondeterministic_optimize_skip_unused_shards=1 and multiple columns in sharding key expression). #26353 (Azat Khuzhin).
    • πŸ›  Fixed rare bug in lost replica recovery that may cause replicas to diverge. #26321 (tavplubix).
    • πŸ›  Fix zstd decompression (for import/export in zstd framing format that is unrelated to tables data) in case there are escape sequences at the end of internal buffer. Closes #26013. #26314 (Kseniia Sumarokova).
    • πŸ›  Fix logical error on join with totals, close #26017. #26250 (Vladimir C).
    • Remove excessive newline in thread_name column in system.stack_trace table. This fixes #24124. #26210 (alexey-milovidov).
    • πŸ›  Fix potential crash if more than one untuple expression is used. #26179 (alexey-milovidov).
    • πŸ‘» Don't throw exception in toString for Nullable Enum if Enum does not have a value for zero, close #25806. #26123 (Vladimir C).
    • πŸ›  Fixed incorrect sequence_id in MySQL protocol packets that ClickHouse sends on exception during query execution. It might cause MySQL client to reset connection to ClickHouse server. Fixes #21184. #26051 (tavplubix).
    • Fix for the case that cutToFirstSignificantSubdomainCustom()/cutToFirstSignificantSubdomainCustomWithWWW()/firstSignificantSubdomainCustom() returns incorrect type for consts, and hence optimize_skip_unused_shards does not work:. #26041 (Azat Khuzhin).
    • πŸ›  Fix possible mismatched header when using normal projection with prewhere. This fixes #26020. #26038 (Amos Bird).
    • Fix sharding_key from column w/o function for remote() (before select * from remote('127.1', system.one, dummy) leads to Unknown column: dummy, there are only columns . error). #25824 (Azat Khuzhin).
    • πŸ›  Fixed Not found column ... and Missing column ... errors when selecting from MaterializeMySQL. Fixes #23708, #24830, #25794. #25822 (tavplubix).
    • Fix optimize_skip_unused_shards_rewrite_in for non-UInt64 types (may select incorrect shards eventually or throw Cannot infer type of an empty tuple or Function tuple requires at least one argument). #25798 (Azat Khuzhin).

    πŸ— Build/Testing/Packaging Improvement

    • βœ… Now we ran stateful and stateless tests in random timezones. Fixes #12439. Reading String as DateTime and writing DateTime as String in Protobuf format now respect timezone. Reading UInt16 as DateTime in Arrow and Parquet formats now treat it as Date and then converts to DateTime with respect to DateTime's timezone, because Date is serialized in Arrow and Parquet as UInt16. GraphiteMergeTree now respect time zone for rounding of times. Fixes #5098. Author: @alexey-milovidov. #15408 (alesapin).
    • βœ… clickhouse-test supports SQL tests with Jinja2 templates. #26579 (Vladimir C).
    • βž• Add support for build with clang-13. This closes #27705. #27714 (alexey-milovidov). #27777 (Sergei Semin)
    • βž• Add CMake options to build with or without specific CPU instruction set. This is for #17469 and #27509. #27508 (alexey-milovidov).
    • πŸ›  Fix linking of auxiliar programs when using dynamic libraries. #26958 (RaΓΊl MarΓ­n).
    • ⚑️ Update RocksDB to 2021-07-16 master. #26411 (alexey-milovidov).