ClickHouse v20.5.1.3364 Release Notes

Release Date: 2020-05-14 // 11 days ago

Previous changes from v20.4.2.9

  • Backward Incompatible Change

    • 🌲 System tables (e.g. system.query_log, system.trace_log, system.metric_log) are using compact data part format for parts smaller than 10 MiB in size. Compact data part format is supported since version 20.3. If you are going to downgrade to version less than 20.3, you should manually delete table data for system logs in /var/lib/clickhouse/data/system/.
    • 🛠 When string comparison involves FixedString and compared arguments are of different sizes, do comparison as if smaller string is padded to the length of the larger. This is intented for SQL compatibility if we imagine that FixedString data type corresponds to SQL CHAR. This closes #9272. #10363 (alexey-milovidov)
    • 👉 Make SHOW CREATE TABLE multiline. Now it is more readable and more like MySQL. #10049 (Azat Khuzhin)
    • ➕ Added a setting validate_polygons that is used in pointInPolygon function and enabled by default. #9857 (alexey-milovidov)

    🆕 New Feature

    • ➕ Add support for secured connection from ClickHouse to Zookeeper #10184 (Konstantin Lebedev)
    • 👌 Support custom HTTP handlers. See ISSUES-5436 for description. #7572 (Winter Zhang)
    • ➕ Add MessagePack Input/Output format. #9889 (Kruglov Pavel)
    • ➕ Add Regexp input format. #9196 (Kruglov Pavel)
    • ➕ Added output format Markdown for embedding tables in markdown documents. #10317 (Kruglov Pavel)
    • ➕ Added support for custom settings section in dictionaries. Also fixes issue #2829. #10137 (Artem Streltsov)
    • ➕ Added custom settings support in DDL-queries for CREATE DICTIONARY #10465 (Artem Streltsov)
    • ➕ Add simple server-wide memory profiler that will collect allocation contexts when server memory usage becomes higher than the next allocation threshold. #10444 (alexey-milovidov)
    • Add setting always_fetch_merged_part which restrict replica to merge parts by itself and always prefer dowloading from other replicas. #10379 (alesapin)
    • ➕ Add function JSONExtractKeysAndValuesRaw which extracts raw data from JSON objects #10378 (hcz)
    • ➕ Add memory usage from OS to system.asynchronous_metrics. #10361 (alexey-milovidov)
    • ➕ Added generic variants for functions least and greatest. Now they work with arbitrary number of arguments of arbitrary types. This fixes #4767 #10318 (alexey-milovidov)
    • Now ClickHouse controls timeouts of dictionary sources on its side. Two new settings added to cache dictionary configuration: strict_max_lifetime_seconds, which is max_lifetime by default, and query_wait_timeout_milliseconds, which is one minute by default. The first settings is also useful with allow_read_expired_keys settings (to forbid reading very expired keys). #10337 (Nikita Mikhaylov)
    • Add log_queries_min_type to filter which entries will be written to query_log #10053 (Azat Khuzhin)
    • ➕ Added function isConstant. This function checks whether its argument is constant expression and returns 1 or 0. It is intended for development, debugging and demonstration purposes. #10198 (alexey-milovidov)
    • ➕ add joinGetOrNull to return NULL when key is missing instead of returning the default value. #10094 (Amos Bird)
    • Consider NULL to be equal to NULL in IN operator, if the option transform_null_in is set. #10085 (achimbab)
    • ➕ Add ALTER TABLE ... RENAME COLUMN for MergeTree table engines family. #9948 (alesapin)
    • 👌 Support parallel distributed INSERT SELECT. #9759 (vxider)
    • Add ability to query Distributed over Distributed (w/o distributed_group_by_no_merge) ... #9923 (Azat Khuzhin)
    • ➕ Add function arrayReduceInRanges which aggregates array elements in given ranges. #9598 (hcz)
    • ➕ Add Dictionary Status on prometheus exporter. #9622 (Guillaume Tassery)
    • ➕ Add function arrayAUC #8698 (taiyang-li)
    • 👌 Support DROP VIEW statement for better TPC-H compatibility. #9831 (Amos Bird)
    • ➕ Add 'strict_order' option to windowFunnel() #9773 (achimbab)
    • 👌 Support DATE and TIMESTAMP SQL operators, e.g. SELECT date '2001-01-01' #9691 (Artem Zuikov)

    Experimental Feature

    • ➕ Added experimental database engine Atomic. It supports non-blocking DROP and RENAME TABLE queries and atomic EXCHANGE TABLES t1 AND t2 query #7512 (tavplubix)
    • 🎉 Initial support for ReplicatedMergeTree over S3 (it works in suboptimal way) #10126 (Pavel Kovalenko)

    🐛 Bug Fix

    • 🛠 Fixed incorrect scalar results inside inner query of MATERIALIZED VIEW in case if this query contained dependent table #10603 (Nikolai Kochetov)
    • Fixed bug, which caused HTTP requests to get stuck on client closing connection when readonly=2 and cancel_http_readonly_queries_on_client_close=1. #10684 (tavplubix)
    • 🛠 Fix segfault in StorageBuffer when exception is thrown on server startup. Fixes #10550 #10609 (tavplubix)
    • The querySYSTEM DROP DNS CACHE now also drops caches used to check if user is allowed to connect from some IP addresses #10608 (tavplubix)
    • 🛠 Fix usage of multiple IN operators with an identical set in one query. Fixes #10539 #10686 (Anton Popov)
    • 🛠 Fix crash in generateRandom with nested types. Fixes #10583. #10734 (Nikolai Kochetov)
    • 🛠 Fix data corruption for LowCardinality(FixedString) key column in SummingMergeTree which could have happened after merge. Fixes #10489. #10721 (Nikolai Kochetov)
    • Fix logic for aggregation_memory_efficient_merge_threads setting. #10667 (palasonic1)
    • 🛠 Fix disappearing totals. Totals could have being filtered if query had JOIN or subquery with external WHERE condition. Fixes #10674 #10698 (Nikolai Kochetov)
    • Fix the lack of parallel execution of remote queries with distributed_aggregation_memory_efficient enabled. Fixes #10655 #10664 (Nikolai Kochetov)
    • 🛠 Fix possible incorrect number of rows for queries with LIMIT. Fixes #10566, #10709 #10660 (Nikolai Kochetov)
    • 🛠 Fix index corruption, which may occur in some cases after merging compact parts into another compact part. #10531 (Anton Popov)
    • 🛠 Fix the situation, when mutation finished all parts, but hung up in is_done=0. #10526 (alesapin)
    • 🛠 Fix overflow at beginning of unix epoch for timezones with fractional offset from UTC. Fixes #9335. #10513 (alexey-milovidov)
    • 👍 Better diagnostics for input formats. Fixes #10204 #10418 (tavplubix)
    • 🛠 Fix numeric overflow in simpleLinearRegression() over large integers #10474 (hcz)
    • 🛠 Fix use-after-free in Distributed shutdown, avoid waiting for sending all batches #10491 (Azat Khuzhin)
    • ➕ Add CA certificates to clickhouse-server docker image #10476 (filimonov)
    • 🛠 Fix a rare endless loop that might have occurred when using the addressToLine function or AggregateFunctionState columns. #10466 (Alexander Kuzmenkov)
    • 🖐 Handle zookeeper "no node error" during distributed query #10050 (Daniel Chen)
    • 🛠 Fix bug when server cannot attach table after column's default was altered. #10441 (alesapin)
    • 0️⃣ Implicitly cast the default expression type to the column type for the ALIAS columns #10563 (Azat Khuzhin)
    • 📇 Don't remove metadata directory if ATTACH DATABASE fails #10442 (Winter Zhang)
    • 🛠 Avoid dependency on system tzdata. Fixes loading of Africa/Casablanca timezone on CentOS 8. Fixes #10211 #10425 (alexey-milovidov)
    • 🛠 Fix some issues if data is inserted with quorum and then gets deleted (DROP PARTITION, TTL, etc.). It led to stuck of INSERTs or false-positive exceptions in SELECTs. Fixes #9946 #10188 (Nikita Mikhaylov)
    • Check the number and type of arguments when creating BloomFilter index #9623 #10431 (Winter Zhang)
    • Prefer fallback_to_stale_replicas over skip_unavailable_shards, otherwise when both settings specified and there are no up-to-date replicas the query will fail (patch from @alex-zaitsev ) #10422 (Azat Khuzhin)
    • 🛠 Fix the issue when a query with ARRAY JOIN, ORDER BY and LIMIT may return incomplete result. Fixes #10226. #10427 (Vadim Plakhtinskiy)
    • ➕ Add database name to dictionary name after DETACH/ATTACH. Fixes system.dictionaries table and SYSTEM RELOAD query #10415 (Azat Khuzhin)
    • 🛠 Fix possible incorrect result for extremes in processors pipeline. #10131 (Nikolai Kochetov)
    • Fix possible segfault when the setting distributed_group_by_no_merge is enabled (introduced in 20.3.7.46 by #10131). #10399 (Nikolai Kochetov)
    • 🛠 Fix wrong flattening of Array(Tuple(...)) data types. Fixes #10259 #10390 (alexey-milovidov)
    • 🛠 Fix column names of constants inside JOIN that may clash with names of constants outside of JOIN #9950 (Alexander Kuzmenkov)
    • 🛠 Fix order of columns after Block::sortColumns() #10826 (Azat Khuzhin)
    • 🛠 Fix possible Pipeline stuck error in ConcatProcessor which may happen in remote query. #10381 (Nikolai Kochetov)
    • 🛠 Don't make disk reservations for aggregations. Fixes #9241 #10375 (Azat Khuzhin)
    • 🛠 Fix wrong behaviour of datetime functions for timezones that has altered between positive and negative offsets from UTC (e.g. Pacific/Kiritimati). Fixes #7202 #10369 (alexey-milovidov)
    • 🛠 Avoid infinite loop in dictIsIn function. Fixes #515 #10365 (alexey-milovidov)
    • 0️⃣ Disable GROUP BY sharding_key optimization by default and fix it for WITH ROLLUP/CUBE/TOTALS #10516 (Azat Khuzhin)
    • 🛠 Check for error code when checking parts and don't mark part as broken if the error is like "not enough memory". Fixes #6269 #10364 (alexey-milovidov)
    • 👉 Show information about not loaded dictionaries in system tables. #10234 (Vitaly Baranov)
    • 🛠 Fix nullptr dereference in StorageBuffer if server was shutdown before table startup. #10641 (alexey-milovidov)
    • 🛠 Fixed DROP vs OPTIMIZE race in ReplicatedMergeTree. DROP could left some garbage in replica path in ZooKeeper if there was concurrent OPTIMIZE query. #10312 (tavplubix)
    • 🛠 Fix 'Logical error: CROSS JOIN has expressions' error for queries with comma and names joins mix. Fixes #9910 #10311 (Artem Zuikov)
    • Fix queries with max_bytes_before_external_group_by. #10302 (Artem Zuikov)
    • Fix the issue with limiting maximum recursion depth in parser in certain cases. This fixes #10283 This fix may introduce minor incompatibility: long and deep queries via clickhouse-client may refuse to work, and you should adjust settings max_query_size and max_parser_depth accordingly. #10295 (alexey-milovidov)
    • 👍 Allow to use count(*) with multiple JOINs. Fixes #9853 #10291 (Artem Zuikov)
    • Fix error Pipeline stuck with max_rows_to_group_by and group_by_overflow_mode = 'break'. #10279 (Nikolai Kochetov)
    • 🛠 Fix 'Cannot add column' error while creating range_hashed dictionary using DDL query. Fixes #10093. #10235 (alesapin)
    • 🛠 Fix rare possible exception Cannot drain connections: cancel first. #10239 (Nikolai Kochetov)
    • 🛠 Fixed bug where ClickHouse would throw "Unknown function lambda." error message when user tries to run ALTER UPDATE/DELETE on tables with ENGINE = Replicated*. Check for nondeterministic functions now handles lambda expressions correctly. #10237 (Alexander Kazakov)
    • 🛠 Fixed reasonably rare segfault in StorageSystemTables that happens when SELECT ... FROM system.tables is run on a database with Lazy engine. #10209 (Alexander Kazakov)
    • 🛠 Fix possible infinite query execution when the query actually should stop on LIMIT, while reading from infinite source like system.numbers or system.zeros. #10206 (Nikolai Kochetov)
    • 🛠 Fixed "generateRandom" function for Date type. This fixes #9973. Fix an edge case when dates with year 2106 are inserted to MergeTree tables with old-style partitioning but partitions are named with year 1970. #10218 (alexey-milovidov)
    • 🛠 Convert types if the table definition of a View does not correspond to the SELECT query. This fixes #10180 and #10022 #10217 (alexey-milovidov)
    • 🛠 Fix parseDateTimeBestEffort for strings in RFC-2822 when day of week is Tuesday or Thursday. This fixes #10082 #10214 (alexey-milovidov)
    • 🛠 Fix column names of constants inside JOIN that may clash with names of constants outside of JOIN. #10207 (alexey-milovidov)
    • 🛠 Fix move-to-prewhere optimization in presense of arrayJoin functions (in certain cases). This fixes #10092 #10195 (alexey-milovidov)
    • 🛠 Fix issue with separator appearing in SCRAMBLE for native mysql-connector-java (JDBC) #10140 (BohuTANG)
    • 🛠 Fix using the current database for an access checking when the database isn't specified. #10192 (Vitaly Baranov)
    • 🛠 Fix ALTER of tables with compact parts. #10130 (Anton Popov)
    • Add the ability to relax the restriction on non-deterministic functions usage in mutations with allow_nondeterministic_mutations setting. #10186 (filimonov)
    • 🛠 Fix DROP TABLE invoked for dictionary #10165 (Azat Khuzhin)
    • Convert blocks if structure does not match when doing INSERT into Distributed table #10135 (Azat Khuzhin)
    • The number of rows was logged incorrectly (as sum across all parts) when inserted block is split by parts with partition key. #10138 (alexey-milovidov)
    • ➕ Add some arguments check and support identifier arguments for MySQL Database Engine #10077 (Winter Zhang)
    • Fix incorrect index_granularity_bytes check while creating new replica. Fixes #10098. #10121 (alesapin)
    • 🛠 Fix bug in CHECK TABLE query when table contain skip indices. #10068 (alesapin)
    • 🛠 Fix Distributed-over-Distributed with the only one shard in a nested table #9997 (Azat Khuzhin)
    • 🛠 Fix possible rows loss for queries with JOIN and UNION ALL. Fixes #9826, #10113. ... #10099 (Nikolai Kochetov)
    • 🛠 Fix bug in dictionary when local clickhouse server is used as source. It may caused memory corruption if types in dictionary and source are not compatible. #10071 (alesapin)
    • 🛠 Fixed replicated tables startup when updating from an old ClickHouse version where /table/replicas/replica_name/metadata node doesn't exist. Fixes #10037. #10095 (alesapin)
    • Fix error Cannot clone block with columns because block has 0 columns ... While executing GroupingAggregatedTransform. It happened when setting distributed_aggregation_memory_efficient was enabled, and distributed query read aggregating data with mixed single and two-level aggregation from different shards. #10063 (Nikolai Kochetov)
    • 🛠 Fix deadlock when database with materialized view failed attach at start #10054 (Azat Khuzhin)
    • 🛠 Fix a segmentation fault that could occur in GROUP BY over string keys containing trailing zero bytes (#8636, #8925). ... #10025 (Alexander Kuzmenkov)
    • 🛠 Fix wrong results of distributed queries when alias could override qualified column name. Fixes #9672 #9714 #9972 (Artem Zuikov)
    • 🛠 Fix possible deadlock in SYSTEM RESTART REPLICAS #9955 (tavplubix)
    • 🛠 Fix the number of threads used for remote query execution (performance regression, since 20.3). This happened when query from Distributed table was executed simultaneously on local and remote shards. Fixes #9965 #9971 (Nikolai Kochetov)
    • 🛠 Fixed DeleteOnDestroy logic in ATTACH PART which could lead to automatic removal of attached part and added few tests #9410 (Vladimir Chebotarev)
    • 🛠 Fix a bug with ON CLUSTER DDL queries freezing on server startup. #9927 (Gagan Arneja)
    • 🛠 Fix bug in which the necessary tables weren't retrieved at one of the processing stages of queries to some databases. Fixes #9699. #9949 (achulkov2)
    • 🛠 Fix 'Not found column in block' error when JOIN appears with TOTALS. Fixes #9839 #9939 (Artem Zuikov)
    • 🛠 Fix parsing multiple hosts set in the CREATE USER command #9924 (Vitaly Baranov)
    • 🛠 Fix TRUNCATE for Join table engine (#9917). #9920 (Amos Bird)
    • 🛠 Fix race condition between drop and optimize in ReplicatedMergeTree. #9901 (alesapin)
    • Fix DISTINCT for Distributed when optimize_skip_unused_shards is set. #9808 (Azat Khuzhin)
    • 🛠 Fix "scalar doesn't exist" error in ALTERs (#9878). ... #9904 (Amos Bird)
    • Fix error with qualified names in distributed_product_mode=\'local\'. Fixes #4756 #9891 (Artem Zuikov)
    • 👻 For INSERT queries shards now do clamp the settings from the initiator to their constraints instead of throwing an exception. This fix allows to send INSERT queries to a shard with another constraints. This change improves fix #9447. #9852 (Vitaly Baranov)
    • Add some retries when commiting offsets to Kafka broker, since it can reject commit if during offsets.commit.timeout.ms there were no enough replicas available for the __consumer_offsets topic #9884 (filimonov)
    • 🛠 Fix Distributed engine behavior when virtual columns of the underlying table used in WHERE #9847 (Azat Khuzhin)
    • 🛠 Fixed some cases when timezone of the function argument wasn't used properly. #9574 (Vasily Nemkov)
    • Fix 'Different expressions with the same alias' error when query has PREWHERE and WHERE on distributed table and SET distributed_product_mode = 'local'. #9871 (Artem Zuikov)
    • 🛠 Fix mutations excessive memory consumption for tables with a composite primary key. This fixes #9850. #9860 (alesapin)
    • Fix calculating grants for introspection functions from the setting allow_introspection_functions. #9840 (Vitaly Baranov)
    • Fix max_distributed_connections (w/ and w/o Processors) #9673 (Azat Khuzhin)
    • 🛠 Fix possible exception Got 0 in totals chunk, expected 1 on client. It happened for queries with JOIN in case if right joined table had zero rows. Example: select * from system.one t1 join system.one t2 on t1.dummy = t2.dummy limit 0 FORMAT TabSeparated;. Fixes #9777. ... #9823 (Nikolai Kochetov)
    • 🛠 Fix 'COMMA to CROSS JOIN rewriter is not enabled or cannot rewrite query' error in case of subqueries with COMMA JOIN out of tables lists (i.e. in WHERE). Fixes #9782 #9830 (Artem Zuikov)
    • Fix server crashing when optimize_skip_unused_shards is set and expression for key can't be converted to its field type #9804 (Azat Khuzhin)
    • 🛠 Fix empty string handling in splitByString. #9767 (hcz)
    • 🛠 Fix broken ALTER TABLE DELETE COLUMN query for compact parts. #9779 (alesapin)
    • Fixed missing rows_before_limit_at_least for queries over http (with processors pipeline). Fixes #9730 #9757 (Nikolai Kochetov)
    • 🛠 Fix excessive memory consumption in ALTER queries (mutations). This fixes #9533 and #9670. #9754 (alesapin)
    • 🛠 Fix possible permanent "Cannot schedule a task" error. #9154 (Azat Khuzhin)
    • 🛠 Fix bug in backquoting in external dictionaries DDL. Fixes #9619. #9734 (alesapin)
    • 🛠 Fixed data race in text_log. It does not correspond to any real bug. #9726 (alexey-milovidov)
    • 🛠 Fix bug in a replication that doesn't allow replication to work if the user has executed mutations on the previous version. This fixes #9645. #9652 (alesapin)
    • 🛠 Fixed incorrect internal function names for sumKahan and sumWithOverflow. It led to exception while using this functions in remote queries. #9636 (Azat Khuzhin)
    • Add setting use_compact_format_in_distributed_parts_names which allows to write files for INSERT queries into Distributed table with more compact format. This fixes #9647. #9653 (alesapin)
    • 🛠 Fix RIGHT and FULL JOIN with LowCardinality in JOIN keys. #9610 (Artem Zuikov)
    • 🛠 Fix possible exceptions Size of filter doesn't match size of column and Invalid number of rows in Chunk in MergeTreeRangeReader. They could appear while executing PREWHERE in some cases. #9612 (Anton Popov)
    • 👍 Allow ALTER ON CLUSTER of Distributed tables with internal replication. This fixes #3268 #9617 (shinoi2)
    • 🛠 Fix issue when timezone was not preserved if you write a simple arithmetic expression like time + 1 (in contrast to an expression like time + INTERVAL 1 SECOND). This fixes #5743 #9323 (alexey-milovidov)

    👌 Improvement

    • 🛠 Use time zone when comparing DateTime with string literal. This fixes #5206. #10515 (alexey-milovidov)
    • 🖨 Print verbose diagnostic info if Decimal value cannot be parsed from text input format. #10205 (alexey-milovidov)
    • ➕ Add tasks/memory metrics for distributed/buffer schedule pools #10449 (Azat Khuzhin)
    • 🛠 Display result as soon as it's ready for SELECT DISTINCT queries in clickhouse-local and HTTP interface. This fixes #8951 #9559 (alexey-milovidov)
    • 👍 Allow to use SAMPLE OFFSET query instead of cityHash64(PRIMARY KEY) % N == n for splitting in clickhouse-copier. To use this feature, pass --experimental-use-sample-offset 1 as a command line argument. #10414 (Nikita Mikhaylov)
    • 👍 Allow to parse BOM in TSV if the first column cannot contain BOM in its value. This fixes #10301 #10424 (alexey-milovidov)
    • ➕ Add Avro nested fields insert support #10354 (Andrew Onyshchuk)
    • 👍 Allowed to alter column in non-modifying data mode when the same type is specified. #10382 (Vladimir Chebotarev)
    • Auto distributed_group_by_no_merge on GROUP BY sharding key (if optimize_skip_unused_shards is set) #10341 (Azat Khuzhin)
    • ⚡️ Optimize queries with LIMIT/LIMIT BY/ORDER BY for distributed with GROUP BY sharding_key #10373 (Azat Khuzhin)
    • Added a setting max_server_memory_usage to limit total memory usage of the server. The metric MemoryTracking is now calculated without a drift. The setting max_memory_usage_for_all_queries is now obsolete and does nothing. This closes #10293. #10362 (alexey-milovidov)
    • Add config option system_tables_lazy_load. If it's set to false, then system tables with logs are loaded at the server startup. Alexander Burmak, Svyatoslav Tkhon Il Pak, #9642 #10359 (alexey-milovidov)
    • ⏱ Use background thread pool (background_schedule_pool_size) for distributed sends #10263 (Azat Khuzhin)
    • 👉 Use background thread pool for background buffer flushes. #10315 (Azat Khuzhin)
    • 👌 Support for one special case of removing incompletely written parts. This fixes #9940. #10221 (alexey-milovidov)
    • 👉 Use isInjective() over manual list of such functions for GROUP BY optimization. #10342 (Azat Khuzhin)
    • 🖨 Avoid printing error message in log if client sends RST packet immediately on connect. It is typical behaviour of IPVS balancer with keepalived and VRRP. This fixes #1851 #10274 (alexey-milovidov)
    • 👍 Allow to parse +inf for floating point types. This closes #1839 #10272 (alexey-milovidov)
    • Implemented generateRandom table function for Nested types. This closes #9903 #10219 (alexey-milovidov)
    • 👍 Provide max_allowed_packed in MySQL compatibility interface that will help some clients to communicate with ClickHouse via MySQL protocol. #10199 (BohuTANG)
    • 👍 Allow literals for GLOBAL IN (i.e. SELECT * FROM remote('localhost', system.one) WHERE dummy global in (0)) #10196 (Azat Khuzhin)
    • 🛠 Fix various small issues in interactive mode of clickhouse-client #10194 (alexey-milovidov)
    • Avoid superfluous dictionaries load (system.tables, DROP/SHOW CREATE TABLE) #10164 (Azat Khuzhin)
    • ⚡️ Update to RWLock: timeout parameter for getLock() + implementation reworked to be phase fair #10073 (Alexander Kazakov)
    • ✨ Enhanced compatibility with native mysql-connector-java(JDBC) #10021 (BohuTANG)
    • The function toString is considered monotonic and can be used for index analysis even when applied in tautological cases with String or LowCardinality(String) argument. #10110 (Amos Bird)
    • ➕ Add ON CLUSTER clause support to commands {CREATE|DROP} USER/ROLE/ROW POLICY/SETTINGS PROFILE/QUOTA, GRANT. #9811 (Vitaly Baranov)
    • 💅 Virtual hosted-style support for S3 URI #9998 (Pavel Kovalenko)
    • 🛠 Now layout type for dictionaries with no arguments can be specified without round brackets in dictionaries DDL-queries. Fixes #10057. #10064 (alesapin)
    • ➕ Add ability to use number ranges with leading zeros in filepath #9989 (Olga Khvostikova)
    • 👍 Better memory usage in CROSS JOIN. #10029 (Artem Zuikov)
    • Try to connect to all shards in cluster when getting structure of remote table and skip_unavailable_shards is set. #7278 (nvartolomei)
    • Add total_rows/total_bytes into the system.tables table. #9919 (Azat Khuzhin)
    • 0️⃣ System log tables now use polymorpic parts by default. #9905 (Anton Popov)
    • Add type column into system.settings/merge_tree_settings #9909 (Azat Khuzhin)
    • Check for available CPU instructions at server startup as early as possible. #9888 (alexey-milovidov)
    • ✂ Remove ORDER BY stage from mutations because we read from a single ordered part in a single thread. Also add check that the rows in mutation are ordered by sorting key and this order is not violated. #9886 (alesapin)
    • 🛠 Implement operator LIKE for FixedString at left hand side. This is needed to better support TPC-DS queries. #9890 (alexey-milovidov)
    • ⚡️ Add force_optimize_skip_unused_shards_no_nested that will disable force_optimize_skip_unused_shards for nested Distributed table #9812 (Azat Khuzhin)
    • 🔀 Now columns size is calculated only once for MergeTree data parts. #9827 (alesapin)
    • Evaluate constant expressions for optimize_skip_unused_shards (i.e. SELECT * FROM foo_dist WHERE key=xxHash32(0)) #8846 (Azat Khuzhin)
    • 🚚 Check for using Date or DateTime column from TTL expressions was removed. #9967 (Vladimir Chebotarev)
    • DiskS3 hard links optimal implementation. #9760 (Pavel Kovalenko)
    • If set multiple_joins_rewriter_version = 2 enables second version of multiple JOIN rewrites that keeps not clashed column names as is. It supports multiple JOINs with USING and allow select * for JOINs with subqueries. #9739 (Artem Zuikov)
    • Implementation of "non-blocking" alter for StorageMergeTree #9606 (alesapin)
    • ➕ Add MergeTree full support for DiskS3 #9646 (Pavel Kovalenko)
    • 👍 Extend splitByString to support empty strings as separators. #9742 (hcz)
    • Add a timestamp_ns column to system.trace_log. It contains a high-definition timestamp of the trace event, and allows to build timelines of thread profiles ("flame charts"). #9696 (Alexander Kuzmenkov)
    • 🔊 When the setting send_logs_level is enabled, avoid intermixing of log messages and query progress. #9634 (Azat Khuzhin)
    • ➕ Added support of MATERIALIZE TTL IN PARTITION. #9581 (Vladimir Chebotarev)
    • 👌 Support complex types inside Avro nested fields #10502 (Andrew Onyshchuk)

    🐎 Performance Improvement

    • 👍 Better insert logic for right table for Partial MergeJoin. #10467 (Artem Zuikov)
    • 👌 Improved performance of row-oriented formats (more than 10% for CSV and more than 35% for Avro in case of narrow tables). #10503 (Andrew Onyshchuk)
    • 👌 Improved performance of queries with explicitly defined sets at right side of IN operator and tuples on the left side. #10385 (Anton Popov)
    • 👉 Use less memory for hash table in HashJoin. #10416 (Artem Zuikov)
    • Special HashJoin over StorageDictionary. Allow rewrite dictGet() functions with JOINs. It's not backward incompatible itself but could uncover #8400 on some installations. #10133 (Artem Zuikov)
    • 👍 Enable parallel insert of materialized view when its target table supports. #10052 (vxider)
    • 👌 Improved performance of index analysis with monotonic functions. #9607#10026 (Anton Popov)
    • Using SSE2 or SSE4.2 SIMD intrinsics to speed up tokenization in bloom filters. #9968 (Vasily Nemkov)
    • 👌 Improved performance of queries with explicitly defined sets at right side of IN operator. This fixes performance regression in version 20.3. #9740 (Anton Popov)
    • Now clickhouse-copier splits each partition in number of pieces and copies them independently. #9075 (Nikita Mikhaylov)
    • ➕ Adding more aggregation methods. For example TPC-H query 1 will now pick FixedHashMap<UInt16, AggregateDataPtr> and gets 25% performance gain #9829 (Amos Bird)
    • 👉 Use single row counter for multiple streams in pre-limit transform. This helps to avoid uniting pipeline streams in queries with limit but without order by (like select f(x) from (select x from t limit 1000000000)) and use multiple threads for further processing. #9602 (Nikolai Kochetov)

    🏗 Build/Testing/Packaging Improvement