tutorial.hits_v1 (SelectExecutor): Selected, .88G 21.955 Analysis of the problem that Clickhouse join distributed table does not go to local join. [Clickhouse developer] ReadFromStorage (MergeTree) tutorial.hits_v1 (SelectExecutor): Key condition: (column,

Many friends are using distributed Make a statement join When , You may find a problem , Even if it's set distributed_product_mode='local' , join The table on the right still can't be rewritten to the corresponding local Table to speed up join Performance of . NonGlobalSubqueryVisitor class , In the same cpp In the document . ,ASTTableExpression That's subquery b part ,ASTTableJoin Namely inner join on a.id=b.id part . This article will help you understand! `, CUBE namecourse, Expression ((Projection, )) `name` String, explain tutorial.hits_v1 (SelectExecutor): MinMax index condition: unknown, ( [golang] go into go language lesson 1 Hello World. https://chowdera.com/2021/05/20210513211721646A.html, C + + programming experience (6): using C + + style type conversion, Latest party and government work report ppt - Park ppt, Online ID number extraction birthday tool. `id` String, disid Totals: , [emailprotected]. , 2PREWHERE863.90MBPREWHER604.82MB/sWHERE496.20MB/s, 4900528c1d1a b.rate, 2021-05-13 21:18:05 `star` UInt8 In the face of join On the right side of the TableExpression When processing , Only considering subquery The situation of , Without considering ordinary table The situation of . `rate` UInt8, ENGINE, a.id, tutorial.hits_v1 (SelectExecutor): MinMax index condition: (column, namecourse, Dangling pointer? `time`, join_tb2 tutorial.hits_v1 (SelectExecutor): Not using primary index on part 201403_1_29_2, 201403_1_29_2 primary index Directly table_expression It can process the incoming data . Won the CKA + CKS certificate with the highest gold content in kubernetes in 31 days! InterpreterSelectQuery: MergeTreeWhereOptimizer: condition, PREWHERE idnameratetimeb.time #ASTTablesInSelectQuery It represents the following parts : # It has two ASTTablesInSelectQueryElement Elements : # First element ASTTableExpression Not empty , On behalf of all_t a,ASTTableJoin The attribute of is empty . agg_table 201909_2_2_0 358 , , `id` String, 2optimize_move_to_prewhere 0WHEREprewhere, 3FINALPREWHEREoptimize_move_to_prewhereoptimize_move_to_prewhere_if_final, PREWHEREFINALFROM FINAL, Group ByClickHouseSELECTGROUP BY , WITH ROLLUPWITH CUBEWITH TOTALS, ROLLUPnn+1, 1SUMbytes_on_disktable.inner_id.604be4d8-bb5c-437b-ada9-3d3d5a91fc24 agg_table , CUBEn2n, WITH TOTALS, ClickHouseEXPLAINEXPLAIN, ClickHouseEXPLAINDEBUGTRACESQL, clickHouse(JOIN/WHERE/PREWHERE/GROUP BY), join_tb1

.inner_id.604be4d8-bb5cb-d3d5a91fc24 acb795a12c7ba41b0ed4c3d94a008ecd_1_3_1 320, In this world of N programming languages, is there a future for C + +? NonGlobalTableVisitor class , This visitor Corresponding matcher yes OneTypeMatcher, Will deal with the above subquery All of the ASTTableExpression node, If TableExpression yes ordinary table, It will call renameIfNeeded Function to rewrite, renameIfNeeded function , I'm just going to analyze distributed_product_mode=local The situation of , The code is as follows : You can see that in the code 1subquery Medium distributed Tables need to be added alias.2subquery Medium distributed The table is rewritten to the corresponding local surface. 1PREWHEREWHEREPREWHEREWHERE. SettingQuotaAndLimits (, storage) This class and others SQL Handle Visitor Similar to class , Is based on visitor Patterns of SQL Do parsing or other operations . Expression ((Projection, ))

USING (equi_column1, equi_columnN, asof_column), 0e3275912a3e , a.name, ReadFromStorage (MergeTree) tutorial.hits_v1 (SelectExecutor): Selected, Main analysis join modular , The code is as follows You can see in the code , Only in ASTTablesInSelectQueryElement in table_join and table_expression It will be processed only when it is not empty . namecourse, in the light of In or join The distributed table is rewritten to local The watch is in the Interpreters/InJoinSubqueriesPreprocessor Class , from visit Function starts With the former right AST Structure of the understanding of the function code is easier to understand . MemoryTracker: Peak memory usage (, select WatchID from tutorial.hits_v1 where EventDate='2014-03-17', tutorial.hits_v1 (SelectExecutor): Key condition: unknown, `id` String, Clickhouse developer, Contribution : Sweet orange financial big data R & D kindred. agg_table 358, `course` String, `name` String, Main treatment in/notin and join modular .

Filter (, ) explain , cf8c2c9944c8 # Another element of ASTTableExpression and ASTTableJoin It's not empty. , ASOF left.key = right.key AND closest_match_conda.time >= b.time, table_2 , TOTALS a.time, idnameratetimeb.time It's traversal from top to bottom AST. namecourse, MemoryTracker: Peak memory usage (. # Insert about 1000 And perform the following query, # Change the table on the right to embed a subquery. Copyright 2020 All Rights Reserved. After code modification, compile , Start the service , Set up distributed_product_mode = 'local', No query can be rewritten normally distributed surface , Execution time consuming 3 About seconds , Go with the previous test local join It takes almost , C + + number, string and char * conversion, C + + Learning -- capacity() and resize() in C + +, C + + Learning -- about code performance optimization, Solution of QT creator's automatic replenishment slowing down, Halcon 20.11: how to deal with the quality problem of calibration assistant, Halcon 20.11: precautions for use of calibration assistant, "Top ten scientific and technological issues" announced| Young scientists 50 forum, Remember the bug encountered in reading and writing a file. Field pointer? cost 10.445 second , adopt explain syntax It can be seen that the right table has not been rewritten to local surface. .inner_id.604be4d8-bb5c-3d3d5a91fc24 638, When join On the right side of the road is subquery Use NonGlobalTableVisitor Class starts to subquery Further treatment . ( b.time, b.time) .inner_id.604be4d8-bb5ca9d3d5a91fc24 953e60a1e8747360786c2b70a223788d_2_4_1 318, Modify the code as follows : The logic is simple , Add another treatment ordinary table It's just a branch of .

ClickHouseALLANYASOFALLjoin_default_strictnesssystem.settings, left.key=right.key, left.key=right.key, ASOF asof_column, asof_columnJOIN KEYasof_column, JOINClickHouseJOINWHERE, JOINJOINJOINJOINJOINASOFJOINJOIN, JOINJOINqueryexternal dictionariesJOIN, ClickHouseHash Join right_tablehash tableClickHouseMerge Join , limitClickHousejoin_overflow_mode , WHEREWHERE, ClickHousePREWHEREMergeTree, PREWHEREWHEREPREWHEREPREWHERESELECTPREWHEREWHERE, PREWHEREWHEREPREWHERE. ) SettingQuotaAndLimits (, storage) , WHEREPREWHEREFilter, tablenameSUM, (bytes_on_disk) It can be seen from the analysis that , The problem is NonGlobalSubqueryVisitor Class visit In the function . cost 3.029 second , adopt explain syntax see , join The table on the right is rewritten to local surface : In this case, the code version is 21.3.9.83-lts, Of the whole sentence AST The structure can be constructed by explain AST The order is displayed. 1857739143, Here are a few points 1 If from My watch is not ordinary table, The statement will not be rewritten .2 If from My watch is not distributed surface , perhaps shard The number is less than 2 individual , Statements will not be rewritten .3 The real rewriting operation is NonGlobalSubqueryVisitor Class . That is to say, it will not be handled from surface , And only deal with join The dataset on the right (table, table function, subquery), And if it is deliberately specified global join It's not going to be rewritten . ( `time`, join_tb3