Skip to content

[fix](profile) populate inverted index metrics for SEARCH() query path#62121

Open
airborne12 wants to merge 2 commits intoapache:masterfrom
airborne12:fix-search-inverted-index-metrics
Open

[fix](profile) populate inverted index metrics for SEARCH() query path#62121
airborne12 wants to merge 2 commits intoapache:masterfrom
airborne12:fix-search-inverted-index-metrics

Conversation

@airborne12
Copy link
Copy Markdown
Member

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

SEARCH() function uses the query_v2 engine, which bypasses InvertedIndexReader::query(). This causes all inverted index profile metrics to report zero, making it impossible to diagnose SEARCH() performance issues from query profiles.

Three independent problems were identified and fixed:

1. Query-level timers not populated (function_search.cpp)

The MATCH path populates inverted_index_query_timer, searcher_open_timer, searcher_search_timer, etc. via SCOPED_RAW_TIMER in inverted_index_reader.cpp. The SEARCH path bypasses this entirely — FieldReaderResolver::resolve() opens the index and evaluate_inverted_index_with_search_param() executes the query without any timer instrumentation.

Fix: Add SCOPED_RAW_TIMER at corresponding points in function_search.cpp:

  • inverted_index_query_timer around the entire evaluation
  • inverted_index_searcher_open_timer around searcher cache miss (directory open)
  • inverted_index_searcher_search_timer around query build + execution
  • inverted_index_searcher_search_init_timer around build_query_recursive()
  • inverted_index_searcher_search_exec_timer around collect_multi_segment_doc_set/top_k()
  • inverted_index_searcher_cache_hit/miss counters in resolve()

2. FileCache IO not attributed to inverted index (inverted_index_compound_reader)

CSIndexInput clone constructor did not copy _io_ctx, causing cloned inputs to read without file_cache_stats — so InvertedIndexBytesScannedFromRemote/Cache etc. stayed zero. Additionally, DorisCompoundReader::openInput() created new CSIndexInput without setting _io_ctx, so query_v2 scorer reads lacked proper IO context.

Fix:

  • Copy _io_ctx in CSIndexInput clone constructor
  • Store _io_ctx in DorisCompoundReader and propagate it to new CSIndexInput in openInput()

3. Profile counters pruned by level (olap_scan_operator.cpp)

All inverted index counters were registered with default level=2 via ADD_TIMER/ADD_COUNTER. The profile display system prunes level-2 leaf counters when profile_level < 2, making them invisible in query profiles. The FileCache counters used ADD_CHILD_TIMER_WITH_LEVEL(..., 1) and were visible.

Fix: Register inverted index counters with level=1 via ADD_TIMER_WITH_LEVEL/ADD_COUNTER_WITH_LEVEL.

Release note

None

Check List (For Author)

  • Test

    • Manual test (add detailed scripts or steps below)
      • Deployed on cloud_sim cluster with S3 storage
      • Cold query (file cache cleared): verified InvertedIndexBytesScannedFromRemote: 1.15 KB, InvertedIndexNumRemoteIOTotal: 1, InvertedIndexRemoteIOUseTimer: 46.919ms
      • Warm query (file cache populated): verified InvertedIndexBytesScannedFromCache: 1.15 KB, InvertedIndexNumLocalIOTotal: 1
      • Query-level timers: verified InvertedIndexQueryTime: 51.155ms, InvertedIndexSearcherOpenTime: 49.196ms, InvertedIndexSearcherSearchTime: 51.60ms, InvertedIndexSearcherCacheMiss: 1
  • Behavior changed:

    • Yes. Inverted index profile counters now visible at default profile level and populated for SEARCH() queries.
  • Does this need documentation?

    • No.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

SEARCH() uses query_v2 engine bypassing InvertedIndexReader::query(),
causing all inverted index profile metrics to be zero. This patch fixes
three independent issues:

1. Query-level timers: Add SCOPED_RAW_TIMER instrumentation in
   function_search.cpp for inverted_index_query_timer,
   searcher_open/search/init/exec timers, and cache hit/miss counters.

2. FileCache IO attribution: Fix IOContext propagation in compound reader
   so that file_cache_stats and is_inverted_index flag reach
   CachedRemoteFileReader. Store io_ctx in DorisCompoundReader and
   propagate to CSIndexInput on openInput(); copy _io_ctx in
   CSIndexInput clone constructor.

3. Profile counter visibility: Change inverted index counter registration
   from default level=2 to level=1 so they survive profile pruning.
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 28971 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3a0339649f21428258d44ea8f7f8c49ae5be9301, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17664	3704	3743	3704
q2	q3	10687	894	620	620
q4	4695	475	360	360
q5	7654	1338	1129	1129
q6	268	171	136	136
q7	934	950	783	783
q8	10410	1423	1308	1308
q9	5990	5431	5304	5304
q10	6331	2040	1795	1795
q11	484	275	270	270
q12	703	406	281	281
q13	18163	2795	2164	2164
q14	285	293	259	259
q15	q16	899	869	789	789
q17	1098	1083	716	716
q18	6470	5674	5625	5625
q19	1209	1205	1059	1059
q20	556	426	280	280
q21	4993	2383	2042	2042
q22	504	422	347	347
Total cold run time: 99997 ms
Total hot run time: 28971 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4559	4435	4334	4334
q2	q3	4637	4719	4181	4181
q4	2072	2112	1368	1368
q5	5003	5075	5222	5075
q6	198	164	131	131
q7	1986	1846	1629	1629
q8	3295	3075	3071	3071
q9	8257	8222	8359	8222
q10	4461	4434	4259	4259
q11	596	402	391	391
q12	679	727	520	520
q13	3191	3349	2380	2380
q14	307	313	287	287
q15	q16	781	770	702	702
q17	1297	1237	1260	1237
q18	7859	7044	7023	7023
q19	1163	1158	1063	1063
q20	2313	2269	1982	1982
q21	6059	5313	5199	5199
q22	596	526	464	464
Total cold run time: 59309 ms
Total hot run time: 53518 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 178632 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3a0339649f21428258d44ea8f7f8c49ae5be9301, data reload: false

query5	4337	658	529	529
query6	357	223	211	211
query7	4230	573	336	336
query8	339	264	238	238
query9	8785	3905	3872	3872
query10	470	364	304	304
query11	6719	5488	5144	5144
query12	193	134	135	134
query13	1306	617	457	457
query14	5671	5198	4771	4771
query14_1	4154	4149	4130	4130
query15	215	203	185	185
query16	1007	467	453	453
query17	1158	816	653	653
query18	2472	503	388	388
query19	238	210	171	171
query20	144	137	131	131
query21	225	140	121	121
query22	13956	14820	14488	14488
query23	18212	17306	16695	16695
query23_1	16935	16721	16833	16721
query24	8302	2030	1403	1403
query24_1	1375	1377	1385	1377
query25	617	531	470	470
query26	1262	332	181	181
query27	2635	677	383	383
query28	4483	1874	1912	1874
query29	1016	680	554	554
query30	298	229	195	195
query31	1087	1061	954	954
query32	99	72	70	70
query33	532	334	292	292
query34	1180	1171	685	685
query35	746	797	669	669
query36	1234	1226	1122	1122
query37	149	98	82	82
query38	3065	3027	2993	2993
query39	915	897	857	857
query39_1	840	841	844	841
query40	237	149	131	131
query41	63	59	58	58
query42	110	108	106	106
query43	312	316	273	273
query44	
query45	209	196	192	192
query46	1148	1211	803	803
query47	2342	2354	2214	2214
query48	405	420	325	325
query49	643	540	440	440
query50	752	289	216	216
query51	4284	4286	4215	4215
query52	108	109	104	104
query53	250	271	200	200
query54	327	286	259	259
query55	100	92	98	92
query56	312	302	313	302
query57	1651	1705	1689	1689
query58	301	276	276	276
query59	2864	2983	2726	2726
query60	334	337	325	325
query61	156	156	154	154
query62	707	626	567	567
query63	243	195	203	195
query64	5267	1354	982	982
query65	
query66	1520	460	394	394
query67	24350	24433	24087	24087
query68	
query69	448	345	306	306
query70	1007	1019	998	998
query71	321	291	281	281
query72	3035	2830	2474	2474
query73	777	752	459	459
query74	9842	9721	9504	9504
query75	2754	2626	2295	2295
query76	2300	1099	765	765
query77	407	412	328	328
query78	11370	11398	10809	10809
query79	1572	1065	820	820
query80	1090	604	514	514
query81	519	278	243	243
query82	1304	156	119	119
query83	341	291	256	256
query84	268	147	121	121
query85	980	514	459	459
query86	427	321	313	313
query87	3285	3170	3085	3085
query88	3552	2691	2668	2668
query89	444	394	353	353
query90	1883	178	179	178
query91	185	164	144	144
query92	80	76	70	70
query93	985	956	573	573
query94	633	343	286	286
query95	639	374	336	336
query96	1051	744	342	342
query97	2677	2659	2581	2581
query98	257	240	236	236
query99	1070	1075	947	947
Total cold run time: 258746 ms
Total hot run time: 178632 ms

@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 28691 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e3d0ec56ccb97b2320af9ca35f0f4f81938c3939, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17662	3737	3718	3718
q2	q3	10701	858	600	600
q4	4673	463	355	355
q5	7447	1334	1145	1145
q6	183	168	138	138
q7	904	957	760	760
q8	9314	1387	1318	1318
q9	5516	5273	5244	5244
q10	6299	2026	1774	1774
q11	484	276	289	276
q12	677	400	282	282
q13	18102	2769	2170	2170
q14	281	290	260	260
q15	q16	895	844	779	779
q17	989	1154	746	746
q18	6376	5669	5502	5502
q19	1179	1218	940	940
q20	531	431	301	301
q21	4762	2437	1994	1994
q22	494	425	389	389
Total cold run time: 97469 ms
Total hot run time: 28691 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4443	4337	4293	4293
q2	q3	4626	4725	4143	4143
q4	2107	2060	1287	1287
q5	4898	4939	5179	4939
q6	193	168	136	136
q7	2035	1743	1611	1611
q8	3242	3020	3078	3020
q9	8206	8283	8168	8168
q10	4394	4392	4269	4269
q11	634	431	399	399
q12	709	790	585	585
q13	2810	3076	2415	2415
q14	303	319	270	270
q15	q16	750	812	704	704
q17	1263	1266	1290	1266
q18	8114	6968	7038	6968
q19	1152	1166	1160	1160
q20	2192	2211	1982	1982
q21	5921	5266	4851	4851
q22	585	517	431	431
Total cold run time: 58577 ms
Total hot run time: 52897 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 178379 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e3d0ec56ccb97b2320af9ca35f0f4f81938c3939, data reload: false

query5	4334	682	537	537
query6	342	249	209	209
query7	4240	565	351	351
query8	340	251	220	220
query9	8712	3899	3930	3899
query10	470	363	318	318
query11	6765	5490	5137	5137
query12	194	143	130	130
query13	1282	653	455	455
query14	5760	5222	4859	4859
query14_1	4168	4165	4123	4123
query15	212	206	184	184
query16	1013	535	364	364
query17	1159	794	651	651
query18	2601	516	386	386
query19	236	216	177	177
query20	144	135	130	130
query21	220	148	122	122
query22	14089	14712	14449	14449
query23	18032	17171	16642	16642
query23_1	16697	16862	16671	16671
query24	7730	1881	1354	1354
query24_1	1371	1369	1351	1351
query25	651	492	440	440
query26	1248	328	171	171
query27	2670	636	364	364
query28	4438	1890	1905	1890
query29	975	702	559	559
query30	299	240	194	194
query31	1094	1038	933	933
query32	90	73	69	69
query33	528	363	288	288
query34	1194	1131	680	680
query35	762	782	671	671
query36	1216	1188	1077	1077
query37	163	104	84	84
query38	3108	3038	2997	2997
query39	914	902	858	858
query39_1	827	827	830	827
query40	241	153	133	133
query41	103	60	57	57
query42	113	107	107	107
query43	314	323	278	278
query44	
query45	203	196	195	195
query46	1155	1230	780	780
query47	2300	2301	2224	2224
query48	405	417	309	309
query49	630	536	427	427
query50	764	292	220	220
query51	4557	4276	4284	4276
query52	106	109	97	97
query53	256	273	202	202
query54	320	281	264	264
query55	100	93	87	87
query56	307	310	306	306
query57	1733	1699	1601	1601
query58	307	276	256	256
query59	2929	2998	2733	2733
query60	334	334	332	332
query61	161	158	163	158
query62	664	617	571	571
query63	244	202	197	197
query64	5297	1346	973	973
query65	
query66	1442	488	366	366
query67	24259	24394	24131	24131
query68	
query69	445	345	341	341
query70	1043	988	1034	988
query71	320	285	278	278
query72	2980	2736	2451	2451
query73	821	808	468	468
query74	9914	9702	9578	9578
query75	2783	2641	2289	2289
query76	2345	1180	751	751
query77	408	416	332	332
query78	11277	11318	10763	10763
query79	1669	1086	807	807
query80	1399	589	514	514
query81	529	283	238	238
query82	984	172	129	129
query83	367	305	263	263
query84	272	148	121	121
query85	942	499	457	457
query86	448	331	313	313
query87	3284	3197	3123	3123
query88	3617	2724	2714	2714
query89	459	389	351	351
query90	1998	188	181	181
query91	177	176	145	145
query92	86	75	72	72
query93	1249	997	563	563
query94	739	320	281	281
query95	665	465	339	339
query96	1021	764	325	325
query97	2660	2670	2545	2545
query98	237	227	222	222
query99	1051	1053	973	973
Total cold run time: 258951 ms
Total hot run time: 178379 ms

@doris-robot
Copy link
Copy Markdown

BE UT Coverage Report

Increment line coverage 45.07% (32/71) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.96% (20068/37892)
Line Coverage 36.54% (188483/515864)
Region Coverage 32.83% (146548/446329)
Branch Coverage 33.95% (64107/188806)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 91.55% (65/71) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.64% (27328/37108)
Line Coverage 57.21% (294203/514283)
Region Coverage 54.51% (245566/450463)
Branch Coverage 56.11% (106270/189388)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 91.55% (65/71) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.60% (27310/37108)
Line Coverage 57.18% (294074/514283)
Region Coverage 54.49% (245469/450463)
Branch Coverage 56.09% (106226/189388)

@airborne12
Copy link
Copy Markdown
Member Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings:

  1. be/src/exprs/function/function_search.cpp: inverted_index_searcher_cache_hit is incremented when FieldReaderResolver reuses its per-query _cache. That cache is distinct from InvertedIndexSearcherCache, so repeated clauses on the same field will report a cache hit even when the global searcher cache missed.
  2. be/src/exprs/function/function_search.cpp: the SEARCH DSL-cache fast path still returns before inverted_index_query_timer is started, and it never updates inverted_index_query_cache_hit/miss. Cached SEARCH queries will therefore keep showing InvertedIndexQueryTime: 0 and InvertedIndexQueryCacheHit/Miss: 0.

Critical checkpoints:

  • Goal of the task: Partially achieved. The PR fixes open/search timing visibility and IO-context propagation, but SEARCH query-cache metrics are still incomplete and one cache-hit counter is now overstated.
  • Modification size/focus: Yes, the patch is small and focused.
  • Concurrency: No new lock-order or thread-safety issue found in the touched code; the reported issue is metric semantics, not synchronization.
  • Lifecycle/static initialization: DorisCompoundReader IO-context propagation looks lifecycle-safe; no static-initialization risk found.
  • Configuration changes: None.
  • Compatibility changes: None.
  • Parallel code paths: Not fully aligned. MATCH counts only real InvertedIndexSearcherCache hits and times the whole query including cache handling, while SEARCH still diverges on cached paths.
  • Special conditional checks: No additional issue found.
  • Test coverage: Insufficient for this change. I did not find SEARCH-path tests that assert query timer/cache counters, which is why these regressions slipped through.
  • Observability: Improved, but still incomplete on cached SEARCH queries.
  • Transaction/persistence: Not applicable.
  • Data writes/modifications: Not applicable.
  • FE-BE variable passing: Not applicable.
  • Performance review: No additional hot-path concern beyond the incorrect metric accounting above.
  • Other issues: None beyond the findings listed above.

auto cache_it = _cache.find(binding_key);
if (cache_it != _cache.end()) {
*binding = cache_it->second;
if (_context->stats) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_cache here is the resolver's per-query binding cache, not InvertedIndexSearcherCache. Incrementing inverted_index_searcher_cache_hit on this fast path changes the metric's meaning and double-counts hits when the same field is referenced multiple times in one SEARCH() evaluation. On a cold segment, the first clause will record the real miss at the later cache-open path, and a second clause on the same field will record a synthetic hit here even though the global searcher cache was never consulted.

@@ -434,6 +447,11 @@ Status FunctionSearch::evaluate_inverted_index_with_search_param(
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This timer still starts after the SEARCH DSL-cache lookup/return path above. When dsl_cache->lookup() returns a bitmap, the function exits before reaching this scope, so InvertedIndexQueryTime remains 0 on query-cache hits. The same fast path also never updates inverted_index_query_cache_hit/miss, so cached SEARCH queries are still invisible in the profile even after this patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants