gh-146455: Fix O(N²) in add_const() after constant folding moved to CFG#146456
Open
zSirius wants to merge 2 commits intopython:mainfrom
Open
gh-146455: Fix O(N²) in add_const() after constant folding moved to CFG#146456zSirius wants to merge 2 commits intopython:mainfrom
zSirius wants to merge 2 commits intopython:mainfrom
Conversation
…d to CFG The add_const() function in flowgraph.c uses a linear search over the consts list to find the index of a constant. After pythongh-126835 moved constant folding from the AST optimizer to the CFG optimizer, this function is now called N times for N inner tuple elements during fold_tuple_of_constants(), resulting in O(N²) total time. Fix by maintaining an auxiliary _Py_hashtable_t that maps object pointers to their indices in the consts list, providing O(1) lookup. For a file with 100,000 constant 2-tuples: - Before: 10.38s (add_const occupies 83.76% of CPU time) - After: 1.48s
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix O(N²) performance regression in
add_const()introduced by moving constant folding from AST to CFG optimizer (gh-126835).Problem
After gh-130769 moved tuple folding to CFG,
fold_tuple_of_constants()callsadd_const()once per inner tuple element.add_const()does a linear scan over theconstslist to find the index, so N calls × O(N) scan = O(N²).The same issue affects unary/binary op folding moved in gh-129550 (
fold_const_unaryop,fold_const_binop).perfprofiling showsadd_consttaking 83.76% of CPU time when compiling 100K nested constant tuples.Fix
Maintain an auxiliary
_Py_hashtable_t(pointer → index mapping) alongside theconstslist, providing O(1) constant lookup. The hashtable:_Py_hashtable_hash_ptr/_Py_hashtable_compare_direct— pure pointer ops, no Python object overhead_PyCfg_OptimizeCodeUnit()and destroyed afteroptimize_cfg(), beforeremove_unused_consts()reindexes the list_PyCompile_ConstCacheMergeOne()already guarantees identity uniqueness (equal-valued constants share the same pointer)All modified functions are
static— no public API changes.Performance (N=100K)
((f, f), ...)(-1, -2, ..., -N)(0+1, 0+2, ..., 0+N)All existing tests pass:
test_compile,test_peepholer,test_ast,test_dis.