gh-127750: Improve caching in singledispatchmethod#127751
gh-127750: Improve caching in singledispatchmethod#127751vodik wants to merge 1 commit intopython:mainfrom
Conversation
|
I'm not an expert at def _method(*args, **kwargs):
if not args:
raise TypeError(f'{funcname} requires at least '
'1 positional argument')
return dispatch(args[0].__class__).__get__(obj, cls)(*args, **kwargs)Wouldn't this hold a strong reference to |
2e08bd9 to
a81b83e
Compare
a81b83e to
fe804ef
Compare
|
Yes! Tracing through the logic manually, I can confirm that the So there's a bit of a space leak here (I assume Python's garbage collector might be able to figure it out though?) I just put up a different that tried to solve this problem as well, but deviates further from the original. If we settle on an approach I'll figure out how to write tests... |
| '1 positional argument') | ||
| # Make sure to use the weakref here to prevent storing | ||
| # a strong reference to obj in the cache | ||
| return dispatch(args[0].__class__).__get__(obj_ref(), cls)(*args, **kwargs) |
There was a problem hiding this comment.
Doesn't this introduce a data race though? If I assigned the method and then deleted the underlying obj? I'll have to try it.
Is a weakref even appropriate for this cache then? Or should I throw a ReferenceError?
- return dispatch(args[0].__class__).__get__(obj_ref(), cls)(*args, **kwargs)
+ obj = obj_ref()
+ if obj is None:
+ raise ReferenceError
+
+ return dispatch(args[0].__class__).__get__(obj, cls)(*args, **kwargs)Or is this whole approach maybe not correct enough?
sobolevn
left a comment
There was a problem hiding this comment.
Please, provide a regression test.
|
If the main issue is that different objects are colliding in the cache (due to equal |
|
Unless we can find a solution that is both robust and has less complexity than this patch, I'm inclined to agree with @eendebakpt — I think a clean revert would be better. It's sad, but it may be the best we can do :( |
|
I'm coming to the same conclusion. This patch now makes this code break: foo = Foo()
dispatcher = foo.mysingledismatchmethod
del foo
dispatcher(...) # failsWhat if we revert #107148 and I instead contribute a test case specifically for this situation? That'll help if someone wants to tackle this again. I personally wasn't worried about the performance of this feature. Either way I'll give it another attempt tonight. Maybe there's another approach that can work. |
|
Possibly one way to make this work would be to use the |
|
But yes, whether we think of a clever solution or just revert the feature, we should clearly add a regression test. |
|
@AlexWaygood Weakref aside, just to be clear, while working on this patch I found another problem with the caching - that the closure put in the cache also holds a reference to the same obj we key the cache with. So it's circular and won't cleanup. There's a memory leak here too. I can't think of a way to break that without breaking the semantics of the language. And I think that's the deeper problem that requires this to be reverted rather than fixed. I don't think there is a fix for this. |
|
One more try: what if we change the cache to a
It seems to work locally for me, but definitely needs a second look. I did not measure performance yet. |
ZeroIntensity
left a comment
There was a problem hiding this comment.
Before we discuss approaches, I'd like to get a test down to make sure things are working.
|
It seems reasonable to me too. |
|
#127839 is a better approach |
The problem with using a
WeakRefDictionaryis it's possible for distinct object instances to collide in the cache depending on their implementation of__eq__/__hash__. In particular, it prevents it from being used at all on Django models.