New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve import time of various stdlib modules #109653
Comments
typing.py|
Retitling this issue to reflect a change of scope: part of the reason for the slow import time of |
|
An idea: review the output of a linter for "unused" imports. Some may be false positives due to needing an import's side effects, but I expect there will be some that are no longer needed. |
How do you feel about making an intentional effort to get rid of such reliance on side effects in the stdlib? |
FWIW, I can only find one in the stdlib that I'm 100% confident with, using ruff/pycln: Line 20 in e8be0c9
I think this is because Victor went through them all fairly recently, in #105411. There are a fair few in |
|
This is also interesting:
|
|
@sobolevn, the situation with So, should we just go for cleaner code in |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
|
Looking at A |
…precated members on demand (python#109651) Co-authored-by: Thomas Grainger <tagrain@gmail.com>
One thing that does mildly bug me about the
I wrote about removing or delaying some of the imports dataclasses uses late last year but didn't end up publishing or pursuing it as #97800 made To summarise, taken from the Footnotes
|
|
I had a look at Almost any improvements also require deferring import of This does mean that classes with no docstring/with string annotations/with the future The changes would look something like this: Code changes to dataclassesdiff --git a/Lib/dataclasses.py b/Lib/dataclasses.py
index 84f8d68ce0..5ce90192f5 100644
--- a/Lib/dataclasses.py
+++ b/Lib/dataclasses.py
@@ -1,13 +1,10 @@
-import re
import sys
import copy
import types
-import inspect
import keyword
-import functools
import itertools
import abc
-import _thread
+from reprlib import recursive_repr as _recursive_repr
from types import FunctionType, GenericAlias
@@ -220,7 +217,21 @@ def __repr__(self):
# String regex that string annotations for ClassVar or InitVar must match.
# Allows "identifier.identifier[" or "identifier[".
# https://bugs.python.org/issue33453 for details.
-_MODULE_IDENTIFIER_RE = re.compile(r'^(?:\s*(\w+)\s*\.)?\s*(\w+)')
+class _DelayedRegexMatcher:
+ def __init__(self, match_str):
+ self.match_str = match_str
+ self._compiled_regex = None
+
+ def __repr__(self):
+ return f"{self.__class__.__name__}(match_str={self.match_str!r})"
+
+ def match(self, ann):
+ if self._compiled_regex is None:
+ import re
+ self._compiled_regex = re.compile(self.match_str)
+ return self._compiled_regex.match(ann)
+
+_MODULE_IDENTIFIER_RE = _DelayedRegexMatcher(r'^(?:\s*(\w+)\s*\.)?\s*(\w+)')
# Atomic immutable types which don't require any recursive handling and for which deepcopy
# returns the same object. We can provide a fast-path for these types in asdict and astuple.
@@ -245,25 +256,31 @@ def __repr__(self):
property,
})
-# This function's logic is copied from "recursive_repr" function in
-# reprlib module to avoid dependency.
-def _recursive_repr(user_function):
- # Decorator to make a repr function return "..." for a recursive
- # call.
- repr_running = set()
-
- @functools.wraps(user_function)
- def wrapper(self):
- key = id(self), _thread.get_ident()
- if key in repr_running:
- return '...'
- repr_running.add(key)
- try:
- result = user_function(self)
- finally:
- repr_running.discard(key)
- return result
- return wrapper
+def _get_annotations(obj):
+ """
+ Compute the annotations dict for a dataclass.
+
+ Copied from inspect.get_annotations with unused code removed.
+ """
+ if isinstance(obj, type):
+ obj_dict = getattr(obj, "__dict__", None)
+ if obj_dict and hasattr(obj_dict, "get"):
+ ann = obj_dict.get("__annotations__", None)
+ if isinstance(ann, types.GetSetDescriptorType):
+ ann = None
+ else:
+ ann = None
+ else:
+ raise TypeError(f"{obj!r} is not a class.")
+
+ if not isinstance(ann, types.NoneType | dict):
+ raise ValueError(f"{obj!r}.__annotations__ is neither a dict nor None")
+
+ if not ann:
+ return {}
+
+ return dict(ann)
+
class InitVar:
__slots__ = ('type', )
@@ -322,7 +339,7 @@ def __init__(self, default, default_factory, init, repr, hash, compare,
self.kw_only = kw_only
self._field_type = None
- @_recursive_repr
+ @_recursive_repr()
def __repr__(self):
return ('Field('
f'name={self.name!r},'
@@ -632,7 +649,7 @@ def _repr_fn(fields, globals):
for f in fields]) +
')"'],
globals=globals)
- return _recursive_repr(fn)
+ return _recursive_repr()(fn)
def _frozen_get_del_attr(cls, fields, globals):
@@ -967,7 +984,7 @@ def _process_class(cls, init, repr, eq, order, unsafe_hash, frozen,
# actual default value. Pseudo-fields ClassVars and InitVars are
# included, despite the fact that they're not real fields. That's
# dealt with later.
- cls_annotations = inspect.get_annotations(cls)
+ cls_annotations = _get_annotations(cls)
# Now find fields in our class. While doing so, validate some
# things, and set the default values (as class attributes) where
@@ -1131,15 +1148,17 @@ def _process_class(cls, init, repr, eq, order, unsafe_hash, frozen,
# we're here the overwriting is unconditional.
cls.__hash__ = hash_action(cls, field_list, globals)
- if not getattr(cls, '__doc__'):
- # Create a class doc-string.
- try:
- # In some cases fetching a signature is not possible.
- # But, we surely should not fail in this case.
- text_sig = str(inspect.signature(cls)).replace(' -> None', '')
- except (TypeError, ValueError):
- text_sig = ''
- cls.__doc__ = (cls.__name__ + text_sig)
+ if sys.flags.optimize < 2: # Don't create a docstring in -OO mode
+ if not getattr(cls, '__doc__'):
+ # Create a class doc-string.
+ try:
+ # In some cases fetching a signature is not possible.
+ # But, we surely should not fail in this case.
+ import inspect
+ text_sig = str(inspect.signature(cls)).replace(' -> None', '')
+ except (TypeError, ValueError):
+ text_sig = ''
+ cls.__doc__ = (cls.__name__ + text_sig)
if match_args:
# I could probably compute this onceWith ./python -Ximporttime -c "import dataclasses"Before: After:
If it seems worth it I could make a PR and investigate the effect of deferring Edit: I realised that you can use a non-data descriptor to delay the Additional changes for non-data descriptordiff --git a/Lib/dataclasses.py b/Lib/dataclasses.py
index 5ce90192f5..36fc17a2b5 100644
--- a/Lib/dataclasses.py
+++ b/Lib/dataclasses.py
@@ -233,6 +233,27 @@ def match(self, ann):
_MODULE_IDENTIFIER_RE = _DelayedRegexMatcher(r'^(?:\s*(\w+)\s*\.)?\s*(\w+)')
+# Descriptor used to defer `inspect` import until __doc__ is accessed
+class _DocstringMaker:
+ def __get__(self, obj, cls=None):
+ """
+ Create and return a class docstring, replacing this descriptor with the docstring.
+ """
+ import inspect
+
+ if cls is None:
+ cls = type(obj)
+ try:
+ # In some cases fetching a signature is not possible.
+ # But, we surely should not fail in this case.
+ text_sig = str(inspect.signature(cls)).replace(' -> None', '')
+ except (TypeError, ValueError):
+ text_sig = ''
+
+ new_docstring = cls.__name__ + text_sig
+ setattr(cls, "__doc__", new_docstring)
+ return new_docstring
+
# Atomic immutable types which don't require any recursive handling and for which deepcopy
# returns the same object. We can provide a fast-path for these types in asdict and astuple.
_ATOMIC_TYPES = frozenset({
@@ -1150,15 +1171,8 @@ def _process_class(cls, init, repr, eq, order, unsafe_hash, frozen,
if sys.flags.optimize < 2: # Don't create a docstring in -OO mode
if not getattr(cls, '__doc__'):
- # Create a class doc-string.
- try:
- # In some cases fetching a signature is not possible.
- # But, we surely should not fail in this case.
- import inspect
- text_sig = str(inspect.signature(cls)).replace(' -> None', '')
- except (TypeError, ValueError):
- text_sig = ''
- cls.__doc__ = (cls.__name__ + text_sig)
+ # Put a docstring maker in place
+ cls.__doc__ = _DocstringMaker()
if match_args:
# I could probably compute this onceFootnotes
|
|
I recently spent a sizeable amount of time trying to speedup startup time of our CLI app, and was delighted to find this issue, thanks so much @AlexWaygood for working on this! 👏 Another chunky stdlib module is diff --git a/Lib/logging/__init__.py b/Lib/logging/__init__.py
index eb7e020d1e..7f7d44e5f5 100644
--- a/Lib/logging/__init__.py
+++ b/Lib/logging/__init__.py
@@ -23,7 +23,7 @@
To use, simply 'import logging' and log away!
"""
-import sys, os, time, io, re, traceback, warnings, weakref, collections.abc
+import sys, os, time, io, re, warnings, weakref, collections.abc
from types import GenericAlias
from string import Template
@@ -653,6 +653,7 @@ def formatException(self, ei):
This default implementation just uses
traceback.print_exception()
"""
+ import traceback
sio = io.StringIO()
tb = ei[2]
# See issues #9427, #1553375. Commented out for now.
@@ -1061,6 +1062,7 @@ def handleError(self, record):
The record which was being processed is passed in to this method.
"""
if raiseExceptions and sys.stderr: # see issue 13807
+ import traceback
exc = sys.exception()
try:
sys.stderr.write('--- Logging error ---\n')
@@ -1601,6 +1603,7 @@ def findCaller(self, stack_info=False, stacklevel=1):
co = f.f_code
sinfo = None
if stack_info:
+ import traceback
with io.StringIO() as sio:
sio.write("Stack (most recent call last):\n")
traceback.print_stack(f, file=sio) |
|
I would like to echo the elation of the previous commenter when I encountered this ongoing effort today. As a maintainer of various CLIs and also two personal apps that run on AWS Lambda/Google Cloud Functions, this will help me tremendously! Thank you very, very much 🙂 |
Sorry for the slow response @danielhollas! That indeed looks like a pretty good idea to me. Feel free to make a PR, and ping me on the PR for review :)
You're very welcome :) |
Delayed import of traceback results in 2-5ms speedup. Issue python#109653
Delayed import of traceback results in ~16% speedup. Issue python#109653
Feature or enhancement
Proposal:
As noted in https://discuss.python.org/t/deferred-computation-evalution-for-toplevels-imports-and-dataclasses/34173,
typingisn't the slowest stdlib module in terms of import time, but neither is it one of the quickest. We should speed it up, if possible.Links to previous discussion of this feature:
https://discuss.python.org/t/deferred-computation-evalution-for-toplevels-imports-and-dataclasses/34173
Linked PRs
typing.py: improve import time by creating soft-deprecated members on demand #109651enumimport time by avoiding import offunctools#109789Lib/directory #109803typesinfunctools#109804recursive_reprindataclasses#109822email.utils#109824importlib.metadata._adapters#109829random#110221random(GH-110221) #110247warningsin several modules #110286loggingby lazy loadingtraceback#112995The text was updated successfully, but these errors were encountered: