Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
19b28fc
CVE-2020-10735: Prevent DoS by very large int()
tiran May 5, 2020
0a96b20
Default to disable, improve tests and docs
tiran Jan 19, 2022
88f6d5d
fix typo
tiran Jan 19, 2022
70c195e
More docs (WIP)
tiran Jan 19, 2022
e17e93b
Basic documentation for sys functions
tiran Jan 19, 2022
fbd14b7
Use ValueError, ignore underscore, scale limit
tiran Jan 20, 2022
dd74d70
Fix CI
tiran Jan 20, 2022
0e01461
Address Greg's review
tiran Aug 1, 2022
0b21e5f
Fix sys.flags len and docs
tiran Aug 1, 2022
3b38abe
Keep the warning, but remove advice about limiting input length in th…
gpshead Aug 2, 2022
37193ed
Renamed the APIs & too many other refactorings.
gpshead Aug 5, 2022
c90b79f
Improve the configuring docs.
gpshead Aug 7, 2022
fea25ea
Stop tying to base10, just use string digits.
gpshead Aug 7, 2022
ac9f22f
Remove the added now-unneeded helper log tbl fn.
gpshead Aug 7, 2022
da72dd1
prevent intdostimeit from emitting errors in test_tools.
gpshead Aug 7, 2022
d7e4d7b
Remove a leftover base 10 reference. clarify.
gpshead Aug 7, 2022
5c7e6d5
versionadded/changed to 3.12
gpshead Aug 7, 2022
61a5bc9
Link to the CVE from the main doc.
gpshead Aug 7, 2022
c15adde
Add a What's New entry.
gpshead Aug 7, 2022
76ae1c2
Add a Misc/NEWS.d entry.
gpshead Aug 7, 2022
1ad88f5
Undo addition to PyConfig to ease backporting.
gpshead Aug 8, 2022
0c83111
Remove the Tools/scripts/ example and timing code.
gpshead Aug 8, 2022
5d39ab6
un-add the <math.h> include (not needed for PR anymore)
gpshead Aug 8, 2022
5b77b3e
Remove added unused imports.
gpshead Aug 8, 2022
de00cdc
Tabs -> Spaces
gpshead Aug 8, 2022
3cc8553
make html and make doctest in Doc pass.
gpshead Aug 8, 2022
da97e65
Raise the default limit and the threshold.
gpshead Aug 10, 2022
ef03a16
Remove xmlrpc.client changes, test-only.
gpshead Aug 12, 2022
e916845
Rearrange the new stdtypes docs, w/limits + caution.
gpshead Aug 13, 2022
101502e
Make a huge int a SyntaxError with lineno when parsing.
gpshead Aug 16, 2022
fa8a58a
Mention the chosen default in the NEWS entry.
gpshead Aug 16, 2022
313ab6d
Properly clear & free the prior exception.
gpshead Aug 16, 2022
614cd02
Add a note to the float.as_integer_ratio() docs.
gpshead Aug 17, 2022
16ad090
Clarify the documentation wording and error msg.
gpshead Aug 17, 2022
4eb72e6
Fix test_idle, it used a long int on a line.
gpshead Aug 17, 2022
da36550
Rename the test.support context manager and document it.
gpshead Aug 19, 2022
f4372cc
Documentation cleanup.
gpshead Aug 19, 2022
c421853
Update attribution in Misc/NEWS.d
gpshead Aug 25, 2022
9f2168a
Regen global strings
tiran Sep 1, 2022
3c8504b
Make the doctest actually run & fix it.
gpshead Sep 1, 2022
1586419
Fix the docs build.
gpshead Sep 2, 2022
94bd3ee
Rename the news file to appease the Bedevere bot.
gpshead Sep 2, 2022
0b91f65
Regen argument clinic after the rebase merge.
gpshead Sep 2, 2022
02776f9
Hexi hexa
tiran Sep 2, 2022
173fa4e
Hexi hexa 2
tiran Sep 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Use ValueError, ignore underscore, scale limit
  • Loading branch information
tiran authored and gpshead committed Sep 2, 2022
commit fbd14b753d2a27d51a89322c4f2aaf5ad24a0102
6 changes: 4 additions & 2 deletions Doc/library/functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -911,8 +911,10 @@ are always available. They are listed here in alphabetical order.
The delegation to :meth:`__trunc__` is deprecated.

.. versionchanged:: 3.12
:class:`int` string inputs are now limited, see :ref:`int maximum
digits limitation <intmaxdigits>`.
:class:`int` string inputs and string representation can be limited.
A :exc:`ValueError` is raised when the input or string representation
exceeds the limit. See :ref:`int maximum
digits limitation <intmaxdigits>` for more information.

.. function:: isinstance(object, classinfo)

Expand Down
47 changes: 32 additions & 15 deletions Doc/library/stdtypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5466,16 +5466,31 @@ Integer maximum digits limitation
=================================

CPython has a global limit for converting between :class:`int` and class:`str`
to mitigate denial of service attacks. The limit is necessary because there
exists no efficient algorithm, that can convert a string to an integer or
to mitigate denial of service attacks. The limit is necessary because Python's
integer type is an abitrary length number (also known as bignum). There
exists no efficient algorithm that can convert a string to an integer or
an integer to a string in linear time, unless the base is a power of *2*. Even
the best known algorithms for base *10* have sub-quadratic complexity. A large
input like::

int('1' * 500_000)
input like ``int('1' * 500_000)`` takes about a second at 100% CPU load on
an X86_64 CPU from 2020 with 4.2 GHz max frequency.

The limit value uses base 10 as a reference point and scales with base.
That means :class:`int` accepts longer input strings for smaller bases and
fails earlier for larger bases. Underscores in input strings don't count
towards the limit.

When an operation exceeds the limit, an :exc:`ValueError` is raised::

>>> sys.setintmaxdigits(2048)
>>> i = 10 ** 2047
>>> len(str(i))
2048
>>> i = 10 ** 2048
>>> len(str(i))
Traceback (most recent call last):
...
ValueError: input exceeds maximum integer digit limit

takes about a second at 100% CPU load on an X86_64 CPU from 2020 with 4.2 GHz
max frequency.

Configure limitations
---------------------
Expand All @@ -5494,13 +5509,8 @@ Configure limitations
precedence. The flag defaults to *-1*.

* :func:`sys.getintmaxdigits` and :func:`sys.setintmaxdigits` are getter
and setter for interpreter-wide limit.

Recommended configuration::

import sys
if hasattr(sys.flags, "intmaxdigits") and sys.flags.intmaxdigits == -1:
sys.setintmaxdigits(4096)
and setter for interpreter-wide limit. Subinterpreters have their own
limit.

Affected APIs
-------------
Expand All @@ -5521,10 +5531,17 @@ The limitations do not apply to functions with a linear algorithm:
* :func:`int.from_bytes` and :func:`int.to_bytes`
* :func:`hex`, :func:`oct`, :func:`bin` (the resulting string may consume
a substantial amount of memory)
* :ref:`formatspec` for hex, octet, and binary types
* :ref:`formatspec` for hex, octet, and binary numbers
* :class:`str` to :class:`float`
* :class:`str` to :class:`decimal.Decimal`

Recommended configuration
-------------------------

import sys
if hasattr(sys.flags, "intmaxdigits") and sys.flags.intmaxdigits == -1:
sys.setintmaxdigits(4096)


.. rubric:: Footnotes

Expand Down
24 changes: 20 additions & 4 deletions Lib/test/test_int.py
Original file line number Diff line number Diff line change
Expand Up @@ -591,24 +591,40 @@ def _test_maxdigits(self, c):
i = c('1' * 100_000)
str(i)

# OverflowError
def check(i, base=None):
with self.assertRaises(OverflowError):
with self.assertRaises(ValueError):
if base is None:
c(i)
else:
c(i, base)

maxdigits = 1024
maxdigits = 2048
with support.setintmaxdigits(maxdigits):
assert maxdigits == sys.getintmaxdigits()
check('1' * (maxdigits + 1))
check('+' + '1' * (maxdigits + 1))
check('1' * (maxdigits + 1))

i = 10 ** maxdigits
with self.assertRaises(OverflowError):
with self.assertRaises(ValueError):
str(i)

# ignore power of two
for base in (2, 4, 8, 16, 32):
c('1' * (maxdigits + 1), base)
c('1' * 100_000, base)

# limit ignores underscores
s = '1111_' * ((maxdigits) // 4)
s = s[:-1]
int(s)
check(s + '1')

# limit is in equivalent of base 10 digits
s = '1' * 2147
assert len(str(int(s, 9))) == maxdigits
int(s + '1', 9)

def test_maxdigits(self):
self._test_maxdigits(int)
self._test_maxdigits(IntSubclass)
Expand Down
2 changes: 1 addition & 1 deletion Lib/test/test_json/test_decode.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ def test_limit_int(self):
maxdigits = 5000
with support.setintmaxdigits(maxdigits):
self.loads('1' * maxdigits)
with self.assertRaises(OverflowError):
with self.assertRaises(ValueError):
self.loads('1' * (maxdigits + 1))


Expand Down
2 changes: 1 addition & 1 deletion Lib/test/test_xmlrpc.py
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,7 @@ def test_limit_int(self):
check = self.check_loads
with self.assertRaises(OverflowError):
check('<int>123456780123456789</int>', None)
with self.assertRaises(OverflowError):
with self.assertRaises(ValueError):
maxdigits = 5000
with support.setintmaxdigits(maxdigits):
s = '1' * (maxdigits + 1)
Expand Down
24 changes: 19 additions & 5 deletions Objects/longobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -1819,8 +1819,8 @@ long_to_decimal_string_internal(PyObject *aa,
PyInterpreterState *interp = _PyInterpreterState_GET();
if ((interp->intmaxdigits > 0) && (strlen > interp->intmaxdigits)) {
Py_DECREF(scratch);
PyErr_SetString(PyExc_OverflowError,
"too many digits in integer");
PyErr_SetString(PyExc_ValueError,
"input exceeds maximum integer digit limit");
return -1;
}
}
Expand Down Expand Up @@ -2434,6 +2434,7 @@ digit beyond the first.
twodigits c; /* current input character */
Py_ssize_t size_z;
Py_ssize_t digits = 0;
Py_ssize_t underscores = 0;
int i;
int convwidth;
twodigits convmultmax, convmult;
Expand Down Expand Up @@ -2470,6 +2471,7 @@ digit beyond the first.

while (_PyLong_DigitValue[Py_CHARMASK(*scan)] < base || *scan == '_') {
if (*scan == '_') {
++underscores;
if (prev == '_') {
/* Only one underscore allowed. */
str = lastdigit + 1;
Expand All @@ -2490,12 +2492,24 @@ digit beyond the first.
goto onError;
}

slen = scan - str;
/* intmaxdigits limit ignores underscores and uses base 10
* as reference point.
* For other bases slen is transformed into base 10 equivalents.
* Our string to integer conversion algorithm scales less than
* linear with base value, for example int('1' * 300_000", 30)
* is slightly more than five times slower than int(..., 5).
* The naive scaling "slen / 10 * base" is close enough to
* compensate.
*/
slen = scan - str - underscores;
if (base != 10) {
slen = (Py_ssize_t)(slen / 10 * base);
}
if (slen > _PY_LONG_MAX_DIGITS_THRESHOLD) {
PyInterpreterState *interp = _PyInterpreterState_GET();
if ((interp->intmaxdigits > 0 ) && (slen > interp->intmaxdigits)) {
PyErr_SetString(PyExc_OverflowError,
"too many digits in integer");
PyErr_SetString(PyExc_ValueError,
"input exceeds maximum integer digit limit");
return NULL;
}
}
Expand Down