http.server parses HTTP version numbers too permissively.
http.server accepts request lines with HTTP version numbers that have '_', '+', and '-'.
Reproduction steps:
(Requires netcat)
python3 -m http.server --bind 127.0.0.1
printf 'GET / HTTP/-9_9_9.+9_9_9\r\n\r\n' | nc 127.0.0.1 8000
Justification
Here are the HTTP-version definitions from each of the three HTTP RFCs:
HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT
HTTP-version = HTTP-name "/" DIGIT "." DIGIT
HTTP-name = %x48.54.54.50 ; "HTTP", case-sensitive
HTTP-version = HTTP-name "/" DIGIT "." DIGIT
HTTP-name = %s"HTTP"
I understand allowing multiple digits for backwards-compatibility with RFC 2616, but I don't think it makes sense to let the specifics of int leak out into the world. We should at least ensure that only digits are permitted in HTTP version numbers.
My environment
- CPython 3.12.0a6+
- Operating system and architecture: Arch Linux on x86_64
Linked PRs