perf: add literal-pattern fast path to split() by mattn · Pull Request #19708 · vim/vim

mattn · 2026-03-16T04:04:45Z

split() with a literal separator (e.g. ",", ":", "abc") is an extremely common pattern in Vim script, yet it currently goes through the full regexp compile-and-match path every time. This patch adds a fast path that detects patterns containing no regexp metacharacters and uses strstr() to scan instead, skipping vim_regcomp() / vim_regexec() entirely. Multi-byte characters are handled safely via mb_ptr2len().

Regexp patterns and the default whitespace pattern are unaffected and still take the existing code path.

Benchmark: 200,000 iterations per case

Pattern	Before	After	Speedup
`','` (literal 1-char)	11.284 s	2.966 s	3.8×
`'abc'` (literal multi-char)	5.919 s	3.350 s	1.8×
default (whitespace)	12.204 s	11.920 s	1.0×
`',\+'` (regexp)	9.675 s	9.633 s	1.0×

When the pattern passed to split() is a single plain byte (not a regexp metacharacter), bypass vim_regcomp/vim_regexec entirely and scan with vim_strchr() instead. This avoids regex compilation and matching overhead for the very common case of splitting on a literal character such as "," or ":".

Generalize the fast path from single-byte literals to any pattern that contains no regexp metacharacters. Use mb_ptr2len() to safely skip multi-byte characters when scanning for metacharacters, and strstr() for the actual splitting.

char101 · 2026-03-16T06:49:28Z

How about adding a condition that the previous char is not \ in

for (p = pat; *p != NUL; p += mb_ptr2len(p))
	if (*p < 0x80
		&& vim_strchr((char_u *)".^$~[]\\*?+|{}()", *p) != NULL)
	    return FALSE;

that will make \.\. literal.

EDIT: I guess that will require \.\. to be compiled first by the regex engine to make it literal so this can't work.

chrisbra · 2026-03-17T19:58:41Z

src/evalfunc.c

+	while (*str != NUL || keepempty)
+	{
+	    p = (char_u *)strstr((char *)str, (char *)pat);
+	    end = p == NULL ? str + STRLEN(str) : p;


can we avoid the strlen() inside the loop?

chrisbra · 2026-03-17T20:01:07Z

src/evalfunc.c

+	patlen = (int)STRLEN(pat);
+	while (*str != NUL || keepempty)
+	{
+	    p = (char_u *)strstr((char *)str, (char *)pat);


Hm, does strstr() handle non utf-8 multibyte chars correctly?

Copilot

Pull request overview

This PR optimizes the Vimscript split() builtin by adding a fast path for purely-literal separator patterns, avoiding regex compilation/execution for common cases while leaving regexp and default-whitespace behavior on the existing code path.

Changes:

Add is_literal_pat() helper to detect patterns with no regexp metacharacters (with multibyte-safe scanning).
Implement a literal-separator split loop using strstr() and byte-length advancement instead of vim_regcomp()/vim_regexec().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

src/evalfunc.c

+					&& *str != NUL && p != NULL
+					&& end < p + patlen))


src/evalfunc.c

+    static int
+is_literal_pat(char_u *pat)
+{
+    char_u  *p;
+
+    if (pat == NULL || *pat == NUL)
+	return FALSE;
+
+    // Check that no character in the pattern has regexp meaning.
+    // Use mb_ptr2len() to skip over multi-byte characters safely so that
+    // trail bytes are never mistaken for ASCII metacharacters.
+    for (p = pat; *p != NUL; p += mb_ptr2len(p))
+	if (*p < 0x80
+		&& vim_strchr((char_u *)".^$~[]\\*?+|{}()", *p) != NULL)
+	    return FALSE;
+
+    return TRUE;
+}
+


mattn added 3 commits March 16, 2026 13:03

fix: preserve empty split items

eb071f9

mattn changed the title ~~perf: fast path for split() with a single-byte literal separator~~ perf: add literal-pattern fast path to split() Mar 16, 2026

mattn changed the title ~~perf: add literal-pattern fast path to split()~~ perf/do_string_sub-literal-copy Mar 16, 2026

mattn changed the title ~~perf/do_string_sub-literal-copy~~ perf: add literal-pattern fast path to split() Mar 16, 2026

fix: avoid codestyle false positive

4f407db

chrisbra reviewed Mar 17, 2026

View reviewed changes

chrisbra requested a review from Copilot March 17, 2026 20:01

Copilot started reviewing on behalf of chrisbra March 17, 2026 20:02 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: add literal-pattern fast path to split()#19708

perf: add literal-pattern fast path to split()#19708
mattn wants to merge 4 commits intovim:masterfrom
mattn:perf/split-literal-fastpath

mattn commented Mar 16, 2026 •

edited

Loading

Uh oh!

char101 commented Mar 16, 2026 •

edited

Loading

Uh oh!

chrisbra Mar 17, 2026

Uh oh!

chrisbra Mar 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

mattn commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

char101 commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chrisbra Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

chrisbra Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mattn commented Mar 16, 2026 •

edited

Loading

char101 commented Mar 16, 2026 •

edited

Loading