Runtime switch between a ‘fast’ and a ‘generic’ ASCII print function #402

Manually merged
dnkl merged 3 commits from optimized-print-function into master 9 months ago
dnkl commented 9 months ago
Owner

This is an alternative PR to #401, taking things a step further.

term_print() is called whenever the client application “prints” something to the grid. It is called for both ASCII and UTF-8 characters, and needs to handle sixels, insert mode and ASCII vs. graphical charsets.

Since it’s on the hot path, this becomes unnecessarily slow.

This patch adds a “fast” version of term_print(), tailored for the common case: ASCII characters in non-insert mode, without any sixels and non-graphical charsets.

A new function, term_update_ascii_printer(), has been added, and must be called whenever:

  • The currently selected charset index changes
  • The currently selected charset changes (from ASCII to graphical, or vice verse)
  • Sixels are added to the grid
  • Sixels are removed from the grid
  • Insert mode is enabled/disabled
  • Switching to/from alt screen (since that may change how many sixels we have)

Benchmark results:

Before After
alt-random 0.438s ±0.004 0.396s ±0.012
alt-random-colors 0.429s ±0.008 0.398s ±0.006
scrolling 1.255s ±0.038 1.296s ±0.025
scrolling-filled-lines 0.568s ±0.036 0.594s ±0.011

That's a ~10% performance boost, which isn't too bad considering this code path is already fairly optimized. As expected, no difference in the scrolling benchmarks.

The downside is all the calls to term_update_ascii_printer() spread out over the code base. Not really fond of that.

This is an alternative PR to https://codeberg.org/dnkl/foot/pulls/401, taking things a step further. `term_print()` is called whenever the client application “prints” something to the grid. It is called for both ASCII and UTF-8 characters, and needs to handle sixels, insert mode and ASCII vs. graphical charsets. Since it’s on the hot path, this becomes unnecessarily slow. This patch adds a “fast” version of `term_print()`, tailored for the common case: ASCII characters in non-insert mode, without any sixels and non-graphical charsets. A new function, `term_update_ascii_printer()`, has been added, and must be called whenever: * The currently selected charset **index** changes * The currently selected charset changes (from ASCII to graphical, or vice verse) * Sixels are added to the grid * Sixels are removed from the grid * Insert mode is enabled/disabled * Switching to/from alt screen (since that may change how many sixels we have) Benchmark results: | | Before | After | | ---------------------- | ------------- | ------------- | | alt-random | 0.438s ±0.004 | 0.396s ±0.012 | | alt-random-colors | 0.429s ±0.008 | 0.398s ±0.006 | | scrolling | 1.255s ±0.038 | 1.296s ±0.025 | | scrolling-filled-lines | 0.568s ±0.036 | 0.594s ±0.011 | That's a ~10% performance boost, which isn't too bad considering this code path is already fairly optimized. As expected, no difference in the scrolling benchmarks. The downside is all the calls to `term_update_ascii_printer()` spread out over the code base. Not really fond of that.
dnkl added the
performance
label 9 months ago
dnkl force-pushed optimized-print-function from 13b59cd3bd to 3ebc96fd1b 9 months ago
Poster
Owner

I tested this on another system, and while I don't have the raw numbers here, the speed up was closer to 20% there.

With numbers in that range, I think I can accept having to call term_update_ascii_printer() in a couple of places.

I tested this on another system, and while I don't have the raw numbers here, the speed up was closer to 20% there. With numbers in that range, I think I can accept having to call `term_update_ascii_printer()` in a couple of places.
dnkl force-pushed optimized-print-function from 3ebc96fd1b to 60b3ccc641 9 months ago
dnkl added 1 commit 9 months ago
dnkl added 1 commit 9 months ago
d271119032
csi: update ASCII printer function pointer when switching to/from alt screen
dnkl merged commit 749ae49c48 into master manually 9 months ago
The pull request has been manually merged as 749ae49c48.
Sign in to join this conversation.
Loading…
There is no content yet.