Text reflow is too slow #504

Closed
opened 6 months ago by dnkl · 5 comments
dnkl commented 6 months ago
Owner

Run ls --hyperlink=always /usr/bin ls /use/bin a couple of times, then grab the corner of the terminal window and resize the window back and forth. It is slow. Or if initially fast, gets slower and slower and slower.

The reason is the inefficient remapping of OSC-8 start and endpoints when reflowing text.

Labling as a bug, because this is just too slow.

Run ~~`ls --hyperlink=always /usr/bin`~~ `ls /use/bin` a couple of times, then grab the corner of the terminal window and resize the window back and forth. It is _slow_. Or if initially fast, gets slower and slower and slower. ~~The reason is the inefficient remapping of OSC-8 start and endpoints when reflowing text.~~ Labling as a bug, because this is just too slow.
dnkl added the
bug
performance
labels 6 months ago
dnkl changed title from OSC-8 hyperlinks makes text reflow slooow to Text reflow is too slow 6 months ago
Poster
Owner

While OSC-8 doesn't make things faster, reflow is way too slow even without them.

This needs to be addressed.

While OSC-8 doesn't make things faster, reflow is way too slow even without them. This needs to be addressed.
Poster
Owner

wcwidth() is a major issue here.

We should be able to avoid
it completely and use our SPACERs instead.

`wcwidth()` is a major issue here. We should be able to avoid it completely and use our SPACERs instead.
Poster
Owner

I think there are several things that needs to be done here, but the first one should be getting rid of wcwidth().

The way to do this is by using CELL_MULT_COL_SPACER. Unfortunately, we cannot do that as of today. It is used in two ways: first, it's printed after a multi-column character, as a sort of padding (also makes it possible to detect multi-column characters given a random cell in the grid). But it is also used as padding before multi-column characters, as padding, to ensure a multi-column character isn't printed on/over the right margin.

This makes it very difficult to use it while reflowing the grid.

I think the best way to solve this is by redefining CELL_MULT_COL_SPACER, from a single value, to being a base value, to which we add the number of remaining spacers, including itself.

So if we have a character whose wcwidth is 4, we'd print <c> <4> <3> <2> <1>. The pre-pad variant could use the base value itself. So if the application prints a double-width character in the last column, what we'd actually end up printing is:

                <0>
<c> <1>

That way, we always know what a spacer means. And then we should be able to use this instead of having to call wcwidth() while reflowing the grid.

And please rename it to CELL_SPACER!

Next, the reflow logic currently "collects" empty cells, and only emits them under certain conditions. However, when we do emit them, they are treated as regular cells. I.e. we run lots of logic that inspects the Unicode point and does multi-column handling. This is completely unnecessary; we know all the properties of an empty cell. Make use of this.

After that, we can start looking at the handling of URIs, and tracking points in general.

I think there are several things that needs to be done here, but the first one should be getting rid of `wcwidth()`. The way to do this is by using `CELL_MULT_COL_SPACER`. Unfortunately, we cannot do that as of today. It is used in two ways: first, it's printed **after** a multi-column character, as a sort of padding (also makes it possible to detect multi-column characters given a random cell in the grid). But it is also used as padding **before** multi-column characters, as padding, to ensure a multi-column character isn't printed on/over the right margin. This makes it very difficult to use it while reflowing the grid. I think the best way to solve this is by redefining `CELL_MULT_COL_SPACER`, from a single value, to being a base value, to which we add the number of remaining spacers, including itself. So if we have a character whose wcwidth is 4, we'd print `<c> <4> <3> <2> <1>`. The pre-pad variant could use the base value itself. So if the application prints a double-width character in the last column, what we'd actually end up printing is: ``` <0> <c> <1> ``` That way, we always know what a spacer _means_. And then we should be able to use this instead of having to call `wcwidth()` while reflowing the grid. And please rename it to `CELL_SPACER`! Next, the reflow logic currently "collects" empty cells, and only emits them under certain conditions. However, when we _do_ emit them, they are treated as regular cells. I.e. we run lots of logic that inspects the Unicode point and does multi-column handling. This is completely unnecessary; we **know** all the properties of an empty cell. Make use of this. After that, we can start looking at the handling of URIs, and tracking points in general.
dnkl referenced this issue from a commit 5 months ago
dnkl added this to the 1.8.0 milestone 5 months ago
Poster
Owner

With all the PRs above applied (1, 2, 3, and the not-yet-merged 4), reflow times have gone from:

  • 16K scrollback, lots of OSC-8 URIs: ~40ms → 15ms
  • 16K scrollback, no OSC-8 URIs: ~22ms → 7ms

This is very unscientific. I'm just filling the scrollback with while true; do ls --hyperlink=always /usr/bin; done, and while true; do ls --hyperlink=none /usr/bin; done respectively, and then manually resizing window, in roughly the same way.

The times mentioned above are "visual" means. That is; I'm looking at a long list of reflow times and picking what appears to be the mean value.

With all the PRs above applied (1, 2, 3, and the not-yet-merged 4), reflow times have gone from: * 16K scrollback, **lots** of OSC-8 URIs: ~40ms → 15ms * 16K scrollback, **no** OSC-8 URIs: ~22ms → 7ms This is very unscientific. I'm just filling the scrollback with `while true; do ls --hyperlink=always /usr/bin; done`, and `while true; do ls --hyperlink=none /usr/bin; done` respectively, and then manually resizing window, in roughly the same way. The times mentioned above are "visual" means. That is; I'm looking at a long list of reflow times and picking what appears to be the mean value.
Poster
Owner

OSC-less output is now reflowed in what I'd say are very reasonable times. OSC-filled output is still sluggish on slower systems, but at least much better than before.

OSC-less output is now reflowed in what I'd say are very reasonable times. OSC-filled output is still sluggish on slower systems, but at least much better than before.
dnkl referenced this issue from a commit 5 months ago
dnkl closed this issue 5 months ago
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.