iTerm2/Kitty image display protocol support [Question] #481

Open
opened 6 months ago by Eragon · 14 comments
Eragon commented 6 months ago

Sixel is good for displaying pictures as miniatures or previewing them.
But it's not really good for displaying pictures or videos.
It's a little bit slow, and the picture quality isn't as good as iTerm2 or kitty display protocol.

iTerm2 protocol seems to be more implemented than kitty protocol and they provide good quality graphics for images. And even support videos.

Is there any plans to support one of them ?
If you don't want to implement it. Is it possible for someone to implement it and see it merged here ?

Maybe the issue tracker is not the best place for that.

Sixel is good for displaying pictures as miniatures or previewing them. But it's not really good for displaying pictures or videos. It's a little bit slow, and the picture quality isn't as good as iTerm2 or kitty display protocol. iTerm2 protocol seems to be more implemented than kitty protocol and they provide good quality graphics for images. And even support videos. Is there any plans to support one of them ? If you don't want to implement it. Is it possible for someone to implement it and see it merged here ? Maybe the issue tracker is not the best place for that.
Owner

Much of the slowness actually comes from libsixel, which most sixel capable applications are using, either directly or indirectly. notcurses is pretty fast.

But, you are right in that sixel's picture quality is worse than the other protocols. This is due to it being palette based, and also having less fine grained colors - the range for a color component is 0-100, as opposed to 0-255 which we are used to.

There aren't any plans to support anything else. But that doesn't mean we won't accept contributions.

For something to get merged however, I would expect it to be small and clean, hopefully piggy backing on the image infrastructure we already have in place for sixels. Also, I'd rather not add any more library dependencies for, say, image decoding (nor do I want to pull in a complete decoder into the source tree). Remember that foot is trying to be lightweight, and having one image protocol is already border line...

It's really hard to say "yes" or "no" beforehand; we really need to see an actual implementation. I'm saying this just to make it clear that while I'm happy to see contributions, and I'm not against the iTerm2/kitty protocols, doesn't mean a contribution will be accepted. I.e. anyone interrested in doing this should be prepared for a rejection.

Much of the slowness actually comes from libsixel, which most sixel capable applications are using, either directly or indirectly. [notcurses](https://github.com/dankamongmen/notcurses) is pretty fast. But, you are right in that sixel's picture quality is worse than the other protocols. This is due to it being palette based, and also having less fine grained colors - the range for a color component is 0-100, as opposed to 0-255 which we are used to. There aren't any plans to support anything else. But that doesn't mean we won't accept contributions. For something to get merged however, I would expect it to be small and clean, hopefully piggy backing on the image infrastructure we already have in place for sixels. Also, I'd rather not add any more library dependencies for, say, image decoding (nor do I want to pull in a complete decoder into the source tree). Remember that foot is trying to be lightweight, and having one image protocol is already border line... It's really hard to say "yes" or "no" beforehand; we really need to see an actual implementation. I'm saying this just to make it clear that while I'm happy to see contributions, and I'm not _against_ the iTerm2/kitty protocols, doesn't mean a contribution **will** be accepted. I.e. anyone interrested in doing this should be prepared for a rejection.
dnkl added the
question
label 6 months ago

FYI "showing 24-bit images" is for both Kitty and iTerm2 a subset of those protocols. Other features include file transfer, shared memory stuff, a bunch of image format decoders, layering under text, and pan/stretch/scale. Also, the current best implementations of iTerm2 "images only" protocol support appears to be mintty and wezterm.

FYI "showing 24-bit images" is for both Kitty and iTerm2 a subset of those protocols. Other features include file transfer, shared memory stuff, a bunch of image format decoders, layering under text, and pan/stretch/scale. Also, the current best implementations of iTerm2 "images only" protocol support appears to be mintty and wezterm.
Owner

@lamonte I was aware this was the case with Kitty's protocol, and somewhere in the back of my mind I also knew the iTerm2 protocol had lots of "extras". But I must have suppressed that, because it had totally slipped my mind when I wrote the reply above.

I hear you did your own protocol?

@lamonte I was aware this was the case with Kitty's protocol, and somewhere in the back of my mind I also knew the iTerm2 protocol had lots of "extras". But I must have suppressed that, because it had totally slipped my mind when I wrote the reply above. I hear you did your own protocol?

@dknl I did, documented here and intended to be a stop-gap until a 24-bit protocol really takes off (e.g. is adopted by xterm). This one is as simple as I know how to do, and if one already supports any other protocol it should be a matter of a few hours.

@dknl I did, [documented here](https://gitlab.com/klamonte/jexer/-/wikis/jexer-images) and intended to be a stop-gap until a 24-bit protocol really takes off (e.g. is adopted by xterm). This one is as simple as I know how to do, and if one already supports any other protocol it should be a matter of a few hours.
Owner

@lamonte I might take a stab at adding support for it, if given time. If nothing else, it should help with refactoring the existing image support to better support alternatives.

@lamonte I might take a stab at adding support for it, if given time. If nothing else, it should help with refactoring the existing image support to better support alternatives.

FTR iTerm2 image protocol sorta work in tmux and IIRC even remotely. There's a size limit capping whatever the protocol does when transfering the image so this is not perfect or should I say pretty far from it.
I tried to hack tmux to see if I can get this fixed but my effort was failed.
There's a bug report about this somewhere.

I was using Alacritty with the iTerm2 image patch and the implemenation had multiple issues when previewing images in ranger file manager so I suggest including this in the tests (if iTerm2 is implemented).

Kitty image protocol, on the other hand, does not work at all in tmux, so maybe this fact should be taken into consideration.

FTR iTerm2 image protocol sorta work in tmux and IIRC even remotely. There's a size limit capping whatever the protocol does when transfering the image so this is not perfect or should I say pretty far from it. I tried to hack tmux to see if I can get this fixed but my effort was failed. There's a bug report about this somewhere. I was using Alacritty with the iTerm2 image patch and the implemenation had multiple issues when previewing images in ranger file manager so I suggest including this in the tests (if iTerm2 is implemented). Kitty image protocol, on the other hand, does not work at all in tmux, so maybe this fact should be taken into consideration.
sv commented 2 months ago

To play the devil's advocate

Complexity breeds fragility. Having advanced video capabilities in terminal emulator might be hip, cool and all of that... Until someone decides to fuzz the input.

_To play the devil's advocate_ Complexity breeds fragility. Having advanced video capabilities in terminal emulator might be hip, cool and all of that... Until someone decides to fuzz the input.

Complexity breeds fragility. Having advanced video capabilities in terminal emulator might be hip, cool and all of that... Until someone decides to fuzz the input.

For the record, I have crashed or otherwise rendered unusable the following terminals via their sixel image support: xterm (but only one of the earliest builds with sixel support, not since 2019), mlterm, jexer (mine), mintty, wezterm, iTerm2, contour, DomTerm, and MacTerm.

The only three terminals that I have not personally seen a sixel crash from are RLogin, yaft, and foot.

For iTerm2 protocol, I crashed mintty, iTerm2, and wezterm.

After WezTerm gets Kitty support, I'll start poking there. Given how large the attack surface is, I expect quite a few uh-ohs.

> Complexity breeds fragility. Having advanced video capabilities in terminal emulator might be hip, cool and all of that... Until someone decides to fuzz the input. For the record, I have crashed or otherwise rendered unusable the following terminals via their sixel image support: xterm (but only one of the earliest builds with sixel support, not since 2019), mlterm, jexer (mine), mintty, wezterm, iTerm2, contour, DomTerm, and MacTerm. The only three terminals that I have not personally seen a sixel crash from are RLogin, yaft, and foot. For iTerm2 protocol, I crashed mintty, iTerm2, and wezterm. After WezTerm gets Kitty support, I'll start poking there. Given how large the attack surface is, I expect quite a few uh-ohs.

FWIW Notcurses has documented their own graphics protocol. And notcurses recently removed iTerm2 support as well.

Crashes are not that concerning as long as they are addressed correctly and expediently, which dnkl does an excellent job of. I don't think it will be controversial to say that foot has the best code quality out of all terminal emulators.

Although, maybe this is one of those things where it's better to wait and see what plays out...

FWIW Notcurses has documented their own [graphics protocol](https://nick-black.com/dankwiki/index.php?title=Spriteful_TErminal_GrAphics_Protocol). And notcurses recently [removed](https://github.com/dankamongmen/notcurses/issues/2060) iTerm2 support as well. Crashes are not that concerning as long as they are addressed correctly and expediently, which dnkl does an excellent job of. I don't think it will be controversial to say that foot has the best code quality out of all terminal emulators. Although, maybe this is one of those things where it's better to wait and see what plays out...
Poster

Is there a way to use 0-255 colors in sixel ?
That can be a workaround to get good picture quality without changing the protocol.

Is there a way to use 0-255 colors in sixel ? That can be a workaround to get good picture quality without changing the protocol.
Owner

Is there a way to use 0-255 colors in sixel ?

With this, do you mean 24-bit RGB colors? Or are you referring to the standard 256 color palette?

> Is there a way to use 0-255 colors in sixel ? With this, do you mean 24-bit RGB colors? Or are you referring to the standard 256 color palette?
Poster

You said that

But, you are right in that sixel's picture quality is worse than the other protocols. This is due to it being palette based, and also having less fine grained colors - the range for a color component is 0-100, as opposed to 0-255 which we are used to.

We can't touch the palette, because that's defined by the img to sixel tool.

But is it possible to have a more fine grained colors on the terminal side ?

You said that > But, you are right in that sixel's picture quality is worse than the other protocols. This is due to it being palette based, and also having less fine grained colors - the range for a color component is 0-100, as opposed to 0-255 which we are used to. We can't touch the palette, because that's defined by the img to sixel tool. But is it possible to have a more fine grained colors on the terminal side ?
Owner

No, I don't see how that would be possible. The application programs the palette. While XTerm does have a "default" palette, foot doesn't, because all sixel applications I have seen so far do provide their own palette.

The 0-100 limitation is in the sixel protocol itself. When the application programs the palette, it specifies the color for each entry as an RGB triple, with each component being in the range 0-100. That means less granularity than the 0-255 range we're used to "normal" RGB colors.

No, I don't see how that would be possible. The application programs the palette. While XTerm does have a "default" palette, foot doesn't, because all sixel applications I have seen so far do provide their own palette. The 0-100 limitation is in the sixel protocol itself. When the application programs the palette, it specifies the color for each entry as an RGB triple, with each component being in the range 0-100. That means less granularity than the 0-255 range we're used to "normal" RGB colors.
Poster

So I have two solutions👍:

  1. Use Kitty/iTerm2/other protocol to get beautiful pictures
  2. Create sixel 2.0 with full RGB/HLS/other color space (because… why not ?)
So I have two solutions👍: 1. Use Kitty/iTerm2/other protocol to get beautiful pictures 2. Create sixel 2.0 with full RGB/HLS/other color space (because… why not ?)
Sign in to join this conversation.
No Milestone
No Assignees
6 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.