Alsa module hangs with device removal #86

Closed
opened 4 months ago by JorwLNKwpH · 15 comments

Yambar's alsa module hangs as well as fully using one of the CPU cores when an alsa device is removed.

Using pipewire as the default alsa device (ie: route alsa through pipewire) avoids this issue until the pipewire daemon is restarted, which has the same effect as removing an alsa device. Also, if the alsa device is in use (even with pipewire), then yambar's alsa module will hang.

Should yambar just terminate the alsa module if it can't detect the soundcard?

Yambar's alsa module hangs as well as fully using one of the CPU cores when an alsa device is removed. Using pipewire as the default alsa device (ie: route alsa through pipewire) avoids this issue *until* the pipewire daemon is restarted, which has the same effect as removing an alsa device. Also, if the alsa device is in use (even with pipewire), then yambar's alsa module will hang. Should yambar just terminate the alsa module if it can't detect the soundcard?
Poster

I get this warning when changing the volume: warn: modules/alsa.c:147: default,Master: channel volume mismatch, using value from Front Left.

However,amixer get Master reports the same value for both channels.

I get this warning when changing the volume: `warn: modules/alsa.c:147: default,Master: channel volume mismatch, using value from Front Left`. However,`amixer get Master` reports the same value for both channels.
Owner

I think it would make sense to enter a polling mode when the device disappears.

That would also solve the problem where the device isn't yet available when starting yambar.

Regarding the warning; yambar considers all payback channels when checking the volume. I.e not just left and right. Since we only expose a single volume tag (and not per-channel volume), yambar warns when the volume is different on at least one channel.

It would probably be safe to simply remove the warning.

I think it would make sense to enter a polling mode when the device disappears. That would also solve the problem where the device isn't yet available when starting yambar. Regarding the warning; yambar considers *all* payback channels when checking the volume. I.e not just left and right. Since we only expose a single volume tag (and not per-channel volume), yambar warns when the volume is different on at least one channel. It would probably be safe to simply remove the warning.
Owner

Related: #59

Related: https://codeberg.org/dnkl/yambar/issues/59
Owner

Also related: #61

Also related: https://codeberg.org/dnkl/yambar/issues/61#issuecomment-245546
dnkl added the
bug
label 4 months ago
Owner

First step is to find out why the alsa module starts spinning when the device disappears. My guess would be we're getting a POLLHUP notification from select(), that we don't handle correctly, triggering an infinite select() loop. But, this needs to be verified. We need to make sure we can terminate the alsa module correctly, before implementing the poll mode (where we wait for the device to appear).

First step is to find out why the alsa module starts spinning when the device disappears. My guess would be we're getting a `POLLHUP` notification from `select()`, that we don't handle correctly, triggering an infinite `select()` loop. But, this needs to be verified. We need to make sure we can terminate the alsa module correctly, before implementing the poll mode (where we wait for the device to appear).
Owner

FWIW, I'm guessing this bug can be reproduced in plain alsa, with e.g. a USB soundcard.

FWIW, I'm guessing this bug can be reproduced in plain alsa, with e.g. a USB soundcard.
Poster

I don't seem to hit this error at all. How can I trigger this? Is there a way to get a reliable crash instead of being stuck in a loop? There might be more than one unhandled error too.

implementing the poll mode (where we wait for the device to appear).

Can udev be used to avoid polling?

FWIW, I'm guessing this bug can be reproduced in plain alsa, with e.g. a USB soundcard.

Yes, I tested the pipewire stuff after the fact. Pipewire by default doesn't touch alsa-related settings anyways, so one has to explicitly turn on the alsa integration.

I don't seem to hit [this](https://codeberg.org/dnkl/yambar/src/branch/master/modules/alsa.c#L268) error at all. How can I trigger this? Is there a way to get a reliable crash instead of being stuck in a loop? There might be more than one unhandled error too. >implementing the poll mode (where we wait for the device to appear). Can udev be used to avoid polling? >FWIW, I'm guessing this bug can be reproduced in plain alsa, with e.g. a USB soundcard. Yes, I tested the pipewire stuff after the fact. Pipewire by default doesn't touch alsa-related settings anyways, so one has to explicitly turn on the alsa integration.
Owner

I don't seem to hit this error at all.

It's only checking the first FD from alsa. There may be more. I assume all get a POLLHUP if the device disappears.

How can I trigger this? Is there a way to get a reliable crash instead of being stuck in a loop?

Not sure. I'm going to test unplugging an USB soundcard. Not sure if that'll "work" or not.

Can udev be used to avoid polling?

Maybe. It's something to look into. Though, I would guess the right thing to use is inotify.

> I don't seem to hit this error at all. It's only checking the first FD from alsa. There may be more. I assume **all** get a `POLLHUP` if the device disappears. > How can I trigger this? Is there a way to get a reliable crash instead of being stuck in a loop? Not sure. I'm going to test unplugging an USB soundcard. Not sure if that'll "work" or not. > Can udev be used to avoid polling? Maybe. It's something to look into. Though, I would guess the right thing to use is inotify.
dnkl closed this issue 4 months ago

FWIW, I can still reproduce this - either unplugging/plugging USB device and/or bluetooth headphones (using pipewire etc so switches happen automatically).

I could fix the unplugging by adding a new snd_mixer_set_callback and calling it on each iteration and checking if mask was SND_CTL_EVENT_MASK_REMOVE. That let me get out out the infinite loop.

However the bar never restarts/switches to online because it's obviously watching using inotify instead of alsa callback API (so I guess I tries to connect before alsa/pipewire is ready?).

Anyway - this might need some additional work

FWIW, I can still reproduce this - either unplugging/plugging USB device and/or bluetooth headphones (using pipewire etc so switches happen automatically). I could fix the unplugging by adding a new `snd_mixer_set_callback` and calling it on each iteration and checking if `mask` was `SND_CTL_EVENT_MASK_REMOVE`. That let me get out out the infinite loop. However the bar never restarts/switches to online because it's obviously watching using inotify instead of alsa callback API (so I guess I tries to connect before alsa/pipewire is ready?). Anyway - this might need some additional work
Owner

I don't use pipewire myself. But it would be interresting to see which revents bits are set on the alsa FDs.

Using snd_mixer_set_callback() could be our fallback method if we can't do anything with the FDs themselves.

However the bar never restarts/switches to online because it's obviously watching using inotify instead of alsa callback API (so I guess I tries to connect before alsa/pipewire is ready?).

When I tested with alsa (and {un,}plugging a USB device), inotify triggered multi times. We try to connect each of those times, and with plain alsa devices, it works (at least for me) on the last inotify notification.

But it is possible pipewire works differently, and that our connect attempt fails also for the last inotify notification.

I don't use pipewire myself. But it would be interresting to see which `revents` bits are set on the alsa FDs. Using `snd_mixer_set_callback()` could be our fallback method if we can't do anything with the FDs themselves. > However the bar never restarts/switches to online because it's obviously watching using inotify instead of alsa callback API (so I guess I tries to connect before alsa/pipewire is ready?). When I tested with alsa (and {un,}plugging a USB device), inotify triggered multi times. We try to connect each of those times, and with plain alsa devices, it works (at least for me) on the last inotify notification. But it is possible pipewire works differently, and that our connect attempt fails also for the last inotify notification.

I looked and the revent bits on the fds was still POLLIN after unpluggin the device - I am suspecting some other state needs to be checked first or the data is not necessarily up to date. It could also be that it's something that pipewire is doing weirdly (we are talking about basically opening a "virtual" device/mixer supplied by pipewire - that in theory should not "go away" when I unplug a device that's behind it?

I looked and the revent bits on the fds was still POLLIN after unpluggin the device - I am suspecting some other state needs to be checked first or the data is not necessarily up to date. It could also be that it's something that pipewire is doing weirdly (we are talking about basically opening a "virtual" device/mixer supplied by pipewire - that in theory should not "go away" when I unplug a device that's behind it?

I have a very ugly workaround for my case in 5eb41c9a40

But obviously that's not a solution...

I have a very ugly workaround for my case in https://codeberg.org/sochotnicky/yambar/commit/5eb41c9a40aa9553bd04a69903d0e8c3e085a777 But obviously that's not a solution...
Owner

Alsa sets POLLERR and/or POLLNVAL when the device disappears. If pipewire doesn't, it's a bug in its emulation layer. However, if the underlying (emulated) device node hasn't disappeared, then perhaps the behavior is correct, from pipewire's perspective.

FWIW, we only log errors/warnings when we fail to update the state, but failure doesn't cause us to break out and revert to the "disconnected" state. Doing that is a possible avenue. Someone would have to check what kind of errors, if any, we're getting when we're in the infite POLLIN state.

Alsa sets `POLLERR` and/or `POLLNVAL` when the device disappears. If pipewire doesn't, it's a bug in its emulation layer. However, if the underlying (emulated) device node hasn't disappeared, then perhaps the behavior is correct, from pipewire's perspective. FWIW, we only log errors/warnings when we fail to update the state, but failure doesn't cause us to break out and revert to the "disconnected" state. Doing that is a possible avenue. Someone would have to check what kind of errors, if any, we're getting when we're in the infite `POLLIN` state.

In my case as you guess I used the virtual device (which does not go away), but it seems the mixer itself disappears (or at least blips it seems). I can see if I can at least provide a somewhat better fix to get out of the loop...

In my case as you guess I used the virtual device (which does not go away), but it seems the mixer itself disappears (or at least blips it seems). I can see if I can at least provide a somewhat better fix to get out of the loop...

I managed to reproduce it with pipewire but not with pulseaudio so I extracted bits of alsa module for reproducer and created a bug in pipewire FYI:
https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/1627

I managed to reproduce it with pipewire but not with pulseaudio so I extracted bits of alsa module for reproducer and created a bug in pipewire FYI: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/1627
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.