Segfault on (maybe) battery event #45

Closed
opened 8 months ago by st3r4g · 7 comments
st3r4g commented 8 months ago
yambar version 1.6.1

I've had this happen other times, but I didn't have core dumps enabled. It happens once in a while, and seems peculiar to this laptop.
Backtrace:

#0  0x0000000000000000 in ?? ()
#1  0x0000556a3f01fa67 in instantiate (particle=0x556a3fe232c0, tags=0x7fe7704269d0)
    at ../particles/ramp.c:123
#2  0x0000556a3f01f5dd in instantiate (particle=0x556a3fe239c0, tags=0x7fe7704269d0)
    at ../particles/list.c:120
#3  0x0000556a3f0200ab in instantiate (particle=0x556a3fe23270, tags=0x7fe7704269d0)
    at ../particles/map.c:114
#4  0x0000556a3f01d01a in content (mod=0x556a3fe23f20) at ../modules/battery.c:131
#5  0x0000556a3f022575 in module_begin_expose (mod=<optimized out>) at ../module.c:25
#6  0x0000556a3f0215c6 in expose (_bar=0x556a3fe246b0) at ../bar/bar.c:100
#7  0x0000556a3f01e36b in loop (_bar=0x556a3fe246b0, expose=0x556a3f0213f0 <expose>, 
    on_mouse=<optimized out>) at ../bar/wayland.c:1056
#8  0x0000556a3f01f165 in run (_bar=0x556a3fe246b0) at ../bar/bar.c:259
#9  0x00007fe771150f6a in start_thread (arg=0x7fe770427640) at pthread_create.c:460
#10 0x00007fe77107e2ff in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

I see ../modules/battery.c in the backtrace, does this mean it concerns the battery module and was triggered by a battery event?
pp looks like pointing to a wrong location, because other members too are invalid (besides instantiate which led to the segfault). Unfortunately idx is <optimized out>, but it has to be outside of the valid range, because:

(gdb) p pp
$22 = (struct particle *) 0x556a3fcea210
(gdb) p *p->particles@p->count
$23 = {0x556a3fe23310, 0x556a3fe233a0, 0x556a3fe23430, 0x556a3fe234c0, 0x556a3fe23550, 
  0x556a3fe235e0, 0x556a3fe23670, 0x556a3fe23700, 0x556a3fe23790, 0x556a3fe23820}

I can try to run a debug build and see which assertion it hits. Let me know if you have other suggestions to debug this.

``` yambar version 1.6.1 ``` I've had this happen other times, but I didn't have core dumps enabled. It happens once in a while, and seems peculiar to this laptop. Backtrace: ``` #0 0x0000000000000000 in ?? () #1 0x0000556a3f01fa67 in instantiate (particle=0x556a3fe232c0, tags=0x7fe7704269d0) at ../particles/ramp.c:123 #2 0x0000556a3f01f5dd in instantiate (particle=0x556a3fe239c0, tags=0x7fe7704269d0) at ../particles/list.c:120 #3 0x0000556a3f0200ab in instantiate (particle=0x556a3fe23270, tags=0x7fe7704269d0) at ../particles/map.c:114 #4 0x0000556a3f01d01a in content (mod=0x556a3fe23f20) at ../modules/battery.c:131 #5 0x0000556a3f022575 in module_begin_expose (mod=<optimized out>) at ../module.c:25 #6 0x0000556a3f0215c6 in expose (_bar=0x556a3fe246b0) at ../bar/bar.c:100 #7 0x0000556a3f01e36b in loop (_bar=0x556a3fe246b0, expose=0x556a3f0213f0 <expose>, on_mouse=<optimized out>) at ../bar/wayland.c:1056 #8 0x0000556a3f01f165 in run (_bar=0x556a3fe246b0) at ../bar/bar.c:259 #9 0x00007fe771150f6a in start_thread (arg=0x7fe770427640) at pthread_create.c:460 #10 0x00007fe77107e2ff in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 ``` I see `../modules/battery.c` in the backtrace, does this mean it concerns the battery module and was triggered by a battery event? `pp` looks like pointing to a wrong location, because other members too are invalid (besides `instantiate` which led to the segfault). Unfortunately `idx` is `<optimized out>`, but it has to be outside of the valid range, because: ``` (gdb) p pp $22 = (struct particle *) 0x556a3fcea210 (gdb) p *p->particles@p->count $23 = {0x556a3fe23310, 0x556a3fe233a0, 0x556a3fe23430, 0x556a3fe234c0, 0x556a3fe23550, 0x556a3fe235e0, 0x556a3fe23670, 0x556a3fe23700, 0x556a3fe23790, 0x556a3fe23820} ``` I can try to run a debug build and see which assertion it hits. Let me know if you have other suggestions to debug this.
Owner

does this mean it concerns the battery module and was triggered by a battery event?

Yep, looks like it. The bar iterates all modules and asks them for their "current" content. Content in this case means instantiated particles (where instantiated means a particle with a specific set of tag values applied).

A debug build is probably the best way forward. There are likely a couple of bugs involved here; an issue in the battery module (or your battery driver), causing it to instantiate a tag with an unexpected value (e.g -1, or something like that), and most likely a bug in the ramp particle where the calculated index isn't properly bounded.

> does this mean it concerns the battery module and was triggered by a battery event? Yep, looks like it. The bar iterates all modules and asks them for their "current" content. Content in this case means instantiated particles (where instantiated means a particle with a specific set of tag values applied). A debug build is probably the best way forward. There are likely a couple of bugs involved here; an issue in the battery module (or your battery driver), causing it to instantiate a tag with an unexpected value (e.g `-1`, or something like that), and most likely a bug in the `ramp` particle where the calculated index isn't properly bounded.
Owner

One potential problem in ramp is if the tag's min value is greater than its max value.

There's an assertion for this, but the code doesn't actually handle it.

One potential problem in `ramp` is if the tag's `min` value is greater than its `max` value. There's an assertion for this, but the code doesn't actually handle it.
Owner

One potential problem in ramp is if the tag's min value is greater than its max value.

But I don't think that's the problem here; the battery module only instantiates a single range tag, capacity, with hard coded min=0 and max=100 values. Of course, if ramp is applied to something other than the capacity tag, anything could happen.

> One potential problem in ramp is if the tag's min value is greater than its max value. But I don't think that's the problem here; the battery module only instantiates a single range tag, `capacity`, with hard coded `min=0` and `max=100` values. Of course, if `ramp` is applied to something _other_ than the `capacity` tag, anything could happen.
Owner

The same issue exists with value; there's an assertion for min <= value <= max, but nothing is done to handle values out of range.

This is probably what's happening here.

Battery drivers have been seen to intermittently report very weird values...

Suggested patch:

diff --git a/particles/ramp.c b/particles/ramp.c
index 45cc277..8b23855 100644
--- a/particles/ramp.c
+++ b/particles/ramp.c
@@ -4,6 +4,9 @@
 
 #include <stdio.h>
 
+#define LOG_MODULE "ramp"
+#define LOG_ENABLE_DBG 0
+#include "../log.h"
 #include "../config.h"
 #include "../config-verify.h"
 #include "../particle.h"
@@ -102,6 +105,26 @@ instantiate(const struct particle *particle, const struct tag_set *tags)
     long min = tag != NULL ? tag->min(tag) : 0;
     long max = tag != NULL ? tag->max(tag) : 0;
 
+    if (min > max) {
+        LOG_WARN(
+            "tag's minimum value is greater than its maximum: "
+            "tag=\"%s\", min=%ld, max=%ld", p->tag, min, max);
+        min = max;
+    }
+
+    if (value < min) {
+        LOG_WARN(
+            "tag's value is less than its minimum value: "
+            "tag=\"%s\", min=%ld, value=%ld", p->tag, min, value);
+        value = min;
+    }
+    if (value > max) {
+        LOG_WARN(
+            "tag's value is greater than its maximum value: "
+            "tag=\"%s\", min=%ld, value=%ld", p->tag, max, value);
+        value = max;
+    }
+
     assert(value >= min && value <= max);
     assert(max >= min);
 

edit: updated patch to also add logging.

The same issue exists with `value`; there's an assertion for `min <= value <= max`, but nothing is done to handle values out of range. This is probably what's happening here. Battery drivers have been seen to intermittently report very weird values... Suggested patch: ```diff diff --git a/particles/ramp.c b/particles/ramp.c index 45cc277..8b23855 100644 --- a/particles/ramp.c +++ b/particles/ramp.c @@ -4,6 +4,9 @@ #include <stdio.h> +#define LOG_MODULE "ramp" +#define LOG_ENABLE_DBG 0 +#include "../log.h" #include "../config.h" #include "../config-verify.h" #include "../particle.h" @@ -102,6 +105,26 @@ instantiate(const struct particle *particle, const struct tag_set *tags) long min = tag != NULL ? tag->min(tag) : 0; long max = tag != NULL ? tag->max(tag) : 0; + if (min > max) { + LOG_WARN( + "tag's minimum value is greater than its maximum: " + "tag=\"%s\", min=%ld, max=%ld", p->tag, min, max); + min = max; + } + + if (value < min) { + LOG_WARN( + "tag's value is less than its minimum value: " + "tag=\"%s\", min=%ld, value=%ld", p->tag, min, value); + value = min; + } + if (value > max) { + LOG_WARN( + "tag's value is greater than its maximum value: " + "tag=\"%s\", min=%ld, value=%ld", p->tag, max, value); + value = max; + } + assert(value >= min && value <= max); assert(max >= min); ``` **edit**: updated patch to also add logging.
dnkl added the
bug
label 8 months ago
Poster

Thank you, I'm now running a debug build of master with the patch applied, so I should see one of the warnings when it happens again, and hopefully no more segfaults. I have not figured out how to trigger it, so I'll just wait for it to happen spontaneously.

Thank you, I'm now running a debug build of master with the patch applied, so I should see one of the warnings when it happens again, and hopefully no more segfaults. I have not figured out how to trigger it, so I'll just wait for it to happen spontaneously.
Poster

Here it is:

2021-04-21 15:35:02.492359747 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307
2021-04-21 15:35:02.782189971 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307
2021-04-21 15:35:04.000748973 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307
2021-04-21 15:35:09.014495162 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307
2021-04-21 15:35:14.028085240 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307
2021-04-21 15:35:19.041911274 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307
2021-04-21 15:35:24.055509704 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307
2021-04-21 15:35:29.069102574 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307

and no segfault this time. I guess the patch is good to go.
EDIT: oh wait, the text is wrong in that warning, min=%ld should be max=%ld

Here it is: ``` 2021-04-21 15:35:02.492359747 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307 2021-04-21 15:35:02.782189971 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307 2021-04-21 15:35:04.000748973 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307 2021-04-21 15:35:09.014495162 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307 2021-04-21 15:35:14.028085240 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307 2021-04-21 15:35:19.041911274 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307 2021-04-21 15:35:24.055509704 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307 2021-04-21 15:35:29.069102574 warn: particles/ramp.c:122: tag's value is greater than its maximum value: tag="capacity", min=100, value=307 ``` and no segfault this time. I guess the patch is good to go. **EDIT**: oh wait, the text is wrong in that warning, `min=%ld` should be `max=%ld`
dnkl referenced this issue from a commit 8 months ago
dnkl closed this issue 8 months ago
Owner

Thanks! Patch (with typo fixed) has been pushed to master.

Thanks! Patch (with typo fixed) has been pushed to master.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.