Persistent "500 (Internal Server Error)" when creating PRs through API #1317
Labels
No Label
bug
Codeberg
contribution welcome
docs
duplicate
enhancement
infrastructure
legal
licence / ToS
public relations
question
s/Gitea/Forgejo
s/Pages
s/Weblate
s/Woodpecker
security
service
spam
upstream
wontfix
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: Codeberg/Community#1317
Loading…
Reference in New Issue
There is no content yet.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may exist for a short time before cleaning up, in most cases it CANNOT be undone. Continue?
Comment
We've configured renovate through WP for the
woodpecker-plugin
org.However, when running WP it always stops after a few repos with 500 errors from the API during PR creation.
Example: https://ci.codeberg.org/repos/12705/pipeline/6/3
PRs were created in some repos, so the workflow in general works and token scopes are correct. Is some rate limiting going on?
Or a proxy/LB which complains?
The error persisted across three runs and always appeared at the same repo (here https://codeberg.org/woodpecker-plugins/trivy).
Here's an excerpt of the error from above's build
This is likely due to rate limiting. The software upstream has no proper rate limiting and no way to pass a custom response to the user from the code that triggers the rate limiting.
The code can be found here and suggestions for improvements are welcome:
6b44939deb
The rate limiting is 3 issues / five minutes (pull requests are also counted as issues here). It only applies if the issue itself contains hyperlinks, but I suppose it does in your case.
We haven't had automation in mind when hotfixing this to fight our spam problem. Most created issues (by users) do not share hyperlinks, and if they do, they usually don't create many consecutive issues containing links.
Yeah OK I see. I understand and see the reasoning though I'd argue that it needs a different/modified solution as this is a bummer for any kind of semi-automated development.
Meanwhile
renovate
succeeded here: https://ci.codeberg.org/repos/12705/pipeline/8/3 so it might have been only a one-time issue, i.e. whenrenovate
is initially run on an org and hence creates a bunch of PRs in a short time. Yet it can happen again easily if e.g. a widely used image dependency is bumped and renovate tries to open PRs in > 3 repos in a run.Maybe you could whitelist some URLs? Sure, manual work needed but I guess maybe the only solution?
For renovate specifically, the issues usually contain links to github.com and renovate but then also to common package/registry sources. But maybe going with these first WRT to exclusion might help?
Surely, this could be exploited by motivated bots but will they really ever find out about a whitelist and then first create ad-like repos on GH?
Maybe Codeberg should send a 429 instead of a 500 when the rate limit is hit
We should, but we don't know how to do this. There is no precedent for ratelimiting in the code base AFAICT, and we didn't figure out a convenient way to send the custom code at the point where the rate limiting is triggered. I suppose we should probably move the code somewhere else, but I have no clue where best to do this.
We also apply some rate limiting on the reverse proxy. We send proper status code there (and I think also a nice page explaining why the rate limiting is necessary etc).
So after a few days of observing this and how renovate acts on it, it's quite limiting from a dev perspective. The
woodpecker-plugins
org is only operating on ~10 repos, and it takes me multiple weeks with a daily renovate schedule to get around the rate limit (to create all issues and PRs as needed). Yes, I can adapt the schedule but in the end this also means a lot of runs which consumed resources in the CI that could have been prevented.Maybe an option would be to filter against the
user-agent
? I.e.renovate
sends"user-agent": "RenovateBot/37.22.0 (https://github.com/renovatebot/renovate)"
and a regex match against this could be a good whitelist approach?You create a new error type and check in the API router if the error has this type. Here is an example.