Spelling and Grammer Checker
An automated linter to check spelling and grammar on incoming PRs would be nice.
I'm not sure how we can automate and integrate it. Maybe use webhooks to trigger a shell script, which then comments the TeXtidote result to the PR?
I created an aspell based spellchecker with Woodpecker for the internal monthly letter drafts. It simply checks the files modified in a commit, prints the unknown words and if exits if there are more than 5 errors per file.
It's plain stupid, and IMHO a waste of energy. Most of the time is spent on installing Git and aspell in the CI, then takes one second to check the files.
But in theory this would be possible, also with your mentioned projects which should serve higher quality (aspell was merely a proof of concept).
We could compare the commits against the main branch, and detect all changed files in an PR, then spellcheck them. I might start working on it, as soon as I find a reasonable way to do this without the whole overhead, e.g. with a prepared container that doesn't download all the tools each time. Or maybe start with a local tool and put it in CI later.
I made this quick and dirty script to get all markdown files modified from main (except for deleted ones) and run textidote on it.
#!/bin/bash FILES=$(git diff origin/master...HEAD --name-only --diff-filter=d | grep "content\|.md" | tr '\n' ' ') textidote --output html $FILES > check.html
I opened #219 as a draft to implement spell checking. I do not think it is useable in its current form.
I tried TeXtidote and Spellchecker-CLI but I was not satisfied with the results. Many of my example mistakes (like simple typos) where not detected at all.
I looked around for alternatives and found hunspell which is used by a bunch of huge projects.
Unfortunately the output is text only. I guess one would have to parse the more detailed output one gets when calling hunspell with
-a and render a nice view to make it somewhat useful.
The result is a pipeline which works in theory and detected all my sample mistakes (not included in pull request).
I had to add some foo for the pipeline to fail if one or more mistakes were detected.
As a result of the check the line containing the mistake is simply written into the output file. As I said one would have to somehow interpret the output of
-a to make it more useful.
Additionally the container needs to install git and hunspell every time. I agree that this would be a waste of resources. However I guess it would be simple to just create a container image in a registry of choice to reduce the effort to only launching a container and the actual tests.
As I said I do not think it to be usable as of now. The text output is too simple. I do not think that every user wanting to improve the documentation could handle the output of the pipeline correctly. I guess it would lead to frustration. Simply looking over the documentation from time to time seems to be the better approach.
I use languagetool as a browser addon and it's nice, but as mentioned in comment 1, textidote it's built on top of that.
Hm, I didn't have that in my mind anymore when I added the comment. Thanks for the hint.
However my editor uses languagetool and it provides a lot of good tips and hints.
My experiments yesterday did not lead to these. Strange. It is definitely worth another look.
Deleting a branch is permanent. It CANNOT be undone. Continue?