How to render jupyter notebook on Codeberg properly? #671
Open
opened 2 weeks ago by penguinsfly
·
9 comments
Loading…
Reference in new issue
There is no content yet.
Delete Branch '%!s(<nil>)'
Deleting a branch is permanent. It CANNOT be undone. Continue?
How would
jupyter
notebooks be rendered properly instead of a JSON format?I was recommended to look at this. From my naive understanding, it is a configuration on the backend of
gitea
(theapp.ini
file) instead of a user/repo-specific configuration, is that correct?Thanks!
Hi @penguinsfly!
You're indeed right that this needs to be configured within the app.ini and cannot be achieved by a repo configuration.
Looking at the blog post it is a bit hacky and requires to install jupyter (and it's dependencies) on the server, which are quite a lot. Unfortunatly it doesn't seem to be that other software exists that also has the functionality to have
.ipynb
->html
.Thank you for your reply. I agree with you that
jupyter
sometimes can be quite a lot.Just for reference, I created a
venv
environment (~20M initially) and justpip install nbconvert
and it turned to be (~50MB), so I'm guessing it's around 30MB.There's also
nbviewer
(I thinknbconvert
is used backend), which renders a notebook from the raw link (e.g., see here, I just grabbed a random link on codeberg). So I wonder whether it is possible to fetch thehtml
within the body from that. Or just to somehow automatically attach a link pointing towards thenbviewer
site with the raw link appended.Another solution is
pandoc
, which might be lighter. I usually use it formd
convesion but I believe it can also convert notebooks, though one might need a few tweaks/configurations to get it to look right.Do you think any of the above suggestions might be possible? Or do you anticipate some future solutions any time soon?
I bookmarked that blog post quite a while ago. I didn't want to frickle with the setup yet.
What would be optimal in my opinion: Have (docker?) containers where everything is installed, and instead of calling
pandoc
or whatever, we're callingcontainertool run in-container-xy pandoc
or something like that. Anyone interested to work on that with us?Happy to take a stab at it(add it to the backlog), seems like using something like pandoc container is useful in the long-term as it also provides a lot of other transformations.
I noticed that Codeberg already seems to be using pandoc for
.rst
files.77a0e2828f/etc/gitea/conf/app.ini (L183)
So it seems like just a small configuration part to add support for this.
Yes, but we actually don't want to use pandoc directly on the server, but use containers for that instead.
Hmm okay, either way I had some fiddling with pandoc in how they convert it into HTML. It seems that they don't add any syntax highlighting classes/language when it's rendered "directly". It's only added when you specify to be self contained(
--self-contained
). As well for images, the conversion todata:...
URI's is when specifying the self containment as well. So while looking into just using a<iframe>
via Gitea. It became clear that unless you fully trust the output to not contain malicious scripts, it's not possible to use the<iframe>
.So I played around with the docker images for a bit and found that
nbconvert
, though not generalizable to other formats, might not be so bad (relatively). This is my first time playing around with docker so excuse my naive approach.For
pandoc
,pandoc/minimal
seems to be the smallest one available (?) and it's 79.3 MB. Whichpandoc
container source is currently used forgitea/codeberg
?Anyway, like @Gusted said, it needs additional configuration and more tinkering to get things to look right. And I haven't found a source for a template yet.
With
nbconvert
(i.e.dev:nbconvert-alpine
), I just used thepython:3.8-alpine
then installednbconvert
and it was around 83.9MB (initially 46.8MB), so not that far frompandoc/minimal
. And I tested with this demo file on nbviewer, which even without additional installations ofpandoc
somehow renders the math parts quite nicely (unsure how).Here's what I used:
Here's the
docker images
outputAs for timing, with the same file,
pandoc/minimal
took a bit less than a second whilenbconvert
took around 3 seconds. Of course, one would probably have to set a file-size and timing limit for conversion to preventnbconvert
from taking too long or too much space.FWIW, we will need this PR. Otherwise we couldn't use iframe to render the HTML.