Moving stored content from JSON to HTML #46

Open
opened 4 months ago by Alamantus · 17 comments
Owner

I've been thinking a lot about changing the way Feather Wiki stores its content so that it can be viewed more easily without JavaScript. What I had in mind was storing the data as basic HTML instead of as JSON. Now I know that parsing HTML is not ideal, but I think I know a good way to do it.

The problem is that no matter how I slice it, the result will be no less than a full extra kilobyte. Plus add however many extra bytes for backwards compatibility. I'm inclined to say that this is ok for accessibility's sake, but it goes against the "small as possible" tenet. It's a good idea, but it does work fine without it, only excluding devices and people who disable JavaScript. But then that's part of the problem—I don't want my tool to exclude anyone if I can avoid it.

So what do you think? Is having your data in fully-readable HTML even without JavaScript worthy enough to merit a non-negligible jump in size? Is 65KB a reasonable upper limit to aim for when 63KB is the current size? Writing it out like this makes it obvious that yes, of course 65KB is fine, but what do you all think?

I've been thinking a lot about changing the way Feather Wiki stores its content so that it can be viewed more easily without JavaScript. What I had in mind was storing the data as basic HTML instead of as JSON. Now I know that parsing HTML is [not ideal](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454), _but_ I think I know a good way to do it. The problem is that no matter how I slice it, the result will be no less than a full extra kilobyte. Plus add however many extra bytes for backwards compatibility. I'm inclined to say that this is ok for accessibility's sake, but it goes against the "small as possible" tenet. It's a good idea, but it does work fine without it, only excluding devices and people who disable JavaScript. But then that's part of the problem—I don't want my tool to exclude anyone if I can avoid it. So what do you think? Is having your data in fully-readable HTML even without JavaScript worthy enough to merit a non-negligible jump in size? Is 65KB a reasonable upper limit to aim for when 63KB is the current size? Writing it out like this makes it obvious that yes, of course 65KB is fine, but what do you all think?
Alamantus added the
discussion
label 4 months ago

I don't have a strong opinion about it, but I would note that TiddlyWiki recently (in 5.2.0) switched in the other direction, i.e. from html/dom storage to json inside a script tag.

https://github.com/Jermolene/TiddlyWiki5/pull/5708 has all the details and related discussion.

So the motivations for making such a big change in TiddlyWiki might be arguments in favor of sticking with json for Feather Wiki.

I don't have a strong opinion about it, but I would note that TiddlyWiki recently (in 5.2.0) switched in the other direction, i.e. from html/dom storage to json inside a script tag. https://github.com/Jermolene/TiddlyWiki5/pull/5708 has all the details and related discussion. So the motivations for making such a big change in TiddlyWiki might be arguments in favor of sticking with json for Feather Wiki.

Well... I guess the idea is to keep the biggest version below the psychological threshold of 64K? Because otherwise it seems silly to save a kilobyte at the expense of huge accessibility improvements. That wouldn't just make FeatherWiki accessible, and indexable by search engines, it would also make documents future-proof, since it will be trivial to extract the text from them with a text editor in case of need.

A bigger question is, what about pages stored as Markdown?

Well... I guess the idea is to keep the biggest version below the psychological threshold of 64K? Because otherwise it seems silly to save a kilobyte at the expense of huge accessibility improvements. That wouldn't just make FeatherWiki accessible, and indexable by search engines, it would also make documents future-proof, since it will be trivial to extract the text from them with a text editor in case of need. A bigger question is, what about pages stored as Markdown?
Poster
Owner

@sbaird That's interesting that they're moving toward JSON, but it does seem like a good idea for them. I cracked open a TiddlyWiki file I had from a 2 years ago and their format is weird as hell.

@nosycat Yeah, it really is just the psychological threshold. There's no real upper bound beyond being generally resistant to adding to the file size. But you're absolutely right, making this change provides a ton of great benefits (aside from making it slightly harder to code with because you'd have to parse the HTML every time you want to grab the data).

Markdown is, however, the eternal question because I have no idea what the correct approach would be. I was considering two options: 1) convert the Markdown to HTML and then find a way to reverse it back into Markdown on load (which has a lot of obvious problems plus would just keep adding to the file size for the HTML-to-Markdown converter) or 2) just display the Markdown as plain text. Since the whole point of this thought process is to allow non JavaScript platforms to read the data, I can't store it as markdown and display it as HTML—it's all or nothing. Any opinions on that or other ideas on handling Markdown in this situation would be appreciated!

@sbaird That's interesting that they're moving toward JSON, but it does seem like a good idea for them. I cracked open a TiddlyWiki file I had from a 2 years ago and their format is weird as hell. @nosycat Yeah, it really is just the psychological threshold. There's no _real_ upper bound beyond being generally resistant to adding to the file size. But you're absolutely right, making this change provides a ton of great benefits (aside from making it slightly harder to code with because you'd have to parse the HTML every time you want to grab the data). Markdown is, however, the eternal question because I have no idea what the correct approach would be. I was considering two options: 1) convert the Markdown to HTML and then find a way to reverse it back into Markdown on load (which has a lot of obvious problems plus would just keep adding to the file size for the HTML-to-Markdown converter) or 2) just display the Markdown as plain text. Since the whole point of this thought process is to allow non JavaScript platforms to read the data, I can't store it as markdown and display it as HTML—it's all or nothing. Any opinions on that or other ideas on handling Markdown in this situation would be appreciated!

Meh, come to think of it Markdown is plain text and can be stored as such. But if this storage scheme complicates the programming so much, I don't know. Maybe a separate exporter is the better idea after all. If anything, FeatherWiki lends itself to that much better than its inspiration.

Meh, come to think of it Markdown is plain text and can be stored as such. But if this storage scheme complicates the programming so much, I don't know. Maybe a separate exporter is the better idea after all. If anything, FeatherWiki lends itself to that much better than its inspiration.
Poster
Owner

Yeah, storing the data as HTML does complicate programming a bit, but it's not that bad as long as the stored data is consistent and easy to identify. That being said, the current JSON storage is much easier to work with from a code perspective.

I was thinking of other options for displaying wiki data without JavaScript as well, namely outputting a simple index of the page titles. This is what TiddlyWiki used to do in their noscript tag, and it is a bit problematic to only display a message about requiring JavaScript. But it again reduces non-JavaScript usefulness to nothing but what they could see if only they could use JavaScript.

Maybe that's not too terrible because the Feather Wiki app itself already demands a browser with ECMAScript 2015 support? I don't know, but maybe that's better than nothing.

Yeah, storing the data as HTML does complicate programming a bit, but it's not that bad as long as the stored data is consistent and easy to identify. That being said, the current JSON storage is much easier to work with from a code perspective. I _was_ thinking of other options for displaying wiki data without JavaScript as well, namely outputting a simple index of the page titles. This is what TiddlyWiki used to do in their `noscript` tag, and it _is_ a bit problematic to only display a message about requiring JavaScript. But it again reduces non-JavaScript usefulness to nothing but what they _could_ see if only they could use JavaScript. Maybe that's not too terrible because the Feather Wiki app itself already demands a browser with ECMAScript 2015 support? I don't know, but maybe that's better than nothing.

Seeing the table of contents is a lot better than "just enable JS". A link to a more accessible version, that the author could set, would be even more useful.

Seeing the table of contents is a lot better than "just enable JS". A link to a more accessible version, that the author could set, would be even more useful.
Poster
Owner

A custom link to a more accessible version is a great idea! That coupled with the table of contents should definitely be sufficient.

Sees this discussion after this point, know that I'm currently leaning toward this option, but I am still open to hearing other ideas and opinions!

A custom link to a more accessible version is a great idea! That coupled with the table of contents should definitely be sufficient. Sees this discussion after this point, know that I'm currently leaning toward this option, but I am still open to hearing other ideas and opinions!

I'm thinking: what if you left it as JSON, leaving FeatherWiki light and (more importantly) making your efforts as easy-going as possible programming/maintenance-wise, but had some kind of featherwiki "viewer" to translate that json into plain html?

So some kinf of single-file html thingy we can download, that we can view in a web browser, and upload a feather-wiki instance to view the JSON as HTML?

Not quite sure I"m wording that right. Might not be thinking it right? Hellish to do? Meh, a thought for the giggles of it.

I'm thinking: what if you left it as JSON, leaving FeatherWiki light and (more importantly) making your efforts as easy-going as possible programming/maintenance-wise, but had some kind of featherwiki "viewer" to translate that json into plain html? So some kinf of single-file html thingy we can download, that we can view in a web browser, and upload a feather-wiki instance to view the JSON as HTML? Not quite sure I"m wording that right. Might not be thinking it right? Hellish to do? Meh, a thought for the giggles of it.

My opinion:

  • I think staying with the JSON in script tag is the most efficient way. I agree to @sbaird

  • Storing of data in HTML tag adds a question of how it would be displayed / presented on non JS browsers, would added CSS make it more heavy for all users.

  • I find that compatibility with non JS browsers is important even if 99% of people don't use it cos, its very handy in certain situations and for power users. I think those who use FeatherWiki and TiddlyWiki are a breed of power users.

  • The JSON save format needs to be well documented so that its easier to write parsers.

  • @nosycat A seperate accessible version is a good idea... But @Alamantus having different versions feels like fragmentation as we have a few variants already.

I have a solution that could be a work around:

As a content / wiki creator, I could decide if the output should be backwards compatible with browsers that dont support JS.

  • For this, there could be a check box that states "Add Static Version" besides the save wiki button.

  • That way, when the JS code saves the wiki, it should also call a function that renders the wiki to HTML and just appends it to the bottom of the file.

(Now that could be put in a noscript tag too but it would not work on crippled browsers with JS enabled - so just appending static html works best.)

  • As a result the saved wiki would be both js compatible and also compatible on browsers with no js.

  • The downside is that it comes with double size cost when it is enabled.

But I feel that it is okay cos for 99% of the people they would not need the compat save mode. And for those who need that - lets assume university folks who read some manuals in featherwiki in a terminal or a ebook reader - they could always check the option during save.

I have added a featherwiki file with static HTML appended. Try normal view and with JS disabled.

Things to note: The increase of filesize on wire can be offset if the reverse proxy or server has gz or brotli compression enabled which is usually the case.

My opinion: * I think staying with the JSON in script tag is the most efficient way. I agree to @sbaird * Storing of data in HTML tag adds a question of how it would be displayed / presented on non JS browsers, would added CSS make it more heavy for all users. * I find that compatibility with non JS browsers is important even if 99% of people don't use it cos, its very handy in certain situations and for power users. I think those who use FeatherWiki and TiddlyWiki are a breed of power users. * The JSON save format needs to be well documented so that its easier to write parsers. * @nosycat A seperate accessible version is a good idea... But @Alamantus having different versions feels like fragmentation as we have a few variants already. I have a solution that could be a work around: As a content / wiki creator, I could decide if the output should be backwards compatible with browsers that dont support JS. * For this, there could be a check box that states "Add Static Version" besides the save wiki button. * That way, when the JS code saves the wiki, it should also call a function that renders the wiki to HTML and just appends it to the bottom of the file. (Now that could be put in a noscript tag too but it would not work on crippled browsers with JS enabled - so just appending static html works best.) * As a result the saved wiki would be both js compatible and also compatible on browsers with no js. * The downside is that it comes with double size cost when it is enabled. But I feel that it is okay cos for 99% of the people they would not need the compat save mode. And for those who need that - lets assume university folks who read some manuals in featherwiki in a terminal or a ebook reader - they could always check the option during save. I have added a featherwiki file with static HTML appended. Try normal view and with JS disabled. Things to note: The increase of filesize on wire can be offset if the reverse proxy or server has gz or brotli compression enabled which is usually the case.

@trholding I simply meant a link the wiki author can enter in the settings, that presumably points at a more accessible version of the page, if one exists. Not yet another thing for FeatherWiki to handle.

@trholding I simply meant a link the wiki author can enter in the settings, that presumably points at a more accessible version of the page, if one exists. Not yet another thing for FeatherWiki to handle.
Poster
Owner

@CharlieJV That's a good instinct because I do have plans/hopes to make a Feather Wiki tool that will process the content in a variety of ways, including a static HTML version and a plain text version. Ideally I (or someone else) will be able to make a number of different external tools that can make Feather Wiki do a lot of things that won't fit into the app itself.

@trholding I appreciate your input as always! I do intend to write up better documentation for hacking Feather Wiki by the end of the month, and the JSON structure is definitely one of the things that'll need to be included there.

I liked the idea to add an option, so I've gone ahead and added a checkbox to the settings page that allows you to include a static copy of the site's content in the HTML, and I've made the default behavior be a listing of the page titles and their links. After all the changes (and simplifying the HTML of the settings page itself), it's only 1 KB for this much-needed feature, which is acceptable to me.

@CharlieJV That's a good instinct because I do have plans/hopes to make a Feather Wiki tool that will process the content in a variety of ways, including a static HTML version and a plain text version. Ideally I (or someone else) will be able to make a number of different external tools that can make Feather Wiki do a lot of things that won't fit into the app itself. @trholding I appreciate your input as always! I do intend to write up better documentation for hacking Feather Wiki by the end of the month, and the JSON structure is definitely one of the things that'll need to be included there. I liked the idea to add an option, so I've gone ahead and added a checkbox to the settings page that allows you to include a static copy of the site's content in the HTML, and I've made the default behavior be a listing of the page titles and their links. After all the changes (and simplifying the HTML of the settings page itself), it's only 1 KB for this much-needed feature, which is acceptable to me.

@Alamantus you are the man! Much 💯 ! You just implemented that? The best opensource dev ever! Like people suggest something, you make it happen in less than a day! WOW!

Now I'll grab a copy and study the changes.

@Alamantus you are the man! Much 💯 ! You just implemented that? The best opensource dev ever! Like people suggest something, you make it happen in less than a day! WOW! Now I'll grab a copy and study the changes.
Poster
Owner

@trholding It's in the dev branch. I'm working on updating the website for the release (hopefully next week, but I have a lot to write), but you can build the new version from the dev branch if you want to see it in action! I'm sure I'll have a couple more changes and bug fixes to do before I release, too, so don't go too wild making content based on dev!

EDIT: Also, please don't come to expect such a quick turnaround. I have a habit of working on my side project in fits and starts and have been known to leave things not done for quite some time 😅 Community attention helps keep my attention really well, though! I'm glad you like the project and stick with the conversations 💖

@trholding It's in the dev branch. I'm working on updating the website for the release (hopefully next week, but I have a lot to write), but you can build the new version from the dev branch if you want to see it in action! I'm sure I'll have a couple more changes and bug fixes to do before I release, too, so don't go too wild making content based on dev! EDIT: Also, please don't come to expect such a quick turnaround. I have a habit of working on my side project in fits and starts and have been known to leave things not done for quite some time 😅 Community attention helps keep my attention really well, though! I'm glad you like the project and stick with the conversations 💖

@Alamantus I know that feeling of leaving things unfinished and working in fits. First there is always the need to make one's bread and butter which is a priority - I make hardware/embedded, these are not best times. Second for me, I had asperger's syndrome in childhood which spilled into adulthood so I don't have a choice it's automatic.

I really like FeatherWiki, its like therapy! It's like the beauty of 80's systems/tech - no bloat, pure function and plesant surprise, like magic, a feeling of being ahead in time.

I hope to contribute to FeatherWiki with code, I am more of a systems side programmer and I'm not too confident with JS except for rather C -> JS things...

Maybe this is not the place to discuss but... FYI:

I like FeatherWiki so much that I started building a affordable (very cheap) personal Mesh Info/Wiki hardware MVP despite already working on a single board computer.

I was so thrilled that I made this announcement: https://twitter.com/AMICABoard/status/1535641510032510976

I'll make sure that if it gains commercial success, small but fair royalties would be paid out to you per each device sold. I'm sure that my chill co-founder is of the same opinion.

Also you get to name the Hardware. Choose from: Peacock, Phoenix, (Mythical Bird),(Your Bird)

@Alamantus I know that feeling of leaving things unfinished and working in fits. First there is always the need to make one's bread and butter which is a priority - I make hardware/embedded, these are not best times. Second for me, I had asperger's syndrome in childhood which spilled into adulthood so I don't have a choice it's automatic. **I really like FeatherWiki, its like therapy! It's like the beauty of 80's systems/tech - no bloat, pure function and plesant surprise, like magic, a feeling of being ahead in time.** I hope to contribute to FeatherWiki with code, I am more of a systems side programmer and I'm not too confident with JS except for rather C -> JS things... Maybe this is not the place to discuss but... FYI: I like FeatherWiki so much that I started building a affordable (very cheap) personal Mesh Info/Wiki hardware MVP despite already working on a single board computer. I was so thrilled that I made this announcement: https://twitter.com/AMICABoard/status/1535641510032510976 I'll make sure that if it gains commercial success, small but fair royalties would be paid out to you per each device sold. I'm sure that my chill co-founder is of the same opinion. Also you get to name the Hardware. Choose from: Peacock, Phoenix, (Mythical Bird),(Your Bird)
Poster
Owner

My particular reason for working that way is ADHD. I can work wonders when I'm hyperfixated on a project, but then do absolutely nothing when it wears off. A blessing and a curse.

That sounds really neat! I'm absolutely hopeless when it comes to hardware, so a lot of that goes straight over my head, but I know about mesh networks, and I think they're super cool! If your idea does have commercial success, tiny royalties would be awesome! I clearly didn't make Feather Wiki for money, but I always appreciate being appreciated 😄

As for the name, I'd need to get a better understanding of what the hardware actually is and would do before I could choose the best bird for its name :)

My particular reason for working that way is ADHD. I can work wonders when I'm hyperfixated on a project, but then do absolutely nothing when it wears off. A blessing and a curse. That sounds really neat! I'm absolutely hopeless when it comes to hardware, so a lot of that goes straight over my head, but I know about mesh networks, and I think they're super cool! If your idea does have commercial success, tiny royalties would be awesome! I clearly didn't make Feather Wiki for money, but I always appreciate being appreciated 😄 As for the name, I'd need to get a better understanding of what the hardware actually is and would do before I could choose the best bird for its name :)

My particular reason for working that way is ADHD. I can work wonders when I'm hyperfixated on a project, but then do absolutely nothing when it wears off. A blessing and a curse.

Ditto, plus some other cognitive disorder cognitive psychology could not figure out.

Everything, and I do mean absolutely everything, being interesting to me and grabbing my attention, my game is to constantly somehow make a priority always the most interesting thing. Usually by making it a race to get it done as quick as I can before any distraction hammers me. SQUIRREL!

To be able to bounce around between a handful of equally high-priority interesting things as any one reaches a milestone marker, that helps.

Rock'n roll !

> My particular reason for working that way is ADHD. I can work wonders when I'm hyperfixated on a project, but then do absolutely nothing when it wears off. A blessing and a curse. Ditto, plus some other cognitive disorder cognitive psychology could not figure out. Everything, and I do mean absolutely everything, being interesting to me and grabbing my attention, my game is to constantly somehow make a priority always the most interesting thing. Usually by making it a race to get it done as quick as I can before any distraction hammers me. SQUIRREL! To be able to bounce around between a handful of equally high-priority interesting things as any one reaches a milestone marker, that helps. Rock'n roll !

As for the name, I'd need to get a better understanding of what the hardware actually is and would do before I could choose the best bird for its name :)

@Alamantus

If you want to see how small and useful FeatherWiki is...

Being small is very useful, those who made fun of FeatherWiki and its small size will probably understand its importance now!

Just a demo for the time being:

https://www.youtube-nocookie.com/embed/S8tMfcs4zcY

I did the experiments on a cheap generic micro controller board.

I did this to prove that it is feasible to put FeatherWiki on a chip thanks to opensource and a bit of hacking, it was possible!

The code + FeatherWiki as UI is 85KB only with 35KB free space left!

Using Parser : Raw BINARY
Size : 84680
Interface serial_posix: 57600 8E1
Version : 0x22
Option 1 : 0x00
Option 2 : 0x00
Device ID : 0x0410 (STM32F10xxx Medium-density)

  • RAM : Up to 20KiB (512b reserved by bootloader)
  • Flash : Up to 128KiB (size first sector: 4x1024)
  • Option RAM : 16b
  • System RAM : 2KiB

Next step is to design the hardware - could re-use stuff from some of my designs.

TLDR:

This generic board: 32bit, 120KB Flash, 20KB RAM, 75Mhz
Appolo Guidance Computer: 16bit, 72KB ROM, 4KB RAM, 2Mhz

Feather could fit into Appolo Guidance Class Computers of the 60's :)
https://en.wikipedia.org/wiki/Apollo_Guidance_Computer

FeatherWiki is the ONLY wiki that fits inside the most constrained chips!(TM)

> As for the name, I'd need to get a better understanding of what the hardware actually is and would do before I could choose the best bird for its name :) @Alamantus If you want to see how small and useful FeatherWiki is... Being small is very useful, those who made fun of FeatherWiki and its small size will probably understand its importance now! Just a demo for the time being: https://www.youtube-nocookie.com/embed/S8tMfcs4zcY I did the experiments on a cheap generic micro controller board. I did this to prove that it is feasible to put FeatherWiki on a chip thanks to opensource and a bit of hacking, it was possible! The code + FeatherWiki as UI is 85KB only with 35KB free space left! Using Parser : Raw BINARY Size : 84680 Interface serial_posix: 57600 8E1 Version : 0x22 Option 1 : 0x00 Option 2 : 0x00 Device ID : 0x0410 (STM32F10xxx Medium-density) - RAM : Up to 20KiB (512b reserved by bootloader) - Flash : Up to 128KiB (size first sector: 4x1024) - Option RAM : 16b - System RAM : 2KiB Next step is to design the hardware - could re-use stuff from some of my designs. TLDR: This generic board: 32bit, 120KB Flash, 20KB RAM, 75Mhz Appolo Guidance Computer: 16bit, 72KB ROM, 4KB RAM, 2Mhz Feather could fit into Appolo Guidance Class Computers of the 60's :) https://en.wikipedia.org/wiki/Apollo_Guidance_Computer FeatherWiki is the ONLY wiki that fits inside the most constrained chips!(TM)
516 KiB
288 KiB
Sign in to join this conversation.
No Milestone
No Assignees
5 Participants
Notifications
Due Date

No due date set.

Reference: Alamantus/FeatherWiki#46
Loading…
There is no content yet.