A Microformats 2 parser in Haskell https://unrelenting.technology/mf2/
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Val Packett eb55df250d release 1.0.2.2 2 months ago
.github/workflows Only build executables when flags are enabled 2 months ago
executable Add dark mode to webapp 2 months ago
library/Data/Microformats2 Add aeson 2.x support 2 months ago
test-suite Project stuff 2 months ago
.ghci safe haskell and stuff 6 years ago
.gitignore Add stack metadata 8 years ago
CODE_OF_CONDUCT.md release 1.0.1.7 6 years ago
README.md Project stuff 2 months ago
Setup.hs release 1.0.1.4 7 years ago
UNLICENSE unlicense, coc 7 years ago
microformats2-parser.cabal release 1.0.2.2 2 months ago
stack.yaml Add aeson 2.x support 2 months ago
stack.yaml.lock Add aeson 2.x support 2 months ago

README.md

Hackage unlicense

microformats2-parser

Microformats 2 parser for Haskell! #IndieWeb

  • parses items, rels, rel-urls
  • resolves relative URLs (with support for the <base> tag), including inside of html for e-* properties
  • parses the value-class-pattern, including date and time normalization
  • handles malformed HTML (the actual HTML parser is tagstream-conduit)
  • also can convert to JF2
  • high performance
  • extensively tested

Also check out http-link-header because you often need to read links from the Link header!

DEMO PAGE

Usage

Look at the API docs on Hackage for more info, here's a quick overview:

{-# LANGUAGE OverloadedStrings #-}

import Data.Microformats2.Parser
import Data.Default
import Network.URI

parseMf2 def $ documentRoot $ parseLBS "<body><p class=h-entry><h1 class=p-name>Yay!</h1></p></body>"

parseMf2 (def { baseUri = parseURI "https://where.i.got/that/page/from/" }) $ documentRoot $ parseLBS "<body><base href=\"base/\"><link rel=micropub href='micropub'><p class=h-entry><h1 class=p-name>Yay!</h1></p></body>"

The def is the default configuration.

The configuration includes:

  • htmlMode, an HTML parsing mode (Unsafe | Escape | Sanitize)
  • baseUri, the Maybe URI that represents the address you retrieved the HTML from, used for resolving relative addresses -- you should set it

parseMf2 will return an Aeson Value structured like canonical microformats2 JSON. lens-aeson is a good way to navigate it.

Development

Use stack to build.
Use ghci to run tests quickly with :test (see the .ghci file).

$ stack build

$ stack test

$ stack ghci

License

This is free and unencumbered software released into the public domain.
For more information, please refer to the UNLICENSE file or unlicense.org.