Hugo-to-Gemini Markdown converter

Overview

Hugo-to-Gemini converter

PkgGoDev

This repo holds a converter of Hugo Markdown posts to text/gemini (also named Gemtext in this README). The converter is supposed to make people using Hugo's entrance to Project Gemini, the alternate web, somewhat simpler.

The renderer is somewhat hasty, and is NOT supposed to be able to convert the entirety of possible Markdown to Gemtext (as it's not possible to do so, considering Gemtext is a lot simpler than Markdown), but instead a selected subset of it, enough for conveying your mind in Markdown.

The renderer uses the gomarkdown library for parsing Markdown. gomarkdown has a few quirks at this time, the most notable one being unable to parse links/images inside other links.

gmnhg

This program converts Hugo Markdown content files from content/ in accordance with templates found in gmnhg/ to the output dir. It also copies static files from static/ to the output dir.

For more details about the rendering process, see the doc attached to the program.

Usage of gmnhg:
  -output string
        output directory (will be created if missing) (default "output/")
  -working string
        working directory (defaults to current directory)

md2gmn

This program reads Markdown input from either text file (if -f filename is given), or stdin. The resulting Gemtext goes to stdout.

Usage of md2gmn:
  -f string
        input file

md2gmn is mainly made to facilitate testing the Gemtext renderer but can be used as a standalone program as well.

License

This program is redistributed under the terms and conditions of the GNU General Public License, more specifically version 3 of the License. For details, see COPYING.

Comments
  • Implement a generic links extractor

    Implement a generic links extractor

    Before this, links would only be scraped from paragraphs and rendered as a block after parent paragraph. This replaces this logic with a generic links extractor that would recursively collect every link from any parent node, including footnotes, blockquotes, and lists.

    Fixes #17 and #23.

    enhancement renderer 
    opened by tdemin 10
  • Image issues

    Image issues

    Images on my site don't seem to work in any clients, either on my live site or when testing locally. (I've just started dithering the images on my site, but this was also an issue before that.) I'm starting to wonder if gmnhg could be doing something to the images during the copy process that is messing with the ability of clients to display them? The thing is, the images display perfectly fine if I open them from the filesystem.

    If you want to see an example, check out gemini://mntn.xyz/test-markdown-syntax and look under "Images." Or for a direct link: gemini://mntn.xyz/test-markdown-syntax/PIXNIO-2545657-5760x3840.gif

    Server returns code 20 (ok) when the image is loaded, so it's not that. If I download the image using a client like gmni and view it externally, it displays just fine. If I try to view it in Lagrange or Ariane, it doesn't work. Extremely weird behavior.

    Edit: only thing I can think of is that maybe gmnhg is chopping off a byte or something like that. Something that more forgiving image viewers might be able to handle but which the clients cannot.

    question gmnhg 
    opened by mntn-xyz 7
  • Add RSS support for #26

    Add RSS support for #26

    This relies on the site config to provide a custom GeminiBaseURL setting, as well as the site title, copyright, and language. Hopefully this could be useful elsewhere. I have only tested it with a TOML config so far but the others should work. I have also only tested with the default RSS template, I haven't tried to override it yet... so there may be bugs in that feature.

    I decided that adding a "GeminiBaseURL" config option is probably the only solution besides a command line switch or extra config file. I've noticed that lots of people run their Gemini site on a separate subdomain from their main site.

    Note: I made a change to topLevelPosts, and added an empty string key that holds everything. This gives both the RSS feed and the main index a way to easily access all posts. I had to update the main index template to avoid a change in behavior, but this was helpful for keeping the RSS generation code relatively clean.

    opened by mntn-xyz 7
  • Task list formatting

    Task list formatting

    These should probably be formatted as regular lists, right now they are collapsed into a single line.

    Markdown:

    Example task list
    - [x] Theme website
    - [x] Write formatting test page
    - [ ] Fix the bugs
    

    Output:

    Example task list - [x] Theme website - [x] Write formatting test page - [ ] Fix the bugs
    
    opened by mntn-xyz 7
  • Automatic processing (dithering) of images/preview images

    Automatic processing (dithering) of images/preview images

    Perhaps this is better for a plugin, or another program entirely, but I got this idea after seeing the Gemini "imageboard" someone just made (iich.space/img) and I couldn't help but share it.

    What if there were an option to either generate dithered "preview" images (or to replace the images entirely) via this library: https://github.com/makeworld-the-better-one/dither

    I'm more partial to generating preview images, because some things like technical diagrams wouldn't work so well with dithering. But I can imagine that people running their Gemini sites on low bandwidth/low power servers might want the option to shrink all their images.

    question 
    opened by mntn-xyz 4
  • Blockquotes

    Blockquotes

    Just wanted to note a couple of issues:

    • Multiline blockquotes don't render properly; renderer should be able to handle line breaks and blank quoted lines
    • HTML tags in blockquotes are not stripped (the typical Hugo markdown example page uses <br> and <cite> which are present in the output)
    bug question 
    opened by mntn-xyz 4
  • HTML tags in blockquotes are not stripped

    HTML tags in blockquotes are not stripped

    Initially discovered in #5.

    Despite (Renderer).paragraph() utilizing (mostly) the same logic as (Renderer).blockquote(), raw HTML is stripped from text paragraphs, but not from blockquotes. Appears to be a gomarkdown issue.

    Blockquote with an HTML line break

    bug gomarkdown 
    opened by tdemin 3
  • Tables do not render

    Tables do not render

    Tables don't currently render. My suggestion is to print them inside a preformatted text block:

    | Syntax      | Description |
    | ----------- | ----------- |
    | Header      | Title       |
    | Paragraph   | Text        |
    

    This should be legible in all clients and should translate more or less directly from Markdown. Using preformatted text will ensure that extra-long tables don't wrap.

    One approach would be something like this:

    • Iterate over the table, counting characters in each cell to determine the maximum width of each column.
    • Render each row, padding each cell to the maximum width of the column.
    • While rendering rows, add in the header separator, borders, and border spacing as needed. The number of dashes in the header separator should be the same as the maximum width of the column.
    • To handle cell alignment, either do nothing (left aligned), move the cell padding to the left side (right aligned), or distribute cell padding evenly on left and right (center aligned).

    Technically gomarkdown supports colspan > 1, but I haven't seen this in Hugo markdown and I'm not sure it's even supported. An initial implementation could probably ignore this for simplicity. Hypothetical tables with colspan > 1 would still render, they would just be misaligned.

    enhancement 
    opened by mntn-xyz 3
  • H4-H6 should be handled per Gemini spec

    H4-H6 should be handled per Gemini spec

    First of all, thanks for making this! I plan on using it in a project.

    The Gemini spec only supports three levels of headings: "Headings are limited to a single line and start with either one, two or three # symbols followed by one mandatory space character"

    The extra #s generated by gmnhg seem to be inconsistently handled by some clients I've tried. It seems that the generated headings should be limited to three levels, although it would be nice to include some additional markup to help people distinguish H3 from H4-H6. It could be as simple as this:

    ### # Heading 4
    ### ## Heading 5
    ### ### Heading 6
    

    Or maybe another character would be better?

    ### + Heading 4
    ### ++ Heading 5
    ### +++ Heading 6
    
    bug 
    opened by mntn-xyz 3
  • Make front matter data accessible to index pages

    Make front matter data accessible to index pages

    gmnhg currently assumes users will type index page title and other metadata unrelated to content right in the Markdown file, essentially controlling rendering by themselves. This makes a user unable to use _index.md as the single source of index content for both the Gemini and the Web site.

    This is partially why _index.gmi.md was a thing at all: not providing the user with the means to render metadata which would usually be controlled by the template would mean it would require an additional copy of the index page will the metadata tossed in.

    enhancement gmnhg 
    opened by tdemin 2
  • Config parsing

    Config parsing

    Based on #31 patch but I made this separate.

    This now uses the default baseUrl and title from the Hugo configuration, unless a gmnhg section is defined and an override baseUrl and/or title is supplied there.

    If you use a protocol-relative base URL like //mysite.com then you don't need to define a override. But some people host their Gemini sites on different domains or subdomains, so this supports that without adding extra junk to the main Hugo configuration namespace.

    I didn't see a reason to add overrides for copyright or language code, but that's trivial to add if anyone ever needs it.

    Closes #29.

    opened by mntn-xyz 2
  • WIP: Allow using any keys in site configuration and post metadata

    WIP: Allow using any keys in site configuration and post metadata

    gmnhg currently only handles a handful of the most useful metadata fields, meaning users who want to store additional metadata for use in post templates (tags? taxonomy? out of luck).

    This makes gmnhg handle any reasonable property stored in site configuration (with support for gmnhg-specific overrides in the [gmnhg] section) as well as any post metadata property.

    enhancement gmnhg 
    opened by tdemin 0
  • Allow using _index.md on no _index.gmi.md

    Allow using _index.md on no _index.gmi.md

    This makes gmnhg use _index.md as directory / top-level index page content source in case _index.gmi.md is not present.

    As the renderer has matured a lot (namely lists of links are now handled a lot better), this should allow writing almost zero content specific to gmnhg.

    The docs are updated accordingly.

    gmnhg rfc 
    opened by tdemin 1
  • Optionally generate pages for taxonomies (tags, categories, etc)

    Optionally generate pages for taxonomies (tags, categories, etc)

    https://gohugo.io/content-management/taxonomies/

    I'd suggest duplicating the Hugo defaults, and using the defaults (tags/categories) OR the list from [taxonomies] if present, plus any exclusions from [disableKinds]. This could probably be handled in the same way as baseUrl and title, where overrides can be established in the [gmnhg] section.

    Taxonomy index templates would be passed a list of pages with metadata, just like an index page.

    I'll probably work on a PR for this when I have some free time.

    enhancement gmnhg 
    opened by mntn-xyz 1
  • Problem with bullets in blockquotes

    Problem with bullets in blockquotes

    I was quoting a post where someone used bullets and it concatenated all the bulleted text onto a single line.

    This is an example of a broken blockquote

    • Item 1
    • Item 2
    • Item 3

    Becomes

    This is an example of a broken blockquote

    Item 1Item 2Item 3

    I'll look at it when I have some free time, I'm sure it's something about the handling of lists inside of blockquotes.

    bug renderer 
    opened by mntn-xyz 1
  • add option to render links as they are on markdown

    add option to render links as they are on markdown

    basically when it creates the links at the button it will break the view and format. I have a list of software I use, is a long list.. like:

    • Desktop:
    • Web:
    • rss feeds:

    doing the links at the button of the page is super annoying for someone that just wants to go to a section and click on the software link there as I have it on markdown.

    as an example this is the result from parsing this: https://git.sr.ht/~rek2/dotfiles/tree/main/item/README.md to this: gemini://rek2.hispagatos.org/software.gmi

    in the mean time I found this other tool that does exactly what I need https://github.com/makeworld-the-better-one/md2gemini

    I rather use a compiled tool in GO or RUST or C etc so as soon if you ever do add this feature I will switch. thanks

    enhancement renderer 
    opened by r3k2 1
  • Print full post content in RSS description

    Print full post content in RSS description

    As gmnhg right now doesn't generate post summary, using post content as RSS post description source seems like our best bet. Right now gmnhg will generate RSS with empty descriptions.

    gmnhg rfc 
    opened by tdemin 0
Releases(v0.4.2)
Owner
Timur Demin
Student at Ufa State Petroleum Technological University
Timur Demin
A CLI markdown converter written in Go.

MDConv is a markdown converter written in Go. It is able to create PDF and HTML files from Markdown without using LaTeX. Instead MDConv u

null 43 Nov 12, 2022
Simple Markdown to Html converter in Go.

Markdown To Html Converter Simple Example package main import ( "github.com/gopherzz/MTDGo/pkg/lexer" "github.com/gopherzz/MTDGo/pkg/parser" "fm

Nikita Kazeka 2 Jan 29, 2022
Godown - Markdown to HTML converter made with Go

Godown Godown is a tiny-teeny utility that helps you convert your Markdown files

Kevin Suñer 0 Jan 18, 2022
🚩 TOC, zero configuration table of content generator for Markdown files, create table of contents from any Markdown file with ease.

toc toc TOC, table of content generator for Markdown files Table of Contents Table of Contents Usage Installation Packages Arch Linux Homebrew Docker

Yagiz Degirmenci 87 Jul 27, 2022
Mdfmt - A Markdown formatter that follow the CommonMark. Like gofmt, but for Markdown

Introduction A Markdown formatter that follow the CommonMark. Like gofmt, but fo

杨英明 15 Aug 24, 2022
Blackfriday: a markdown processor for Go

Blackfriday Blackfriday is a Markdown processor implemented in Go. It is paranoid about its input (so you can safely feed it user-supplied data), it i

Russ Ross 5k Nov 26, 2022
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.

html-to-markdown Convert HTML into Markdown with Go. It is using an HTML Parser to avoid the use of regexp as much as possible. That should prevent so

Johannes Kaufmann 423 Nov 13, 2022
Produces a set of tags from given source. Source can be either an HTML page, Markdown document or a plain text. Supports English, Russian, Chinese, Hindi, Spanish, Arabic, Japanese, German, Hebrew, French and Korean languages.

Tagify Gets STDIN, file or HTTP address as an input and returns a list of most popular words ordered by popularity as an output. More info about what

ZoomIO 24 Sep 27, 2022
Upskirt markdown library bindings for Go

Goskirt Package goskirt provides Go-bindings for the excellent Sundown Markdown parser. (F/K/A Upskirt). To use goskirt, create a new Goskirt-value wi

Jukka-Pekka Kekkonen 32 Oct 23, 2022
A markdown renderer package for the terminal

go-term-markdown go-term-markdown is a go package implementing a Markdown renderer for the terminal. Note: Markdown being originally designed to rende

Michael Muré 252 Nov 23, 2022
A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.

goldmark A Markdown parser written in Go. Easy to extend, standards-compliant, well-structured. goldmark is compliant with CommonMark 0.29. Motivation

Yusuke Inuzuka 2.4k Nov 17, 2022
:triangular_ruler:gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt.

gofmtmd gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt. Installation $ go get github.com/po3rin/gofm

po3rin 91 Oct 31, 2022
Convert Microsoft Word Document to Markdown

docx2md Convert Microsoft Word Document to Markdown Usage $ docx2md NewDocument.docx Installation $ go get github.com/mattn/docx2md Supported Styles

mattn 560 Nov 15, 2022
Stylesheet-based markdown rendering for your CLI apps 💇🏻‍♀️

Glamour Write handsome command-line tools with Glamour. glamour lets you render markdown documents & templates on ANSI compatible terminals. You can c

Charm 1.4k Nov 18, 2022
go-md2man - 转换 Markdown 为 man 手册内容

go-md2man Converts markdown into roff (man pages). Uses blackfriday to process markdown into man pages. Usage ./md2man -in /path/to/markdownfile.md -o

Brian Goff 173 Nov 17, 2022
A PDF renderer for the goldmark markdown parser.

goldmark-pdf goldmark-pdf is a renderer for goldmark that allows rendering to PDF. Reference See https://pkg.go.dev/github.com/stephenafamo/goldmark-p

Stephen Afam-Osemene 91 Oct 21, 2022
Markdown to Webpage app

mark2web Markdown to webpage link Usage $ mark2web test.md https://mark2web.test/aa32d8f230ef9d44c3a7acb55b572c8599502701 $ mark2web /tmp/session/test

Faithfulness Alamu 5 Apr 18, 2021
Markdown Powered Graph API

What is Arachne? Arachne, (Greek: “Spider”) in [[greek/mythology]], the [[Arachne:daughter of:Idmon of Colophon]] in Lydia, a dyer in purple. Arachne

Jesus Moreno 7 Dec 19, 2021
Schedule daily tweets from markdown files in your repo, posted via github actions.

markdown-tweet-scheduler Schedule daily tweets from markdown files in your repo, posted to twitter via github actions. Setup Fork this repo Get your t

reid j sherman 87 Oct 26, 2022