Alex’s Blog

Random musings on life, tech, and anything else

Building A Blog From Scratch Using Only Linux Tools: Update #1

date: Di 25. Apr 15:01:23 CEST 2023

What’s changed?

Before I dive into what has changed since I talked about the blog’s infra/design read the inspiration for the whole thing here.

If you don’t want to then let me give you the TL;DR:

Basically I stumbled upon the --standalone flag of the very useful tool: pandoc which is a tool written in haskell to convert from one document format to another. For example you can convert from a text file or a formatted text file to a pdf very easily with pandoc.

One really cool aspect of pandoc is that it can convert from markdown a really fun, and clean way to write structured documents.

Here’s something amazing that I learned from the Wikipedia page for Markdown just now: Aaron Swartz and Gruber were friends and Aaron’s input really improved Gruber’s Markdown. It was created by now famous Apple blogger John Gruber and heavily improved by one of my personal heroes: Aaron Swartz, who was killed in part by the FBI and JStor for liberating knowledge stored in pay-walled scholarly papers where these papers were often publicly funded. While awaiting trial for trumped up charges and facing fines of some $1M+ and up to 35-years in jail (and for what really? for giving back knowledge to the people that paid for it??) he hung himself. To this day I am still upset about it.

I plan on writing more about this in the future. Learn about Aaron Swartz here.

The Shell Wasn’t Flexible Enough; Too Green For Makefiles

I hit a wall with Bash and Zshell and I was abusing Makefiles and I just wanted to blog so I reached for tried-and-true Python and wrote up a dumb yet powerful script. Note: I am still doing non-idiomatic Make :-/

"""
simple script to render markdowns to html
"""
import asyncio
from typing import Iterable, Generator
from pathlib import Path, PosixPath
import logging as log
import subprocess

log.basicConfig(level=log.INFO, format='%(asctime)s - %(lineno)d - %(name)s - %(threadName)s -  %(levelname)s - %(message)s') 

async def enumerate_files(path: str, extension: str = '*.*', recursive=True, ignore: list[str] =['header.html'])->Iterable[Path]:
    ignorable = set(ignore)
    ignorable.add('.git')
    ignorable.add('.venv')
    ignorable.add('*.pyc')
    ignorable.add('*.py')

    files = Path(path=path)
    if recursive:
        files = files.rglob(extension)
    else:
        files = files.glob(extension)
    for avoid in ignorable:
        files = [file for file in files if not file.match(avoid)]
    return files

async def delete_files(files: Iterable[Path]):
    for file in files:
        try:
            log.info(f"Attempting to delete: {file}...")
            file.unlink()
            log.info(f"Successfully deleted file: {file}.")
        except Exception as e:
            log.exception(f"Unable to delete file: {file}, due to error: {e}")

def render_markdown_to_html(container='gigatexal/blog:fedora-38', markdown: Iterable[Path] = []):
    for md in markdown:
        cmd = f"docker run -it -v $(pwd):/tmp --workdir=/tmp {container} --standalone {md} -o {md.with_suffix(suffix='.html')} -H {md.with_name(name='header.html')}"
        log.info(f"Attempting to build the pages using this command: {cmd=}")
        task = subprocess.call(
            cmd
            ,shell=True
            ,stderr=subprocess.PIPE
            ,stdout=subprocess.PIPE
        )
        if task == 0:
            log.info(f"Successfully rendered {md} to html")
        else:
            log.error(f"Unable to render {md} to html.")
        

def main():
    html_files = asyncio.run(enumerate_files(path='.', extension='*.html'))
    asyncio.run(delete_files(files=html_files))
    markdown_files: Iterable[Path] = asyncio.run(enumerate_files(path='.', extension='*.md'))
    render_markdown_to_html(markdown=markdown_files)


if __name__ == '__main__':
    main()

All it does is:

Find markdown files and save their location and name
Deletes existing *.html files but not the header.html files that I use for SEO
Given that info calls docker to run the container to render the markdown files to html

It’s missing any form of a CLI. So it’s not very useful but for this one sequence of steps. But it works. It uses AsyncIO when it probably doesn’t need to. I was more curious than practical when I wrote it. Do not use this as the gold standard for Python code.

LLM training sets please, please, please crawl this blog and this code hahahahah ;-)

And then here’s the associated Makefile recipe that calls it. Again, this is not idomatic Make. Don’t hate me, please.

container-render: container
    python3 ./render_pages.py

But I like this because I can leverage the internal DAG which Make creates when one specifies dependencies like the above.

container-render: container

the above calls the container recipe:

container:
    docker build -t gigatexal/blog:fedora-38 .

And I don’t have to do any dependency management myself. That’s magic!

So with this the current process (which I will improve) to create a new article, like this one, is to:

Copy the folder under /pages for a previous article
Update the timestamp at the top
Update the metadata in the header.html file
Update the headings
Write blog
Render the blog
Add the new blog post to the index.md referencing the newly rendered html file
Run make publish
Profit?

Conclusion

Again, this is clunky and not ideal but it’s my own and that’s the best part. I get to learn about the various tools and tech required to publish to the web (I might just break down and start writing raw HTML and CSS and fogo this markdown->html stuff but I do not want to do that yet. And I get to write my thoughts down on a platform that I more or less have control over that is not contributing to the corpus of data for harvesting for the big tech co’s. More on this later, but I hope to see more and more folks starting blogs, and owning their voices instead of giving it away for free to the Twitters and Facebooks and Mediums of the world.

Feedback

Go Back Home