Hacking on Org Mode


2021-02-08T12:18:42+01:00
org mode emacs emacs lisp Zettelkasten

Last year seems to me like the year of the Zettelkästen. The seeds Sönke Ahrens planted a few years ago have reached their viral phase, and we’re seeing paid and free note-taking solutions crop up all over the landscape.

As a long-time Org Mode user, I was kind of pre-committed to Org-roam for my general notes.1 So that’s what I’ve gravitated towards these past months.

But this isn’t a Zettelkasten post. It isn’t even an Org-roam post. This post is about hacking Org Mode.

For I have itches.

I’m not taking notes all the time. Sometimes I’d just like to read them. Easy, I’ll just publish as HTML and point my web browser to them, right?

There are quite a few ways to convert Org to HTML. The natural one being, of course, Org Export. It tastes raw out of the box, but it certainly does the job: I get an HTML transcription of my input; external and media links are converted and work fine. Defining the whole Kasten as an Org Publish project, internal links work as well.

But I have Org Attachments. In Org Mode, attachments are files that are virtually copied inside an outline. Physically they’re copied to an UUID-derived subdirectory; a system property coupled with the attachment: link scheme maintains the abstraction.

The “default” way to handle them is to set up publishing on the attachment container directory with the org-publish-attachment publishing function.2 This would locate any file in the attachment subdirectories, and copy them with no transformation to the publication directory. The transcoder for attachment links is in the same mindset, so it all kind of just works.

There’s a catch, of course. What happens when I have multiple attachments, in separate outlines, with the same name?

As far as the Org Mode abstraction goes, that’s not a problem: the UUID directory they’re stored in will differ, and we can indeed access them separately from their respective contexts.

But when published with org-publish-attachment, they end up in the same directory. The latest one published clobbers all homonyms. Not too satisfactory. Especially when Org Mode had a solution in place to avoid those conflicts all along.

So let’s port it.

I’d like them copied in the same kind of directory structure as they come from. I suppose I could just copy the entire attachment container directory, by Emacs or non-Emacs means. But that feels dirty: what if I only wanted to publish a single outline? What if I had removed an entire Org file for, say, privacy reasons? I wouldn’t want its attachments to still be published. And I’m going to need the attachment links to follow anyway. So, really, this means I can only reasonably handle them while I’m taking care of the Org files they’re declared in, since they’re the ones who have the knowledge of their container’s UUID.

So here’s how I’m currently doing it.

(defun my-drop-prefix (prefix string)
  "Removes PREFIX from a the beginning of STRING.

If the prefix doesn't match, `error' is called."
  (if (string= (substring string 0 (length prefix)) prefix)
      (substring string (length prefix))
    (error "%S is not a prefix of %S" prefix string)))

(defun my-org-publish-attachment-relative (plist filepath pub-dir)
  "Like `org-publish-attachment', but keep the attachment's relative path."
  (when (file-name-absolute-p filepath)
    ;; is absolute path, make relative again
    (setq filepath
          (my-drop-prefix default-directory (expand-file-name filepath))))
  (let* ((path (file-name-directory filepath))
         (pub-dir-deep (concat (file-name-as-directory pub-dir) path)))
    (org-publish-attachment plist filepath pub-dir-deep)))

(defun my-org-attach-file-dir-of (element)
  "Helper: return the attachment directory of a provided Org Element."
  (let ((pos (org-element-property :begin element)))
    (file-name-as-directory
     (save-excursion (goto-char pos) (org-attach-dir)))))

(defun my-org-publish-attachment-filter (tree backend plist)
  "Tree filter to scan for attachments and publish them in the
same relative directory they come from.  Returns the unchanged
tree.

To be used as a :filter-parse-tree in the
`org-publish-project-alist'."
  (let ((pub-dir (plist-get plist :publishing-directory)))
    (org-element-map tree 'headline
      (lambda (hl)
        (and (member "ATTACH" (org-element-property :tags hl))
             (let* ((dir (my-org-attach-file-dir-of hl))
                    (files (org-attach-file-list dir)))
               (dolist (file files)
                 (my-org-publish-attachment-relative
                  plist (concat dir file) pub-dir))))))
    (org-element-map tree 'link
      (lambda (link)
        (when (string= "attachment" (org-element-property :type link))
          (let* ((dir (my-org-attach-file-dir-of link))
                 (file (org-element-property :path link)))
            (org-element-put-property link :type "file")
            (org-element-put-property link :path (concat dir file)))))))
  tree)

A few things to note:

One double-sized itch scratched.

Next up: I don’t always have inner links to attachments. Within Emacs, they can always be accessed through the attachment dispatcher, with the o class of end-functions for files, or f for the entire folder. How can I port that to HTML?

There’s always an ATTACH tag to headlines with them; maybe I could enhance it to an HTML link?

Say linking directly to the attachment if there’s a single one, and to the entire directory if there’s multiple. Onus on the webserver to serve an index for it.

I’m not confident enough to touch the exporter’s info plist, so I’ll use a dynamic variable instead.

(defvar my-attach-link nil
  "The attachment link target currently in scope, nil when none.")

(defun my-ox-html-attach-headline (old-func headline &rest args)
  "Set `my-attach-link' to the attachment target location for the
scope of `org-html-headline'.

Intended as advice on `org-html-headline'."
  (if (member "ATTACH" (org-element-property :tags headline))
      (let* ((dir (my-org-attach-file-dir-of headline))
             (files (org-attach-file-list dir))
             (my-attach-link (cond ((cdr files) dir)
                                   ((concat dir (car files))))))
        (apply old-func headline args))
    (apply old-func headline args)))
(advice-add 'org-html-headline :around #'my-ox-html-attach-headline)

(defun my-make-link (dest text)
  "Make an Org link element from a destination"
  (with-temp-buffer
    (save-excursion (insert (org-link-make-string dest text)))
    (org-element-link-parser)))

(defun my-ox-html-attach-tag (arglist)
  "Replace an ATTACH tag string with an HTML link.

Intended as advice on `org-html--tags'.

The link is to be setup in dynamic variable `my-attach-link' by
the `my-ox-html-attach-headline' advice."
  (let ((tags (car arglist))
        (info (cadr arglist)))
    (list
     (mapcar (lambda (tag)
               (if (string= tag "ATTACH")
                   (org-html-link (my-make-link my-attach-link tag)
                                  "ATTACH"
                                  (cons '(:html-inline-image-rules nil) info))
                 tag))
             tags)
     info)))
(advice-add 'org-html--tags :filter-args #'my-ox-html-attach-tag)

Noteworthy:

Second itch scratched.

What’s blocking me from Nirvana?

On the Org core front: I’m going to want inline audio pretty soon. I could tackle it from both ends, so I’ll need a bit more reflecting before I start.

On the Org-roam front: I’m obviously going to need the backlinks. So that’s still missing, but I don’t think I’ll need to personally hack as much: a lot has already been done out there and I’ll likely be able to pick one. Neil Mather’s implementation is particularly nifty, I’ll probably take a lot of inspiration there.

In the meantime, the code is available as a gist. Feel free to help improve my rusty Elisp!


  1. I do use at least two other systems for specialized and/or shared notes: Neuron and the venerable MediaWiki.↩︎

  2. Despite its flaws, there’s really no obligation to do it that way. There’s just no support (that I found) for anything else.↩︎

  3. You may wonder, as I did, why it can’t be done directly at the tag transcoding point. The reason is that the tag transcoding point is not provided with enough information to recover the attachment properties: all it has is the tags string and the global export plist; no buffer positioning or parse tree.↩︎