Last year seems to me like the year of the Zettelkästen. The seeds Sönke Ahrens planted a few years ago have reached their viral phase, and we’re seeing paid and free note-taking solutions crop up all over the landscape.
But this isn’t a Zettelkasten post. It isn’t even an Org-roam post. This post is about hacking Org Mode.
For I have itches.
I’m not taking notes all the time. Sometimes I’d just like to read them. Easy, I’ll just publish as HTML and point my web browser to them, right?
There are quite a few ways to convert Org to HTML. The natural one being, of course, Org Export. It tastes raw out of the box, but it certainly does the job: I get an HTML transcription of my input; external and media links are converted and work fine. Defining the whole Kasten as an Org Publish project, internal links work as well.
But I have Org Attachments. In Org Mode, attachments are files that are virtually copied inside an outline. Physically they’re copied to an UUID-derived subdirectory; a system property coupled with the
attachment: link scheme maintains the abstraction.
The “default” way to handle them is to set up publishing on the attachment container directory with the
org-publish-attachment publishing function.2 This would locate any file in the attachment subdirectories, and copy them with no transformation to the publication directory. The transcoder for attachment links is in the same mindset, so it all kind of just works.
There’s a catch, of course. What happens when I have multiple attachments, in separate outlines, with the same name?
As far as the Org Mode abstraction goes, that’s not a problem: the UUID directory they’re stored in will differ, and we can indeed access them separately from their respective contexts.
But when published with
org-publish-attachment, they end up in the same directory. The latest one published clobbers all homonyms. Not too satisfactory. Especially when Org Mode had a solution in place to avoid those conflicts all along.
So let’s port it.
I’d like them copied in the same kind of directory structure as they come from. I suppose I could just copy the entire attachment container directory, by Emacs or non-Emacs means. But that feels dirty: what if I only wanted to publish a single outline? What if I had removed an entire Org file for, say, privacy reasons? I wouldn’t want its attachments to still be published. And I’m going to need the attachment links to follow anyway. So, really, this means I can only reasonably handle them while I’m taking care of the Org files they’re declared in, since they’re the ones who have the knowledge of their container’s UUID.
So here’s how I’m currently doing it.
defun my-org-publish-attachment-relative (plist filepath pub-dir) ("Like `org-publish-attachment', but keep the attachment's relative path." when (file-name-absolute-p filepath) (;; is absolute path, make relative again setq filepath (org-publish-file-relative-name filepath plist))) (let* ((path (file-name-directory filepath)) ( (pub-dir-deep (concat (file-name-as-directory pub-dir) path))) (org-publish-attachment plist filepath pub-dir-deep))) defun my-org-attach-file-dir-of (element) ("Helper: return the attachment directory of a provided Org Element." let ((pos (org-element-property :begin element))) ( (file-name-as-directory (save-excursion (goto-char pos) (org-attach-dir))))) defun my-org-publish-attachment-filter (tree backend plist) ("Tree filter to scan for attachments and publish them in the same relative directory they come from. Returns the unchanged tree. To be used as a :filter-parse-tree in the `org-publish-project-alist'." let ((pub-dir (plist-get plist :publishing-directory))) ( (org-element-map tree 'headlinelambda (hl) (and (member "ATTACH" (org-element-property :tags hl)) (let* ((dir (my-org-attach-file-dir-of hl)) ( (files (org-attach-file-list dir)))dolist (file files) ( (my-org-publish-attachment-relative plist (expand-file-name file dir) pub-dir)))))) (org-element-map tree 'linklambda (link) (when (string= "attachment" (org-element-property :type link)) (let* ((dir (my-org-attach-file-dir-of link)) ( (file (org-element-property :path link))):type "file") (org-element-put-property link (org-element-put-property link :path (concat dir file))))))) tree)
A few things to note:
I’m defining aIndeed, it duplicated
drop-prefixfunction to ensure relative links. From all the Elisp tracing I’ve been doing since then, I’m pretty sure this duplicates some
org-publishfunctionality. I’ll have to revisit this someday.
- I’m handling attachment links by converting them to file links. This was not my initial plan. I wanted to refine the
org-html-linktranscoder to recognize attachment links and add the path prefix accordingly. After many failed attempts, I abandoned this path: the way the HTML exporter’s code is now, attachment links are detected by the transcoder as a custom protocol in its first switch, and their conversion is done in the attachment module by peeking at the export backend symbol, and having a case to convert HTML attachment exports to
<a>tags. Seems reasonable? Well, it works. Until I need one of them to convert to
<img>instead, that is. The image conversion is the next switch in the transcoder, so it never sees any attachment links. Smells like attachments were added to Org core after the HTML export, and that part never really got fused. Anyhow, doing it right would need real patches in Org, so hacking the substitution to file links will do for now.
One double-sized itch scratched.
Next up: I don’t always have inner links to attachments. Within Emacs, they can always be accessed through the attachment dispatcher, with the
o class of end-functions for files, or
f for the entire folder. How can I port that to HTML?
There’s always an
ATTACH tag to headlines with them; maybe I could enhance it to an HTML link?
Say linking directly to the attachment if there’s a single one, and to the entire directory if there’s multiple. Onus on the webserver to serve an index for it.
I’m not confident enough to touch the exporter’s info plist, so I’ll use a dynamic variable instead.
- the proper place to define it is when we have access to the UUID. We’re not garanteed to have any more contents in the node than its headline, so that’s the place.3 It’s the
- the place to use it is when we’re transcoding the tag name. It’s the
defvar my-attach-link nil ("The attachment link target currently in scope, nil when none.") defun my-ox-html-attach-headline (old-func headline contents info &rest args) ("Set `my-attach-link' to the attachment target location for the scope of `org-html-headline'. Intended as advice on `org-html-headline'." if (member "ATTACH" (org-element-property :tags headline)) (let* ((dir (my-org-attach-file-dir-of headline)) ( (files (org-attach-file-list dir))"file:" (my-attach-link (concat (org-publish-file-relative-namecond ((cdr files) dir) (car files)))) ((concat dir ( info))))apply old-func headline contents info args)) (apply old-func headline contents info args))) ( (advice-add 'org-html-headline :around #'my-ox-html-attach-headline) defun my-make-link (dest text) ("Make an Org link element from a destination" (with-temp-buffer (save-excursion (insert (org-link-make-string dest text))) (org-element-link-parser))) defun my-ox-html-attach-tag (arglist) ("Replace an ATTACH tag string with an HTML link. Intended as advice on `org-html--tags'. The link is to be setup in dynamic variable `my-attach-link' by the `my-ox-html-attach-headline' advice." let ((tags (car arglist)) (cadr arglist))) (info (list (mapcar (lambda (tag) (if (string= tag "ATTACH") ( (org-html-link (my-make-link my-attach-link tag)"ATTACH" cons '(:html-inline-image-rules nil) info)) ( tag)) tags) info)))(advice-add 'org-html--tags :filter-args #'my-ox-html-attach-tag)
- I’m forcing
org-html-linkto generate the link. If I didn’t, Org Export would detect the link as “descriptionless” and convert it to an inline image, which would look real weird. My links aren’t descriptionless, though, I’m going out of my way to ensure they’re tagged “ATTACH” at every level. I suspect that’s a bug in
org-export-inline-image-p, that advertises only applying to links without a description, but checks
org-element-contentson them. AFAICT,
nilon link elements.
- There’s probably a more Emacsy way of updating the information in-place. Most of my coding has been in pure functions as of late, I need a bit of an adjustment when I come back to Elisp.
Second itch scratched.
What’s blocking me from Nirvana?
On the Org core front: I’m going to want inline audio pretty soon. I could tackle it from both ends, so I’ll need a bit more reflecting before I start.
On the Org-roam front: I’m obviously going to need the backlinks. So that’s still missing, but I don’t think I’ll need to personally hack as much: a lot has already been done out there and I’ll likely be able to pick one. Neil Mather’s implementation is particularly nifty, I’ll probably take a lot of inspiration there.
In the meantime, the code is available as a gist. Feel free to help improve my rusty Elisp!
Update 2021-03-23: It broke with an org-mode update, a good opportunity to sanitize the difference between relative paths and link destinations.
Despite its flaws, there’s really no obligation to do it that way. There’s just no support (that I found) for anything else.↩︎
You may wonder, as I did, why it can’t be done directly at the tag transcoding point. The reason is that the tag transcoding point is not provided with enough information to recover the attachment properties: all it has is the tags string and the global export plist; no buffer positioning or parse tree.↩︎