The Problem
In the past year I've accumulated 500 or so URL shortcuts on my Mac OS X desktop (more precisely, files with the extension .webloc that contain URLs in their resource forks). These were created by me dragging bookmark-like strings from Safari or Firefox to the desktop, ostensibly for future consideration and collation.
So, I have a collection of 500 files, each of which contains a URL, and I want each of those to become individual Web Archive entries in DEVONthink. This is versus its other record types that could be appropriate (PDF, URL, others).
A Solution
DEVONthink has no mass import feature that solves this problem exactly. Fortunately, and this is one of its outstanding features, DEVONthink is scriptable through AppleScript. In fact, there is a context-sensitive script "Add web document to DEVONthink" which appears in the menu bar's Scripts menu when Safari or Firefox are in the foreground. The contents of this script (~/Library/Scripts/Applications/Firefox/Add web document to DEVONthink.scpt) reveal how to send a URL, title and refering URL to DEVONthink for it to create a new record of the Web Archive type.
I then googled how to get the URL stored in a .webloc file as a string. I'm a Python programmer, and wanted to see how to do it in python and not AppleScript, so I ended up at http://toxicsoftware.com/webloc-to-pinboard/. The function infoForWebloc() does the trick.
I hadn't used Automator before so instead of tying together the AppleScript and Python directly I decided to make this my first foray into Automator. I created a Folder Action on a new folder (~/Desktop/url2devonthink/) with two actions: Run Shell Script and Run AppleScript.
The Shell Script portion executes a Python program to extract the URL and retrieve a suitable Title.
Shell: /bin/bash
Pass input: as arguments
Contents of ~/bin/webloc2url.py is as follows. I've removed some optimizations for clarity and brevity:
The AppleScript passes that on to DEVONthink for Web Archive creation. Note that I cut-and-pasted most of this from ~/Library/Scripts/Applications/Firefox/Add web document to DEVONthink.scpt.
Conclusion
Some caveats apply:
- The above was almost certainly not the most optimal way of accomplishing the goal
- The triple-pipe Cheap Hack (tm) was my lame workaround to not wanting to see how multiple arguments are passed between stages in Automator's pipes
- The actual process was iterative and error-prone. Some URLs had become bad, some I didn't want to include, and so on.
- Since this was a one-time task I have no need to create a more structured and reliable program than the above.
