Tuesday, December 29, 2009

webloc shortcuts into DEVONthink Web Archives - python + automator + applescript

Recently I've been moving forward with my year-or-so long quest to organize my thoughts, readings and writings, and their associated notes, in a searchable format suitable for aiding me in deciding the order in which I should attack my ideas, and suitable for presenting me with relevant entries as I work through the execution of my ideas. To store and retrieve my documents I've decided to give DEVONthink a shot. A lot of searching, and a little bit of evaluating, went into this decision. I'm not 100% sold that it will meet my needs, but seems fit in with them pretty nicely. More on that another time, if I get around to it.

The Problem

In the past year I've accumulated 500 or so URL shortcuts on my Mac OS X desktop (more precisely, files with the extension .webloc that contain URLs in their resource forks). These were created by me dragging bookmark-like strings from Safari or Firefox to the desktop, ostensibly for future consideration and collation.


So, I have a collection of 500 files, each of which contains a URL, and I want each of those to become individual Web Archive entries in DEVONthink. This is versus its other record types that could be appropriate (PDF, URL, others).

A Solution

DEVONthink has no mass import feature that solves this problem exactly. Fortunately, and this is one of its outstanding features, DEVONthink is scriptable through AppleScript. In fact, there is a context-sensitive script "Add web document to DEVONthink" which appears in the menu bar's Scripts menu when Safari or Firefox are in the foreground. The contents of this script (~/Library/Scripts/Applications/Firefox/Add web document to DEVONthink.scpt) reveal how to send a URL, title and refering URL to DEVONthink for it to create a new record of the Web Archive type.

I then googled how to get the URL stored in a .webloc file as a string. I'm a Python programmer, and wanted to see how to do it in python and not AppleScript, so I ended up at http://toxicsoftware.com/webloc-to-pinboard/. The function infoForWebloc() does the trick.

I hadn't used Automator before so instead of tying together the AppleScript and Python directly I decided to make this my first foray into Automator. I created a Folder Action on a new folder (~/Desktop/url2devonthink/) with two actions: Run Shell Script and Run AppleScript.

The Shell Script portion executes a Python program to extract the URL and retrieve a suitable Title.

Shell: /bin/bash
Pass input: as arguments



Contents of ~/bin/webloc2url.py is as follows. I've removed some optimizations for clarity and brevity:


The AppleScript passes that on to DEVONthink for Web Archive creation. Note that I cut-and-pasted most of this from ~/Library/Scripts/Applications/Firefox/Add web document to DEVONthink.scpt.


Conclusion

Some caveats apply:
  • The above was almost certainly not the most optimal way of accomplishing the goal
  • The triple-pipe Cheap Hack (tm) was my lame workaround to not wanting to see how multiple arguments are passed between stages in Automator's pipes
  • The actual process was iterative and error-prone. Some URLs had become bad, some I didn't want to include, and so on.
  • Since this was a one-time task I have no need to create a more structured and reliable program than the above.
Now all I have to do is catalog all those new entries!

Monday, December 14, 2009

Why write essays? Why write essays concisely? Why write well-reasoned essays concisely?

I've been thinking a lot about writing recently. Specifically, I've been considering:
  • What are my intrinsic motivators for wanting to write more? How about extrinsic?
  • What's the extrinsic ROI? Intrinsic?
  • Do I have the willpower to ignore something with a poor extrinsic ROI and a good intrinsic one? Should I ignore it, even if I do have the willpower?
  • What about short and long term implications of the near-term decision I must make regarding spending more time on writing?

Of secondary concern to me are the topics of my writing, the format, the audiences and my writing's relevance. These are by no means unimportant. Indeed, they inform the answers to the questions I mention above, but they have not been the primary variables under consideration.

This morning I spent about 15 minutes and wrote a thousand or so words in a stream-of-consciousness format. I wrote about topics I wanted to write about related to my position at Cornell. I've been finding recently that I've been having a specific kind of communication problem with my staff. That is, with some members, it has become obvious to me that I'm assuming too much preexisting knowledge, or that I've been assuming that I've communicated that knowledge in the past. It is clear that I haven't, or didn't, communicate well.

Therefore, I'd like to start building up a library of essays on various topics related to what I do professionally. I'd like to use that exercise to both understand how to communicate my desires more effectively, and to provide a set of documents for future reference.

This got me to wondering: is a collection of short essays more valuable in this context than a collection of detailed compositions with reasoned arguments and citations? Academia has historically eschewed the blog entry as a medium for developing and delivering reasoned arguments, in favor of the well cited scholarly article. The writing I'm considering is not scholarly, but perhaps the motivations behind requiring arguments to be well considered and cited still apply? Is there room for scholarly-style writing in the realm of short essays on such mundane topics as "Why We Check In Our Code Periodically"?

I have several thoughts, but in the spirit of short essays without well reasoned, substantiated arguments and conclusions, I figure I should publish this entry before I add any of them!