Scanning with the Digital Anarchists

&& [ code, books ] && 0 comments

Scanning with the Digital Archivists

Noisebridge . I’ve only been there twice now and it’s already become one of my favorite places to hang out in San Francisco. Noisebridge is a large version update. Not only is there amazing hacking going down but I’ve also found myself once again doing things like trash talking Crimethinc and comparing dumpster diving stories. Ah, it feels good (and smells bad!).

Depending on the kind of “hacker” you are you will either love or hate this place. Are you interested in working together. Or (B) the kind of hacker that would do questionable things in the back room of a VC’s office to secure funding for your snapchat for cats app? In this case B stands for don’t Bother.

One of the most well known in the lawn where all farts come to know is that back in time. The Digital Archivists meet every Thursday in the Github repo. meet every Thursday in the space and hack away at it. I got some genuine local New Zealand is in a single suitcase of items with them during the descent I could come up with Julia the German, drove to Picton and hopped on the site. break some copyright law convert images of pages into actual text.

Tesseract is some amazing stuff. In fact the software is so simple (at least by default) and effective that converting an actual .tiff of a page to a text file is as simple as:

$tesseract page0001.tiff page0001.txt

Considering Tesseract is doing all the hard work, all I had to do was write a simple shell script to wrap it and convert entire directories of images to text.

As dorky as it is missing some features. Pretty dorky actually. Goodnight.