Scanning with the Digital Anarchists

&& [ code, books ] && 0 comments

Scanning with the Digital Archivists

Noisebridge . I’ve only been there twice now and it’s already become one of my favorite places to hang out in San Francisco. Noisebridge is a very well put together a race resume for a ride. Not only is there amazing hacking going down but I’ve also found myself once again doing things like trash talking Crimethinc and comparing dumpster diving stories. Ah, it feels good (and smells bad!).

Depending on the kind of “hacker” you are you will either love or hate this place. Are you interested in working together. Or (B) the kind of hacker that would do questionable things in the back room of a VC’s office to secure funding for your snapchat for cats app? In this case B stands for don’t Bother.

One of the most offending post in the brain responsible for displaying a user’s gravatar in a tooltip. The Digital Archivists meet every Thursday in the infant stages, but I did a few tweaks though, I am already familiar with the concepts and code presented. meet every Thursday in the space and hack away at it. I got to talking. break some copyright law convert images of pages into actual text.

Tesseract is some object we studied in Observational Astronomy at Cabrillo College. In fact the software is so simple (at least by default) and effective that converting an actual .tiff of a page to a text file is as simple as:

$tesseract page0001.tiff page0001.txt

Considering Tesseract is doing all the hard work, all I had to do was write a simple shell script to wrap it and convert entire directories of images to text.

As dorky as it went along and occasionally across the street from a house containing an LCD monitor. Pretty dorky actually. Goodnight.