I have ripped my first book: In The Age Of The Smart Machine: The Future Of Work And Power, by Shoshana Zuboff.
Why In The Age Of The Smart Machine, in particular? Well, first of all, it’s not really my book, or at least I didn’t buy it: it was bought by Robin Peek, my colleague, mentor, and friend from Simmons College (and my ripped version retains her name written in it on the flyleaf), and I inherited it when she cleaned out her office probably over a decade ago. So I didn’t have the emotional attachment to it that I might if I’d bought it with my own money. Plus, while I didn’t (and didn’t feel I had to) run it by Robin to get her permission, I think she’d approve. Also, it seemed appropriate: what could possibly make for an age of smarter machines than large collections of digitized texts?
I’ve wanted to rip my books since I first read this post, in which Alex Halavais describes ripping his own books (to an audience at an AAUP conference, brave man!), in the interest of saving space at home and having his materials available on the road. Genius, I thought… why didn’t I think of that? And then I was reminded of my grand ambitions when I saw Jason Griffey’s recent post, Ripping your books.
My early forays into the paperless office were not books, however, but papers in my file cabinet. That task is now mostly completed, thanks to my (no doubt very bored) GAs over the past 2 academic years. I’ve also wanted to build the DIY book scanner for a long time, and I even got as far as some semi-serious discussions on the topic with Cristóbal Palmer… but we ran up against the fairly basic problem of simply not having anywhere to build the thing.
And so it’s taken me this long to get around to ripping my books, using more or less the Alex Halavais method. And here’s where I talk about my technique.
First I literally ripped the book. The copy of In The Age Of The Smart Machine that I have is a paperback, with just glue holding the spine together. I broke the spine, and tore the book into chunks. The book is about 470 pages long, and I tore the book into chunks of about 20 pages. Why 20 pages? Because that’s about as thick as the paper cutter in the SILS library will accommodate. I then sliced the inside edge off the chunks, to remove the glue of the binding. That took less than a quarter-inch off the width of the page. Reassemble the book in page order, and stick on desk for a few weeks.
Eventually I got around to the actual format-shifting. Here’s how that worked. I took a chunk of pages and stuck them on our office photocopier. Each stack was around 100 pages, because that’s as thick a stack as the copier could handle (the grade of paper is fairly thick, thicker than printer paper). Oh and BTW, I had to turn each page individually before putting the stack on the copier, because there was some residual glue connecting some pages. Those pages pulled apart easily… but easier to do that pre-scanning than to go back and re-scan later. Because the cut edge of the book was a little rough, I had to put the stack on the copier upside down, so it fed the outside edge of the book in first. That just prevented the copier from jamming on the rough edge.
Our office copier will scan to PDF, and email the PDF. I love that feature. So I scanned the whole book in chunks, and received a bunch of PDF attachments to emails from the copier. I discovered that, slightly annoyingly, the maximum length of a PDF that the copier will send is 50 pages, so for most batches I received 2 or 3 files. Download the attachments. Open in Acrobat.
Because I scanned the book upside down, I had to rotate the pages 180°. Easy enough. I combined the (many) PDF files from the copier using Insert Page. When the whole book was one big PDF file, I ran Acrobat’s built-in OCR on it. Unexpectedly, the file size shrank after OCRing, from around 16K to around 13.5K.
So there you have it, folks. I now have a fully searchable, PDF version of a book. And it was a big book too, so I think it makes for a good proof of concept. I wish it were in a more flexible format, rather than PDF, but I’ll take what I can get. And proof of concept it is too: I’m now faced with the decision of which of my books (that I paid for with my own money this time) get the axe first.
I now have a stack of paper that used to be a bound book sitting on my desk. I’ve considered just dropping it in the recycle bin, but I just can’t bring myself to do that. I think I’ll put it on the table in the SILS lobby. Surely some student will want it. If not, then the bin. And that should tell us something about the value of the print format.