Google's book search--something I've ignored, out of sheer indifference--is going through some amusing growing pains:
Start with publication dates. To take Google's word for it, 1899 was a literary annus mirabilis, which saw the publication of Raymond Chandler's Killer in the Rain, The Portable Dorothy Parker, André Malraux's La Condition Humaine, Stephen King's Christine, The Complete Shorter Fiction of Virginia Woolf, Raymond Williams's Culture and Society 1780-1950, and Robert Shelton's biography of Bob Dylan, to name just a few. And while there may be particular reasons why 1899 comes up so often, such misdatings are spread out across the centuries. A book on Peter F. Drucker is dated 1905, four years before the management consultant was even born; a book of Virginia Woolf's letters is dated 1900, when she would have been 8 years old. Tom Wolfe's Bonfire of the Vanities is dated 1888, and an edition of Henry James's What Maisie Knew is dated 1848.
[ ... ]
Google acknowledges the incorrect dates but says they came from the providers. It's true that Google has received some groups of books that are systematically misdated, like a collection of Portuguese-language works all dated 1899. But a very large proportion of the errors are clearly Google's own doing. A lot of them arise from uneven efforts to automatically extract a publication date from a scanned text. A 1901 history of bookplates from the Harvard University Library is correctly dated in the library's catalog. Google's incorrect date of 1574 for the volume is drawn from an Elizabethan armorial bookplate displayed on the frontispiece. An 1890 guidebook called London of To-Day is correctly dated in the Harvard catalog, but Google assigns it a date of 1774, which is taken from a front-matter advertisement for a shirt-and-hosiery manufacturer that boasts it was established in that year.
Then there are the classification errors, which taken together can make for a kind of absurdist poetry. H.L. Mencken's The American Language is classified as Family & Relationships. A French edition of Hamlet and a Japanese edition of Madame Bovary are both classified as Antiques and Collectibles (a 1930 English edition of Flaubert's novel is classified under Physicians, which I suppose makes a bit more sense.) An edition of Moby Dick is labeled Computers; The Cat Lover's Book of Fascinating Facts falls under Technology & Engineering. And a catalog of copyright entries from the Library of Congress is listed under Drama (for a moment I wondered if maybe that one was just Google's little joke).
I wouldn't call it a "little" joke; looks like Google's entire book-scanning enterprise is a joke, even if it's unintentional.



Comments