Categories
book

Amazon Buys Goodreads (and the people go crazy)

So I’m sure you’ve seen the news that Amazon is buying/has bought Goodreads. I caught up with it on the various book-related sites I read – the main one being Goodreads itself. I was surprised the degree to which this created upset/anger/panic – or maybe that’s just because I read the feedback threads on GR. You really would think the world is going to end.

So I get that people feel betrayed. It’s less than a year since GR was forced to remove a lot of Amazon-sourced metadata from their site. Whilst this was reasonable in my opinion – they weren’t prepared to abide by Amazon’s API TOS, which would have meant not linking to other book vendors – it did mean that it created a lot of work for the site’s librarians, who are unpaid volunteers. So aside from the usual web 2.0 stuff about user content they quite literally had worked to make the site what it was.

Unfortunately for them they never noticed that Goodreads was always intended to be a commercial venture. Never noticed or didn’t care because they trusted Goodreads and it “felt like” a community?

For me Tim Spalding, founder of LibraryThing, a similarly purposed but different in tone and fiercely independent site, put it well:

Unless I’m quite mistaken, Goodreads was not hugely profitable as itself. With 30 employees, many of their engineers and in Los Angeles, he was probably burning upwards of $3 million/year on salary and benefits alone. When you do the advertising math, there’s no way he was making lots of money–not the sort of money that justifies a $150m valuation. (I don’t for a second believe the $1b number.) My guess is that he wasn’t even cash-positive. A number of people in the industry share my assessment. Unless the company itself is very profitable and very, very large, there’s no chance of going public, hence no way for the investors to cash out.

So he had to cash out. And he pretty much had to sell to Amazon.

So I think it was always inevitable they would cash out. I was one of those thinking they’d start selling through the site – which no doubt would have elicited some equally vociferous howls.

Personally I’m not fearful and I’m not jumping ship yet. I’ll wait and see. I am slightly sad that we don’t have a source of reviews and data about books that’s as big as Goodreads but also independent.

Categories
book reading

How to Waste The Rest of Your Life (not) Reading

or

Are ebook Samples really Useful?

Why Did I Do This?

One of the biggest problems with books these days – and I guess I really mean ebooks – is there’s just too much freaking choice. The rise of self-publishing is undoubtedly a good thing, it means that anyone and everyone can get their words online and into a form you can conveniently download onto your phone, tablet or ereader device. But not everyone and anyone can write, or has something interesting to say, or can use a spell-checker apparently. And that’s before we get into issues of taste and preference.

One of the tools that sites like Amazon use to counter this problem – along with ratings and reviews – is the availability of free samples. Basically every ebook available from Amazon also has a sample – usually the first chapter or so – that you can download for free. A try-before-you-buy option with no commitment. Good idea huh?

Yes. Well, I mean I think so in principle but I seem to almost never use them in practice. This post will be partly about why that is. Maybe.

However the thing that really inspired this post was when samples are used in the recurring arguments over the relative quality of indies versus trad-published books. This is a sub-section of an argument about quality and it basically says that even if there is a lot of unreadable junk out there it’s possible to find the “gems” by using, amongst other things, samples.

Let’s just say I’m sceptical about this – surely it simply takes too much time to read samples to use them as anything other than a final filter? But that’s a gut reaction. So I thought I’d test it. Sort of.

What did I do?

I decided to throw a few numbers together and see what came out.

On the 16th August 2012 I went to amazon.co.uk and I looked at the available fiction ebooks (I almost never read non-fiction). I read mostly from the following genres (Amazon’s categories) SciFi, Fantasy, Crime & Thrillers and Action & Adventure. I looked for a “comedy” category but although I found “humour” as a category for paper books I didn’t for the Kindle store. Also that included non-fiction humour – books of essays and memoirs and so on – which I’m less inclined to read.

Anyway here’s a list of how many titles there were:

Genre Total
Action & Adventure 38,375
Crime & Thrillers 74,605
Fantasy 38,790
SciFi 33,904
All four 185,674
SciFi/Fantasy 72,694
All Fiction 561,178

Clearly, even without further analysis that’s too many books. Fortunately Amazon gives me lots of ways to filter these. I can look at just the ones with a 4star or higher review average (I want to read the good ones right?), or the ones which came out in the last 30days (let’s assume I check regularly) or I could look at what’s about to come out. Or combine two or more of these.

Genre Total 4star 30days Coming Soon 4star+30
Action & Adventure 38,375 4,508 1,435 70 70
Crime & Thrillers 74,605 12,987 3,035 509 250
Fantasy 38,790 6,383 1,813 178 136
SciFi 33,904 4,102 1,427 102 10
All four 185,674 27,980 7,710 859 466
SciFi/Fantasy 72,694 10,485 3,240 280 146
All Fiction 561,178 67,690 22,813 3,253 1,409

Now some of those numbers look less scary but what do they mean in terms of reading samples?

What did I assume?

I needed to make an mathematical model (i.e. a spreadsheet) and for that I need some generalisations or assumptions.

First let’s assume that it takes me on average 5mins to read a sample. Sample sizes vary but I am a slow reader so I think this is on the low end but that will favour the proposition that samples are a good way to filter.

So let’s plug that into our model and here’s the time taken to read all those samples:

Genre Total 4star 30days Coming Soon 4star+30
Action & Adventure 133d 5h55m 16d 15h40m 5d 23h35m 5h50m 5h50m
Crime & Thrillers 259d 1h05m 45d 2h15m 11d 12h55m 2d 18h25m 1d 20h50m
Fantasy 135d 16h20m 22d 3h55m 6d 7h05m 1d 14h15m 11h20m
SciFi 118d 17h20m 14d 5h50m 5d 5h22m 8h30m 50m
All four 645d 16h50m 97d 3h40m 27d 18h30m 3d 23h35m 2d 14h50m
SciFi/Fantasy 252d 9h50m 36d 9h45m 11d 6h00m 1d 23h20m 1d 12h10m
All Fiction 1949d 12h50m 235d 0h50m 79d 5h05m 11d 7h05m 5d 21h25m

Whoops! The power of multiplication has turned what had seemed reasonable book numbers into to unreasonable lengths of time. I’m clearly not going to spend days (or months, years!) reading samples to decide my next “full” book read. About the only thing that seems reasonable is 4star SciFi from the last 30 days.

How did I refine the model? (assumptions #2)

OK so I’ve got some numbers now but are they at all useful? Would any sane person really trying to read all the samples from a particular category? Probably not. We can refine the model with a couple of additional assumptions. Let’s say I go to Amazon and look at the list of my particular category – it shows me them in pages of 12 where I get the book covers, titles and authors. Probably what I would do is page through this list and click on a few likely looking ones and read the blurb and if that didn’t immediately disqualify itself I’d then download the sample.

So let’s assume it takes 5seconds to scan each page of 12 book titles and covers.

Let’s assume that for any list 10% are worth reading the blurb and that it takes 15seconds to skim-read the blurb.

Remember this is based on testing the idea that samples are actually the way to go so the blurb-reading is really to confirm that the cover/title has given the correct impression as regards genre and probable content.

Finally let’s assume that we commit to read the samples of half the ones where we read the blurb i.e. 5% of the list overall.

Plugging those numbers in to our new model the overall time take per list is:

Genre Total 4star 30days Coming Soon 4star+30
Action & Adventure 8d 12h19m 1d 21h11m 6h44m 19m 19m
Crime & Thrillers 15d 14h34m 3d 13h01m 1d 14h15m 2h23m 1d10m
Fantasy 8d 14h16m 1d 5h59m 8h31m 50m 38m
SciFi 7d 15h19m 1d 19h16m 6h42m 28m 2m
All four 36d 8h29m 5d 11h28m 2d 12h13m 4h02m 2h11m
SciFi/Fantasy 14d 5h35m 2d 1h16m 1d 15h13m 1h18m 41m
All Fiction 110d 21h01m 13d 6h04m 4d 11h12m 1d 15h17m 6h37m

Still a lot of large numbers there. I’m automatically rejecting anything over a day. However an hour and a half to check out upcoming SciFi/Fantasy seems doable, as does a couple of hours to review the 4star+ books in my favourite genres from the past 30 days.

So, whilst the numbers overall confirm my gut instinct, limit the scope a little and it may actually be a viable method.

Hold on a second your model is wrong because…

I can think of two main reasons someone may object to the way I’ve set this up:

  1. The numbers in your assumptions are wrong. Obviously it’s true that if we vary these numbers we can come out with different answers. All I can say is I think the assumptions are roughly true for me and I’ve tried to err on the side that would lessen time taken so that I’m giving sampling as a method a fair chance.
  2. In reality, no-one would do it that way. Clearly when you have a nice simple equation you can plug whatever numbers you like in and get the answer. A human being however would react differently given 10 books to sample rather than 10,000. In other words the assumptions don’t scale. I think this is true. I think that the larger the number of books you have the more you would want to use other filters first OR the more likely you are to simply bail out early i.e. read the first 25 samples say, and pick the best of those. However I think the numbers are still useful because they show the difficulty of getting your book read, based on sampling alone, if it’s lower down that list. Which I think just confirms what indie authors already know which is the importance of getting as may good reviews, ratings and getting as high up those popularity lists as possible.

Have I learnt anything?

I think so. I had assumed that if I wanted to find something new to read I should follow the usual routes – reviews from trusted sources and recommendations from family/friends – methods which haven’t changed since I started reading (well before the advent of ebooks). I hadn’t expected sampling would help because I hadn’t expected that the numbers would ever dip to low enough levels to be reasonable. Turns out that may not be true and scanning the latest 4star books in my chosen genres once a month for samples might be a worthwhile investment.

Or not. Because intellectually I can see the merit. Psychologically an hour reading samples when I could be reading my next book seems like an hour wasted.