Categories
book reading

How to Waste The Rest of Your Life (not) Reading

A pseudo-scientific (i.e. not at all) analysis of whether ebook samples works as a way of filtering what to read next.

or

Are ebook Samples really Useful?

Why Did I Do This?

One of the biggest problems with books these days – and I guess I really mean ebooks – is there’s just too much freaking choice. The rise of self-publishing is undoubtedly a good thing, it means that anyone and everyone can get their words online and into a form you can conveniently download onto your phone, tablet or ereader device. But not everyone and anyone can write, or has something interesting to say, or can use a spell-checker apparently. And that’s before we get into issues of taste and preference.

One of the tools that sites like Amazon use to counter this problem – along with ratings and reviews – is the availability of free samples. Basically every ebook available from Amazon also has a sample – usually the first chapter or so – that you can download for free. A try-before-you-buy option with no commitment. Good idea huh?

Yes. Well, I mean I think so in principle but I seem to almost never use them in practice. This post will be partly about why that is. Maybe.

However the thing that really inspired this post was when samples are used in the recurring arguments over the relative quality of indies versus trad-published books. This is a sub-section of an argument about quality and it basically says that even if there is a lot of unreadable junk out there it’s possible to find the “gems” by using, amongst other things, samples.

Let’s just say I’m sceptical about this – surely it simply takes too much time to read samples to use them as anything other than a final filter? But that’s a gut reaction. So I thought I’d test it. Sort of.

What did I do?

I decided to throw a few numbers together and see what came out.

On the 16th August 2012 I went to amazon.co.uk and I looked at the available fiction ebooks (I almost never read non-fiction). I read mostly from the following genres (Amazon’s categories) SciFi, Fantasy, Crime & Thrillers and Action & Adventure. I looked for a “comedy” category but although I found “humour” as a category for paper books I didn’t for the Kindle store. Also that included non-fiction humour – books of essays and memoirs and so on – which I’m less inclined to read.

Anyway here’s a list of how many titles there were:

Genre Total
Action & Adventure 38,375
Crime & Thrillers 74,605
Fantasy 38,790
SciFi 33,904
All four 185,674
SciFi/Fantasy 72,694
All Fiction 561,178

Clearly, even without further analysis that’s too many books. Fortunately Amazon gives me lots of ways to filter these. I can look at just the ones with a 4star or higher review average (I want to read the good ones right?), or the ones which came out in the last 30days (let’s assume I check regularly) or I could look at what’s about to come out. Or combine two or more of these.

Genre Total 4star 30days Coming Soon 4star+30
Action & Adventure 38,375 4,508 1,435 70 70
Crime & Thrillers 74,605 12,987 3,035 509 250
Fantasy 38,790 6,383 1,813 178 136
SciFi 33,904 4,102 1,427 102 10
All four 185,674 27,980 7,710 859 466
SciFi/Fantasy 72,694 10,485 3,240 280 146
All Fiction 561,178 67,690 22,813 3,253 1,409

Now some of those numbers look less scary but what do they mean in terms of reading samples?

What did I assume?

I needed to make an mathematical model (i.e. a spreadsheet) and for that I need some generalisations or assumptions.

First let’s assume that it takes me on average 5mins to read a sample. Sample sizes vary but I am a slow reader so I think this is on the low end but that will favour the proposition that samples are a good way to filter.

So let’s plug that into our model and here’s the time taken to read all those samples:

Genre Total 4star 30days Coming Soon 4star+30
Action & Adventure 133d 5h55m 16d 15h40m 5d 23h35m 5h50m 5h50m
Crime & Thrillers 259d 1h05m 45d 2h15m 11d 12h55m 2d 18h25m 1d 20h50m
Fantasy 135d 16h20m 22d 3h55m 6d 7h05m 1d 14h15m 11h20m
SciFi 118d 17h20m 14d 5h50m 5d 5h22m 8h30m 50m
All four 645d 16h50m 97d 3h40m 27d 18h30m 3d 23h35m 2d 14h50m
SciFi/Fantasy 252d 9h50m 36d 9h45m 11d 6h00m 1d 23h20m 1d 12h10m
All Fiction 1949d 12h50m 235d 0h50m 79d 5h05m 11d 7h05m 5d 21h25m

Whoops! The power of multiplication has turned what had seemed reasonable book numbers into to unreasonable lengths of time. I’m clearly not going to spend days (or months, years!) reading samples to decide my next “full” book read. About the only thing that seems reasonable is 4star SciFi from the last 30 days.

How did I refine the model? (assumptions #2)

OK so I’ve got some numbers now but are they at all useful? Would any sane person really trying to read all the samples from a particular category? Probably not. We can refine the model with a couple of additional assumptions. Let’s say I go to Amazon and look at the list of my particular category – it shows me them in pages of 12 where I get the book covers, titles and authors. Probably what I would do is page through this list and click on a few likely looking ones and read the blurb and if that didn’t immediately disqualify itself I’d then download the sample.

So let’s assume it takes 5seconds to scan each page of 12 book titles and covers.

Let’s assume that for any list 10% are worth reading the blurb and that it takes 15seconds to skim-read the blurb.

Remember this is based on testing the idea that samples are actually the way to go so the blurb-reading is really to confirm that the cover/title has given the correct impression as regards genre and probable content.

Finally let’s assume that we commit to read the samples of half the ones where we read the blurb i.e. 5% of the list overall.

Plugging those numbers in to our new model the overall time take per list is:

Genre Total 4star 30days Coming Soon 4star+30
Action & Adventure 8d 12h19m 1d 21h11m 6h44m 19m 19m
Crime & Thrillers 15d 14h34m 3d 13h01m 1d 14h15m 2h23m 1d10m
Fantasy 8d 14h16m 1d 5h59m 8h31m 50m 38m
SciFi 7d 15h19m 1d 19h16m 6h42m 28m 2m
All four 36d 8h29m 5d 11h28m 2d 12h13m 4h02m 2h11m
SciFi/Fantasy 14d 5h35m 2d 1h16m 1d 15h13m 1h18m 41m
All Fiction 110d 21h01m 13d 6h04m 4d 11h12m 1d 15h17m 6h37m

Still a lot of large numbers there. I’m automatically rejecting anything over a day. However an hour and a half to check out upcoming SciFi/Fantasy seems doable, as does a couple of hours to review the 4star+ books in my favourite genres from the past 30 days.

So, whilst the numbers overall confirm my gut instinct, limit the scope a little and it may actually be a viable method.

Hold on a second your model is wrong because…

I can think of two main reasons someone may object to the way I’ve set this up:

  1. The numbers in your assumptions are wrong. Obviously it’s true that if we vary these numbers we can come out with different answers. All I can say is I think the assumptions are roughly true for me and I’ve tried to err on the side that would lessen time taken so that I’m giving sampling as a method a fair chance.
  2. In reality, no-one would do it that way. Clearly when you have a nice simple equation you can plug whatever numbers you like in and get the answer. A human being however would react differently given 10 books to sample rather than 10,000. In other words the assumptions don’t scale. I think this is true. I think that the larger the number of books you have the more you would want to use other filters first OR the more likely you are to simply bail out early i.e. read the first 25 samples say, and pick the best of those. However I think the numbers are still useful because they show the difficulty of getting your book read, based on sampling alone, if it’s lower down that list. Which I think just confirms what indie authors already know which is the importance of getting as may good reviews, ratings and getting as high up those popularity lists as possible.

Have I learnt anything?

I think so. I had assumed that if I wanted to find something new to read I should follow the usual routes – reviews from trusted sources and recommendations from family/friends – methods which haven’t changed since I started reading (well before the advent of ebooks). I hadn’t expected sampling would help because I hadn’t expected that the numbers would ever dip to low enough levels to be reasonable. Turns out that may not be true and scanning the latest 4star books in my chosen genres once a month for samples might be a worthwhile investment.

Or not. Because intellectually I can see the merit. Psychologically an hour reading samples when I could be reading my next book seems like an hour wasted.

One reply on “How to Waste The Rest of Your Life (not) Reading”

I don’t read samples either. I research my reads and afford myself many hours of trawling through lists, prizes, acquaintance recommendations and other methods involving dead chickens and lunar events. I only read about 70 books a year which is probably due to too much time spent on research — see the article above. I’ve got a potential list around 70 books. The probability of me stumbling upon some unknown author through a sample is roughly 0. I wish the independent authors much luck because I think we need them, but that doesn’t change the fact that in the US alone there were more than 200,000 books published last year. Just based on numbers, the odds of me reading one of those books in the next ten years is about 0.3%. That’s if I only read books published in the United States in 2012 for the next ten years. It could happen, I do some weird things. Reading samples isn’t one of them though..

Leave a Reply

Your email address will not be published. Required fields are marked *