machine learning [book review]

I have to admit the rather embarrassing fact that Machine Learning, A probabilistic perspective by Kevin P. Murphy is the first machine learning book I really read in detail…! It is a massive book with close to 1,100 pages and I thus hesitated taking it with me around, until I grabbed it in my bag for Warwick. (And in the train to Argentan.) It is also massive in its contents as it covers most (all?) of what I call statistics (but visibly corresponds to machine learning as well!). With a Bayesian bent most of the time (which is the secret meaning of probabilistic in the title).

“…we define machine learning as a set of methods that can automatically detect patterns in data, and then use the uncovered patterns to predict future data, or to perform other kinds of decision making under uncertainty (such as planning how to collect more data!).” (p.1)

Apart from the Introduction—which I find rather confusing for not dwelling on the nature of errors and randomness and on the reason for using probabilistic models (since they are all wrong) and charming for including a picture of the author’s family as an illustration of face recognition algorithms—, I cannot say I found the book more lacking in foundations or in the breadth of methods and concepts it covers than a “standard” statistics book. In short, this is a perfectly acceptable statistics book! Furthermore, it has a very relevant and comprehensive selection of references (sometimes favouring “machine learning” references over “statistics” references!). Even the vocabulary seems pretty standard to me. All this makes me wonder why we at all distinguish between the two domains, following Larry Wasserman’s views (for once!) that the difference is mostly in the eye of the beholder, i.e. in which department one teaches… Which was already my perspective before I read the book but it comforted me even further. And the author agrees as well (“The probabilistic approach to machine learning is closely related to the field of statistics, but differs slightly in terms of its emphasis and terminology”, p.1). Let us all unite!

[..part 2 of the book review to appear tomorrow…]

2 Responses to “machine learning [book review]”

  1. Carl Korkpoe Says:

    Would have been a better book had the examples and demonstrations been done using an open platform like R instead of Matlab. Who in this day and age uses Matlab for serious statistics?

    • Dan Simpson Says:

      Machine learners and a decent number of people. It’s a cleaner language with a lot of better features. It’s also widely taught and not everyone comes from a pure stats background.

      R is mostly used by applied stats people who make great use of the packages. But if you are developing new stuff, more structured less “idiosyncratic” languages (such as Matlab and python) are often easier to use.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.