How to Read a Model
We’ve now created several models by piecing together the assumptions needed to solve a particular problem. But there is a vast ecosystem of models and algorithms already out there that other people have built. It would be great to make use of this work to avoid reinventing the wheel. Can model-based machine learning help us to pick up someone else’s model and understand what assumptions it is making, so that we can make good use of it?
So far in the book, we’ve designed a variety of different models in order to solve a range of real world problems. But this isn’t the only way to apply the skills of model-based machine learning. As well as designing models, model-based machine learning also enables us to understand and interpret models created by other people. This understanding is useful for:
- seeing whether it makes sense to apply someone else’s model to a particular task;
- explaining the behaviour of such a model – for example, on a particular data item;
- diagnosing problems that arise when applying an existing model to a new problem;
- exploring how best to extend a model to encode new assumptions or to modify existing ones.
In this chapter, we will demonstrate how to interpret models by example – we will take a number of popular models and show how to understand the assumptions represented in each model. We will explore what these assumptions mean in terms of what data and tasks each model is, and is not, suitable for. We will also identify assumptions that limit where the model can be applied. By relaxing these assumptions, we will create extended models which can be applied more broadly than the original.
In some cases, we will start with an algorithm rather than a model. Here, we will first have to translate the algorithm into a corresponding model before we can begin the process of analysis. This is also a very useful skill – as well as allowing for analysis of the resulting model, it also allows the full range of inference algorithms to be applied, often unlocking new capabilities that the original algorithm lacked.
The models and algorithms that we will explore are:
- Latent Dirichlet Allocation – a model of the topics mentioned in a set of documents;
- Decision Tree – a classification algorithm based on very different assumptions to the classifier we developed in Chapter 4;
- Principal Component Analysis – an algorithm for transforming a set of observations of correlated variables into a set of values of uncorrelated variables, known as principal components;
- k-means clustering – a popular algorithm for discovering clusters of related data points.
To explore our first model, Latent Dirichlet Allocation, read on…