John Winn

Chapter 2

Assessing People’s Skills

Throughout our lives, we are constantly assessing the skills and abilities of those around us. Who should I hire? Who should play on the team? Who can I ask for help? How can I best teach this person? Taking all that we know about someone and working out what they can and cannot do comes naturally to most of us. But how can we use model-based machine learning to do this automatically?

In this chapter, we will develop our first model of some real-world data. We will address the problem of assessing candidates for a job that requires certain skills. The idea is that candidates will take a multiple-choice test and we will use model-based machine learning to determine which skills each candidate has (and with what probability) given their answers in the test. We can then use this for tasks such as selecting a shortlist of candidates very likely to have a set of essential skills.

Each question in a test requires certain skills to answer. For a software development job, these skills might be knowledge of the programming language C# or the database query language SQL. Some of the questions might require multiple skills in order to be answered correctly. Figure 2.1 gives some example questions which have been marked with the skills required to answer them. Because our model could be used for many different types of job it must work with different tests and different skills, as long as these skill annotations are provided. It is important that the system should only use these annotations when presented with a new test – it must not require any additional information, for example, sample answers from people with known skills.

Figure 2.1Part of a certification test used to assess software development skills. The questions have been annotated with the skills needed to answer them.

In order to assess which skills a candidate has, we will need to analyse their answers to the test. Since we know the skills needed for each question, this may appear straightforward: we just need to check whether they are getting all the SQL questions right or all the C# questions wrong. But the real world is more complicated than this – even if someone knows C# they may make a mistake or misread a question; even if they do not know SQL they may guess the right answer by pure luck. In some cases, the test questions may be badly written or even outright wrong.

The situation is even more complicated for questions that need two (or more) skills. If someone gets a question that needs two skills right, it suggests that they are likely to have both skills. If they get it wrong, there are several possibilities: they could have one skill or the other (but probably not both) or they could have neither. Assessing which of these is the case requires looking at their answers to other questions and trying to find a consistent set of skills that is likely to give rise to all of the answers considered together. To do this kind of complex reasoning automatically, we need to design a model of how a person with particular skills answers a set of questions.

You can recreate all results in this chapter using the companion source code [Diethe et al., 2019].

References

[Diethe et al., 2019] Diethe, T., Guiver, J., Zaykov, Y., Kats, D., Novikov, A., and Winn, J. (2019). Model-Based Machine Learning book, accompanying source code. https://github.com/dotnet/mbmlbook.