Month: August 2012

Can you make a sandwich from classifiers?

One day, just a few years ago, a client came to Aiaioo Labs with a very interesting problem.

He wanted to know if and how AI tools could save him some money.

It turned out that he had a fairly large team performing the task of manually categorizing documents.

He wanted to know if we could supply him an AI tool that could automate the work.  The only problem was, he was going to need very high quality.

And no single automated classifier was going to be able to deliver that sort of quality by itself.

That’s when we hit upon the idea of a classifier sandwich.

The sandwich is prepared by arranging two classifiers as follows:

1.  Top layer – high precision classifier – when it picks a category, it is very likely to be right (the precision of the selected category is very high).

2.  Bottom layer – high recall classifier – when it rejects a category, it is very likely to be right about the rejection (the precision of the rejected category is very high).

Whatever the top layer does not pick and the bottom layer does not reject – that is, the middle of the sandwich – is then handed off to be processed manually by the team of editors that the client had in place.

So, that was a lovely little offering, one that any consultant could put together.  And it is incredibly easy to put together an ROI proposition for such an offering.

How do you calculate the ROI of a classifier sandwich?

Simple!

Let’s say the high-precision top layer has a recall of 30%.

Let’s say the high-recall bottom layer has a recall of 80%.

Then about 50% of the documents that pass through the system will end up being automatically sorted out.

The work effort and therefore the size of the team needed to do it, would therefore be halved.

Note that to make the sandwich, we need two high-precision classifiers (the first one selects a category with high precision while the second one rejects the other category with high precision).

Both categories need to have a precision greater than or equal to the quality guarantee demanded by the client.

That precision limit determines the amount of effort left over for humans to do.

How can we tune classifiers for high precision?

For maxent classifiers, thresholds can be set on the confidence scores they return for the various categories.

For naive bayesian classifiers, the best approach to creating high-precision classifiers is a training process known as expectation maximization.

For more information, please refer the work of Kamal Nigam et al:  http://www.kamalnigam.com/papers/emcat-mlj99.pdf

Another secret to boosting precision is using the right features in your classifier, but more about that later.