Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Multi-label classification is a machine-learning problem where each sample can h

ID: 650646 • Letter: M

Question

Multi-label classification is a machine-learning problem where each sample can have zero or more labels from a closed set of possible labels. This task has applications in several fields. For example, in dialog systems, each sentence that the human says may have several intents, and the classifier should detect all of them. For example, the sentence "I want a cake and a drink" contains the two intents "WANTCAKE" and "WANTDRINK".

Theoretically, I expect a classifier to classify multi-label samples, even if the training data contained only single-label samples. For example, consider the following training set (where each word is considered a feature):

"I want a cake" -> WANTCAKE
"I want a drink" -> WANTDRINK
"I want a solution" -> WANTSOLUTION

I would expect a classifier to realize, that the words "I want a" are not relevant for classification, and the words cake/drink/solution are indicative of the classes WANTCAKE/WANTDRINK/WANTSOLUTION respectively, and classify the sentence "I want a cake and a drink" correctly as {WANTCAKE,WANTDRINK}.

This seems trivial to humans. Therefore. I was very surprised to find out, that many state-of-the-art multi-label classifiers fail miserably on this simple task!

For example, consider a multi-label classifier in the "Binary Relevance" method. In this method, there is a single binary classifier for each label. For example, there is a binary classifier for the "WANTCAKE" label, trained with I want a cake" as a positive sample, and the other two sentences as negative samples. When this classifier sees the sentence "I want a cake and a drink and a solution", it sees a single feature "cake" that is a positive signal of WANTCAKE, and two features, "drink" and "solution", that are negative signals of WANTCAKE, because they appeared in the training set with sentences that did not have the WANTCAKE label. Therefore, this classifier returns 'negative'. The same happens for the other two binary classifiers, and thus the multi-label classifier returns an empty set!

I also tried other approaches to multi-label classification, such as RF-PCT (Random Forest Prediction Clustering Trees), with a slightly larger example (7 labels instead of 3) and got similar results.

I sent this problem to machine learning experts, and they told me that I need more training data. They said that a classifier cannot tag multi-label instances, if the training data contains only single-label instances. In practice, they are right - adding more training data usually improves the accuracy of the classifier.

But I am still bothered with the theoretical issue - how can it be, that there is no state-of-the-art classifier that can solve this trivial, 3-instance problem?

I am looking for a classifier that provably solves such problems. I.e., a classifier for which there is a proof, that if it is given correct single-label samples, it can correctly solve multi-label cases.

Is there such a classifier?

Explanation / Answer

You subsequently clarified that you are looking for a way to do multi-label classification in general, and the example in the question about wanting a cake was just an example.

OK, here is one standard way to do multi-label classification. For each candidate label, you build a boolean classifier that outputs true or false: true means that the label applies, false means it doesn't.

In your example, you'd have three classifiers: a "cake classifier" that outputs true if the sentence should be labelled "wants-cake" and false otherwise; a "drink classifier" that outputs true if the sentence should be labelled "wants-drink" and false otherwise; and a "solution classifier" that outputs true if the sentence should be labelled "wants-drink" and false otherwise. You now train each one separately. Given a sentence, you run all three classifiers on it and use that to select which labels should or should not be associated to the sentence. For instance, if the "cake classifier" outputs true, the "drink classifier outputs false, and the "solution classifier" outputs true, then you label the sentence as "wants-cake + wants-solution".

This allows you to use any boolean classifier as your underlying building block. For instance, you can use SVMs, decision trees, random forests, and many other schemes.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Chat Now And Get Quote