If you want to improve your classifier, it is best to use the Confusion Matrix. The fields marked in shades of orange show you the categories for which a high number of your examples could not be classified correctly. This is where you can start to improve the data quality.
By clicking on such a field, the examples contained in it open below the Confusion Matrix. These can be edited there directly.
In total, there are 4 cases of improving the classifier:
- The classifier is wrong: If the classifier is wrong and the label and the text example go together correctly, you do not have to change anything in your examples.
- The label is wrong: It is possible that an error has crept into your examples with the label. Please read the text examples again and correct the label if necessary.
- The label is correct, the classifier is also correct: Split your text sample into two examples and assign them to the appropriate categories.
- The label is wrong, the classifier is also wrong: Correct the label so that your classifier learns the correct assignment during the next training.
If there is a lot of confusion between thematically similar categories, it is recommended to merge the two categories. After you have made the adjustments, start a new training session.