This article solves the following challenge:
Imbalanced dataset classification performs poorly
If you are not able to gain more instances of the minority classes to balance to dataset, Azure ML offers a strategy for creating new instances of the minority classes by synthetic oversampling.
This method is called SMOTE (Synthetic Minority Oversampling Technique), which is a statistical method to increase the number of instances of smaller classes.
SMOTE expects as input a dataset with 2 classes.
If you have more than 2 classes, you have to split the dataset into junks of the majority class and one of the minority classes.
\"Class" ^(Majority|Minority1)
This expression splits the original dataset into a smaller one consisting only of instances of the Majority class and the Minority1 class.
You can repeat this step for the other minority classes.
For example:
Majority class has 1000 instances
Minority class has 250 insances
Choose a SMOTE percentage of 300 percent to add about 750 instances to the minority class.
Attention: If you had more than one minority class, you have to merge the SMOTE results but only merging once the majority class.
Now that the dataset is balanced, you can train your model again and see if the performance improved.
Evaluate complexity of present statement:
Select ratingCancelGuessingPassing knowledgeKnowledgeableExpert
Your rating: 4 Average: 4 (3 votes)