For more information on variable importance, please refer to the seminar below.
Also, the mechanism of variable importance is as follows:
- Create a model by excluding one predictor variable (strictly speaking, by shuffling), and calculate how much worse the prediction accuracy is compared to if the variable were not excluded.
- Repeat this process for all variables.
- Evaluate the relative importance of each variable based on the score indicating how much the prediction accuracy deteriorates.
Example of calculation concept:
- Prediction accuracy when variable A is included: 90
- Prediction accuracy when variable A is not included: 50
- Importance of variable A = 90 - 50 = 40
Variable importance is calculated using the permutationImportance function from the mmpf package.
When calculating variable importance, if the predictive model is intended to be numerical, such as linear regression, the “mean squared error” is used. If the predictive model is intended to be logical, such as logistic regression, the “misclassification rate” is used.
If you are interested in the specific formula for calculating variable importance, the following page may be helpful: