Expert Systems With Applications 200 (2022) 1169418
ranking reliability of the permutation importance values of features obtained by single training is low. Therefore, we propose an improved feature selection method called the adaptive feature selection method, which has two stages. The implementation steps areas follows First stage a) The original feature set is used as the input
training random forest model, and the PI value of each feature in the feature set is calculated by using the trained model and sorted according to the PI value from small to large, getting the sorted feature set
F = {
f1
, f2
, ......fn}. b) Repeat a) process n times to get n sorted feature sets. c)
Count the number of times k that each feature ranked in the topi R × 100% of the total ranking, where R is calculated
by the formula as follow R =
0
.15
ee0
.015
/e×
nTimewindow+
0
.2
ee0
.35
×
nTimewindow−
0
.001 ×
nTimewindow(8) Where the
nTimewindow is the number of time window. d) Traverse the total feature set
F, in which the features
that meet the conditions of k > K form anew feature set
F’
, which is much less than
F. Second stage a)
The new feature set F’ is used as the input training random forest model, and the PI value of each feature in the feature set is calculated by using the trained model (Calculate multiple times to take the average. Sort by PI value from small to large to get the sorted feature set
F’
sb) Takeout the feature
with the lowest PI value from F’
s to get anew feature set
F’
n. Replace
F’ with
F’
nc) Repeat procedures a) and b)
until the feature set is empty, and get the set
T of the feature subset and the accuracy set
S corresponding to the feature subset. d) The feature subset corresponding to the highest accuracy inset Sis the best feature subset. End of feature selection.
Share with your friends: