For the problem of multi-dimensional feature redundancy in remote sensing detection of wheat stripe rust using reflectance spectrum and solar-induced chlorophyll fluorescence (SIF), a feature selection and disease index (DI) monitoring model combining mRMR and XGBoost algorithm was proposed in this study. Firstly, characteristic wavelengths selected by successive projections algorithm (SPA) were combined with the vegetation indices, trilateral parameters, and canopy SIF parameters to constitute the initial feature set. Then, the max-relevance and min-redundancy (mRMR) algorithm and correlation coefficient (CC) analysis were used to reduce the dimensionality of the initial feature set, respectively. Features selected by mRMR and CC were input as independent variables into the extreme gradient boosting regression (XGBoost) and gradient boosting regression tree (GBRT) to monitor the severity of stripe rust. The experimental results show that, compared with CC analysis, the monitoring accuracy of the features selected by mRMR in the XGBoost and GBRT models increased by 12% and 17% on average, respectively. Meanwhile, the mRMR-XGBoost model achieved the best monitoring accuracy (R2 = 0.8894, RMSE = 0.1135). The R2 between the measured DI and predicted DI of mRMR-XGBoost was improved by an average of 5%, 12%, and 22% compared with mRMR-GBRT, CC-XGBoost, and CC-GBRT models. These results suggested that XGBoost is more suitable for the remote sensing monitoring of wheat stripe rust, and mRMR has more advantages than the commonly used CC analysis in feature selection. Field survey data validation results also confirm that the mRMR-XGBoost algorithm has excellent monitoring applicability and scalability. The proposed model could provide a reference for data dimensionality reduction and crop disease index monitoring based on hyperspectral data.
Loading....