|
|
Estimation of soil organic matter content based on machine learning |
BAI Ting1,2,3,DING Jianli1,2,3*,WANG Jingzhe1,2,3 |
1. College of Resources & Environmental Science, Xinjiang University, Urumqi, Xinjiang 830046, China; 2. Key Laboratory of Oasis Ecology, Xinjiang University, Urumqi, Xinjiang 830046, China; 3. Key Laboratory of Smart City and Environment Modelling of Higher Education Institute, Xinjiang University, Urumqi, Xinjiang 830046, China |
|
|
Guide |
|
Abstract To estimate soil organic matter(SOM)quickly and efficiently, a combined estimation model of the competitive adaptativereweighted sampling(CARS)and random forests(RF)was developed. The Ebinur Lake Basin was selected as a study area, and the soil hyperspectral reflectance and SOM content were measured. After data pre-processing, the visible-near infrared spectra of the four spectral variables, the original spectrum(R), the first derivative(R′), absorbance(log(1/R))and the first derivative of absorbance([log(1/R)]′)were screened with CARS method, and the full-spectrum RF and CARS-RF models were further developed with RF algorithm. Results indicated that after screening the variables with the CARS method, the number of preferred variable sets for the four spectral variables was 35, 26, 34, and 121, respectively; between the four spectral variables, the R′ and [log(1/R)]′ showed higher accuracy in the estimation of SOM, and the model accuracy based on [log(1/R)]′ is the highest; the accuracy of the CARS-RF model is better than that of the full spectrum RF model, and the verification set coefficient(R2), root mean square error(RMSE)and relative analysis error(RPD)for the CARS-RF model were 0.881, 6.438 g/kg and 2.177, respectively. It can be concluded that based on the data pre-processing, this study can provide a more suitable and efficient method for the estimation of arid and semiarid lands SOM by using variable preferred method and with less variables.
|
Received: 09 August 2018
|
Fund: |
|
|
|
|
|
|