李蔚, 何祺, 刘哲, 李秀珍, 崔佳琳, 张晓鸣, 张莉. 基于机器学习的杀虫剂二元分类抗性风险模型[J]. 农药学学报, 2024, 26(4): 724-734. DOI: 10.16801/j.issn.1008-7303.2024.0049
    引用本文: 李蔚, 何祺, 刘哲, 李秀珍, 崔佳琳, 张晓鸣, 张莉. 基于机器学习的杀虫剂二元分类抗性风险模型[J]. 农药学学报, 2024, 26(4): 724-734. DOI: 10.16801/j.issn.1008-7303.2024.0049
    LI Wei, HE Qi, LIU Zhe, LI Xiuzhen, CUI Jialin, ZHANG Xiaoming, ZHANG Li. Machine learning-based binary classification of insecticides for resistance risk modeling[J]. Chinese Journal of Pesticide Science, 2024, 26(4): 724-734. DOI: 10.16801/j.issn.1008-7303.2024.0049
    Citation: LI Wei, HE Qi, LIU Zhe, LI Xiuzhen, CUI Jialin, ZHANG Xiaoming, ZHANG Li. Machine learning-based binary classification of insecticides for resistance risk modeling[J]. Chinese Journal of Pesticide Science, 2024, 26(4): 724-734. DOI: 10.16801/j.issn.1008-7303.2024.0049

    基于机器学习的杀虫剂二元分类抗性风险模型

    Machine learning-based binary classification of insecticides for resistance risk modeling

    • 摘要: 随着杀虫剂的大量使用,害虫对其产生的抗药性问题日益严重。尽管杀虫剂的室内抗性实验可获得杀虫剂的抗性倍数,但存在实验周期长、实验试材获取困难等问题。采用机器学习模型快速、合理评估杀虫剂的潜在抗性风险是一种行之有效的方法。本研究基于节肢动物抗性数据库(arthropod pesticide resistance database,APRD)、英国作物生产委员会(British crop production council,BCPC)和SPECS数据库,选择结构相似性低的样本组成训练集,结合6种机器学习算法:线性判别分析 (linear discriminant analysis, LDA)、支持向量机 (support vector machine,SVM)、人工神经网络 (artificial neural network,ANN)、决策树 (decision tree,DT)、随机森林 (random forest,RF)、自组织映射聚类算法 (self-organizing map,SOM),分别构建了杀虫剂的二元分类抗性风险模型,基于测试集对预测模型进行参数优化,并用最优模型对外部验证集进行了预测。单一模型中,DT在外部验证集中预测准确率达到了84.62%;同时,采用投票机制整合了6种模型,其在外部验证集中阳性样本预测准确率为78.95%,阴性样本预测准确率为65%。本研究构建的抗性模型可为新杀虫剂的潜在抗性风险提供理论评估,有助于指导杀虫剂在农田中的科学使用以延缓其抗性产生。

       

      Abstract: The extensive use of insecticides has indeed led to an increasingly serious problem of pest resistance. However, indoor resistance experiments, despite providing the resistance multiplicity of insecticides, suffering from long experimental periods and difficulty in obtaining test materials. Nevertheless, the use of machine learning models to assess the potential resistance risk of insecticides quickly and reasonably is a promising approach that warrants further investigation. The study utilized the Arthropod Pesticide Resistance Database (APRD), British Crop Production Council (BCPC), and SPECS databases to carefully select samples with low structural similarity to form a training set. Six machine learning algorithms LDA (linear discriminant analysis), SVM (support vector machine), ANN (artificial neural network), DT (decision tree), RF (random forest), SOM (self-organizing map) were used to construct binary classification resistance risk models for insecticides. The prediction model's parameters were meticulously optimized based on the test set and rigorously verified with the optimal model against the external validation set. Among the single models, DT had a prediction accuracy of 84.62% in the external validation set. By utilizing a voting mechanism to combine the evaluation effects of the six models, we achieved a prediction accuracy of 78.95% for positive samples and 65% for negative samples. The model offers a confident evaluation of the potential resistance risk of new insecticides and guides the scientific use of insecticides to delay the development of resistance.

       

    /

    返回文章
    返回