Hostname: page-component-76fb5796d-25wd4 Total loading time: 0 Render date: 2024-04-26T23:57:04.672Z Has data issue: false hasContentIssue false

A comparative evaluation of convolutional neural networks, training image sizes, and deep learning optimizers for weed detection in alfalfa

Published online by Cambridge University Press:  15 June 2022

Jie Yang
Affiliation:
Ph.D Candidate, College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing, China; also Visiting Student, Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong, China
Muthukumar Bagavathiannan
Affiliation:
Associate Professor, Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, USA
Yundi Wang
Affiliation:
Graduate Student, Department of Computer Science, Stevens Institute of Technology, Hoboken, New Jersey, USA
Yong Chen*
Affiliation:
Professor, College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing, China
Jialin Yu*
Affiliation:
Research Professor, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences at Weifang, Weifang, Shandong, China
*
Authors for correspondence: Yong Chen, College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing, 210037, China. Email: chenyongjsnj@163.com. Jialin Yu, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences at Weifang, Weifang, Shandong, 261325, China. Email: jialin.yu@pku-iaas.edu.cn
Authors for correspondence: Yong Chen, College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing, 210037, China. Email: chenyongjsnj@163.com. Jialin Yu, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences at Weifang, Weifang, Shandong, 261325, China. Email: jialin.yu@pku-iaas.edu.cn
Rights & Permissions [Opens in a new window]

Abstract

In this research, the deep-learning optimizers Adagrad, AdaDelta, Adaptive Moment Estimation (Adam), and Stochastic Gradient Descent (SGD) were applied to the deep convolutional neural networks AlexNet, GoogLeNet, VGGNet, and ResNet that were trained to recognize weeds among alfalfa using photographic images taken at 200×200, 400×400, 600×600, and 800×800 pixels. An increase in the image sizes reduced the classification accuracy of all neural networks. The neural networks that were trained with images of 200×200 pixels resulted in better classification accuracy than the other image sizes investigated here. The optimizers AlexNet and GoogLeNet trained with AdaDelta and SGD outperformed the Adagrad and Adam optimizers; VGGNet trained with AdaDelta outperformed Adagrad, Adam, and SGD; and ResNet trained with AdaDelta and Adagrad outperformed the Adam and SGD optimizers. When the neural networks were trained with the best-performing input image size (200×200 pixels) and the best-performing deep learning optimizer, VGGNet was the most effective neural network, with high precision and recall values (≥0.99) when validation and testing datasets were used. Alternatively, ResNet was the least effective neural network in its ability to classify images containing weeds. However, there was no difference among the different neural networks in their ability to differentiate between broadleaf and grass weeds. The neural networks discussed herein may be used for scouting weed infestations in alfalfa and further integrated into the machine vision subsystem of smart sprayers for site-specific weed control.

Type
Research Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of the Weed Science Society of America

Introduction

Alfalfa, an important leguminous forage crop, is widely cultivated worldwide (Bai et al. Reference Bai, Ma, Ma, Velthof, Wei, Petr, Oene, Lee and Zhang2018; Li et al. Reference Li, Gu, Liu, Wei, Hu, Wang, McNeill and Ban2021; Mielmann Reference Mielmann2013; Radovic et al. Reference Radovic, Sokolovic and Markovic2009). Alfalfa contains a high amount of crude protein and is rich in minerals, particularly calcium, iron, and manganese (Richter et al. Reference Richter, Siddhuraju and Becker2003), and is thus considered to be a healthy feed for livestock (Salzano et al. Reference Salzano, Neglia, D’Onofrio, Balestrieri, Limone, Cotticelli, Marrone, Anastasio, D’Occhio and Campanile2021). Weeds are a significant challenge in alfalfa production because they compete with alfalfa for nutrients, space, sunlight, and water, and reduce forage yield and nutritive value. Moreover, certain weed species such as perilla mint (Perilla frutescens L.) are toxic to livestock (Kerr et al. Reference Kerr, Johnson and Burrows1986). A variety of postemergence (POST) herbicides are used for weed control in alfalfa fields. For instance, clethodim and 2,4-DB control a wide range of grasses and broadleaf weeds, respectively, in conventional alfalfa crops (Cudney and Adams Reference Cudney and Adams1993; Idris et al. Reference Idris, Dongola, Elamin and Babiker2019), while glyphosate provides nonselective control of weeds in glyphosate-tolerant alfalfa crops (Wilson and Burgener Reference Wilson and Burgener2009). These POST-applied herbicides are typically broadcast-applied in alfalfa fields, including where weeds do not occur.

Site-specific weed management, particularly precision herbicide application, can considerably reduce herbicide input and weed control costs (Franco et al. Reference Franco, Pedersen, Papaharalampos and Orum2017; Sabzi et al. Reference Sabzi, Abbaspour-Gilandeh and Garcia-Mateos2018; Yu et al. Reference Yu, Schumann, Sharpe, Li and Boyd2020; Zaman et al. Reference Zaman, Esau, Schumann, Percival, Chang, Read and Farooque2011). A major obstacle for autonomous precision herbicide application is the ability to accurately and reliably detect weeds in a real-time manner (Sabzi et al. Reference Sabzi, Abbaspour-Gilandeh and Garcia-Mateos2018). Traditional machine vision techniques depend on the ability to recognize and differentiate plant leaf color, spectral information, and feature fusion (Sabzi et al. Reference Sabzi, Abbaspour-Gilandeh and Garcia-Mateos2018, Reference Sabzi, Abbaspour-Gilandeh and Arribas2020); morphological features (Bakhshipour and Jafari Reference Bakhshipour and Jafari2018; Hamuda et al. Reference Hamuda, Mc Ginley, Glavin and Jones2017; Pulido et al. Reference Pulido, Solaque and Velasco2017); and spatial information (Farooq et al. Reference Farooq, Hu and Jia2019). However, these traditional approaches cannot reliably detect weeds intermingled with crops, especially in the context of a complex environment with high crop and weed densities (Ahmad et al. Reference Ahmad, Jan, Farman, Ahmad and Ullah2020; Akbarzadeh et al. Reference Akbarzadeh, Paap, Ahderom, Apopei and Alameh2018; Sujaritha et al. Reference Sujaritha, Annadurai, Satheeshkumar, Sharan and Mahesh2017; Yu et al. Reference Yu, Sharpe, Schumann and Boyd2019a).

Machine learning techniques have advanced significantly in recent years (Jordan and Mitchell Reference Jordan and Mitchell2015). Deep convolutional neural networks (DCNNs) have been used successfully in various field applications (LeCun et al. Reference LeCun, Bengio and Hinton2015; Ni et al. Reference Ni, Wang, Vinson, Holmes and Tao2019). For example, recent studies have shown that deep learning can be used to diagnose coronavirus disease (Saood and Hatem Reference Saood and Hatem2021), to help predict seizure recurrence (Geng et al. Reference Geng, Alkhachroum, Bicchi, Jagid, Cajigas and Chen2021), for high-accuracy three-dimensional optical measurement (Yao et al. Reference Yao, Gai, Chen, Chen and Da2021), to predict the activity of potential drug molecules (Ma et al. Reference Ma, Sheridan, Liaw, Dahl and Svetnik2015), to analyze particle accelerator data (Azhari et al. Reference Azhari, Abarda, Ettaki, Zerouaoui and Dakkon2020; Ciodaro et al. Reference Ciodaro, Deva, De Seixas and Damazio2012), to detect industrial defects in wood veneer finishes (Shi et al. Reference Shi, Li, Zhu, Wang and Ni2020), to achieve efficient classification of green plum detects (Zhou et al. Reference Zhou, Zhuang, Liu and Zhang2020), and to reconstruct brain circuits (Helmstaedter et al. Reference Helmstaedter, Briggman, Turaga, Jain, Seung and Denk2013). In addition, DCNNs have demonstrated an exceptional ability to detect objects and classify digital images (Tompson et al. Reference Tompson, Jain, LeCun and Bregler2014; Wang et al. Reference Wang, Gao and Yuan2018). Thus, DCNNs show great promise for being employed for weed detection and classification purposes (Wang et al. Reference Wang, Zhang and Wei2019).

Researchers have recently explored the feasibility of using DCNNs to detect weeds in various cropping systems (Ferentinos Reference Ferentinos2018; Ghosal et al. Reference Ghosal, Blystone, Singh, Ganapathysubramanian, Singh and Sarkar2018; Sharpe et al. Reference Sharpe, Schumann, Yu and Boyd2019b; Singh et al. Reference Singh, Ganapathysubramanian, Sarkar and Singh2018; Yu et al. Reference Yu, Schumann, Sharpe, Li and Boyd2020). Sharpe et al. (Reference Sharpe, Schumann, Yu and Boyd2019b) showed that You Only Look Once (YOLO, version 3; Cornell University, Ithaca, NY [YOLO is a unified, real-time object detection system of software]) can be used as an object detector to discriminate broadleaves, grasses, and sedges in the middle rows of plastic-mulched vegetable crops. Yu et al. (Reference Yu, Sharpe, Schumann and Boyd2019a, Reference Yu, Schumann, Sharpe, Li and Boyd2020) reported the feasibility of using DCNNs to detect multiple broadleaf and grass weeds among actively growing or dormant bermudagrass [Cynodon dactylon (L.). Pers.] plants. Hennessy et al. (Reference Hennessy, Esau, Farooque, Schumann, Zaman and Corscadden2021) reported the feasibility of using YOLO3-tiny to detect hairy fescue (Festuca filiformis Pourr.) and sheep sorrel (Rumex acetosella L.) among wild blueberry (Vaccinium spp. L.) plants. Hussain et al. (Reference Hussain, Farooque, Schumann, Abbas, Acharya, McKenzie-Gopsill, Barrett, Afzaal, Zaman and Cheema2021) investigated the feasibility of using DCNNs to detect common lambsquarters (Chenopodium album L.) in potato (Solanum tuberosum L.) fields. However, the feasibility and effectiveness of using DCNNs for weed detection in alfalfa have never been investigated.

Alfalfa hay is typically harvested multiple times per growing season, unlike most other crops. Alfalfa has the capability to re-grow following harvest and can rapidly regenerate new stems and leaves. Weed detection in various heights of alfalfa stands might be a significant challenge.

Image classification with DCNNs can be used in the machine vision subsystem of smart sprayers for weed detection and real-time precision treatment (Sharpe et al. Reference Sharpe, Schumann, Yu and Boyd2019b; Wang et al. Reference Wang, Zhang and Wei2019; Yu et al. Reference Yu, Sharpe, Schumann and Boyd2019a, Reference Yu, Schumann, Sharpe, Li and Boyd2020). He et al. (Reference He, Zhang, Ren and Sun2015) noted that arbitrary use of fixed-size input images for training a neural network might reduce the classification accuracy. However, a careful review of the literature suggests that almost all previously reported studies that evaluated the feasibility of using DCNNs for weed detection and classification arbitrarily used a particular size of input images (Ferentinos Reference Ferentinos2018; Sharpe et al. Reference Sharpe, Schumann, Yu and Boyd2019b; Yu et al. Reference Yu, Sharpe, Schumann and Boyd2019a, Reference Yu, Sharpe, Schumann and Boyd2019b, Reference Yu, Schumann, Sharpe, Li and Boyd2020). Limited research has been carried out to investigate the impact of training image sizes on the performance of DCNNs for weed detection and classification through a comparative study.

When training is given to a deep learning model, the algorithm gradually improves to optimize through a large number of samples, with a certain weight of the optimizer (Krizhevsky et al. Reference Krizhevsky, Sutskever and Hinton2012; Simonyan and Zisserman Reference Simonyan and Zisserman2014; Szegedy et al. Reference Szegedy, Liu, Jia, Sermanet, Reed, Anguelov, Erhan, Vanhoucke and Rabinovich2015). Thus, selecting an appropriate optimizer is critical in the training pipeline for deep learning models (Choi et al. Reference Choi, Shallue, Nado, Lee, Maddison and Dahl2019). The majority of previous studies have focused on comparing the state-of-the-art deep learning architectures for weed detection (Sharpe et al. Reference Sharpe, Schumann and Boyd2019a, 2019b; Wang et al. Reference Wang, Zhang and Wei2019; Yu et al. Reference Yu, Sharpe, Schumann and Boyd2019a, Reference Yu, Sharpe, Schumann and Boyd2019b). However, none of them attempted to improve weed detection accuracies by using deep-learning optimizers through comparative research. Therefore, the objectives of this research were to 1) explore the effects of using various image sizes for training purposes to gauge the performance of DCNNs to detect and classify weeds; 2) compare several DCNNs trained with different deep learning optimizers for weed detection purposes; and 3) determine the feasibility of using DCNNs to detect multiple broadleaf and grass weeds growing in alfalfa.

Materials and Methods

Overview

In this research, the image classification DCNN architectures AlexNet (Krizhevsky et al. Reference Krizhevsky, Sutskever and Hinton2012), GoogLeNet (Szegedy et al. Reference Szegedy, Liu, Jia, Sermanet, Reed, Anguelov, Erhan, Vanhoucke and Rabinovich2015), VGGNet (Simonyan and Zisserman Reference Simonyan and Zisserman2014), and ResNet (He et al. Reference He, Zhang, Ren and Sun2016) were evaluated. These neural networks were trained to recognize four different sizes of input images (200×200, 400×400, 600×600, and 800×800 pixels); and three commonly employed deep-learning optimizers, Adagrad (Duchi et al. Reference Duchi, Hazan and Singer2011), AdaDelta (Zeiler Reference Zeiler2012), and Stochastic Gradient Descent (SGD; Darken et al. Reference Darken, Chang and Moody1992). AlexNet consists of eight layers, including five convolutional layers and three full connection layers (Krizhevsky et al. Reference Krizhevsky, Sutskever and Hinton2012). GoogLeNet consists of 22 convolutional layers and is designed to work with small convolutions in order to reduce the neuron numbers and parameters (Szegedy et al. Reference Szegedy, Liu, Jia, Sermanet, Reed, Anguelov, Erhan, Vanhoucke and Rabinovich2015). VGGNet used in this research consists of 19 weight layers. VGGNet is designed to implement smaller convolutional kernels to limit neuron numbers and parameters (Simonyan and Zisserman Reference Simonyan and Zisserman2014). ResNet is based on the VGG19, which consists of 50 layers and is modified to include a residual unit through a short-circuit mechanism (He et al. Reference He, Zhang, Ren and Sun2016). ResNet solves the degradation problem of the deep network through residual learning for training deeper networks (He et al. Reference He, Zhang, Ren and Sun2016). All neural networks were pretrained using the ImageNet database (Deng et al. Reference Deng, Dong, Socher, Li, Li and Li2009) with specific spatial tensor image sizes of 224×224 pixels, whereas the AlexNet was trained with 227×227 pixels (He et al. Reference He, Zhang, Ren and Sun2016; Krizhevsky et al. Reference Krizhevsky, Sutskever and Hinton2012; Simonyan and Zisserman Reference Simonyan and Zisserman2014; Szegedy et al. Reference Szegedy, Liu, Jia, Sermanet, Reed, Anguelov, Erhan, Vanhoucke and Rabinovich2015).

Image Acquisition

Images of various weeds growing in alfalfa fields were acquired multiple times during September and October 2020 using a digital camera (Panasonic® DMC-ZS110; Xiamen, Fujian, China) at a resolution of 4,160×3,120 pixels. The images taken in alfalfa fields in Bengbu, Anhui, China (117.89°N, 117.88°E) were used for the training dataset, validation dataset (VD), and testing dataset (TD). Additional testing images were taken in separate alfalfa fields in Bengbu, Anhui, China (additional testing dataset 1, TD 1) and Yangzhou University Pratacultural Science Experiment Station (32.20°N, 119.23°E) in Yangzhou, Jiangsu, China (additional testing dataset 2, TD 2). The additional testing datasets were used to examine the robustness of the models. The images containing alfalfa (8 to 52 cm height) and various broadleaf and grass weed species were captured from a height of approximately 1.5 m from the ground (0.05 cm pixel−1). Our research team designed a smart sprayer with a camera installed at 1.5 m above the ground (data not shown). Thus, all images were captured at 1.5 m above the ground to mimic the height of the smart sprayer’s camera. Images were acquired under various outdoor lighting conditions, including clear/bright, cloudy, and partially cloudy skies. In the present study, weed images were captured in the fall season, and only mature weeds were used for training and testing. An additional investigation is needed to evaluate the feasibility of using neural networks to identify weed growth stages. Variable rates could be sprayed according to the weed growth stages. For example, low and high herbicide rates could be sprayed to control seedling and mature annual weeds, respectively, while maintaining adequate weed control.

Impact of Training Using Various Image Sizes

During training, all collected images were cropped into sub-image datasets with resolutions of 200×200, 400×400, 600×600, or 800×800 pixels using Irfanview (v.5.50; Iran Skijan, Jajce, Bosnia; Figure 1A). The DCNN architectures received training with these image sizes. For each image size, the training dataset contained 3,000 positive images (with weeds) and 3,000 negative images (without weeds; Figure 1B). The VD contained 600 positive and 600 negative images. The TD contained 300 positive and 300 negative images that were randomly selected from the sites where the training images were taken but were not used for training. The TD 1 and TD 2 each contained a total of 700 positive and 700 negative images. The training and testing images contained a variety of broadleaf and grass weed species occurring in the mixture. The dominant broadleaf weed species (Figure 1C) included annual fleabane [Erigeron annuus (L.) Pers], common sage (Salvia plebeia R. Br.), Canada thistle [Cirsium arvense (L.) Scop.], and hemistepta [Hemistepta lyrata (Bunge) Bunge]; whereas the major grass weeds (Figure 1C) included crabgrass (Digitaria spp.), goosegrass [Eleusine indica (L.) Gaertn.], barnyardgrass [Echinochloa crus-galli (L.) Beauv], and green foxtail [Setaria viridis (L.) Beauv.].

Figure 1. Image classification using deep neural networks in training and testing images. (A) Images were cropped into four different sizes of input images, including 200×200, 400×400, 600×600, and 800×800 pixels; (B) input images were classified into true positive images (including the target weeds) and true negative images (excluding the target weeds); (C) for true positive images, the major broadleaf weeds were annual fleabane, common sage, Canada thistle, and heistepta, while the major grass weeds were crabgrass, goosegrass, barnyardgrass, and green foxtail.

Effect of Optimizers

Next, we investigated the performance of the CDDNs when they received additional training with four common deep-learning optimizers, Adagrad (Duchi et al. Reference Duchi, Hazan and Singer2011), AdaDelta (Zeiler Reference Zeiler2012), Adam (Kingma and Ba Reference Kingma and Ba2014), and SGD (Darken et al. Reference Darken, Chang and Moody1992). The characteristics of the deep-learning optimizers are described below.

Adagrad uses different learning rates for every parameter in the network (Duchi et al. Reference Duchi, Hazan and Singer2011). It updates the learning rate η based on the frequency of the update of each parameter. The performance of Adagrad relies on manually setting a global learning rate. The optimizer AdaDelta is an extension of Adagrad (Zeiler Reference Zeiler2012). AdaDelta accumulates the previous gradients over a fixed timeframe and employs Hessian approximation to ensure that the update direction is in the negative gradient. Adam combines the advantages of Adagrad and Root Mean Square Propagation (RMSProp; Kingma and Ba Reference Kingma and Ba2014). The method calculates the adaptive learning rate of different parameters by estimating the first and second gradients. It has the following advantages: 1) simple implementation, efficient calculation, and less memory demand; 2) the updating of parameters is not affected by the scaling transformation of gradient; 3) it is suitable for large-scale data and parameter scenarios and is applicable to unstable target functions; and 4) it is suitable for addressing the problem of sparse gradient or large noise gradient. Although Adam is currently the mainstream optimization algorithm, the best results in many fields (e.g., object recognition in computer vision) are still obtained by using SGD (Wilson et al. Reference Wilson, Roelofs, Stern, Srebro and Recht2017). SGD refers to mini-batch gradient descent (Qian et al. Reference Qian, Jin, Yi, Zhang and Zhu2015) and is one of the simplest deep-learning optimizers used to calculate the mini-batch gradient at every iteration. Although SGD is one of the most commonly used optimizers, its disadvantages are obvious. SGD can easily converge to the local optimum and be trapped in a saddle point.

Detection of Broadleaf and Grass Weeds

The deep learning architectures AlexNet, GoogLeNet, VGGNet, and ResNet were trained using input images of 200×200 pixels to detect broadleaf and grass weeds growing among alfalfa plants. The neural networks were trained with a total of 3,000 positive (with broadleaf weeds) and 3,000 negative (with grass weeds) images. The images of VD, TD, TD 1, and TD 2 contained broadleaf or grass weeds. The VD contained 600 positive and 600 negative images, the TD contained 300 positive and 300 negative images, and the TD 1 and TD 2 each contained a total of 700 positive and 700 negative images.

Training and Testing

The training and testing datasets were imported into the NVIDIA Deep Learning GPU Training System (DIGITS v. 6.0.0; NVIDIA Corporation, Santa Clara, CA, USA). The training and testing were performed on a GeForce RTX 2080Ti computer with 64 GB of memory using the Convolutional Architecture for Fast Feature Embedding (CAFFE; Jia et al. Reference Jia, Shelhamer, Donahue, Karayev, Long, Girshick, Guadarrama and Darrell2014). The hyper parameters used for training the neural networks are presented in Table 1. The actual training was carried out using the initial hyper parameters proposed by the original authors (Darken et al. Reference Darken, Chang and Moody1992; Duchi et al. Reference Duchi, Hazan and Singer2011; Kingma and Ba Reference Kingma and Ba2014; Zeiler Reference Zeiler2012).

Table 1. Hyper parameters used for training the neural networks. a

a The deep convolutional neural networks AlexNet, GoogLeNet, VGGNet, and ResNet were evaluated using various image sizes and various deep-learning optimizers for training purposes. All four neural networks were trained with input images of 200×200 pixels. The images in the validation and testing datasets contained images of both broadleaf and grass weeds.

The testing and validation results of the neural networks were arranged in a confusion matrix with four possible conditions: true positive (tp), false positive (fp), false negative (fn), and true negative (tn). Precision, recall, and F1 scores were computed based on the results of confusion matrices.

Precision measures the accuracy of the neural network at positive detection and was calculated using Equation 1 (Hoiem et al. Reference Hoiem, Chodpathumwan and Dai2012; Sokolova and Lapalme Reference Sokolova and Lapalme2009; Tao et al. Reference Tao, Barker and Sarathy2016):

(1) $${\rm{Precision}} = \;{tp \over {tp + fp}}$$

Recall measures the effectiveness of the neural network in identifying the target and was determined using Equation 2 (Hoiem et al. Reference Hoiem, Chodpathumwan and Dai2012; Sokolova and Lapalme Reference Sokolova and Lapalme2009; Tao et al. Reference Tao, Barker and Sarathy2016):

(2) $${\rm{Recall}} = {{tp} \over {tp + fn}}$$

F1 score is the harmonic mean of precision and recall. The F1 score is used for comprehensive evaluation of precision and recall and was calculated using Equation 3 (Tao et al. Reference Tao, Barker and Sarathy2016):

(3) $${{\rm{F}}_1}{\rm{score}} = {{2{\rm{*precision*recall}}} \over {{\rm{precision}} + {\rm{recall}}}}$$

Results and Discussion

Effect of Training Using Various Image Sizes

The input image size significantly affected the performance of the ability of DCNNs to detect weeds (Table 2). The neural networks that were trained with the small input images (200×200 pixels) performed better than they did with any other image sizes (400×400, 600×600, and 800×800 pixels) as evidenced by higher precision, recall, and F1 score values. For all neural networks, the F1 scores were ≥0.94 for the VD and TD when networks were trained with the small (200×200 pixels) images; however, the F1 scores were ≤0.95, ≤0.87, and ≤0.82 when the neural networks were trained with the relatively larger input image sizes of 400×400, 600×600, and 800×800 pixels, respectively. Interestingly, an increase in image size resulted in lower F1 scores for GoogLeNet and VGGNet compared to AlexNet and ResNet. When the neural networks were trained with large input images of 800×800 pixels, the F1 scores for GoogLeNet and VGGNet were ≤0.59 and ≤0.44, respectively, for VD, TD, TD 1, and TD 2, whereas the F1 scores of AlexNet and ResNet were ≥0.79 and ≥0.81, respectively.

Table 2. Image classification using deep convolutional neural network architectures under different image sizes in validation and testing datasets for detection of weeds in alfalfa crops. a, b

a Abbreviations: VD, validation dataset; TD, testing dataset; TD 1, testing dataset 1; TD 2, testing dataset 2.

b The models were trained to detect all types of weeds. The training datasets contained 3,000 positive and 3,000 negative images; the validation dataset contained 600 positive and 600 negative images; the testing results contained 300 positive and 300 negative images; and the TD 1 and TD 2 contained 700 positive and 700 negative images.

A significant difference was observed among the ability the neural networks to detect weeds. When the neural networks were trained with input images of 200×200 pixels, AlexNet, GoogLeNet, and VGGNet were highly effective and achieved high F1 scores (≥0.98), with high recall values (≥0.99) for the VD and TD; however, the F1 scores of ResNet were ≤0.96 for the VD and TD, primarily due to low precision (≤0.94). The F1 scores of AlexNet, GoogLeNet, and VGGNet were ≥0.99 for TD 2 but ≤0.98 for TD 1. The lower recall of TD 1 compared with TD 2 images cannot be adequately explained but it might be related to the presence of a greater diversity of weed species and a wider range of alfalfa height. ResNet demonstrated greater image classification accuracy compared to AlexNet, GoogLeNet, and VGGNet when the neural networks were trained with large input images. ResNet also had the highest F1 scores across all validation and testing datasets when the neural networks were trained with images of 800×800 pixels.

In the experiment, the image datasets of the four pixel sizes and the model that had been trained with the 200×200 image datasets demonstrated the greatest ability to recognize weeds. The four networks trained by the four different pixel images and the loss curve is shown in the schematic diagram on the left of Figure 2. Under the training image sizes of 200×200, the model iterated for a total of 30 steps, started to converge within 5 steps, and then tended to stabilize. This size outperformed other cropping sizes, achieved stable convergence in less time, and obtained the lowest loss convergence value and the highest accuracy.

Figure 2. Loss curve of convolutional neural network training.

Transfer learning is the process of recycling previously trained neural networks by updating a small part of the original weights using new data (Bengio et al. Reference Bengio, Guyon, Dror and Lemaire2012). The use of transfer learning can reduce the amount of data required for training DCNNs (Espejo-Garcia et al. Reference Espejo-Garcia, Mylonas, Athanasakos, Fountas and Vasilakoglou2020) and is therefore widely adopted for deep-learning models of training (Geng et al. Reference Geng, Alkhachroum, Bicchi, Jagid, Cajigas and Chen2021; Mohanty et al. Reference Mohanty, Hughes and Salathé2016; Singh et al. Reference Singh, Ganapathysubramanian, Sarkar and Singh2018; Yu et al. Reference Yu, Sharpe, Schumann and Boyd2019a, Reference Yu, Sharpe, Schumann and Boyd2019b, Reference Yu, Schumann, Sharpe, Li and Boyd2020). In addition, He et al. Reference He, Zhang, Ren and Sun2015 noted that the use of fixed-size input images might significantly reduce the recognition accuracy of images or sub-images of arbitrary size. Mishkin et al. (Reference Mishkin, Sergievskiy and Matas2017) reported a similar finding: that the size of training images could significantly affect the recognition accuracy of DCNNs. As the input image size increases, the number of pixels in the images also increases. Using excessively large images may reduce the abstract level of the information, leading to increased calculation requirements and thereby reduced recognition accuracy; however, the critical information for feature extraction may not be well preserved when excessively small images are used. In addition, the small image size (200× 200pixels) performed best, likely because it is close to the initial spatial tensor image sizes used for pre-training the neural networks. AlexNet was pre-trained with the spatial tensor image size of 227×227 pixels, while GoogLeNet, VGGNet, and ResNet were pre-trained with images of 224×224 pixels (He et al. Reference He, Zhang, Ren and Sun2016; Krizhevsky et al. Reference Krizhevsky, Sutskever and Hinton2012; Simonyan and Zisserman Reference Simonyan and Zisserman2014; Szegedy et al. Reference Szegedy, Liu, Jia, Sermanet, Reed, Anguelov, Erhan, Vanhoucke and Rabinovich2015). Therefore, further reducing the image size used for training (i.e., smaller than 200×200 pixels) may not improve weed detection accuracy. Further study is needed to verify this assumption.

In previous research, neural networks exhibited excellent weed detection accuracy, but they were exposed to too many training images (Ferreira et al. Reference dos Santos Ferreira, Freitas, da Silva, Pistori and Folhes2017; Yu et al. Reference Yu, Schumann, Sharpe, Li and Boyd2020). For example, Yu et al. (Reference Yu, Schumann, Sharpe, Li and Boyd2020) used a dataset of 8,000 positive and 9,000 negative images to train a neural network to detect and classify multiple grass weed species growing among bermudagrass plants; the authors reported that VGGNet outperformed AlexNet and GoogLeNet in their ability to do so. Based on the present study’s findings, we suggest that using the most appropriate training image size can substantially enhance the performance of weed detection and thereby reduce the need to train the programs to detect image quantity. Furthermore, using an appropriate image size may also minimize the difference between the neural networks’ ability to detect weeds, although this assumption needs to be further verified.

Effect of Optimizers

AdaDelta and SGD optimizers generally outperformed Adagrad and Adam. The F1 scores of AlexNet trained with AdaDelta and SGD were ≥0.96 with VD, TD, TD 1, and TD 2; whereas F1 scores were ≤0.92 when AlexNet was trained with Adagrad, and ≤0.98 when it was trained with Adam (Table 3). The F1 scores of GoogLeNet did not significantly differ between the optimizers when the VD and TD were used. However, the F1 scores for GoogLeNet were ≥0.97 when TD 1 and TD 2 were used and when it was trained with AdaDelta and SGD, but the scores were ≤0.95 when it was trained with Adagrad and ≤0.93 when it was trained with Adam (Table 3). The F1 scores for VGGNet were ≥0.98 when VD, TD, TD 1, and TD 2 were used and when it was trained with AdaDelta, but the scores were ≤0.98 when it was trained with Adagrad and Adam (Table 3). ResNet trained with SGD and Adam exhibited significantly lower F1 scores when TD 1 and TD 2 were used than when it was trained with Adagrad and AdaDelta. These characteristics were evidenced in the loss curve on the right side of Figure 2. For AlexNet, Adagrad and Adam were obviously unsuitable compared to SGD and AdaDelta. SGD converged faster than AdaDelta, and eventually, the two curves tended to be stable. For GoogLeNet, the curves of the four optimizers exhibited little difference and leveled off in the end. For VGGNet, Adam performed worse than the other three optimizers. Among the optimizers, AdaDelta reached convergence faster, and finally, the three curves became stable. For ResNet, the four curves fluctuated greatly and did not tend to be stable when the number of iterations was 30 steps. These findings indicate that the classification accuracy of weed detection can be improved when the neural networks are trained with appropriate optimizers.

Table 3. Image classification using deep convolutional neural network architectures under different deep learning optimizers in validation and testing datasets for detection of weeds in alfalfa. a, b

a Abbreviations: VD, validation dataset; TD, testing dataset; TD 1, testing dataset 1; TD 2, testing dataset 2.

b The models were trained to detect all types of weeds in images at 200×200 pixels, and the training dataset contained 3,000 positive and 3,000 negative images. The VD contained 600 positive and 600 negative images; the TD contained 300 positive and 300 negative images; and the TD 1 or TD 2 contained 700 positive and 700 negative images.

To date, hundreds of deep-learning optimizers have been developed (Schmidt et al. Reference Schmidt, Schneider and Hennig2021). However, the research community commonly relies on benchmarking or even personal and anecdotal experiences to choose an optimizer (Geng et al. Reference Geng, Alkhachroum, Bicchi, Jagid, Cajigas and Chen2021; Nagaraju and Chawla Reference Nagaraju and Chawla2020). During deep learning, an optimization algorithm is required to reduce losses that occur by updating the weight parameters (Choi et al. Reference Choi, Shallue, Nado, Lee, Maddison and Dahl2019; Schmidt et al. Reference Schmidt, Schneider and Hennig2021). The optimization algorithm can significantly affect the training speed and determine the final performance of the neural network being trained (Choi et al. Reference Choi, Shallue, Nado, Lee, Maddison and Dahl2019). Our results confirmed the importance of selecting an appropriate deep-learning optimizer during the period when neural network models are being trained for weed detection purposes. The neural networks evaluated here needed different optimizers to achieve the best performance in weed detection.

To the best of our knowledge, this is the first report to investigate the effect of using optimizers on neural networks for purposes of weed detection. For detection of fruits or plant diseases, recent empirical comparisons revealed the differences between the neural networks when they were trained with different optimizers (Postalcolu Reference Postalcolu2020; Schmidt et al. Reference Schmidt, Schneider and Hennig2021). Adam, SGD, and RMSProp were used for training DCNNs for fruit detection and found that Adam and RMSProp outperformed SGD (Postalcolu Reference Postalcolu2020). In another study, Xception trained with the optimizer Adam achieved higher F1 scores for classifying plant disease images than other optimizers, including Adagrad, Adamax, SGD, and RMSProp (Saleem et al. Reference Saleem, Potgieter and Arif2020). Wilson et al. (Reference Wilson, Roelofs, Stern, Srebro and Recht2017) reported that adaptive learning-rate methods (e.g., Adagrad, AdaDelta, RMSProp, Adam) generally performed worse than SGD Contrast software (Ithaca, NY: Cornell University) in terms of object recognition character-level language modeling and constituency parsing.

Detection of Broadleaf vs. Grass Weeds

Based on the results presented in the two sections above, AlexNet, GoogLeNet, VGGNet, and ResNet performed best with images of 200×200 pixels in detecting multiple broadleaf and grass weeds and the most effective deep-learning optimizers. No obvious differences were observed among any neural networks in their ability to detect broadleaf plants vs. grasses, as evidenced by the precision, recall, and F1 score values (Table 4). Among the neural networks we evaluated, VGGNet consistently produced the highest F1 scores when VD, TD, TD 1, and TD 2 were used (classification results are shown in Figure 3), whereas ResNet consistently produced the lowest F1 scores in its ability to detect broadleaf and grass weeds. VGGNet achieved high F1 scores (≥0.99) with high recall (≥0.98) when VD, TD, and TD 2 were used, whereas the F1 scores of ResNet never exceeded 0.73. GoogLeNet outperformed AlexNet in its ability to detect broadleaf and grass weeds; the F1 scores of GoogLeNet were consistently higher than those of AlexNet when VD, TD, TD 1, and TD 2 were used.

Table 4. Image classification using deep convolutional neural network architectures in validation and testing datasets to detect broadleaf vs. grass weeds in alfalfa. a, b

a Abbreviations: VD, validation dataset; TD, testing dataset; TD 1, testing data set 1; TD 2, testing data set 2.

b The models were trained to detect all types of weeds with training images of 200×200 pixels, and the training dataset contained 3,000 positive and 3,000 negative images. The validation dataset contained 600 positive and 600 negative images; the testing results contained 300 positive and 300 negative images; the TD 1 or TD 2 contained 700 positive and 700 negative images.

Figure 3. Classification results of the VGGNet in the testing dataset.

Although the neural networks achieved excellent performance in their ability to detect weeds when they were trained with the best-performed image size and optimizer, various factors may affect their performance. In the present study, except for ResNet, the precision and recall values were lower when TD 1 was used than when TD 2 was used. This result might be because the TD 1 photographs were acquired primarily on cloudy days and thus were darker than the TD 2 images. Additional studies are needed to evaluate the training and testing of neural networks with images from a wide range of geographic locations, weed species, weed densities, weed, crop growth stages, and light intensities, and their ability to adapt to more complex situations.

DCNNs detect weeds based on plant morphological features, including leaf pattern and texture (Kamilaris and Prenafeta-Boldu Reference Kamilaris and Prenafeta-Boldu2018). For this reason, the detection of broadleaf weeds growing in alfalfa fields is hypothetically more difficult than the detection of grasses. However, among all the neural networks tested here, the present study clearly showed no obvious differences in their ability to detect broadleaf and grass weeds. We note that a high image processing speed is vital for real-time weed recognition and precision herbicide application (Yang et al. Reference Yang, Prasher, Landry and DiTommaso2000). The neural networks in this study exhibited fast image processing using the NVIDIA Geforce 2080Ti graphics processing unit in the present study. The image processing speeds were 23 ms, 35 ms, 64 ms, and 68 ms image-1 for AlexNet, GoogLeNet, VGGNet, and ResNet, respectively.

Conclusion

This research demonstrated the feasibility of using DCNNs for purposes of weed detection in alfalfa crops. AlexNet, GoogLeNet, VGGNet, and ResNet trained with small input images of 200×200 pixels performed better than when large images of 400×400, 600×600, and 800×800 pixels were used. Furthermore, the choice of a deep-learning optimizer can significantly affect the performance of neural networks. The optimizers AdaDelta and SGD outperformed Adagrad and Adam when they were used with AlexNet and GoogLeNet; AdaDelta outperformed Adagrad, Adam, and SGD when used with VGGNet; and Adagrad and AdaDelta outperformed Adam and SGD when used with ResNet. All neural networks showed no differences in their ability to detect broadleaf and grass weeds. When the neural networks were trained with the best-performing input image size and optimizer, the neural networks were ranked as follows, from the highest to lowest classification accuracy: VGGNet > GoogLeNet > AlexNet > ResNet. Future research will integrate these neural networks into the machine vision subsystem of smart sprayers.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant No. 32072498), the Key Research and Development Program of Jiangsu Province (Grant No. BE2021016), and Jiangsu Agricultural Science and Technology Innovation Fund (Grant No. CX(21)3184). The authors declare no conflicts of interest.

Footnotes

Associate Editor: Vipan Kumar, Kansas State University

References

Ahmad, J, Jan, B, Farman, H, Ahmad, W, Ullah, A (2020) Disease detection in plum using convolutional neural network under true field conditions. Sensors 20:5569 CrossRefGoogle ScholarPubMed
Akbarzadeh, S, Paap, A, Ahderom, S, Apopei, B, Alameh, K (2018) Plant discrimination by Support Vector Machine classifier based on spectral reflectance. Comput Electron Agric 148:250258 CrossRefGoogle Scholar
Azhari, M, Abarda, A, Ettaki, B, Zerouaoui, J, Dakkon, M (2020) Higgs boson discovery using machine learning methods with pyspark. Procedia Comput Sci 170:11411146 CrossRefGoogle Scholar
Bai, Z, Ma, W, Ma, L, Velthof, GL, Wei, Z, Petr, H, Oene, O, Lee, M, Zhang, F (2018) China’s livestock transition: driving forces, impacts, and consequences. Sci Adv 4:eaar8534 CrossRefGoogle ScholarPubMed
Bakhshipour, A, Jafari, A (2018) Evaluation of support vector machine and artificial neural networks in weed detection using shape features. Comput Electron Agric 145:153160 CrossRefGoogle Scholar
Bengio, Y, Guyon, G, Dror, V, Lemaire, G, Silver (2012) Deep learning of representations for unsupervised and transfer learning. Pages 17–36 in Proceedings of ICML workshop on unsupervised and transfer learning. JMLR Workshop and Conference Proceedings, Bellevue, Washington, July 2, 2011Google Scholar
Choi, D, Shallue, CJ, Nado, Z, Lee, J, Maddison, CJ, Dahl, GE (2019) On empirical comparisons of optimizers for deep learning. https://arxiv.org/abs/1910.05446. Accessed: June 16, 2020Google Scholar
Ciodaro, T, Deva, D, De Seixas, J, Damazio, D (2012) Online particle detection with neural networks based on topological calorimetry information. Volume 368 in Proceedings of the 14th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT). London: STFCCrossRefGoogle Scholar
Cudney, DW, Adams, O (1993) Improving weed control with 2,4-DB amine in seedling alfalfa (Medicago sativa). Weed Technol 7:465470 CrossRefGoogle Scholar
Darken, C, Chang, J, Moody, J (1992) Learning rate schedules for faster stochastic gradient search. Volume 2 in Neural networks for signal processing. Helsingør, Denmark: CiteseerGoogle Scholar
Deng, J, Dong, W, Socher, R, Li, LJ, Li, K, Li, FF (2009) ImageNet: A large-scale hierarchical image database. Pages 248–255 in Cvpr: 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, Florida, June 20–25, 2009Google Scholar
Duchi, J, Hazan, E, Singer, Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:21212159 Google Scholar
Espejo-Garcia, B, Mylonas, N, Athanasakos, L, Fountas, S, Vasilakoglou, I (2020) Towards weeds identification assistance through transfer learning. Comput Electron Agric 171:105306 CrossRefGoogle Scholar
Farooq, A, Hu, JK, Jia, XP (2019) Analysis of spectral bands and spatial resolutions for weed classification via deep convolutional neural network. IEEE Geosci Remote Sens Lett 16:183187 CrossRefGoogle Scholar
Ferentinos, KP (2018) Deep learning models for plant disease detection and diagnosis. Comput Electron Agric 145:311318 CrossRefGoogle Scholar
dos Santos Ferreira, A, Freitas, DM, da Silva, GG, Pistori, H, Folhes, MT (2017) Weed detection in soybean crops using ConvNets. Comput Electron Agric 143:314324 CrossRefGoogle Scholar
Franco, C, Pedersen, SM, Papaharalampos, H, Orum, JE (2017) The value of precision for image-based decision support in weed management. Precis Agric 18:366382 CrossRefGoogle Scholar
Geng, DV, Alkhachroum, A, Bicchi, MA, Jagid, JR, Cajigas, I, Chen, ZS (2021) Deep learning for robust detection of interictal epileptiform discharges. J Neural Eng 18:19 CrossRefGoogle ScholarPubMed
Ghosal, S, Blystone, D, Singh, AK, Ganapathysubramanian, B, Singh, A, Sarkar, S (2018) An explainable deep machine vision framework for plant stress phenotyping. Proc Natl Acad Sci U S A 115:46134618 CrossRefGoogle ScholarPubMed
Hamuda, E, Mc Ginley, B, Glavin, M, Jones, E (2017) Automatic crop detection under field conditions using the HSV colour space and morphological operations. Comput Electron Agric 133:97107 CrossRefGoogle Scholar
He, K, Zhang, X, Ren, S, Sun, J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE T Pattern Anal 37:19041916 CrossRefGoogle ScholarPubMed
He, K, Zhang, X, Ren, S, Sun, J (2016) Deep residual learning for image recognition. Pages 770–778 in Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE Computer Society Annual Meeting, Las Vegas, NV, June 27–30, 2016CrossRefGoogle Scholar
Helmstaedter, M, Briggman, KL, Turaga, SC, Jain, V, Seung, HS, Denk, W (2013) Connectomic reconstruction of the inner plexiform layer in the mouse retina. Nature 500:168174 CrossRefGoogle ScholarPubMed
Hennessy, PJ, Esau, TJ, Farooque, AA, Schumann, AW, Zaman, QU, Corscadden, KW (2021) Hair fescue and sheep sorrel identification using deep learning in wild blueberry production. Remote Sens 13:943 CrossRefGoogle Scholar
Hoiem, DC, Chodpathumwan, Y, Dai, Q (2012) Diagnosing error in object detectors. Pages 340–353 in European conference on computer vision. Berlin: SpringerCrossRefGoogle Scholar
Hussain, N, Farooque, AA, Schumann, AW, Abbas, F, Acharya, B, McKenzie-Gopsill, A, Barrett, R, Afzaal, H, Zaman, QU, Cheema, MJ (2021) Application of deep learning to detect Lamb’s quarters (Chenopodium album L.) in potato fields of Atlantic Canada. Comput Electron Agric 182:106040 CrossRefGoogle Scholar
Idris, KI, Dongola, GM, Elamin, SE, Babiker, MM (2019) Evaluation of clethodim for weed control in alfalfa (Medicago sativa L.). U of K J Agric Sci 22:126135 Google Scholar
Jia, Y, Shelhamer, E, Donahue, J, Karayev, S, Long, J, Girshick, R, Guadarrama, S, Darrell, T (2014) Caffe: Convolutional Architecture for Fast Feature Embedding. Pages 675–678 in Proceedings of the 22nd ACM International Conference on Multimedia. Orlando, FL, November 3–7, 2014CrossRefGoogle Scholar
Jordan, MI, Mitchell, TM (2015) Machine learning: trends, perspectives, and prospects. Science 349:255260 Google ScholarPubMed
Kamilaris, A, Prenafeta-Boldu, FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:7090 CrossRefGoogle Scholar
Kerr, LA, Johnson, BJ, Burrows, GE (1986) Intoxication of cattle by Perilla frutescens (purple mint). Vet Human Toxicol 28:412416 Google Scholar
Kingma, DP, Ba, J (2014) Adam: A method for stochastic optimization. https://arxiv.org/abs/1412.6980. Accessed: January 30, 2017Google Scholar
Krizhevsky, A, Sutskever, I, Hinton, G (2012) ImageNet classification with deep convolutional neural networks. Pages 1097–1105 in Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, December 3–6, 2012Google Scholar
LeCun, Y, Bengio, Y, Hinton, G (2015) Deep learning. Nature 521:436 CrossRefGoogle ScholarPubMed
Li, J, Gu, H, Liu, Y, Wei, S, Hu, G, Wang, X, McNeill, MR, Ban, L (2021) RNA-seq reveals plant virus composition and diversity in alfalfa, thrips, and aphids in Beijing, China. Arch Virol 166:17111722 CrossRefGoogle ScholarPubMed
Ma, JS, Sheridan, RP, Liaw, A, Dahl, GE, Svetnik, V (2015) Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55:263274 CrossRefGoogle ScholarPubMed
Mielmann, A (2013) The utilisation of lucerne (Medicago sativa): a review. Br Food J 115:590600 CrossRefGoogle Scholar
Mishkin, D, Sergievskiy, N, Matas, J (2017) Systematic evaluation of convolution neural network advances on the imagenet. Comput Vis Image Underst 161:1119 CrossRefGoogle Scholar
Mohanty, SP, Hughes, DP, Salathé, M (2016) Using deep learning for image-based plant disease detection. Front Plant Sci 7:1419 CrossRefGoogle ScholarPubMed
Nagaraju, M, Chawla, P (2020) Systematic review of deep learning techniques in plant disease detection. Int J Syst Assur Eng 11:547560 CrossRefGoogle Scholar
Ni, C, Wang, D, Vinson, R, Holmes, M, Tao, Y (2019) Automatic inspection machine for maize kernels based on deep convolutional neural networks. Biosyst Eng 178:131144 CrossRefGoogle Scholar
Postalcolu, S (2020) Performance analysis of different optimizers for deep learning-based image recognition. Int J Pattern Recognit Artif Intell 34:16 Google Scholar
Pulido, C, Solaque, L, Velasco, N (2017) Weed recognition by SVM texture feature classification in outdoor vegetable crop images. Ing Invest 37:6874 CrossRefGoogle Scholar
Qian, Q, Jin, R, Yi, J, Zhang, L, Zhu, S (2015) Efficient distance metric learning by adaptive sampling and mini-batch stochastic gradient descent (SGD). Mach Learn 99:353372 Google Scholar
Radovic, J, Sokolovic, D, Markovic, J (2009) Alfalfa-most important perennial forage legume in animal husbandry. Biotechnol Anim Husb 25(5-6-1):465475 CrossRefGoogle Scholar
Richter, N, Siddhuraju, P, Becker, K (2003) Evaluation of nutritional quality of moringa (Moringa oleifera Lam.) leaves as an alternative protein source for Nile tilapia (Oreochromis niloticus L.). Aquaculture 217:599611 CrossRefGoogle Scholar
Sabzi, S, Abbaspour-Gilandeh, Y, Arribas, JI (2020) An automatic visible-range video weed detection, segmentation and classification prototype in potato field. Heliyon 6:e03685 CrossRefGoogle ScholarPubMed
Sabzi, S, Abbaspour-Gilandeh, Y, Garcia-Mateos, G (2018) A fast and accurate expert system for weed identification in potato crops using metaheuristic algorithms. Comput Ind 98:8089 CrossRefGoogle Scholar
Saleem, MH, Potgieter, J, Arif, KM (2020) Plant disease classification: A comparative evaluation of convolutional neural networks and deep learning optimizers. Plants 9:1319 CrossRefGoogle ScholarPubMed
Salzano, A, Neglia, G, D’Onofrio, N, Balestrieri, ML, Limone, A, Cotticelli, A, Marrone, R, Anastasio, A, D’Occhio, MJ, Campanile, G (2021) Green feed increases antioxidant and antineoplastic activity of buffalo milk: A globally significant livestock. Food Chem 344:128669 CrossRefGoogle ScholarPubMed
Saood, A, Hatem, I (2021) COVID-19 lung CT image segmentation using deep learning methods: U-Net versus SegNet. BMC Med Imag 21:110 Google Scholar
Schmidt, RM, Schneider, F, Hennig, P (2021) Descending through a crowded valley—benchmarking deep learning optimizers. Pages 9367–9376 in Proceedings of the 38th International Conference on Machine Learning. Virtual, July 18–24, 2021Google Scholar
Sharpe, SM, Schumann, AW, Boyd, NS (2019a) Detection of carolina geranium (Geranium carolinianum) growing in competition with strawberry using convolutional neural networks. Weed Sci 67:239245 CrossRefGoogle Scholar
Sharpe, SM, Schumann, AW, Yu, J, Boyd, NS (2019b) Vegetation detection and discrimination within vegetable plasticulture row-middles using a convolutional neural network. Precis Agric 21:264277 Google Scholar
Shi, J, Li, Z, Zhu, T, Wang, D, Ni, C (2020) Defect detection of industry wood veneer based on NAS and multi-channel mask R-CNN. Sensors 20:4398 Google ScholarPubMed
Simonyan, K, Zisserman, A (2014) Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR). https://arxiv.org/abs/1409.1556. Accessed: May 28, 2019Google Scholar
Singh, AK, Ganapathysubramanian, B, Sarkar, S, Singh, A (2018) Deep learning for plant stress phenotyping: Trends and future perspectives. Trends Plant Sci 23:883898 CrossRefGoogle ScholarPubMed
Sokolova, M, Lapalme, G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manage 45:427437 CrossRefGoogle Scholar
Sujaritha, M, Annadurai, S, Satheeshkumar, J, Sharan, SK, Mahesh, L (2017) Weed detecting robot in sugarcane fields using fuzzy real time classifier. Comput Electron Agric 134:160171 CrossRefGoogle Scholar
Szegedy, C, Liu, W, Jia, Y, Sermanet, P, Reed, S, Anguelov, D, Erhan, D, Vanhoucke, V, Rabinovich, A (2015) Going deeper with convolutions. Pages 1–9 in IEEE Conference on Computer Vision and Pattern Recognition. Boston, Massachusetts, June 7–12, 2015Google Scholar
Tao, A, Barker, J, Sarathy, S (2016) Detectnet: Deep neural network for object detection in DIGITS. https://devblogs.nvidia.com/detectnet-deep-neural-network-object-detection-digits. Accessed: May 11, 2018Google Scholar
Tompson, J, Jain, A, LeCun, Y, Bregler, C (2014) Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation. Pages 1799–1807 in Proceedings of the 28th Conference on Neural Information Processing Systems (NIPS). Montreal, Quebec, December 7–12, 2014Google Scholar
Wang, AC, Zhang, W, Wei, XH (2019) A review on weed detection using ground-based machine vision and image processing techniques. Comput Electron Agric 158:226240 CrossRefGoogle Scholar
Wang, Q, Gao, JY, Yuan, Y (2018) A joint convolutional neural networks and context transfer for street scenes labeling. IEEE Trans Intell Transp Syst 19:14571470 Google Scholar
Wilson, AC, Roelofs, R, Stern, M, Srebro, N, Recht, B (2017) The marginal value of adaptive gradient methods in machine learning. Pages 4154–4161 in Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS). Long Beach, California, December 4–9, 2017Google Scholar
Wilson, RG, Burgener, PA (2009) Evaluation of glyphosate-tolerant and conventional alfalfa weed control systems during the first year of establishment. Weed Technol 23:257263 CrossRefGoogle Scholar
Yang, C, Prasher, SO, Landry, J, DiTommaso, A (2000) Application of artificial neural networks in image recognition and classification of crop and weeds. Can Agric Eng 42:147152 Google Scholar
Yao, P, Gai, S, Chen, Y, Chen, W, Da, F (2021) A multi-code 3D measurement technique based on deep learning. Opt Lasers Eng 143:106623 CrossRefGoogle Scholar
Yu, J, Schumann, AW, Sharpe, SM, Li, X, Boyd, NS (2020) Detection of grassy weeds in bermudagrass with deep convolutional neural networks. Weed Sci 68:545552 CrossRefGoogle Scholar
Yu, J, Sharpe, SM, Schumann, AW, Boyd, NS (2019a) Deep learning for image-based weed detection in turfgrass. Eur J Agron 104:7884 CrossRefGoogle Scholar
Yu, J, Sharpe, SM, Schumann, AW, Boyd, NS (2019b) Detection of broadleaf weeds growing in turfgrass with convolutional neural networks. Pest Manag Sci 75:22112218 CrossRefGoogle ScholarPubMed
Zaman, QU, Esau, TJ, Schumann, AW, Percival, DC, Chang, YK, Read, SM, Farooque, AA (2011) Development of prototype automated variable rate sprayer for real-time spot-application of agrochemicals in wild blueberry fields. Comput Electron Agric 76:175182 CrossRefGoogle Scholar
Zeiler, M (2012) ADADELTA: An adaptive learning rate method. https://arxiv.org/abs/1212.5701. Accessed: May 14,2018Google Scholar
Zhou, H, Zhuang, Z, Liu, Y, Zhang, X (2020) Defect classification of green plums based on deep learning. Sensors 20:6993 Google ScholarPubMed
Figure 0

Figure 1. Image classification using deep neural networks in training and testing images. (A) Images were cropped into four different sizes of input images, including 200×200, 400×400, 600×600, and 800×800 pixels; (B) input images were classified into true positive images (including the target weeds) and true negative images (excluding the target weeds); (C) for true positive images, the major broadleaf weeds were annual fleabane, common sage, Canada thistle, and heistepta, while the major grass weeds were crabgrass, goosegrass, barnyardgrass, and green foxtail.

Figure 1

Table 1. Hyper parameters used for training the neural networks.a

Figure 2

Table 2. Image classification using deep convolutional neural network architectures under different image sizes in validation and testing datasets for detection of weeds in alfalfa crops.a,b

Figure 3

Figure 2. Loss curve of convolutional neural network training.

Figure 4

Table 3. Image classification using deep convolutional neural network architectures under different deep learning optimizers in validation and testing datasets for detection of weeds in alfalfa.a,b

Figure 5

Table 4. Image classification using deep convolutional neural network architectures in validation and testing datasets to detect broadleaf vs. grass weeds in alfalfa.a,b

Figure 6

Figure 3. Classification results of the VGGNet in the testing dataset.