Plant-Disease Relation Model through BERT-BiLSTM-CRF Approach

Slamet Riyanto, Imas Sukaesih Sitanggang, Taufik Djatna, Tika Dewi Atikah

Abstract


Plant Disease Relations (PDR) is one of the Information Extraction (IE) subtasks that reveals the relationship between plant entities and diseases that appear together in a sentence. Previous studies have proposed methods for detecting the extraction of relationships between plant diseases (PDR). Previous research has proposed a Short Dependency Path-Convolutional Neural Network (SDP-CNN) method to predict relationships. However, the proposed method has limitations when faced with long and complex sentences. To overcome these limitations, this study proposes the BERT-BiLSTM-CRF method to improve the model performance in detecting PDR. First, the data is processed into the BERT Encoder layer after the tokenization process. After the BERT Encoder calculates the hidden information, the next step is to enter the linear layer to obtain word embedding. Calculation results in the bilinear layer are forwarded to the softmax layer to predict the relationship of each pair. Computation results in the softmax layer are sent to the BiLSTM layer. Finally, the CRF layer is entered to improve the prediction process. An 80:20 ratio for training and testing data was used to build the model using the same parameter values over ten attempts. GridSearch hyperparameter tuning is also involved in improving model performance. Experimental results show that the architecture proposed in this research can increase the F1 score by 0.790, which proved to be higher than SDP-CNN with a micro F1 score of 0.764. The problem of predicting PDR was overcome by the BERT-BILSTM-CRF method. The issue of forecasting PDR was resolved using the BERT-BILSTM-CRF approach.

Keywords


BERT; Bidirectional Long Short-Term Memory; Conditional Random Field; Plant-Disease Relation; Relation Extraction

References


S. Wadhwa, S. Amir, and B. C. Wallace, “Revisiting Relation Extraction in the era of Large Language Models,” Proc. Annu. Meet. Assoc. Comput. Linguist., vol. 1, pp. 15566–15589, 2023, doi: 10.18653/v1/2023.acl-long.868.

B. Kim, W. Choi, and H. Lee, “A corpus of plant–disease relations in the biomedical domain,” PLoS One, vol. 14, no. 8, pp. 1–19, 2019, doi: 10.1371/journal.pone.0221582.

Y. Liu, Y. Hou, W. Xu, M. Luo, and X. Sun, “Text Analysis of Community Governance Case based on Entity and Relation Extraction,” Proc. - 2020 Chinese Autom. Congr. CAC 2020, pp. 7079–7083, 2020, doi: 10.1109/CAC51589.2020.9327296.

N. Perera, M. Dehmer, F. Emmert-streib, and F. Emmert-streib, “Named Entity Recognition and Relation Detection for Biomedical Information Extraction,” Front. Cell Dev. Biol., vol. 8, no. August, 2020, doi: 10.3389/fcell.2020.00673.

K. Shuang, Y. Tan, Z. Cai, and Y. Sun, “Natural language modeling with syntactic structure dependency,” Inf. Sci. (Ny)., vol. 523, pp. 220–233, 2020, doi: 10.1016/j.ins.2020.03.022.

J. Li, K. Shuang, J. Guo, Z. Shi, and H. Wang, “Enhancing Semantic Relation Classification With Shortest Dependency Path Reasoning,” IEEE/ACM Trans. Audio Speech Lang. Process., vol. 31, pp. 1550–1560, 2023, doi: 10.1109/TASLP.2023.3265205.

A. P. Ben Veyseh, F. Dernoncourt, D. Dou, and T. H. Nguyen, “Exploiting the syntax-model consistency for neural relation extraction,” Proc. Annu. Meet. Assoc. Comput. Linguist., pp. 8021–8032, 2020, doi: 10.18653/v1/2020.acl-main.715.

H. Yuan, J. Hu, Y. Song, Y. Li, and J. Du, “A new exact algorithm for the shortest path problem: An optimized shortest distance matrix,” Comput. Ind. Eng., vol. 158, no. March, p. 107407, 2021, doi: 10.1016/j.cie.2021.107407.

A. Dash, A. Mohanty, and S. Ghosh, “Advanced NLP Based Entity Key Phrase Extraction and Text-Based Similarity Measures in Hadoop Environment,” 2023 6th Int. Conf. Inf. Syst. Comput. Networks, ISCON 2023, pp. 1–6, 2023, doi: 10.1109/ISCON57294.2023.10112121.

Q. Li et al., “A Survey on Deep Learning Event Extraction: Approaches and Applications,” IEEE Trans. Neural Networks Learn. Syst., vol. 14, no. 9, pp. 1–22, 2022, doi: 10.1109/TNNLS.2022.3213168.

K. Adnan and R. Akbar, “Limitations of information extraction methods and techniques for heterogeneous unstructured big data,” Int. J. Eng. Bus. Manag., vol. 11, pp. 1–23, 2019, doi: 10.1177/1847979019890771.

N. Chida and T. Terauchi, “Repairing Regular Expressions for Extraction,” Proc. ACM Program. Lang., vol. 7, no. June, 2023, doi: 10.1145/3591287.

J. Petrus, Ermatita, Sukemi, and Erwin, “A Novel Approach: Tokenization Framework based on Sentence Structure in Indonesian Language,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 2, pp. 541–549, 2023, doi: 10.14569/IJACSA.2023.0140264.

L. Xue, H. Cao, F. Ye, and Y. Qin, “A method of chinese tourism named entity recognition based on bblc model,” Proc. - 2019 IEEE SmartWorld, Ubiquitous Intell. Comput. Adv. Trust. Comput. Scalable Comput. Commun. Internet People Smart City Innov. SmartWorld/UIC/ATC/SCALCOM/IOP/SCI 2019, no. September 1995, pp. 1722–1727, 2019, doi: 10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00307.

M. E. Peters, S. Ruder, and N. A. Smith, “To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks,” in Proceedings of the 4th Workshop on Representation Learning for NLP, 2019, pp. 7–14.

V. R. Joseph, “Optimal ratio for data splitting,” Stat. Anal. Data Min., vol. 15, no. 4, pp. 531–538, 2022, doi: 10.1002/sam.11583.

G. L. Team, “An Introduction to GridSearchCV | What is Grid Search | Great Learning,” 2023. https://www.mygreatlearning.com/blog/gridsearchcv/ (accessed Sep. 20, 2023).

Z. M. Alhakeem, Y. M. Jebur, S. N. Henedy, H. Imran, L. F. A. Bernardo, and H. M. Hussein, “Prediction of Ecofriendly Concrete Compressive Strength Using Gradient Boosting Regression Tree Combined with GridSearchCV Hyperparameter-Optimization Techniques,” Materials (Basel)., vol. 15, no. 21, 2022, doi: 10.3390/ma15217432.

J. Brownlee, Machine Learning Mastery with Python: Understand You Data, Create Accurate Models and Work Projects End-to-End. 2021.

L. Yang and A. Shami, “On hyperparameter optimization of machine learning algorithms: Theory and practice,” Neurocomputing, vol. 415, pp. 295–316, 2020, doi: 10.1016/j.neucom.2020.07.061.

D. Mateja and A. Heinzl, “Towards Machine Learning as an Enabler of Computational Creativity,” IEEE Trans. Artif. Intell., vol. 2, no. 6, pp. 460–475, 2021, doi: 10.1109/TAI.2021.3100456.

H. Tu, L. Han, and G. Nenadic, “Extraction of Medication and Temporal Relation from Clinical Text using Neural Language Models,” Proc. - 2023 IEEE Int. Conf. Big Data, BigData 2023, pp. 2735–2744, 2023, doi: 10.1109/BigData59044.2023.10386489.

H. K. Wang, Y. Zhang, and M. Huang, “A conditional random field based feature learning framework for battery capacity prediction,” Sci. Rep., vol. 12, no. 1, pp. 1–12, 2022, doi: 10.1038/s41598-022-17455-x.

Z. Q. Geng, G. F. Chen, Y. M. Han, G. Lu, and F. Li, “Semantic relation extraction using sequential and tree-structured LSTM with attention,” Inf. Sci. (Ny)., vol. 509, pp. 183–192, 2020, doi: 10.1016/j.ins.2019.09.006.

J. Lafferty, A. Mccallum, and F. Pereira, “Conditional Random Fields : Probabilistic Models for Segmenting and Labeling Sequence Data Abstract,” vol. 2001, no. June, pp. 282–289, 1999.

J. D. Kelleher, B. Mac Namee, and A. D’Arcy, Fundamentals of Machine Learning for Predictive Data Analytics. The MIT Press, 2015.

M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Inf. Process. Manag., vol. 45, no. 4, pp. 427–437, 2009, doi: 10.1016/j.ipm.2009.03.002.

J. A. Kumar and S. Abirami, “Ensemble application of bidirectional LSTM and GRU for aspect category detection with imbalanced data,” Neural Comput. Appl., vol. 33, no. 21, pp. 14603–14621, 2021, doi: 10.1007/s00521-021-06100-9.

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (Adaptive Computation and Machine Learning series). The MIT Press, 2016.

J. Lee et al., “BioBERT: A pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020, doi: 10.1093/bioinformatics/btz682.

S. Chakraborty, E. Bisong, S. Bhatt, T. O. Wagner, F. Mosconi, and R. D. Elliott, “BioMedBERT: A Pre-trained Biomedical Language Model for QA and IR,” COLING 2020 - 28th Int. Conf. Comput. Linguist. Proc. Conf., pp. 669–679, 2020, doi: 10.18653/v1/2020.coling-main.59.


Full Text: PDF

Refbacks

  • There are currently no refbacks.


 

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
ISSN 2089-3272

Creative Commons Licence

This work is licensed under a Creative Commons Attribution 4.0 International License.

web analytics
View IJEEI Stats

503 Service Unavailable

Service Unavailable

The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

Additionally, a 503 Service Unavailable error was encountered while trying to use an ErrorDocument to handle the request.