LLM-Pep: Targeted Modeling for Anti-Parasitic Peptide Detection Using Large Language Models

Authors

  • Aqsa Amjad Department of Computer Science, University of Management and Technology, Lahore, Punjab, Pakistan.
  • Faria Nazir Department of Software Engineering, University of Management and Technology, Lahore, Punjab, Pakistan.
  • Tayyaba Anees Department of Software Engineering, University of Management and Technology, Lahore, Punjab, Pakistan.
  • Nosheen Qamar Department of Software Engineering, University of Management and Technology, Lahore, Punjab, Pakistan.
  • Wajeeha Khalil Department of Computer Science, University of Engineering and Technology, Peshawar, KPK, Pakistan.

DOI:

https://doi.org/10.56979/1101/2026/1361

Keywords:

Computational Intelligence, bioinformatics, anti-parasitic peptides, pre-trained language models, multi-layer perceptron

Abstract

Parasites pose serious threats to host organisms, and anti-parasitic peptides (APPs) have shown potential in inhibiting parasite growth and reproduction. However, traditional biological screening methods such as nanomedicine-based assays and organism-based approaches are costly and time-consuming, highlighting the need for efficient computational prediction methods. In our study, we introduce a two-stage machine learning framework for accurate APP identification. To handle the class imbalance in the training data, we apply a random under sampling strategy to construct a balanced training set. Next, peptide sequences are encoded using pre-trained large language model-based embed dings and classified using a multi-layer perceptron (MLP) model. Unlike existing approaches that suffer from limited feature representation and poor generalization on independent datasets, our method leverages deep contextual sequence embedding combined with data balancing to improve robustness. Experimental results demonstrate that our model achieves an accuracy of 91.7% and an AUC of 0.939 on independent test sets, surpasses the existing approaches in APP prediction.

Downloads

Published

2026-05-25

How to Cite

Aqsa Amjad, Faria Nazir, Tayyaba Anees, Nosheen Qamar, & Wajeeha Khalil. (2026). LLM-Pep: Targeted Modeling for Anti-Parasitic Peptide Detection Using Large Language Models. Journal of Computing & Biomedical Informatics, 11(01). https://doi.org/10.56979/1101/2026/1361

Issue

Section

Articles