LLM-Pep: Targeted Modeling for Anti-Parasitic Peptide Detection Using Large Language Models
DOI:
https://doi.org/10.56979/1101/2026/1361Keywords:
Computational Intelligence, bioinformatics, anti-parasitic peptides, pre-trained language models, multi-layer perceptronAbstract
Parasites pose serious threats to host organisms, and anti-parasitic peptides (APPs) have shown potential in inhibiting parasite growth and reproduction. However, traditional biological screening methods such as nanomedicine-based assays and organism-based approaches are costly and time-consuming, highlighting the need for efficient computational prediction methods. In our study, we introduce a two-stage machine learning framework for accurate APP identification. To handle the class imbalance in the training data, we apply a random under sampling strategy to construct a balanced training set. Next, peptide sequences are encoded using pre-trained large language model-based embed dings and classified using a multi-layer perceptron (MLP) model. Unlike existing approaches that suffer from limited feature representation and poor generalization on independent datasets, our method leverages deep contextual sequence embedding combined with data balancing to improve robustness. Experimental results demonstrate that our model achieves an accuracy of 91.7% and an AUC of 0.939 on independent test sets, surpasses the existing approaches in APP prediction.
Downloads
Published
How to Cite
Issue
Section
License
This is an open Access Article published by Research Center of Computing & Biomedical Informatics (RCBI), Lahore, Pakistan under CCBY 4.0 International License




