Article Abstract
International Journal of Advance Research in Multidisciplinary, 2025;3(2):357-361
Resume parser using natural language processing and machine learning
Author : Samsun A and Dr. A Angel Cerli
Abstract
Recruitment processes in modern organizations are increasingly reliant on automation to manage the high volume of job applications efficiently. Resume parsing is a key technology in this domain, enabling automatic extraction of relevant candidate information from resumes to support faster and more informed decision-making. This project presents a Python-based resume parsing system that utilizes Natural Language Processing (NLP) techniques to analyze, extract, and structure data from resumes in PDF format.
The system employs the pdfminer library to extract raw text from PDF documents, providing a consistent and reliable foundation for further analysis. Using the spaCy NLP library, the system applies pre-trained models to detect candidate names and applies regular expressions to extract contact details such as phone numbers and email addresses.
In addition to basic identification, the system extracts deeper resume information including skills, education, and work experience. This is achieved through a combination of rule-based methods and customizable keyword-matching logic. The system’s use of YAML configuration files allows for easy adaptation of extraction rules without requiring changes to the core code, making it highly flexible and maintainable.
Once processed, the extracted data is organized into a structured CSV format, allowing recruiters to quickly assess and compare candidates. This project demonstrates the effectiveness of integrating Python tools and NLP techniques to streamline resume processing, ultimately reducing manual effort and enhancing recruitment efficiency across various organizational contexts.
Keywords
Resume parsing, Applicant Tracking System, (ATS) Automated recruitment, Candidate evaluation, Structured data extraction, Resume analysis