Naranjo-Zolotov, Mijail JuanovichRamadan, Medhat Hassan Mohamed Mohamed2025-11-142025-11-142025-10-30http://hdl.handle.net/10362/190738Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Business IntelligenceMedical coding is a foundational element in today’s healthcare and accurate medical coding is vital to healthcare management; however, manual coding remains a significant bottleneck due to its labor-intensive nature and the human error factor, while deep-learning methods showed promising solutions for this challenge, two major factors have been widely challenging; the structure and complexity of free-text clinical notes and the sparsity of the ICD-10 coding system. These challenges are shown more clearly when real medical data is used, such as MIMIC‑IV, an extensive deidentified real patients’ medical data mart with hundreds of thousands of clinical records. In this thesis project, a deep‑learning architecture was implemented a on pretrained medical language model BioGPT to address these challenges. The design includes four innovative components: Deep Cross‑Attention mechanism between clinical documents and the textual definitions of ICD‑10 codes. Furthermore, an attention pooling layer identifies the most relevant sequences to ICD-10 codes, and a hierarchical layer focuses on the parent-child relationships in ICD‑10’s hierarchy and helps the model make use of the coding hierarchy, moreover, a document level aggregation to handle lengthy texts ensures the use of the full clinical information. The proposed model achieved a notable score when compared with similar studies on MIMIC‑III, an F1‑micro score of 74.27% with 3.59% improvement and F1‑macro score of 67.91% with 1.04% improvement.engICD-10 ClassificationClinical NLPBioGPTLarge Language ModelsMIMIC-IVHierarchical AttentionAI in HealthcareTransformer ModelsHealth InformaticsSDG 3 - Good health and well-beingSDG 9 - Industry, innovation and infrastructureSDG 17 - Partnerships for the goalsEnhanced BioGPT for Automated ICD-10 Medical Coding: Harnessing Domain-Tuned Large Language Models for Smarter Medical Codingmaster thesis204073278