| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 819.49 KB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
This study investigates the automatic classification of exam questions according to Bloom’s
Taxonomy, a hierarchical framework used to evaluate cognitive complexity in educational
assessment. Addressing key challenges such as class imbalance and linguistic ambiguity, the
research evaluates 26 model–feature combinations across classical machine learning, gradient
boosting, and deep learning methods. A curated dataset of 1,800 questions—comprising both
expert-labelled and AI-generated items—was used to ensure semantic diversity and balanced
class representation across all six Bloom levels. The best-performing model, a Convolutional
Neural Network (CNN) using fastText token-level embeddings, achieved a macro F1-score of 0.831
and a Cohen’s Kappa of 0.798, outperforming traditional models and other deep learning
baselines. Analysis showed consistent high performance across all cognitive categories,
eliminating the need for an ensemble. The results demonstrate that CNNs, when paired with
subword-aware embeddings, offer an efficient and interpretable solution for cognitive-level
classification, with potential for real-world integration into educational platforms and assessment
design workflows.
Descrição
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science
Palavras-chave
Bloom’s Taxonomy Deep Learning Educational NLP Synthetic Data Cognitive Level Classification SDG 4 - Quality education
