Identification of sub-phenotypes and prediction of response to treatment in Multiple Myeloma patients

Morgado, Sofia Videira Begonha Sequeira

http://hdl.handle.net/10362/176178

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
Morgado_2023.pdf		4.92 MB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Morgado, Sofia Videira Begonha Sequeira

Orientador(es)

Soares, Cláudia

Krippahl, Ludwig

Resumo(s)

Multiple myeloma is an heterogeneous hemathological cancer that affects plasma cells in the bone marrow. Despite recent advances in the treatment of multiple myeloma, the outcome of the patients remains variable. Thus, it is essential to better understand this disease. Identification of sub-phenotypes in multiple myeloma patients and the prediction of response to treatment is crucial for improved individualized patient care. The primary objetive of this project involved the analysis of a dataset provided by Janssen, The Pharmaceutical Companies of Johnson and Johnson Belgium. The aim was to assess the feasibility of addressing the challenges proposed by the company, which included identifying sub-phenotypes of patients and predicting their response to treatment. Additionally, it was important to ensure that the resulting models were interpretable. Based on the conducted analysis, it became apparent that the dataset posed several challenges, notably the absence of crucial information. The primary reason for this can be attributed to the significant presence of missing values within the provided features, as well as the lack of inclusion of other crucial features. This dissertation addresses these stringent restrictions and contains valuable informa- tion and discussion on procedures that can be applied to other data sets with comparable limitations. Using an autoencoder novel network architechture, a procedure is defined for dealing with a large number of missing values without resorting to imputation. This new representation of the data may be used for clustering in an attempt to identify patient subphenotypes. Due to the requirement of creating interpretable classifiers, it is not pos- sible to train classifiers using these encoded features. Therefore, a missing value pattern aggregation strategy is recommended. In addition, the importance and implementation of methods that ensure model interpretability are discussed.

O mieloma múltiplo é um cancro hematológico que afeta as células do plasma na medula óssea. Apesar dos avanços recentes na terapêutica, a progressão clínica destes pacientes é ainda muito variável. Por esse motivo, é essencial criar uma maior compreensão destes doentes, modo a promover uma melhor abordagem clínica. Os objetivos deste projeto incluiram analisar o dataset fornecido pela Janssen, The Farmaceutical Companies of Johnson and Johnson Belgium, para determinar se seria possível responder aos desafios propostos por esta empresa farmacêutica: investigar a presença de sub-fenótipos de doentes, bem como predizer a resposta à terapêutica, garantindo a interpretabilidade dos modelos obtidos. Perante esta análise, tornou-se evidente que o dataset apresentava desafios significati- vos, incluindo a falta de informação essencial devido à presença de um grande número de valores em falta e à ausência de variáveis potencialmente relevantes. Assim, esta dissertação inclui uma discussão sobre os procedimentos que possam ser aplicados a dados semelhantes aos nossos, tendo em vista os objetivos importantes que definimos. Foi utilizado um autoencoder para criar um representação dos dados sem valores em falta e que não seja dependente de imputação. Esta nova representação foi então utilizada para criar clusters na tentativa de identificar sub-fenótipos de doentes. Dado que pretendemos criar classificadores que sejam interpretáveis, não foi possível utilizar estas variáveis codificadas. Assim, definimos uma metodologia baseada nos padrões de valores em falta dos doentes. Por fim, métodos para a obtenção de modelos interpretáveis foram discutidos.

Palavras-chave

Multiple Myeloma Clustering Classification Missing values Model explainability

URI

http://hdl.handle.net/10362/176178

Coleções

FCT: DI - Dissertações de Mestrado

Ver registo completo