| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 7.07 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
O Cancro do Cólon e Reto (CCR) é um dos tipos de cancro com maior taxa de incidência e
mortalidade, e onde o número de casos em doentes jovens tem aumentado consideravel-
mente nos últimos anos, constituindo assim um sério desafio de saúde pública. Embora se
tenham verificado avanços significativos na compreensão desta doença, o caminho para a
combater eficazmente ainda é longo. A deteção precoce e um diagnóstico correto são os
fatores de sucesso mais relevantes no tratamento desta doença. Hoje somos capazes de
extrair dos tumores dados que nos permitem uma melhor e mais detalhada compreensão
da doença. Neste sentido, torna-se relevante considerar a associação entre mutações ger-
minais e o risco aumentado de desenvolver esta patologia, de onde as mutações herdadas
podem revelar-se em potenciais biomarcadores, possibilitando um melhor seguimento
dos pacientes e dos seus familiares.
Esta estratégia está em linha com o conceito da medicina de precisão. Tendo em
conta a variabilidade genética, o ambiente, e o estilo de vida de cada indivíduo é possível
orientar os clínicos na escolha de melhores opções de manejo clínico para cada paciente.
O objetivo passa assim por seguir uma abordagem mais eficiente e personalizada no
tratamento de doenças complexas, como é o caso do CCR. Pretende-se maximizar a
eficácia do tratamento, minimizar os riscos dos efeitos colaterais indesejáveis e, por fim,
reduzir os custos no sistema de saúde. A combinação das tecnologias Next-generation
sequencing (NGS) com o avanço de técnicas de Aprendizagem Automática (AA) permitiu
a integração de grandes quantidades de dados, aumentando a capacidade de identificar
variantes genéticas relevantes em doenças como o cancro. Neste contexto, a informação
que resulta destes modelos pode ser utilizada em diagnósticos genéticos e na prática
clínica, oferecendo uma visão mais completa sobre os mecanismos da doença e ajudando
a tomar decisões médicas informadas.
Com o objetivo de tirar partido dos avanços nestas duas grandes áreas, NGS e AA, e
aplicar ao estudo do CCR, foram treinados modelos de regressão logística com regula-
rização e clustering a dados de pacientes do Instituto Português de Oncologia (IPO) de
Lisboa, a fim de selecionar variantes relevantes na doença e identificar novos grupos de
pacientes. Para isso foram seguidas quatro principais etapas: integração, filtragem dos dados e construção de variáveis; estudo preliminar de regiões em homozigotia; aplicação
de modelos de AA; interpretação e validação dos resultados obtidos.
A deteção de regiões em homozigotia permitiu uma nova visão sobre os locais do gene
onde este fenómeno ocorreu, abrindo caminho para um estudo mais especializado neste
âmbito. Com os modelos de regressão logística foi possível constatar a heterogeneidade
dentro dos grupos clínicos, como consequência da baixa capacidade preditiva dos modelos.
Recorrendo a técnicas de clustering, foram identificados outliers e grupos de pacientes
diferentes dos já estabelecidos pelo IPO o que poderá ter interesse clínico futuramente.
Concluindo, este estudo realçou a baixa capacidade preditiva da informação genética
disponível na separação entre grupos clínicos, permitindo abrir caminho em direção às
técnicas de AA não supervisionadas, que revelaram potencial na descoberta de novos
grupos de pacientes e outliers. A continuação da investigação no contexto deste trabalho
aliada à integração de mais pacientes pode culminar numa compreensão mais detalhada
do CCR e numa gestão mais eficaz de cada caso clínico - reforçando a importância da
medicina de precisão.
Colorectal cancer (CRC) is one of the cancers with the highest incidence and mortality rates, and the number of cases in young patients has increased considerably in recent years, thus posing a serious public health challenge. Although there have been significant advances in the comprehension of this disease, the road to fighting it effectively is still long. Early detection and correct diagnosis are the most important success factors in the treatment of this disease. Therefore, it is relevant to consider the association between germline mutations and increased risk of developing this disease, where inherited mutations may be revealed as biomarkers, allowing a better follow-up of patients and their families. This strategy addresses the concept of precision medicine. By considering the genetic variability, the environment, and the lifestyle of each individual, it is possible to guide clinicians in choosing the best treatment options for each patient. The objective is to follow a more efficient and personalized approach in the treatment of complex diseases, such as CRC - maximizing treatment efficacy, minimizing the risks of undesirable side effects, and reducing costs in the healthcare system. The combination of NGS technologies with the advance of machine learning techniques has allowed the integration of large amounts of data, increasing the ability to identify relevant genetic variants in diseases such as cancer. In this context, the information that results from these models can be used in genetic diagnostics and clinical practice, offering a more complete view of disease mechanisms and enabling informed medical decisions. In order to test this hypothesis in CRC, logistic regularization and clustering algorithms were applied to CRC patient data in order to select variants and identify new clusters. To this end, four main steps were followed: integration, data filtering and variable construction; preliminary study of homozygous regions; application of machine learning models; interpretation and validation of the results obtained. The detection of homozygous regions allowed a new insight into the gene locations where this phenomenon occurred, paving the way for a rather focused study in this area. With the logistic regularization models it was possible to see the heterogeneity within the clinical groups, as a consequence of the poor classification results. Using clustering, it was possible to identify outliers and patient groups different from those already established by IPO, which may be of clinical interest in the future. In conclusion, this study has highlighted the lack of correlation between genetic variants and clinical group assignment, allowing us to pave the way towards unsupervised machine learning techniques, which have shown potential in discovering new clusters and outliers. Further research in the context of this work coupled with the integration of more patients may culminate in a more detailed understanding of CRC and more effective management of each clinical case - reinforcing the importance of precision medicine.
Colorectal cancer (CRC) is one of the cancers with the highest incidence and mortality rates, and the number of cases in young patients has increased considerably in recent years, thus posing a serious public health challenge. Although there have been significant advances in the comprehension of this disease, the road to fighting it effectively is still long. Early detection and correct diagnosis are the most important success factors in the treatment of this disease. Therefore, it is relevant to consider the association between germline mutations and increased risk of developing this disease, where inherited mutations may be revealed as biomarkers, allowing a better follow-up of patients and their families. This strategy addresses the concept of precision medicine. By considering the genetic variability, the environment, and the lifestyle of each individual, it is possible to guide clinicians in choosing the best treatment options for each patient. The objective is to follow a more efficient and personalized approach in the treatment of complex diseases, such as CRC - maximizing treatment efficacy, minimizing the risks of undesirable side effects, and reducing costs in the healthcare system. The combination of NGS technologies with the advance of machine learning techniques has allowed the integration of large amounts of data, increasing the ability to identify relevant genetic variants in diseases such as cancer. In this context, the information that results from these models can be used in genetic diagnostics and clinical practice, offering a more complete view of disease mechanisms and enabling informed medical decisions. In order to test this hypothesis in CRC, logistic regularization and clustering algorithms were applied to CRC patient data in order to select variants and identify new clusters. To this end, four main steps were followed: integration, data filtering and variable construction; preliminary study of homozygous regions; application of machine learning models; interpretation and validation of the results obtained. The detection of homozygous regions allowed a new insight into the gene locations where this phenomenon occurred, paving the way for a rather focused study in this area. With the logistic regularization models it was possible to see the heterogeneity within the clinical groups, as a consequence of the poor classification results. Using clustering, it was possible to identify outliers and patient groups different from those already established by IPO, which may be of clinical interest in the future. In conclusion, this study has highlighted the lack of correlation between genetic variants and clinical group assignment, allowing us to pave the way towards unsupervised machine learning techniques, which have shown potential in discovering new clusters and outliers. Further research in the context of this work coupled with the integration of more patients may culminate in a more detailed understanding of CRC and more effective management of each clinical case - reinforcing the importance of precision medicine.
Descrição
Palavras-chave
Cancro do cólon e reto Aprendizagem automática Mutações germinais Medicina de precisão Perda de heterozigotia
