Clustering of protein structures

Arguelles, Pedro Miguel Franco

http://hdl.handle.net/10362/160387

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
Arguelles_2021.pdf		2.06 MB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Arguelles, Pedro Miguel Franco

Orientador(es)

Krippahl, Ludwig

Resumo(s)

Proteins are very complex and important molecules that carry out a wide range of functions essential to life. The role of any given protein within a cell is heavily determined by its structure, which is recognized as a valuable resource of information when studying proteins. In applications such protein docking or phylogenetics, there is the need to compare such structures in order to obtain relevant information about the proteins that are being considered. As such, there is also a need to specify a some kind of measure that is able to indicate if two protein structures are similar or not. Currently, there are a few different measures that can be used for this task, however, due to the inherent complexity of protein structures it is very hard for a measure to take into account the numerous possible variations and perfectly quantify the dissimilarity among them. Considering this issue, with this work we aim to use clustering algorithms and exper- iment with different structure similarity measures, in an attempt to find effective ways of grouping protein structures with the goal of obtain useful information that can be used in the previously mentioned applications.

As proteínas são moléculas de alta importância e complexidade que desempenham uma grande diversidade de funções essenciais à vida. O papel de uma dada proteína den- tro de uma célula é fortemente influenciado pela sua estrutura, que é reconhecida como um valioso recurso de informação no estudo de proteínas. Em aplicações como o docking ou a análise filogenética de proteínas, há uma necessidade de comparar tais estruturas de forma a obter informação relevante sobre as proteínas que estamos a considerar. Como tal, também há a necessidade de especificar uma medida que seja capaz de indicar se duas estruturas de proteínas são ou não semelhantes. Atualmente, há medidas diferentes que podem ser usadas para esta tarefa, no entanto, devido à inerente complexidade das estru- turas de proteínas é muito difícil para uma medida ter em conta as numerosas variações possíveis e quantificar perfeitamente as diferenças entre elas. Tendo estes problemas em consideração, neste trabalho vamos usar algoritmos de clustering e experimentar diferentes medidas de semelhança, numa tentativa de encontrar maneiras efetivas de agrupar estruturas de proteínas com o objectivo de obter informação útil, que possa ser usada nas aplicações mencionadas anteriormente.

Palavras-chave

Proteins Machine learning Unsupervised learning Clustering Protein structures Structural similarity measures

URI

http://hdl.handle.net/10362/160387

Coleções

FCT: DI - Dissertações de Mestrado

Ver registo completo