Towards an algorithmic skeleton framework for programming the Intel R  Xeon PhiTM processor

Marques, Hélder de Almeida

http://hdl.handle.net/10362/14394

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
Marques_2014.pdf		2.18 MB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Marques, Hélder de Almeida

Orientador(es)

Paulino, Hervé

Resumo(s)

The Intel R Xeon PhiTM is the first processor based on Intel’s MIC (Many Integrated Cores) architecture. It is a co-processor specially tailored for data-parallel computations, whose basic architectural design is similar to the ones of GPUs (Graphics Processing Units), leveraging the use of many integrated low computational cores to perform parallel computations. The main novelty of the MIC architecture, relatively to GPUs, is its compatibility with the Intel x86 architecture. This enables the use of many of the tools commonly available for the parallel programming of x86-based architectures, which may lead to a smaller learning curve. However, programming the Xeon Phi still entails aspects intrinsic to accelerator-based computing, in general, and to the MIC architecture, in particular. In this thesis we advocate the use of algorithmic skeletons for programming the Xeon Phi. Algorithmic skeletons abstract the complexity inherent to parallel programming, hiding details such as resource management, parallel decomposition, inter-execution flow communication, thus removing these concerns from the programmer’s mind. In this context, the goal of the thesis is to lay the foundations for the development of a simple but powerful and efficient skeleton framework for the programming of the Xeon Phi processor. For this purpose we build upon Marrow, an existing framework for the orchestration of OpenCLTM computations in multi-GPU and CPU environments. We extend Marrow to execute both OpenCL and C++ parallel computations on the Xeon Phi. We evaluate the newly developed framework, several well-known benchmarks, like Saxpy and N-Body, will be used to compare, not only its performance to the existing framework when executing on the co-processor, but also to assess the performance on the Xeon Phi versus a multi-GPU environment.

Palavras-chave

Many integrated cores architectures Intel R Xeon PhiTM Parallel programming Algorithmic skeletons

URI

http://hdl.handle.net/10362/14394

Coleções

FCT: DI - Dissertações de Mestrado

Ver registo completo