A carregar...
Projeto de investigação
Sem título
Financiador
Autores
Publicações
A computational approach to the art of visual storytelling
Publication . Marcelino, Gonçalo Barreto Ferreira; Magalhães, João
For millennia, humanity as been using images to tell stories. In modern society, these
visual narratives take the center stage in many different contexts, from illustrated children’s
books to news media and comic books. They leverage the power of compounding
various images in sequence to present compelling and informative narratives, in an immediate
and impactful manner. In order to create them, many criteria are taken into account,
from the quality of the individual images to how they synergize with one another.
With the rise of the Internet, visual content with which to create these visual storylines
is now in abundance. In areas such as news media, where visual storylines are regularly
used to depict news stories, this has both advantages and disadvantages. Although content
might be available online to create a visual storyline, filtering the massive amounts
of existing images for high quality, relevant ones is a hard and time consuming task. Furthermore,
combining these images into visually and semantically cohesive narratives is a
highly skillful process and one that takes time.
As a first step to help solve this problem, this thesis brings state of the art computational
methodologies to the age old tradition of creating visual storylines. Leveraging
these methodologies, we define a three part architecture to help with the creation of visual
storylines in the context of news media, using social media content. To ensure the
quality of the storylines from a human perception point of view, we deploy methods for
filtering and raking images according to news quality standards, we resort to multimedia
retrieval techniques to find relevant content and we propose a machine learning based
approach to organize visual content into cohesive and appealing visual narratives.
Augmenting Translation Lexica by Learning Generalised Translation Patterns
Publication . Mahesh, Kavitha Karimbi; Lopes, José
Bilingual Lexicons do improve quality: of parallel corpora alignment, of newly extracted
translation pairs, of Machine Translation, of cross language information retrieval, among
other applications. In this regard, the first problem addressed in this thesis pertains to
the classification of automatically extracted translations from parallel corpora-collections
of sentence pairs that are translations of each other. The second problem is concerned
with machine learning of bilingual morphology with applications in the solution of first
problem and in the generation of Out-Of-Vocabulary translations.
With respect to the problem of translation classification, two separate classifiers for
handling multi-word and word-to-word translations are trained, using previously extracted
and manually classified translation pairs as correct or incorrect. Several insights
are useful for distinguishing the adequate multi-word candidates from those that are
inadequate such as, lack or presence of parallelism, spurious terms at translation ends
such as determiners, co-ordinated conjunctions, properties such as orthographic similarity
between translations, the occurrence and co-occurrence frequency of the translation
pairs. Morphological coverage reflecting stem and suffix agreements are explored as key
features in classifying word-to-word translations. Given that the evaluation of extracted
translation equivalents depends heavily on the human evaluator, incorporation of an
automated filter for appropriate and inappropriate translation pairs prior to human evaluation
contributes to tremendously reduce this work, thereby saving the time involved
and progressively improving alignment and extraction quality. It can also be applied
to filtering of translation tables used for training machine translation engines, and to
detect bad translation choices made by translation engines, thus enabling significative
productivity enhancements in the post-edition process of machine made translations.
An important attribute of the translation lexicon is the coverage it provides. Learning
suffixes and suffixation operations from the lexicon or corpus of a language is an extensively
researched task to tackle out-of-vocabulary terms. However, beyond mere words
or word forms are the translations and their variants, a powerful source of information
for automatic structural analysis, which is explored from the perspective of improving
word-to-word translation coverage and constitutes the second part of this thesis. In this
context, as a phase prior to the suggestion of out-of-vocabulary bilingual lexicon entries,
an approach to automatically induce segmentation and learn bilingual morph-like units by identifying and pairing word stems and suffixes is proposed, using the bilingual
corpus of translations automatically extracted from aligned parallel corpora, manually
validated or automatically classified. Minimally supervised technique is proposed to enable
bilingual morphology learning for language pairs whose bilingual lexicons are highly
defective in what concerns word-to-word translations representing inflection diversity.
Apart from the above mentioned applications in the classification of machine extracted
translations and in the generation of Out-Of-Vocabulary translations, learned bilingual
morph-units may also have a great impact on the establishment of correspondences of
sub-word constituents in the cases of word-to-multi-word and multi-word-to-multi-word
translations and in compression, full text indexing and retrieval applications.
Towards more Secure and Efficient Password Databases
Publication . Madeira, Miguel Afonso; Ferreira, Bernardo; Leitão, João
Password databases form one of the backbones of nowadays web applications.
Every web application needs to store its users’ credentials (email and password) in
an efficient way, and in popular applications (Google, Facebook, Twitter, etc.) these
databases can grow to store millions of user credentials simultaneously. However,
despite their critical nature and susceptibility to targeted attacks, the techniques
used for securing password databases are still very rudimentary, opening the way to
devastating attacks. Just in the year of 2016, and as far as publicly disclosed, there
were more than 500 million passwords stolen in internet hacking attacks.
To solve this problem we commit to study several schemes like property-preserving
encryption schemes (e.g. deterministic encryption), encrypted data-structures that
support operations (e.g. searchable encryption), partially homomorphic encryption
schemes, and commodity trusted hardware (e.g. TPM and Intel SGX).
In this thesis we propose to make a summary of the most efficient and secure techniques
for password database management systems that exist today and recreating
them to accommodate a new and simple universal API.
We also propose SSPM(Simple Secure Password Management), a new password
database scheme that simultaneously improves efficiency and security of current
solutions existing in literature. SSPM is based on Searchable Symmetric Encryption
techniques, more specifically ciphered data structures, that allow efficient queries
with the minimum leak of access patterns. SSPM adapts these structures to work
with the necessary operation of password database schemes preserving the security
guarantees.
Furthermore, SSPM explores the use of trusted hardware to minimize the revelation
of access patterns during the execution of operations and protecting the storage
of cryptographic keys. Experimental results with real password databases shows us
that SSPM has a similar performance compared with the solutions used today in
the industry, while simultaneous increasing the offered security conditions.
Deriving architectural models from requirements specifications: A systematic mapping study
Publication . Souza, Eric; Moreira, Ana; Goulão, Miguel; NOVALincs; DI - Departamento de Informática; Elsevier Science Publisher B.V.
Context: Software architecture design creates and documents the high-level structure of a software system. Such structure, expressed in architectural models, comprises software elements, relations among them, and properties of these elements and relations. Existing software architecture methods offer ways to derive architectural models from requirements specifications. These models must balance different forces that should be analyzed during this derivation process, such as those imposed by different application domains and quality attributes. Such balance is difficult to achieve, requiring skilled and experienced architects. Object: The purpose of this paper is to provide a comprehensive overview of the existing methods to derive architectural models from requirements specifications and offer a research roadmap to challenge the community to address the identified limitations and open issues that require further investigation. Method: To achieve this goal, we performed a systematic mapping study following the good practices from the Evidence-Based Software Engineering field. Results: This study resulted in 39 primary studies selected for analysis and data extraction, from the 2575 initially retrieved. Conclusion: The major findings indicate that current architectural derivation methods rely heavily on the architects’ tacit knowledge (experience and intuition), do not offer sufficient support for inexperienced architects, and lack explicit evaluation mechanisms. These and other findings are synthesized in a research roadmap which results would benefit researchers and practitioners.
Secure Abstractions for Trusted Cloud Computation
Publication . Tavares, Joana da Silva; Ferreira, Bernardo; Preguiça, Nuno
Cloud computing is adopted by most organizations due to its characteristics, namely
offering on-demand resources and services that can quickly be provisioned with minimal
management effort and maintenance expenses for its users. However it still suffers from
security incidents which have lead to many data security concerns and reluctance in
further adherence. With the advent of these incidents, cryptographic technologies such
as homomorphic and searchable encryption schemes were leveraged to provide solutions
that mitigated data security concerns.
The goal of this thesis is to provide a set of secure abstractions to serve as a tool for
programmers to develop their own distributed applications. Furthermore, these abstractions
can also be used to support trusted cloud computations in the context of NoSQL
data stores. For this purpose we leveraged conflict-free replicated data types (CRDTs) as
they provide a mechanism to ensure data consistency when replicated that has no need
for synchronization, which aligns well with the distributed and replicated nature of the
cloud, and the aforementioned cryptographic technologies to comply with the security
requirements. The main challenge of this thesis consisted in combining the cryptographic
technologies with the CRDTs in such way that it was possible to support all of the data
structures functionalities over ciphertext while striving to attain the best security and
performance possible.
To evaluate our abstractions we conducted an experiment to compare each secure
abstraction with their non secure counterpart performance wise. Additionally, we also
analysed the security level provided by each of the structures in light of the cryptographic
scheme used to support it. The results of our experiment shows that our abstractions
provide the intended data security with an acceptable performance overhead, showing
that it has potential to be used to build solutions for trusted cloud computation.
Unidades organizacionais
Descrição
Palavras-chave
Contribuidores
Financiadores
Entidade financiadora
Fundação para a Ciência e a Tecnologia
Programa de financiamento
5876
Número da atribuição
UID/CEC/04516/2013
