Publications
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Journals and Conferences

8 references, last updated Mon Nov 11 14:27:04 2013

Santos-MPBO14.pdf Nuno Santos, Paulo Flores, and José Monteiro.
"MPBO - A Distributed PBO Solver". Engineering Applications of Artificial Intelligence, The International Journal of Intelligent Real-Time Automation, 2014. (em fase de submissão).
Abstract
Parallel computing has been the subject of intensive research in the last decade and is increasingly being chosen as a solution for developing applications that require high computational power, such as the Boolean Satisfiability (SAT) and Pseudo-Boolean Optimization (PBO) problems. Research in SAT solvers has obtained relevant results in the last years, achieving significant reductions in execution times. Unfortunately, hard instances of SAT require large computational power and even efficient SAT solvers take huge execution times to obtain their solution. Therefore, SAT solvers adaptation to parallel computing systems began to be the subject of considerable research and there already exist several distributed versions of popular SAT solvers. However, the absence of distributed solvers for the PBO problem is notorious. This work intends to contribute and encourage the research into distributed solu- tions to solve the PBO problem. The goal of this work is to propose a distributed Pseudo-Boolean Optimization Solver, named MPBO solver, based on MPI (Message Passing Interface ) and focused on an efficient search space partition, more specifically the partition of the problem optimization search space. The proposed solver achieved significant reductions in the time to solve hard PBO instances, when compared to the MiniSat+ and pwbo solvers.

Keywords: Pseudo-Boolean Optimization (PBO), Boolean Satisfiability (SAT), Message Passing Interface (MPI), Distributed Systems (GRID)

Aksoy-ECCTD13.pdf Levent Aksoy, Eduardo Costa, Paulo Flores, and José Monteiro.
"Exploration of Tradeoffs in the Design of Integer Cosine Transforms for Image Compression". In Proceedings of European Conference on Circuit Theory and Design (ECCTD), September 8-12, 2013.
Abstract
In image data compression, integer cosine transforms (ICTs) have been preferred to discrete cosine transforms (DCTs) due to their similar transform efficiency and lower implementation cost. However, there exist many alternative ICTs with different performance measures and implementation costs. In this work, we explore all possible ICTs, compute their performance measures, and find their implementation costs in terms of the number of adders/subtractors, where a state-of-art technique is used to realize ICTs under a shift-adds architecture. We also investigate the tradeoff between performance and implementation cost, present the pareto-optimal points of this tradeoff, and introduce promising ICTs that were not considered before.

Marques-flairs13.pdf Ricardo Marques, Luis Guerra e Silva, Paulo Flores, and L. Miguel Silveira.
"Improving SAT Solver Efficiency using a Multi-core Approach". In The 26th International Florida Artificial Intelligence Research Society Conference (FLAIRS), pages 94-99, May 2013.
Abstract
Many practical problems in multiple fields can be converted to a SAT problem, or a sequence of SAT problems, such that their solution immediately implies a solution to the original problem. Despite the enormous progress achieved over the last decade in the development of SAT solvers, there is strong demand for higher algorithm efficiency to solve harder and larger problems. The widespread availability of multi-core, shared memory parallel environments provides an opportunity for such improvements. In this paper we present our results on improving the effectiveness of standard SAT solvers on such architectures, through a portfolio approach. Multiple instances of the same basic solver using different heuristic strategies, for search-space exploration and problem analysis, share information and cooperate towards the solution of a given problem. Results from the application of our methodology to known problems from SAT competitions show relevant improvements over the state of the art and yield the promise of further advances.

Martins-rcra12.pdf Ruben Martins, Vasco Manquinho, and Inês Lynce.
"Clause Sharing in Deterministic Parallel Maximum Satisfiability". In Proceedings of the 19th RCRA workshop on Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion (RCRA 2012)., June 2012.
Abstract
Multicore processors are becoming the dominant platform in modern days. As a result, parallel Maximum Satisfiability (MaxSAT) solvers have been developed to exploit this new architecture. Sharing learned clauses in parallel MaxSAT is expected to help to further prune the search space and boost the performance of a parallel solver. Yet, so far it has not been made clear which learned clauses should be shared among the different threads. This paper studies the impact of clause sharing heuristics. Evaluating these heuristics can be a hard task in parallel MaxSAT because existing solvers suffer from non-determinism behavior, i.e. several runs of the same solver can lead to different solutions. This is a clear downside for applications that require solving the same problem instance more than once, such as the evaluation of heuristics. For a fair evaluation, this paper presents the first deterministic parallel MaxSAT solver that ensures reproducibility of results. By using a deterministic solver one can independently evaluate the gains coming from the use of different heuristics rather than the non-determinism of the solver. Experimental results show that sharing learned clauses improves the overall performance of parallel MaxSAT solvers. Moreover, the performance of the new deterministic solver is comparable to the corresponding nondeterministic version.

Villena-TCAD12.pdf J.F. Villena and L. Silveira.
"Exploiting Parallelism for Improved Automation of Multidimensional Model Order Reduction". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 31(1):37-49, January 2012. Special Issue PAR-CAD: Parallel CAD Algorithms and CAD for Parallel Architectures/Systems. (doi:10.1109/TCAD.2011.2167510)
Abstract
This paper addresses the issue of automatically generating reduced order models of very large multidimensional systems. To tackle this problem we introduce an efficient parallel projection based model order reduction framework for parameterized linear systems. The underlying methodology is based on an automated multidimensional sample selection procedure that maximizes effectiveness in the generation of the projection basis. The parallel nature of the algorithm is efficiently exploited using both shared and distributed memory architectures. This leads to a highly scalable, automatic, and reliable parallel reduction scheme, able to handle very large systems depending on multiple parameters. In addition, the framework is general enough to provide a good approximation regardless of the model's representation or underlying nature, as will be demonstrated on a variety of benchmark examples. The method provides the potential to tackle, in an automatic fashion, extremely challenging models that would be otherwise difficult to address with existing sequential approaches.

Keywords: linear systems;memory architecture;multidimensional systems;reduced order systems;sampling methods;automated multidimensional sample selection procedure;distributed memory architectures;multidimensional model order reduction;multidimensional sampling method;parallel projection based model order reduction framework;parallel reduction scheme;parameterized linear systems;sequential approaches;very large multidimensional systems;Approximation methods;Automation;Computational modeling;Computer architecture;Sparse matrices;Vectors;Multidimensional sampling;parallel model order reduction;parameterized systems

Martins-AIC12.pdf Ruben Martins, Vasco Manquinho, and Inês Lynce.
"Parallel Search for Maximum Satisfiability". Journal of Artificial Intelligence Communications, 25(2):75-95, January 2012. (doi:10.3233/AIC-2012-0517)
Abstract
The predominance of multicore processors has increased the interest in developing parallel Boolean Satisfiability (SAT) solvers. As a result, more parallel SAT solvers are emerging. Even though parallel approaches are known to boost performance, parallel approaches developed for Boolean optimization are scarce. This paper proposes parallel search algorithms for Maximum Satisfiability (MaxSAT) and introduces PWBO, a new parallel solver for partial MaxSAT. Using two threads, an unsatisfiability-based algorithm is used to search on the lower bound of the optimal solution, while at the same time a linear search is performed on the upper bound of the optimal solution. Moreover, learned clauses are shared between threads during the search. For a larger number of threads two different strategies are proposed. The first strategy performs a portfolio approach by searching on the lower and upper bound values of the optimal solution using different encodings of cardinality constraints for each thread. The second strategy splits the search space considering different upper bound values of the optimal solution for each thread. Experimental results show that the techniques proposed in the paper enable PWBO to improve when increasing the number of threads

Keywords: Parallel search, maximum satisfiability, cardinality constraints

Villena-iccad10.pdf J.F. Villena and L.M. Silveira.
"3POR - Parallel projection based parameterized order reduction for multi-dimensional linear models". In Proceedings of IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 536-542, November 7-11, 2010. (doi:10.1109/ICCAD.2010.5653851)
Abstract
This paper introduces a distributed and shared memory parallel projection based model order reduction framework for parameterized linear systems. The proposed methodology is based on a sampling scheme followed by a projection to build the reduced model. It exploits the parallel nature of the sampling methods to improve the efficiency of the basis generation. The sample selection scheme uses the residue as a proxy for the model error in order to improve automation and maximize the effectiveness of the sampling step. This yields an automatic and reliable methodology, able to handle large systems depending on the frequency and multiple parameters. The framework can be used in shared and distributed memory architectures separately or in conjunction. It is able to deal with different system representations and models of different characteristics, is highly scalable and the parallelization is very effective, as will be demonstrated on a variety of industrial benchmarks, with super linear speed-ups in certain cases. The methodology provides the potential to tackle large and complex models, depending on multiple parameters in an automatic fashion.

Keywords: reduced order systems;sampling methods;3POr;distributed parallel projection;multidimensional linear models;parallel projection based parameterized order reduction;sample selection scheme;sampling methods;shared memory parallel projection;Arrays;Indexes;Memory management;Sparse matrices;Transfer functions;Vectors

Martins-ictai10.pdf R. Martins, V. Manquinho, and I. Lynce.
"Improving Search Space Splitting for Parallel SAT Solving". In Proceedings of the IEEE International Conference on Tools with Artificial Intelligence (ICTAI), volume 1, pages 336-343, October 27-29, 2010. (Realizado no ambito do projecto, mas financiado por outros projectos, artigo não contabilizado). (doi:10.1109/ICTAI.2010.56)
Abstract
The last two decades progresses have led Propositional Satisfiability (SAT) to be a competitive practical approach to solve a wide range of industrial and academic problems. Thanks to these advances, the size and difficulty of the SAT instances have grown significantly. The demand for more computational power led to the creation of new computer architectures and paradigms composed by multiple machines connected by a network to act as one machine, like clusters and grids. However, extra computing power is not coming anymore from higher processor frequencies, but rather from a growing number of computing cores and processors. It becomes clear that exploiting this new architecture is essential for the evolution of SAT solvers. Search space splitting is probably the most commonly used strategy to explore the parallelism provided by the search space. However, it is not clear how to find the relevant set of variables to divide the search space. This paper extends a method based on the VSIDS heuristic to find the initial set of partition variables. A drawback of search space splitting is load balancing. To overcome this problem, we propose the use of a hybrid approach between search space splitting and portfolio. Preliminary results show that both these techniques improve the performance of the solver and reveal that combining search space splitting and portfolio approaches can lead to better results.

Keywords: computability;search problems;VSIDS heuristic;extra computing power;parallel SAT solving;partition variable;portfolio;propositional satisfiability;search space splitting;Heuristic algorithms;Instruction sets;Load management;Master-slave;Multicore processing;Portfolios;Switches

Thesis

6 references, last updated Mon Nov 25 16:08:06 2013

Costa-MSc14.pdf Carlos Filipe da Silva Portinha e Costa.
"Parallelization of SAT Algorithms on GPUs". Master's thesis, Instituto Superior Técnico, University of Lisbon (ULisboa), 2014. (em revisão final).
Marques-MSc13.pdf Ricardo de Sousa Marques.
"Parallel SAT Solver". Master's thesis, Instituto Superior Técnico, University of Lisbon (ULisboa), December 2013.
Abstract
Many practical problems in multiple fields can be converted to a Boolean Satisfiability (SAT) problem, or a sequence of SAT problems, such that their solution immediately implies a solution to the original problem. Despite the enormous progress achieved over the last decade in the development of SAT solvers, there is strong demand for higher algorithm efficiency to solve harder and larger problems. However, algorithmic developments has slowed down somewhat, which is at odds with such a trend. Fortunately, the widespread availability of multi-core, shared memory parallel environments provides a renewed opportunity for such improvements. In this dissertation we present our results on improving the effectiveness of standard SAT solvers on such architectures, through two different parallel approaches. In the first approach, multiple instances of the same basic solver using different heuristic strategies, compete to find a solution, each performing a different search-space exploration and problem analysis, but still sharing information and cooperating towards the solution of a given problem. In the second approach, we partition the problem in several sub-problems, using clustering techniques. Then, multiple instances cooperatively solve these smaller subsets of the original problem to find a solution for the whole problem. Results from the application of our methodology to known problems from SAT competitions show improvements over the state of the art and yield the promise of further advances.

Keywords: Boolean Satisfiability (SAT), Parallel Search, Multi-core, Clause Sharing, Portfolio Solver, Clustering Algorithm

Santos-MSc13.pdf Nuno Miguel Coelho Santos.
"MPBO: A Distributed Pseudo-Boolean Optimization Solver". Master's thesis, Instituto Superior Técnico, Technical University of Lisbon (UTL), May 2013. (PDF)
Abstract
Parallel computing has been the subject of intensive research in the last decade and is increasingly being chosen as a solution for developing applications that require high computational power, such as the Boolean Satisfiability (SAT) and Pseudo-Boolean Optimization (PBO) problems. Research in SAT solvers has obtained relevant results in the last years, achieving significant reductions in execution times. Unfortunately, hard instances of boolean satisfiability require large computational power and even efficient SAT solvers take huge execution times to obtain their solution. Therefore, SAT solvers adaptation to parallel computing systems began to be the subject of considerable research and there already exist several distributed versions of popular SAT solvers. However, the absence of distributed solvers for the Pseudo-Boolean Optimization problem is notorious. This work intends to contribute and encourage the research into distributed solutions to solve the PBO problem. The goal of this work is to propose a distributed Pseudo-Boolean Optimization Solver, named MPBO solver, based on MPI (Message Passing Interface) and focused on an efficient search space partition, more specifically the partition of the problem optimization search space. The proposed solver achieved significant reductions in the time to solve hard PBO instances, when compared to the MiniSat+ and pwbo solvers.

Keywords: Pseudo-Boolean Optimization (PBO), Boolean Satisfiability (SAT), Message Passing Interface (MPI), Distributed Systems (GRID)

Dias-MSc12.pdf Alexandre Nuno Vicente Dias.
"Detecting Computer Viruses using GPUs". Master's thesis, Instituto Superior Técnico, Technical University of Lisbon (UTL), November 2012. (PDF)
Abstract
Anti-virus software is the main defense mechanism against malware, which is becoming more common and advanced. A significant part of the virus scanning process is dedicated to scanning a given file against a set of virus signatures. As it is important that the overall scanning process be as fast as possible, efforts must be done to minimize the time spent in signature matching. Recently, graphics processing units have increased in popularity in high performance computation, due to their inherently parallel architecture. One of their possible applications is performing matching of multiple signatures in parallel. In this work, we present details on the implemented multiple string searching algorithm based on deterministic finite automata which runs on a graphics processing unit. Due to space concerns inherent to DFAs our algorithm only scans for a substring of each signature, thereby serving as a high-speed pre-filtering mechanism. Multiple optimizations were implemented in order to increase its performance. In our experiments with sets of test files, the implemented solution was found to have a speedup of around 28 when compared to the pattern matching portion of ClamAV, an open-source anti-virus engine. On other sets of test files with different characteristics the solution does not have such a good performance, but future work is described to improve it in these situations.

Keywords: Network security, Virus, String matching, GPGPU Computing, CUDA

Barata-MSc11.pdf Fábio André dos Santos Barata.
"LPGAS - Large Power Grid Analysis and Simulation". Master's thesis, Instituto Superior Técnico, Technical University of Lisbon (UTL), November 2011. (PDF)
Abstract
Análise de power grids tem vindo a emergir ao longo do tempo como um problema de circuitos que atinge os limites dos recursos computacionais, processamento e memória. Os VLSI já excedem cem milhões de nós e qualquer análise e verificação de power grids tende a ficar enorme. Os recursos disponíveis são já muito curtos para guardar os muitos gigabytes de dados envolvidos no processo. Então, estratégias paralelizáveis e eficientes, compatíveis com distribuição de dados, são necessárias para atacar este problema. Neste estudo, soluções de análise de power grids serão examinadas, desenvolvidas e aplicadas tanto a circuitos benchmark como a power grids criadas artificialmente. Algumas soluções falham em atingir bons resultados em algumas características importantes como paralelismo, flexibilidade na distribuição de dados, baixas taxas de comunicação e escalabilidade no resultado, enquanto que outras atingem até ao momento. Porém, uma solução paralela requer particionamento das power grids. Dado que uma power grid pode ser vista como um grafo, a sua divisão é obtida através de técnicas de particionamento. As técnicas mais populares serão aplicadas, sendo algumas dependentes do acesso a informação geométrica e outras não. O algoritmo de Gradiente Conjugado com pré-condicionamento de Bloco-Jacobi, uma solução nunca antes aplicada a power grids, mostra os melhores resultados entre as várias estratégias analisadas, demorando 5 horas a analisar uma power grid de 7.9 milhões de nós com 300 fontes de corrente num aglomerado de 10 octocores. A memória projetada está de acordo com o que existe hoje mesmo, o que é um bom sinal.

Keywords: VLSI, Electromigração, Análise de power grids, Particionamento, Sistemas de equações lineares, Computação paralela e distribuida

Pereira-MSc11.pdf Vando Miguel Bonifácio Pereira.
"Simulador de Eventos Discretos sobre GPUs". Master's thesis, Instituto Superior Técnico, Technical University of Lisbon (UTL), May 2011. (PDF)
Abstract
Numa era em que os circuitos integrados são constituídos por muitos milhões de transístores cuidadosamente projectados de forma a permitir a execução de tarefas bastante complexas, o funcionamento e desempenho destes sistemas precisa de ser arduamente testado e simulado de forma a evitar problemas futuros. A verificação e validação feita através de simulação lógica torna-se assim crucial. No entanto, a dimensão deste problema leva hoje a que estas simulações sejam bastante demoradas e esse cenário tende a piorar à medida que mais transístores vão sendo colocados no mesmo chip. Com este cenário, é hoje crucial estudar novas formas e plataformas para executar estas simulações. Por outro lado e movido pela procura incessante por gráficos de alta-definição para o mercado dos jogos de PC e consolas, surgiram plataformas de computação massivamente paralela, denominadas GPUs (Graphics Processing Units). Para a realização deste trabalho, foi desenvolvido uma plataforma de simulação orientada a eventos em todas as suas fases. De forma a tirar partido da arquitectura massivamente paralela do GPU, foi necessário implementar um pré-processador que particiona o modelo e o adequa à natureza do GPU. No final, os testes efectuados ao trabalho realizado, mostram resultados promissores em termos de desempenho, quando comparado com o mesmo teste efectuado num simulador sequencial típico no CPU.

Keywords: Simulação de Eventos Discretos, Simulação ao Nível das portas lógicas, Simulação de alto desempenho, Unidade de processamento gráfico

Technical Reports and Others Contributions

6 references, last updated Fri Nov 22 18:38:37 2013

Marques-SATcompetiton13.pdf Ricardo Marques, Luis Guerra e Silva, Paulo Flores, and L. Miguel Silveira.
"pmcSAT". Participation in the SAT Competition 2013. Bronze Medal on the competition track: Core Solvers, Parallel, Hard-combinatorial SAT+UNSAT, July 2013.
Costa-TR2013Feb.pdf Carlos Costa.
"Parallelization of SAT Algorithms on GPUs". Technical report, INESC-ID / Instituto Superior Técnico, Technical University of Lisbon (UTL), February 2013.
Marques-TR2013Jan.pdf Ricardo de Sousa Marques.
"SAT Solver Paralelo". Technical report, INESC-ID / Instituto Superior Técnico, Technical University of Lisbon (UTL), January 2013.
Marques-INESCidTR2012-28.pdf Ricardo Sousa Marques, Luis Guerra e Silva, Paulo Flores, and L. Miguel Silveira.
"Improving SAT Solver Efficiency using a Cooperative Multicore Approach". Technical report, INESC-ID Tec. Rep. 28/2012, October 2012.
Marques-INESCidTR2012-12.pdf Ricardo Sousa Marques, Luis Guerra e Silva, Paulo Flores, and L. Miguel Silveira.
"cmcSAT - A Cooperative MultiCore SAT Solver". Technical report, INESC-ID Tec. Rep. 20/2012, July 2012.
Santos-TR2012Jan.pdf Nuno Santos.
"A Distributed Pseudo-Boolean Optimization Solver". Technical report, INESC-ID / Instituto Superior Técnico, Technical University of Lisbon (UTL), January 2012.