Arquitectura y Computación Paralela

Foto de Arquitectura y Computación Paralela

Foto de Juan Carlos

Juan Carlos
Fernandez Fernández

Juan Carlos Fernandez Fernández-rekin lankidetzan egindako argitalpenak (25)

2015

Efficient hardware-supported synchronization mechanisms for manycores
Handbook on Data Centers (Springer New York), pp. 753-803

2014

Selective dynamic serialization for reducing energy consumption in hardware transactional memory systems
Journal of Supercomputing, Vol. 68, Núm. 2, pp. 914-934

2013

Deploying hardware locks to improve performance and energy efficiency of hardware transactional memory
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Design of an efficient communication infrastructure for highly contended locks in many-core CMPs
Journal of Parallel and Distributed Computing, Vol. 73, Núm. 7, pp. 972-985
ECONO: Express coherence notifications for efficient cache coherency in many-core CMPs
Proceedings - 2013 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, IC-SAMOS 2013
Efficient Dir0B cache coherency for many-core CMPs
Procedia Computer Science
On the design of energy-efficient hardware transactional memory systems
Concurrency and Computation: Practice and Experience

2012

CUDA and OpenCL implementations of 3D Fast Wavelet Transform
2012 IEEE 3rd Latin American Symposium on Circuits and Systems, LASCAS 2012 - Conference Proceedings
Design of a collective communication infrastructure for barrier synchronization in cluster-based nanoscale MPSoCs
Proceedings -Design, Automation and Test in Europe, DATE
Dynamic Serialization: Improving energy consumption in eager-eager hardware transactional memory systems
Proceedings - 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2012
Efficient hardware barrier synchronization in many-core CMPs
IEEE Transactions on Parallel and Distributed Systems, Vol. 23, Núm. 8, pp. 1453-1466
Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE
Journal of Supercomputing, Vol. 62, Núm. 2, pp. 787-803
The 2D wavelet transform on emerging architectures: GPUs and multicores
Journal of Real-Time Image Processing, Vol. 7, Núm. 3, pp. 145-152

2011

GLocks: Efficient support for highly-contended locks in many-core CMPs
Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011

2010

A G-line-based network for fast and efficient barrier synchronization in many-core CMPs
Proceedings of the International Conference on Parallel Processing
Characterizing energy consumption in hardware transactional memory systems
Proceedings - 22nd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2010
Characterizing the basic synchronization and communication operations in Dual Cell-based Blades through CellStats
Journal of Supercomputing, Vol. 53, Núm. 2, pp. 247-268
Efficient and scalable barrier synchronization for many-core CMPs
CF 2010 - Proceedings of the 2010 Computing Frontiers Conference
Parallel 3D fast wavelet transform on manycore GPUs and multicore CPUs
Procedia Computer Science

2009

A parallel implementation of the 2D wavelet transform using CUDA
Proceedings of the 17th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2009