Arquitectura y Computación Paralela

AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators
CGO 2024 - Proceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization
NeuraChip: Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator
Proceedings - International Symposium on Computer Architecture
Scalability Limitations of Processing-in-Memory using Real System Evaluations
Performance Evaluation Review, Vol. 52, Núm. 1, pp. 63-64
Scalability Limitations of Processing-in-Memory using Real System Evaluations
SIGMETRICS/PERFORMANCE 2024 - Abstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems
Scalability limitations of processing-in-memory using real system evaluations
Proceedings of the ACM on Measurement and Analysis of Computing Systems, Vol. 8, Núm. 1

Accelerating Polynomial Multiplication for Homomorphic Encryption on GPUs
Proceedings - 2022 IEEE International Symposium on Secure and Private Execution Environment Design, SEED 2022
NaviSim: A Highly Accurate GPU Simulator for AMD RDNA GPUs
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

GNNMark: A Benchmark Suite to Characterize Graph Neural Network Training on GPUS
Proceedings - 2021 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2021
Spartan: A Sparsity-Adaptive Framework to Accelerate Deep Neural Network Training on GPUs
IEEE Transactions on Parallel and Distributed Systems, Vol. 32, Núm. 10, pp. 2448-2463

Design space exploration of accelerators and end-to-end DNN evaluation with TFLITE-SOC
Proceedings - Symposium on Computer Architecture and High Performance Computing
Griffin: Hardware-software support for efficient page migration in multi-GPU systems
Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020
Valkyrie: Leveraging inter-TLB locality to enhance GPU performance
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

Profiling DNN Workloads on a Volta-based DGX-1 System
2018 IEEE International Symposium on Workload Characterization, IISWC 2018

Asymmetric NoC architectures for GPU systems
Proceedings - 2015 9th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2015
Leveraging silicon-photonic NoC for designing scalable GPUs
Proceedings of the International Conference on Supercomputing

Northeastern University