Josue
Feliu Perez
Publicaciones (29) Publicaciones de Josue Feliu Perez
2024
-
SYNPA: SMT Performance Analysis and Allocation of Threads to Cores in ARM Processors
Proceedings - 2024 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2024
2023
-
CELLO: Compiler-Assisted Efficient Load-Load Ordering in Data-Race-Free Regions
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
-
Cloud White: Detecting and Estimating QoS Degradation of Latency-Critical Workloads in the Public Cloud
Future Generation Computer Systems, Vol. 138, pp. 13-25
-
Rebasing Microarchitectural Research with Industry Traces
Proceedings - 2023 IEEE International Symposium on Workload Characterization, IISWC 2023
-
Speculative inter-thread store-to-load forwarding in SMT architectures
Journal of Parallel and Distributed Computing, Vol. 173, pp. 94-106
-
Thread-to-Core Allocation in ARM Processors Building Synergistic Pairs
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
2022
-
A Neural Network to Estimate Isolated Performance from Multi-Program Execution
Proceedings - 30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2022
-
DeepP: Deep Learning Multi-Program Prefetch Configuration for the IBM POWER 8
IEEE Transactions on Computers, Vol. 71, Núm. 10, pp. 2646-2658
-
Effect of Hyper-Threading in Latency-Critical Multithreaded Cloud Applications and Utilization Analysis of the Major System Resources
Future Generation Computer Systems, Vol. 131, pp. 194-208
-
The Forward Slice Core: A High-Performance, Yet Low-Complexity Microarchitecture
ACM Transactions on Architecture and Code Optimization, Vol. 19, Núm. 2
-
VMT: Virtualized Multi-Threading for Accelerating Graph Workloads on Commodity Processors
IEEE Transactions on Computers, Vol. 71, Núm. 6, pp. 1386-1398
2021
-
ITSLF: Inter-thread store-to-load forwarding in simultaneous multithreading
Proceedings of the Annual International Symposium on Microarchitecture, MICRO
2020
-
Bandwidth-aware dynamic prefetch configuration for IBM POWER8
IEEE Transactions on Parallel and Distributed Systems, Vol. 31, Núm. 8, pp. 1970-1982
-
Precise runahead execution
Proceedings - 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA 2020
-
The forward slice core microarchitecture
Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
-
Thread Isolation to Improve Symbiotic Scheduling on SMT Multicore Processors
IEEE Transactions on Parallel and Distributed Systems, Vol. 31, Núm. 2, pp. 359-373
2019
-
Precise runahead execution
IEEE Computer Architecture Letters, Vol. 18, Núm. 1, pp. 71-74
2018
-
A workload generator for evaluating SMT real-time systems
Proceedings - 2018 International Conference on High Performance Computing and Simulation, HPCS 2018
-
Designing lab sessions focusing on real processors for computer architecture courses: A practical perspective
Journal of Parallel and Distributed Computing, Vol. 118, pp. 128-139
2017
-
Improving IBM POWER8 Performance Through Symbiotic Job Scheduling
IEEE Transactions on Parallel and Distributed Systems, Vol. 28, Núm. 10, pp. 2838-2851