Logo Logo
Hilfe
Kontakt
Switch language to English
Task scheduling on FPGA-based accelerators with limited partial reconfiguration
Task scheduling on FPGA-based accelerators with limited partial reconfiguration
Field Programmable Gate Arrays (FPGAs) are integrated circuits that can be reconfigured dynamically. Accelerators for offloading computation based on FPGAs demonstrate potential as energy-efficient and flexible alternatives to conventional accelerators in High-Performance and Cloud Computing. Scheduling tasks on FPGAs is comparable to the allocation of resources on a chip: each offloaded task occupies an area of the chip during its execution. Task scheduling on FPGAs is typically done using Partial Reconfiguration (PR), where a subset of the FPGA is configured to execute a task while the remaining circuits remain unchanged, inspired years of research for scheduling strategies based on PR. However, PR's reliance on low-level features limits its portability and necessitates expert knowledge, hindering the application of existing research for task scheduling and the overall adoption of FPGA-based accelerators. In this work, we aim to support software developers in integrating accelerators based on FPGAs by asking how to optimize task scheduling on FPGA-based accelerators without relying on PR? To address our research question, we present three key contributions: first, we introduce an abstraction-agnostic methodology for describing FPGAs and scheduling strategies, centered around deducing scheduling constraints from a machine model representing a target FPGA. Given a task graph, optimal schedules can be generated with constraint programming and analyzed systematically afterwards. We apply our methodology exemplarily in a case study for two machine models -- one supporting PR and one without PR -- and compare the resulting schedules generated for task graphs of existing applications, demonstrating that avoiding PR is feasible. Second, we introduce an algorithm that uses heuristics to find schedules in polynomial time, since optimal scheduling is an NP-complete problem. The algorithm supports the scheduling of tasks on FPGAs without a dependency on PR. It is evaluated and the derived schedules are compared against optimal schedules. Third, we apply a genetic algorithm to perform Design Space Exploration (DSE) for the machine model. Its goal is to recommend minimal or automated changes to the input program, which in turn affect the quality of possible schedules. We show that effective recommendations can be generated. Our results can help vendors provide significantly more streamlined workflows for programming FPGAs and thus make the platform more appealing for users. The insights can also be used to extend established programming models, such as OpenCL, to accommodate the unique characteristics of FPGAs. Based on our research, architectural optimizations for an approach with a vastly simplified PR implementation both in hardware and software can be considered. Lastly, automating the generation of constraints for an accelerator would support the selection of the most suitable scheduling strategy without human interaction.
Not available
Jungblut, Pascal
2024
Englisch
Universitätsbibliothek der Ludwig-Maximilians-Universität München
Jungblut, Pascal (2024): Task scheduling on FPGA-based accelerators with limited partial reconfiguration. Dissertation, LMU München: Fakultät für Mathematik, Informatik und Statistik
[thumbnail of Jungblut_Pascal.pdf]
Vorschau
Lizenz: Creative Commons: Namensnennung 4.0 (CC-BY)
PDF
Jungblut_Pascal.pdf

1MB

Abstract

Field Programmable Gate Arrays (FPGAs) are integrated circuits that can be reconfigured dynamically. Accelerators for offloading computation based on FPGAs demonstrate potential as energy-efficient and flexible alternatives to conventional accelerators in High-Performance and Cloud Computing. Scheduling tasks on FPGAs is comparable to the allocation of resources on a chip: each offloaded task occupies an area of the chip during its execution. Task scheduling on FPGAs is typically done using Partial Reconfiguration (PR), where a subset of the FPGA is configured to execute a task while the remaining circuits remain unchanged, inspired years of research for scheduling strategies based on PR. However, PR's reliance on low-level features limits its portability and necessitates expert knowledge, hindering the application of existing research for task scheduling and the overall adoption of FPGA-based accelerators. In this work, we aim to support software developers in integrating accelerators based on FPGAs by asking how to optimize task scheduling on FPGA-based accelerators without relying on PR? To address our research question, we present three key contributions: first, we introduce an abstraction-agnostic methodology for describing FPGAs and scheduling strategies, centered around deducing scheduling constraints from a machine model representing a target FPGA. Given a task graph, optimal schedules can be generated with constraint programming and analyzed systematically afterwards. We apply our methodology exemplarily in a case study for two machine models -- one supporting PR and one without PR -- and compare the resulting schedules generated for task graphs of existing applications, demonstrating that avoiding PR is feasible. Second, we introduce an algorithm that uses heuristics to find schedules in polynomial time, since optimal scheduling is an NP-complete problem. The algorithm supports the scheduling of tasks on FPGAs without a dependency on PR. It is evaluated and the derived schedules are compared against optimal schedules. Third, we apply a genetic algorithm to perform Design Space Exploration (DSE) for the machine model. Its goal is to recommend minimal or automated changes to the input program, which in turn affect the quality of possible schedules. We show that effective recommendations can be generated. Our results can help vendors provide significantly more streamlined workflows for programming FPGAs and thus make the platform more appealing for users. The insights can also be used to extend established programming models, such as OpenCL, to accommodate the unique characteristics of FPGAs. Based on our research, architectural optimizations for an approach with a vastly simplified PR implementation both in hardware and software can be considered. Lastly, automating the generation of constraints for an accelerator would support the selection of the most suitable scheduling strategy without human interaction.