![]() |
Type of Document Dissertation Author Wang, Zhong URN etd-04202007-063456 Title SOFTWARE PARTITIONING AND SCHEDULING FOR IMPROVING PERFORMANCE AND ENERGY CONSUMPTION Degree Doctor of Philosophy Department Computer Science and Engineering Advisory Committee
Advisor Name Title Ken Sauer Committee Chair Christian Poellabauer Committee Member Danny Z. Chen Committee Member Surendar Chandra Committee Member Xiaobo Sharon Hu Committee Member Keywords
- Schedulig
- loop partitioning
- memory hierarchy
- instruction level parallelism
- access latency
- energy consumption
- multiple cluster
- multi-bank
Date of Defense 2005-12-05 Availability unrestricted Abstract With the advances of the contemporary computer technology, the complexity grows significantly in both hardware architecture and software application. In order to meet the performance requirement of target applications, more andmore emphasis is put on the compiler techniques to exploit both hardware and software parallelism. Scheduler, an important compiler component to
allocate operations to hardware resources, is crucial to the success of a computing system.
In this thesis, several novel scheduling optimization techniques are presented to address the challenge faced by existing computing architectures and applications.
The first targeted architecture is a system with memory hierarchy and processor comprising multiple processing and memory units. Loop
partition scheduling technique is proposed to take advantage of the memory hierarchy and effectively hide the memory access latency for the loop-intensive applications. The concept of balanced partition schedule is presented to achieve the best memory access latency toleration and hardware resource utilization. Various extensions of the base problem are studied in depth. The solution are presented for the system model with multiple-level memory hierarchy, memory
size constraint and loop model with initial data and multiple nested loops.
Multiple cluster architecture becomes more and more popular due to its superiority over centralized architecture. Inter-cluster communication, achieved by explicit register-to-register move, is compiler-controlled and invisible to the programmer. The thesis proposes an efficient scheduling algorithm which take into account ILP, register file
size and inter-cluster communication constraints. Furthermore, the solution is completed by deliberate the effect of distributed caches. The consideration of data spilling, cache conflicts and cache communications are integrated into the algorithm.
Another target architecture is multi-bank memory architecture, which brings the scheduling complexity and difficulty of variable partitioning. The approach in the thesis not only improves the existing techniques when exploiting the parallelism, but also considers the serialism to take advantage of multiple operating modes of the memory banks. By identifying the best tradeoff between parallelism and serialism, both goals of performance and energy saving can be achieved.
A novel memory access graph model, which captures both information of parallelism and serialism, forms the basis for this scheduling approach.
Files
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access WangZ042007.pdf 1.43 Mb 00:06:36 00:03:23 00:02:58 00:01:29 00:00:07