next up previous
Next: Optimal Data Placement and Up: Research Contributions Previous: Application Specific Image Compression

Optimizing Overall Loop Schedules using Prefetching and Partitioning

Because CPU speed has increased dramatically when compared with memory speed, the slowness of memory hinders the overall system performance. A method combining the loop pipelining technique with data prefetching, called Partition Scheduling with Prefetching (PSP), is proposed. In PSP, the iteration space is first divided into regular partitions. Then a two-part schedule, consisting of the ALU and memory parts, is produced and balanced to produce high throughput. These two parts are executed simultaneously, and hence the remote memory latencies are overlapped. We study the optimal partition shape and size so that a well balanced overall schedule can be obtained. Experiments on DSP benchmarks show that the proposed methodology consistently produces optimal or near optimal solutions. Experiments show that the average schedule length obtained by PSP is $26.7\%$ of that derived using list scheduling.

Two journal papers and 5 conference papers were published and submitted under this category.



Hsingmean Sha 2010-03-24