Abstract
We present a deterministic sorting algorithm, Sample, Partition, and Merge Sort (SPMS), that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(nlog n) time cacheobliviously with an optimal number of cache misses. The parallel complexity (or critical path length) of the algorithm is O(log nlog log n), which improves on previous bounds for deterministic sample sort. The algorithm also has low false sharing costs. When scheduled by a work-stealing scheduler in a multicore computing environment with a global shared memory and p cores, each having a cache of size M organized in blocks of size B, the costs of the additional cache misses and false sharing misses due to this parallel execution are bounded by the cost of O(S M/B) and O(S B) cache misses, respectively, where S is the number of steals performed during the execution. Finally, SPMS is resource oblivious in that the dependence on machine parameters appear only in the analysis of its performance and not within the algorithm itself.
Original language | English (US) |
---|---|
Article number | a23 |
Journal | ACM Transactions on Parallel Computing |
Volume | 3 |
Issue number | 4 |
DOIs | |
State | Published - Mar 2017 |
Keywords
- Cache oblivious
- Sample sort
- Sorting
- merge sort
ASJC Scopus subject areas
- Software
- Modeling and Simulation
- Hardware and Architecture
- Computer Science Applications
- Computational Theory and Mathematics