TY - GEN
T1 - Using one-sided RDMA reads to build a fast, CPU-efficient key-value store
AU - Mitchell, Christopher
AU - Geng, Yifeng
AU - Li, Jinyang
N1 - Publisher Copyright:
© USENIX Annual Technical Conference, USENIX ATC 2013. All rights reserved.
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Recent technological trends indicate that future datacenter networks will incorporate High Performance Computing network features, such as ultra-low latency and CPU bypassing. How can these features be exploited in datacenter-scale systems infrastructure? In this paper, we explore the design of a distributed in-memory key-value store called Pilaf that takes advantage of Remote Direct Memory Access to achieve high performance with low CPU overhead. In Pilaf, clients directly read from the server's memory via RDMA to perform gets, which commonly dominate key-value store workloads. By contrast, put operations are serviced by the server to simplify the task of synchronizing memory accesses. To detect inconsistent RDMA reads with concurrent CPU memory modifications, we introduce the notion of self-verifying data structures that can detect read-write races without client-server coordination. Our experiments show that Pilaf achieves low latency and high throughput while consuming few CPU resources. Specifically, Pilaf can surpass 1.3 million ops/sec (90% gets) using a single CPU core compared with 55K for Memcached and 59K for Redis.
AB - Recent technological trends indicate that future datacenter networks will incorporate High Performance Computing network features, such as ultra-low latency and CPU bypassing. How can these features be exploited in datacenter-scale systems infrastructure? In this paper, we explore the design of a distributed in-memory key-value store called Pilaf that takes advantage of Remote Direct Memory Access to achieve high performance with low CPU overhead. In Pilaf, clients directly read from the server's memory via RDMA to perform gets, which commonly dominate key-value store workloads. By contrast, put operations are serviced by the server to simplify the task of synchronizing memory accesses. To detect inconsistent RDMA reads with concurrent CPU memory modifications, we introduce the notion of self-verifying data structures that can detect read-write races without client-server coordination. Our experiments show that Pilaf achieves low latency and high throughput while consuming few CPU resources. Specifically, Pilaf can surpass 1.3 million ops/sec (90% gets) using a single CPU core compared with 55K for Memcached and 59K for Redis.
UR - http://www.scopus.com/inward/record.url?scp=85077206568&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85077206568&partnerID=8YFLogxK
M3 - Conference contribution
T3 - Proceedings of the 2013 USENIX Annual Technical Conference, USENIX ATC 2013
SP - 103
EP - 114
BT - Proceedings of the 2013 USENIX Annual Technical Conference, USENIX ATC 2013
PB - USENIX Association
T2 - 2013 USENIX Annual Technical Conference, USENIX ATC 2013
Y2 - 26 June 2013 through 28 June 2013
ER -