TY - GEN
T1 - How to copy files
AU - Zhan, Yang
AU - Conway, Alex
AU - Jiao, Yizheng
AU - Mukherjee, Nirjhar
AU - Groombridge, Ian
AU - Bender, Michael A.
AU - Farach-Colton, Martín
AU - Jannen, William
AU - Johnson, Rob
AU - Porter, Donald E.
AU - Yuan, Jun
N1 - Publisher Copyright:
Copyright © Proc. of the 18th USENIX Conference on File and Storage Tech., FAST 2020. All rights reserved.
PY - 2020
Y1 - 2020
N2 - Making logical copies, or clones, of files and directories is critical to many real-world applications and workflows, including backups, virtual machines, and containers. An ideal clone implementation meets the following performance goals: (1) creating the clone has low latency; (2) reads are fast in all versions (i.e., spatial locality is always maintained, even after modifications); (3) writes are fast in all versions; (4) the overall system is space efficient. Implementing a clone operation that realizes all four properties, which we call a nimble clone, is a long-standing open problem. This paper describes nimble clones in BetrFS, an opensource, full-path-indexed, and write-optimized file system. The key observation behind our work is that standard copyon- write heuristics can be too coarse to be space efficient, or too fine-grained to preserve locality. On the other hand, a write-optimized key-value store, as used in BetrFS or an LSMtree, can decouple the logical application of updates from the granularity at which data is physically copied. In our writeoptimized clone implementation, data sharing among clones is only broken when a clone has changed enough to warrant making a copy, a policy we call copy-on-abundant-write. We demonstrate that the algorithmic work needed to batch and amortize the cost of BetrFS clone operations does not erode the performance advantages of baseline BetrFS; BetrFS performance even improves in a few cases. BetrFS cloning is efficient; for example, when using the clone operation for container creation, BetrFS outperforms a simple recursive copy by up to two orders-of-magnitude and outperforms file systems that have specialized LXC backends by 3-4×.
AB - Making logical copies, or clones, of files and directories is critical to many real-world applications and workflows, including backups, virtual machines, and containers. An ideal clone implementation meets the following performance goals: (1) creating the clone has low latency; (2) reads are fast in all versions (i.e., spatial locality is always maintained, even after modifications); (3) writes are fast in all versions; (4) the overall system is space efficient. Implementing a clone operation that realizes all four properties, which we call a nimble clone, is a long-standing open problem. This paper describes nimble clones in BetrFS, an opensource, full-path-indexed, and write-optimized file system. The key observation behind our work is that standard copyon- write heuristics can be too coarse to be space efficient, or too fine-grained to preserve locality. On the other hand, a write-optimized key-value store, as used in BetrFS or an LSMtree, can decouple the logical application of updates from the granularity at which data is physically copied. In our writeoptimized clone implementation, data sharing among clones is only broken when a clone has changed enough to warrant making a copy, a policy we call copy-on-abundant-write. We demonstrate that the algorithmic work needed to batch and amortize the cost of BetrFS clone operations does not erode the performance advantages of baseline BetrFS; BetrFS performance even improves in a few cases. BetrFS cloning is efficient; for example, when using the clone operation for container creation, BetrFS outperforms a simple recursive copy by up to two orders-of-magnitude and outperforms file systems that have specialized LXC backends by 3-4×.
UR - http://www.scopus.com/inward/record.url?scp=85091855690&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85091855690&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85091855690
T3 - Proceedings of the 18th USENIX Conference on File and Storage Technologies, FAST 2020
SP - 75
EP - 89
BT - Proceedings of the 18th USENIX Conference on File and Storage Technologies, FAST 2020
PB - USENIX Association
T2 - 18th USENIX Conference on File and Storage Technologies, FAST 2020
Y2 - 25 February 2020 through 27 February 2020
ER -