How to copy files

Yang Zhan, Alex Conway, Yizheng Jiao, Nirjhar Mukherjee, Ian Groombridge, Michael A. Bender, Martín Farach-Colton, William Jannen, Rob Johnson, Donald E. Porter, Jun Yuan

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Making logical copies, or clones, of files and directories is critical to many real-world applications and workflows, including backups, virtual machines, and containers. An ideal clone implementation meets the following performance goals: (1) creating the clone has low latency; (2) reads are fast in all versions (i.e., spatial locality is always maintained, even after modifications); (3) writes are fast in all versions; (4) the overall system is space efficient. Implementing a clone operation that realizes all four properties, which we call a nimble clone, is a long-standing open problem. This paper describes nimble clones in BetrFS, an opensource, full-path-indexed, and write-optimized file system. The key observation behind our work is that standard copyon- write heuristics can be too coarse to be space efficient, or too fine-grained to preserve locality. On the other hand, a write-optimized key-value store, as used in BetrFS or an LSMtree, can decouple the logical application of updates from the granularity at which data is physically copied. In our writeoptimized clone implementation, data sharing among clones is only broken when a clone has changed enough to warrant making a copy, a policy we call copy-on-abundant-write. We demonstrate that the algorithmic work needed to batch and amortize the cost of BetrFS clone operations does not erode the performance advantages of baseline BetrFS; BetrFS performance even improves in a few cases. BetrFS cloning is efficient; for example, when using the clone operation for container creation, BetrFS outperforms a simple recursive copy by up to two orders-of-magnitude and outperforms file systems that have specialized LXC backends by 3-4×.

    Original languageEnglish (US)
    Title of host publicationProceedings of the 18th USENIX Conference on File and Storage Technologies, FAST 2020
    PublisherUSENIX Association
    Pages75-89
    Number of pages15
    ISBN (Electronic)9781939133120
    StatePublished - 2020
    Event18th USENIX Conference on File and Storage Technologies, FAST 2020 - Santa Clara, United States
    Duration: Feb 25 2020Feb 27 2020

    Publication series

    NameProceedings of the 18th USENIX Conference on File and Storage Technologies, FAST 2020

    Conference

    Conference18th USENIX Conference on File and Storage Technologies, FAST 2020
    Country/TerritoryUnited States
    CitySanta Clara
    Period2/25/202/27/20

    ASJC Scopus subject areas

    • Hardware and Architecture
    • Software
    • Computer Networks and Communications

    Fingerprint

    Dive into the research topics of 'How to copy files'. Together they form a unique fingerprint.

    Cite this