daemon: Do not deduplicate files smaller than 8 KiB....Files smaller than 8 KiB typically represent ~70% of the entries in
/gnu/store/.links but only contribute to ~4% of the space savings
afforded by deduplication.
Not considering these files for deduplication speeds up file insertion
in the store and, more importantly, leaves 'removeUnusedLinks' with
fewer entries to traverse, thereby speeding it up proportionally.
Partly fixes <https://issues.guix.gnu.org/24937>.
* config-daemon.ac: Remove symlink hard link check and CAN_LINK_SYMLINK
definition.
* guix/store/deduplication.scm (%deduplication-minimum-size): New
variable.
(deduplicate)[loop]: Do not recurse when FILE's size is below
%DEDUPLICATION-MINIMUM-SIZE.
(dump-port): New procedure.
(dump-file/deduplicate)[hash]: Turn into...
[dump-and-compute-hash]: ... this thunk.
Call 'deduplicate' only when SIZE is greater than
%DEDUPLICATION-MINIMUM-SIZE; otherwise call 'dump-port'.
* nix/libstore/gc.cc (LocalStore::removeUnusedLinks): Drop files where
st.st_size < deduplicationMinSize.
* nix/libstore/local-store.hh (deduplicationMinSize): New declaration.
* nix/libstore/optimise-store.cc (deduplicationMinSize): New variable.
(LocalStore::optimisePath_): Return when PATH is a symlink or smaller
than 'deduplicationMinSize'.
* tests/derivations.scm ("identical files are deduplicated"): Produce
files bigger than %DEDUPLICATION-MINIMUM-SIZE.
* tests/nar.scm ("restore-file-set with directories (signed, valid)"):
Likewise.
* tests/store-deduplication.scm ("deduplicate, below %deduplication-minimum-size"):
New test.
("deduplicate", "deduplicate, ENOSPC"): Produce files bigger than
%DEDUPLICATION-MINIMUM-SIZE.
* tests/store.scm ("substitute, deduplication"): Likewise.
Ludovic Courtès
2020-12-15
nar: Deduplicate files right as they are restored....This avoids having to traverse and re-read the files that we have just
restored, thereby reducing I/O.
* guix/serialization.scm (dump-file): New procedure.
(restore-file): Add #:dump-file parameter and honor it.
* guix/store/deduplication.scm (tee, dump-file/deduplicate): New
procedures.
* guix/nar.scm (restore-one-item): Pass #:dump-file to 'restore-file'.
(finalize-store-file): Pass #:deduplicate? #f to 'register-items'.
* tests/nar.scm <top level>: Call 'setenv' to set "NIX_STORE".
Ludovic Courtès
2020-12-15
serialization: 'restore-file' sets canonical timestamp and permissions....* guix/serialization.scm (restore-file): Set the permissions and mtime
of FILE.
* guix/nar.scm (finalize-store-file): Pass #:reset-timestamps? #f to
'register-items'.
* tests/nar.scm (rm-rf): Add 'chmod' calls to ensure files are writable.
("write-file + restore-file with symlinks"): Ensure every file in OUTPUT
passes 'canonical-file?'.
* tests/guix-archive.sh: Run "chmod -R +w" before "rm -rf".
Ludovic Courtès
2020-12-15
serialization: 'fold-archive' notifies about directory processing completion....* guix/serialization.scm (fold-archive): Call PROC with a
'directory-complete tag when done with a directory.
(restore-file): Handle it.
* guix/scripts/archive.scm (list-contents): Likewise.
* guix/scripts/challenge.scm (archive-contents): Likewise.
* tests/nar.scm ("write-file-tree + fold-archive"): Adjust accordingly.