Files smaller than 8 KiB typically represent ~70% of the entries in
/gnu/store/.links but only contribute to ~4% of the space savings
afforded by deduplication.
Not considering these files for deduplication speeds up file insertion
in the store and, more importantly, leaves 'removeUnusedLinks' with
fewer entries to traverse, thereby speeding it up proportionally.
Partly fixes <https://issues.guix.gnu.org/24937>.
* config-daemon.ac: Remove symlink hard link check and CAN_LINK_SYMLINK
definition.
* guix/store/deduplication.scm (%deduplication-minimum-size): New
variable.
(deduplicate)[loop]: Do not recurse when FILE's size is below
%DEDUPLICATION-MINIMUM-SIZE.
(dump-port): New procedure.
(dump-file/deduplicate)[hash]: Turn into...
[dump-and-compute-hash]: ... this thunk.
Call 'deduplicate' only when SIZE is greater than
%DEDUPLICATION-MINIMUM-SIZE; otherwise call 'dump-port'.
* nix/libstore/gc.cc (LocalStore::removeUnusedLinks): Drop files where
st.st_size < deduplicationMinSize.
* nix/libstore/local-store.hh (deduplicationMinSize): New declaration.
* nix/libstore/optimise-store.cc (deduplicationMinSize): New variable.
(LocalStore::optimisePath_): Return when PATH is a symlink or smaller
than 'deduplicationMinSize'.
* tests/derivations.scm ("identical files are deduplicated"): Produce
files bigger than %DEDUPLICATION-MINIMUM-SIZE.
* tests/nar.scm ("restore-file-set with directories (signed, valid)"):
Likewise.
* tests/store-deduplication.scm ("deduplicate, below %deduplication-minimum-size"):
New test.
("deduplicate", "deduplicate, ENOSPC"): Produce files bigger than
%DEDUPLICATION-MINIMUM-SIZE.
* tests/store.scm ("substitute, deduplication"): Likewise.
This avoids having to traverse and re-read the files that we have just
restored, thereby reducing I/O.
* guix/serialization.scm (dump-file): New procedure.
(restore-file): Add #:dump-file parameter and honor it.
* guix/store/deduplication.scm (tee, dump-file/deduplicate): New
procedures.
* guix/nar.scm (restore-one-item): Pass #:dump-file to 'restore-file'.
(finalize-store-file): Pass #:deduplicate? #f to 'register-items'.
* tests/nar.scm <top level>: Call 'setenv' to set "NIX_STORE".
* guix/serialization.scm (restore-file): Set the permissions and mtime
of FILE.
* guix/nar.scm (finalize-store-file): Pass #:reset-timestamps? #f to
'register-items'.
* tests/nar.scm (rm-rf): Add 'chmod' calls to ensure files are writable.
("write-file + restore-file with symlinks"): Ensure every file in OUTPUT
passes 'canonical-file?'.
* tests/guix-archive.sh: Run "chmod -R +w" before "rm -rf".
* guix/serialization.scm (fold-archive): Call PROC with a
'directory-complete tag when done with a directory.
(restore-file): Handle it.
* guix/scripts/archive.scm (list-contents): Likewise.
* guix/scripts/challenge.scm (archive-contents): Likewise.
* tests/nar.scm ("write-file-tree + fold-archive"): Adjust accordingly.