Re-using Gentoo’s distfile cache for Emacs source directory

posted on 2023-05-13

When building GNU Emacs on my Gentoo Linux using the ebuild that pulls sources directly from Emacs git repository the cloned contents lands in the /var/cache/distfiles/git3-src/emacs.git. To have access and easy reference to Emacs C functions when reading through built-in documentation, one need to point Emacs to a source directory by setting source-directory variable. It would be quite natural to reuse this cached git repository and not have to clone the source code once again.

Hard luck, as the distfiles cache keep only git bare repository. But we still can reuse at least some of the disk space, with some caveats.

We can use the --shared clone git option to create a git clone with checked out files, but using the original repository as its .git storage (assuming ~/Src/opensource/emacs-git as destination folder):

~/Src/opensource % git clone --shared /var/cache/distfiles/git3-src/emacs.git/ emacs-git
Cloning into 'emacs-git'...
done.
Updating files: 100% (4980/4980), done.

~/Src/opensource % du -hs emacs-git/
188M    emacs-git/

~/Src/opensource % du -hs /var/cache/distfiles/git3-src/emacs.git/
673M    /var/cache/distfiles/git3-src/emacs.git/

The above shows the new repository takes 188Mb on the disk where the bare repository takes additional 673Mb. Additional full clone would allocate roughly those 673Mb on top of the 188Mb, so that’s a significant saving of disk space.

What’s left is to point Emacs to this repository, by putting the following into our init.el:

(setq source-directory (expand-file-name "~/Src/opensource/emacs-git/"))

Afterwards describe-function shows the C sources as well:

2023-05-13-gentoo-emacs-source-directory-c-in-describe-function.png

Updating Emacs with emerge requires one additional manual step - need to pull the changes into the --shared clone again.

One thing we need to remember when using such a --shared clone from a read-only (as it is owned by root and not writable to normal users) is that we can’t really do anything with files in it. There are also some dire warnings in the man page of git clone for the --shared option itself that are good to note (mostly don’t apply to this situation because of the read-only nature of our source clone):

NOTE: this is a possibly dangerous operation; do not use it unless you understand what it does. If you clone your repository using this option and then delete branches (or use any other Git command that makes any existing commit unreferenced) in the source repository, some objects may become unreferenced (or dangling). These objects may be removed by normal Git operations (such as git commit) which automatically call git maintenance run –auto. (See git-maintenance(1).) If these objects are removed and were referenced by the cloned repository, then the cloned repository will become corrupt.

Note that running git repack without the –local option in a repository cloned with –shared will copy objects from the source repository into a pack in the cloned repository, removing the disk space savings of clone –shared. It is safe, however, to run git gc, which uses the –local option by default.

Happy hacking!