I am still in the process of transitioning my blog from WordPress to Hugo. The content has been moved, but I see that WordPress has created quite a mess of files. Here’s a simple way to identify and remove unused images using a few command-line tools.
1. Find All Used Images in Your Content
First, extract all image references from your Markdown files. This command searches for Markdown image links and outputs a sorted, unique list:
grep -hroP '\!\[.*\]\(\K[^")]*' content/ | sed 's/"[^"]*"//g' | sort -u > used_images.txt
grep -hroP '\!\[.*\]\(\K[^")]*' content/finds all image paths in your content.sed 's/"[^"]*"//g'cleans up any extra attributes.sort -usorts and removes duplicates.
2. List All Image Files in Your Static Folder
Next, list every image file in your static directory:
find ./static -type f \( -name "*.jpg" -o -name "*.jpeg" -o -name "*.png" -o -name "*.gif" -o -name "*.svg" \) -printf "/%P\n" | sort -u > all_images.txt
This command finds all common image types and outputs their paths relative to the static folder.
3. Compare Used and Unused Images
Now, compare the two lists to find images that are present in your static folder but not referenced in your content:
comm -23 all_images.txt used_images.txt > unused_images.txt
comm -23outputs lines only inall_images.txt(i.e., unused images).
4. Review and Clean Up
Open unused_images.txt to review which images are safe to delete.