Organizing photos by date with Go
Keeping Things Tidy
In this issue, Mike conjures up a Go program to copy photos from a cell phone or SD card into a date-based file structure on a Linux box. To avoid wasting time, a cache using UUIDs ensures that only new photos are transferred.
I regularly import photos from my phone or the SD card of my brand new mirrorless camera (a Sony A7) to my home computer in order to archive the best shots. On the computer, a homegrown program sorts them into a folder structure that creates a separate directory for each year, month, and day. After importing, the images usually remain on the card or phone. Of course, I don't want the importer to recopy previously imported images the next time it is called, but instead pick up where it left off the last time. If several SD cards are used, it is important to keep track of them because they sometimes use conflicting file names.
The photos on the SD card are files with a name format of DSC<number>.JPG
. On the phone, they have a different file name, say, IMG_<number>.JPG
. Cameras and photo apps increment the consecutive number of newly taken photos by one for each shot. This process is described in the Design rule for Camera File system (DCF) [1] specification. The DCF specification defines the format of the file names along with their counters and specifies what happens if a counter overflows or the camera detects that the user has used other SD cards with separate counters in the meantime.
Figure 1 shows the typical, DCF-compliant file layout on the card. On a freshly formatted card, the camera saves the first images as DSC00001.JPG
, DSC00002.JPG
, and so on in the 100MSDCF/
subdirectory; this, in turn, is located in the DCIM
folder. Now, it's unlikely for anyone to store 99,999 pictures on a card, but if a crazy photographer actually shot that many photos, the camera would create a new directory named 101MSDCF/
and, after the next shot, would simply start again at DSC00001.JPG
.
Interesting things happen if a photographer changes SD cards without reformatting the freshly inserted card: The camera's internal counter jumps from the previously monotonously increasing value to the value of the image with the highest counter on the SD card. Imagine that, after taking DSC02001.JPG
, the photographer switches to an SD card that already contains a photo named DSC09541.JPG
. In this case, the camera would continue with DSC09542.JPG
even if DSC02002.JPG
still happened to be available. Depending on the camera model and software version, there can be some deviations.
Loose Standard
As an experiment, I manipulated an SD card serving in my Sony A7. Its directory 100MSDCF/
was filled with images ranging from DSC00205.JPG
to DSC00952.JPG
. When I manually inserted a new photo named DSC99999.JPG
into the card and reinserted the card into the camera, the camera software actually created the new directory 101MSDCF/
(as a peer to 100MSDCF/
) on the card and saved newly captured images there as DSC00953.JPG
, DSC00954.JPG
, and so on (see Figure 2)!
In other words, the camera remembers – even after it has been turned off and on again – the last image it took and the folder where it stored the shot. When I deleted the fake image DSC99999.JPG
from 100MSDCF/
again, the camera still continued with DSC00954.JPG
in the 101MSDCF/
directory.
However, if you routinely swap SD cards, you will often find new files on them with names that photos in your external storage archive already use. If my algorithm were to rely only on the original file name as a key when importing photos, it would either overwrite existing files in the computer archive or conclude that some files had already been imported previously and should therefore be ignored during the current import. It would be wrong on both counts. Instead, the importer has to store any photos that are not already in the archive, regardless of their original names.
Check and Save
How can an import application determine if a file on the SD card is actually new, even if there is already an image with the same name in the archive? The Go program presented here resorts to a cache file that makes use of the parent directories and a UUID of the respective SD card for imported photos.
Figure 3 shows the importer in action. Called up with the name of the photo directory (normally that of the SD card inserted), the importer works its way through the individual images, plumbing the depths of the card structure. It checks if the particular photo has been copied previously according to the cache data. If not, it archives it in a date-based file structure (Figure 4).
Knotted Handkerchief
Listing 1 implements the cache that helps the program to remember which photos importer
already copied. It relies on file names and file sizes to do this. The cache is a Go map of the type map[string]bool
; it assigns a value of true
to each photo path (as a string) if the respective photo has already been copied. The photo path not only includes the name of the photo file, but also the name of the directory in which it is located on the card (e.g., 100MSDCF/
in Figure 5).
Listing 1
cacher.go
01 package main 02 03 import ( 04 "bufio" 05 "fmt" 06 "github.com/google/uuid" 07 "io/ioutil" 08 "os" 09 "path" 10 "strings" 11 ) 12 13 const uuidFile = ".uuid" 14 const cacheFile = ".idb-import-cache" 15 16 type Cache struct { 17 uuid string 18 iPath string 19 uuidPath string 20 cachePath string 21 cache map[string]bool 22 } 23 24 func NewCache(ipath string) *Cache { 25 return &Cache{ 26 uuid: "", 27 uuidPath: path.Join(ipath, uuidFile), 28 iPath: ipath, 29 cachePath: "", 30 cache: map[string]bool{}, 31 } 32 } 33 34 func (cache *Cache) Init() { 35 buf, err := ioutil.ReadFile(cache.uuidPath) 36 if err == nil { 37 cache.uuid = strings.TrimSpace(string(buf)) 38 } else { 39 if os.IsNotExist(err) { 40 uuid := uuid.New().String() 41 err := ioutil.WriteFile(cache.uuidPath, []byte(uuid), 0644) 42 panicOnErr(err) 43 cache.uuid = uuid 44 } else { 45 panicOnErr(err) 46 } 47 } 48 49 homedir, err := os.UserHomeDir() 50 panicOnErr(err) 51 cache.cachePath = path.Join(homedir, cacheFile) 52 } 53 54 func (cache *Cache) Read() { 55 f, err := os.Open(cache.cachePath) 56 if os.IsNotExist(err) { 57 return 58 } 59 panicOnErr(err) 60 defer f.Close() 61 62 scanner := bufio.NewScanner(f) 63 for scanner.Scan() { 64 line := scanner.Text() 65 cache.cache[line] = true 66 } 67 68 return 69 } 70 71 func (cache Cache) Write() { 72 f, err := os.OpenFile(cache.cachePath, os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0644) 73 panicOnErr(err) 74 defer f.Close() 75 76 for k, _ := range cache.cache { 77 fmt.Fprintf(f, "%s\n", k) 78 } 79 return 80 } 81 82 func (cache Cache) Exists(key string) bool { 83 _, ok := cache.cache[cache.uuid+":"+key] 84 return ok 85 } 86 87 func (cache Cache) Set(key string) { 88 cache.cache[cache.uuid+":"+key] = true 89 }
The program uses a 36-digit UUID to identify the SD card. During the first import of photos on a never-before-used card, it creates the UUID in the .uuid
file at the root level of the card's filesystem and rereads it from there for subsequent import attempts. As you can see in Figure 5, the card's UUID is also part of the key of already imported photos in the cache. This way, the importer knows exactly which card a specific image came from.
In Listing 1, the structure Cache
starting in line 16 defines the data of a cache instance for the card currently being processed. The NewCache()
constructor starting in line 24 returns the pre-initialized structure as a pointer to the caller. The caller stores the pointer in a variable such as cache
. If the programmer then types cache.Function()
, Go passes the structure pointer to the function, using its receiver mechanism – object orientation in Go.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
AlmaLinux 10.0 Beta Released
The AlmaLinux OS Foundation has announced the availability of AlmaLinux 10.0 Beta ("Purple Lion") for all supported devices with significant changes.
-
Gnome 47.2 Now Available
Gnome 47.2 is now available for general use but don't expect much in the way of newness, as this is all about improvements and bug fixes.
-
Latest Cinnamon Desktop Releases with a Bold New Look
Just in time for the holidays, the developer of the Cinnamon desktop has shipped a new release to help spice up your eggnog with new features and a new look.
-
Armbian 24.11 Released with Expanded Hardware Support
If you've been waiting for Armbian to support OrangePi 5 Max and Radxa ROCK 5B+, the wait is over.
-
SUSE Renames Several Products for Better Name Recognition
SUSE has been a very powerful player in the European market, but it knows it must branch out to gain serious traction. Will a name change do the trick?
-
ESET Discovers New Linux Malware
WolfsBane is an all-in-one malware that has hit the Linux operating system and includes a dropper, a launcher, and a backdoor.
-
New Linux Kernel Patch Allows Forcing a CPU Mitigation
Even when CPU mitigations can consume precious CPU cycles, it might not be a bad idea to allow users to enable them, even if your machine isn't vulnerable.
-
Red Hat Enterprise Linux 9.5 Released
Notify your friends, loved ones, and colleagues that the latest version of RHEL is available with plenty of enhancements.
-
Linux Sees Massive Performance Increase from a Single Line of Code
With one line of code, Intel was able to increase the performance of the Linux kernel by 4,000 percent.
-
Fedora KDE Approved as an Official Spin
If you prefer the Plasma desktop environment and the Fedora distribution, you're in luck because there's now an official spin that is listed on the same level as the Fedora Workstation edition.