Bulk renaming in a single pass with Go
Closed Case
The secret is known as closure and is a feature supported not only by Go but also by many other scripting and programming languages. Listing 4 illustrates the procedure with a simple example.
Listing 4
closure.go
01 package main 02 03 import "fmt" 04 05 func main() { 06 mycounter := mkmycounter() 07 08 mycounter() 09 mycounter() 10 mycounter() 11 } 12 13 func mkmycounter() func() { 14 count := 1 15 16 return func() { 17 fmt.Printf("%d\n", count) 18 count++ 19 } 20 }
Before a function-creating function like mkmycounter()
returns a newly constructed subroutine to the caller, it is allowed to define local variables, which are then wrapped into the returned function's context. When called multiple times, those variables subsequently appear global (or rather static) to the call context. If a call to the generated and returned function modifies one of these variables, the next call to the function will also find the previously modified value. The enclosed variables therefore belong to the function, much like instance variables belong to an object in object-oriented programming.
As expected, the call of the binary compiled from Listing 4 shows successive calls of the generated function outputting growing counter values (Listing 5).
Listing 5
Calling the Binary
01 $ go build closure.go 02 $ ./closure 03 1 04 2 05 3
Characters, Bytes, and Runes
The call to the regexp
function ReplaceAllString()
in line 31 of Listing 3 also needs some explanation. It replaces all the characters in the org
string matched by the regular expression rex
with the characters in the repl
string. On the other hand, the ReplaceAll()
function (without the String
suffix), which the user may find first in a cursory study of the man page, expects slices of the type []byte
instead of strings. Attentive readers may wonder what the difference is, considering the fact that you can easily convert a string into a byte slice with []byte(string)
.
To explain this, it is worthwhile digressing into Go's implementation of strings [2]. Astonished Go students will discover that strings and byte slices ([]byte
) are fundamentally different data types in Go. You are not allowed to modify existing strings: Strings are immutable, but you are allowed to mess around with byte slices. In addition, strings distinguish between characters and bytes. Since strings are UTF-8 encoded in Go code, the "PiÒata" string in the program text of Listings 6 and 7 takes up seven bytes, since the accented Ò character in UTF-8 is represented as c3 b1
hex.
As the meaning of the word "character" has historically often been confused with "byte," the Unicode standard refers to them as code points. The Ò character occupies position U+00F1
, which UTF-8 encodes as c3 b1
. To make things worse, there is also an alternative rendering of it in the form of two Unicode code points. This has a squiggly tilde floating above an n, but we'll not be going into that today. The only important thing is that Go refers to code points in the Unicode standard as "runes."
While the range
operator in Listing 6 parses the runes (Figure 1), the for
loop in Listing 7 indexes the individual bytes and returns the accented character in the form of two illegible bytes. You see: It makes sense to check very carefully whether a function processes strings or byte slices. Converting between the two different data types looks easy, but it involves a great deal of internal overhead – that is, it'll cost you compute cycles at runtime.
Listing 6
range.go
package main import "fmt" func main() { str := "PiÒata" for i, c := range str { fmt.Printf("str[%d]='%c'\n", i, c) } }
Listing 7
forloop.go
package main import "fmt" func main() { str := "PiÒata" for i := 0; i < len(str); i++ { fmt.Printf("str[%d]='%c'\n", i, str[i]) } }
Off We Go
Let's get back to Listing 4. Because of the closure implemented there, the function increments the value of the seq
variable by one for each call and replaces the {seq}
placeholder in the file template with the integer value padded out to four digits with leading zeros. foo-{seq}.log
first becomes foo-0001.log
, then foo-0002.log
, and so on.
The call to
go build renamer.go mkmodifier.go
compiles both listings and links the result together into a binary called renamer
. Figure 2 shows some usage examples.
By the way, the os.Rename()
function also accepts identical source and target files – in which case it just does nothing. But if the target file already exists, it overwrites it with the source file without any warning. If you don't want that, you can add a test and maybe a new --force
option, which tells the program to bulldoze whatever it finds in the way.
To avoid unintentional renaming of critical files, it is always a good idea to do a dry run first with -d
. Is everything okay? Then go again, and do it live this time.
Infos
- Renamer: https://github.com/adriangoransson/renamer
- "Strings, bytes, runes, and characters in Go": https://blog.golang.org/strings
« Previous 1 2
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Direct Download
Read full article as PDF:
Price $2.95
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Find SysAdmin Jobs
News
-
Kubuntu Focus Announces XE Gen 2 Linux Laptop
Another Kubuntu-based laptop has arrived to be your next ultra-portable powerhouse with a Linux heart.
-
MNT Seeks Financial Backing for New Seven-Inch Linux Laptop
MNT Pocket Reform is a tiny laptop that is modular, upgradable, recyclable, reusable, and ships with Debian Linux.
-
Ubuntu Flatpak Remix Adds Flatpak Support Preinstalled
If you're looking for a version of Ubuntu that includes Flatpak support out of the box, there's one clear option.
-
Gnome 44 Release Candidate Now Available
The Gnome 44 release candidate has officially arrived and adds a few changes into the mix.
-
Flathub Vying to Become the Standard Linux App Store
If the Flathub team has any say in the matter, their product will become the default tool for installing Linux apps in 2023.
-
Debian 12 to Ship with KDE Plasma 5.27
The Debian development team has shifted to the latest version of KDE for their testing branch.
-
Planet Computers Launches ARM-based Linux Desktop PCs
The firm that originally released a line of mobile keyboards has taken a different direction and has developed a new line of out-of-the-box mini Linux desktop computers.
-
Ubuntu No Longer Shipping with Flatpak
In a move that probably won’t come as a shock to many, Ubuntu and all of its official spins will no longer ship with Flatpak installed.
-
openSUSE Leap 15.5 Beta Now Available
The final version of the Leap 15 series of openSUSE is available for beta testing and offers only new software versions.
-
Linux Kernel 6.2 Released with New Hardware Support
Find out what's new in the most recent release from Linus Torvalds and the Linux kernel team.