Bulk renaming in a single pass with Go
Closed Case
The secret is known as closure and is a feature supported not only by Go but also by many other scripting and programming languages. Listing 4 illustrates the procedure with a simple example.
Listing 4
closure.go
01 package main 02 03 import "fmt" 04 05 func main() { 06 mycounter := mkmycounter() 07 08 mycounter() 09 mycounter() 10 mycounter() 11 } 12 13 func mkmycounter() func() { 14 count := 1 15 16 return func() { 17 fmt.Printf("%d\n", count) 18 count++ 19 } 20 }
Before a function-creating function like mkmycounter()
returns a newly constructed subroutine to the caller, it is allowed to define local variables, which are then wrapped into the returned function's context. When called multiple times, those variables subsequently appear global (or rather static) to the call context. If a call to the generated and returned function modifies one of these variables, the next call to the function will also find the previously modified value. The enclosed variables therefore belong to the function, much like instance variables belong to an object in object-oriented programming.
As expected, the call of the binary compiled from Listing 4 shows successive calls of the generated function outputting growing counter values (Listing 5).
Listing 5
Calling the Binary
01 $ go build closure.go 02 $ ./closure 03 1 04 2 05 3
Characters, Bytes, and Runes
The call to the regexp
function ReplaceAllString()
in line 31 of Listing 3 also needs some explanation. It replaces all the characters in the org
string matched by the regular expression rex
with the characters in the repl
string. On the other hand, the ReplaceAll()
function (without the String
suffix), which the user may find first in a cursory study of the man page, expects slices of the type []byte
instead of strings. Attentive readers may wonder what the difference is, considering the fact that you can easily convert a string into a byte slice with []byte(string)
.
To explain this, it is worthwhile digressing into Go's implementation of strings [2]. Astonished Go students will discover that strings and byte slices ([]byte
) are fundamentally different data types in Go. You are not allowed to modify existing strings: Strings are immutable, but you are allowed to mess around with byte slices. In addition, strings distinguish between characters and bytes. Since strings are UTF-8 encoded in Go code, the "PiÒata" string in the program text of Listings 6 and 7 takes up seven bytes, since the accented Ò character in UTF-8 is represented as c3 b1
hex.
As the meaning of the word "character" has historically often been confused with "byte," the Unicode standard refers to them as code points. The Ò character occupies position U+00F1
, which UTF-8 encodes as c3 b1
. To make things worse, there is also an alternative rendering of it in the form of two Unicode code points. This has a squiggly tilde floating above an n, but we'll not be going into that today. The only important thing is that Go refers to code points in the Unicode standard as "runes."
While the range
operator in Listing 6 parses the runes (Figure 1), the for
loop in Listing 7 indexes the individual bytes and returns the accented character in the form of two illegible bytes. You see: It makes sense to check very carefully whether a function processes strings or byte slices. Converting between the two different data types looks easy, but it involves a great deal of internal overhead – that is, it'll cost you compute cycles at runtime.
Listing 6
range.go
package main import "fmt" func main() { str := "PiÒata" for i, c := range str { fmt.Printf("str[%d]='%c'\n", i, c) } }
Listing 7
forloop.go
package main import "fmt" func main() { str := "PiÒata" for i := 0; i < len(str); i++ { fmt.Printf("str[%d]='%c'\n", i, str[i]) } }
Off We Go
Let's get back to Listing 4. Because of the closure implemented there, the function increments the value of the seq
variable by one for each call and replaces the {seq}
placeholder in the file template with the integer value padded out to four digits with leading zeros. foo-{seq}.log
first becomes foo-0001.log
, then foo-0002.log
, and so on.
The call to
go build renamer.go mkmodifier.go
compiles both listings and links the result together into a binary called renamer
. Figure 2 shows some usage examples.
By the way, the os.Rename()
function also accepts identical source and target files – in which case it just does nothing. But if the target file already exists, it overwrites it with the source file without any warning. If you don't want that, you can add a test and maybe a new --force
option, which tells the program to bulldoze whatever it finds in the way.
To avoid unintentional renaming of critical files, it is always a good idea to do a dry run first with -d
. Is everything okay? Then go again, and do it live this time.
Infos
- Renamer: https://github.com/adriangoransson/renamer
- "Strings, bytes, runes, and characters in Go": https://blog.golang.org/strings
« Previous 1 2
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Direct Download
Read full article as PDF:
Price $2.95
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Find SysAdmin Jobs
News
-
The Next Major Release of Elementary OS has Arrived
It's been over a year since the developers of elementary OS released version 6.1 (Jólnir) but they've finally made their latest release (Horus) available with a renewed focus on the user.
-
KDE Plasma 5.27 Beta is Ready for Testing
The latest beta iteration of the KDE Plasma desktop is now available and includes some important additions and fixes.
-
Netrunner OS 23 Is Now Available
The latest version of this Linux distribution is now based on Debian Bullseye and is ready for installation and finally hits the KDE 5.20 branch of the desktop.
-
New Linux Distribution Built for Gamers
With a Gnome desktop that offers different layouts and a custom kernel, PikaOS is a great option for gamers of all types.
-
System76 Beefs Up Popular Pangolin Laptop
The darling of open-source-powered laptops and desktops will soon drop a new AMD Ryzen 7-powered version of their popular Pangolin laptop.
-
Nobara Project Is a Modified Version of Fedora with User-Friendly Fixes
If you're looking for a version of Fedora that includes third-party and proprietary packages, look no further than the Nobara Project.
-
Gnome 44 Now Has a Release Date
Gnome 44 will be officially released on March 22, 2023.
-
Nitrux 2.6 Available with Kernel 6.1 and a Major Change
The developers of Nitrux have officially released version 2.6 of their Linux distribution with plenty of new features to excite users.
-
Vanilla OS Initial Release Is Now Available
A stock GNOME experience with on-demand immutability finally sees its first production release.
-
Critical Linux Vulnerability Found to Impact SMB Servers
A Linux vulnerability with a CVSS score of 10 has been found to affect SMB servers and can lead to remote code execution.