Screen scraping with Colly in Go
Programming Snapshot – Colly

Lead Image © Hannu Viitanen, 123RF.com
The Colly scraper helps developers who work with the Go programming language to collect data off the web. Mike Schilli illustrates the capabilities of this powerful tool with a few practical examples.
As long as there are websites to view for the masses of browser customers on the web, there will also be individuals on the consumer side who want the data in a different format and write scraper scripts to automatically extract the data to fit their needs.
Many sites do not like the idea of users scraping their data. Check the website's terms of service for more information, and be aware of the copyright laws for your jurisdiction. In general, as long as the scrapers do not republish or commercially exploit the data, or bombard the website too overtly with their requests, nobody is likely to get too upset about it.
Different languages offer different tools for this. Perl aficionados will probably appreciate the qualities of WWW::Mechanize
as a scraping tool, while Python fans might prefer the selenium
package [1]. In Go, there are several projects dedicated to scraping that attempt to woo developers.
[...]
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

News
-
AUR Repository Still Under DDoS Attack
Arch User Repository continues to be under a DDoS attack that has been going on for two weeks.
-
RingReaper Malware Poses Danger to Linux Systems
A new kind of malware exploits modern Linux kernels for I/O operations.
-
Happy Birthday, Linux
On August 25, Linux officially turns 34.
-
VirtualBox 7.2 Has Arrived
With early support for Linux kernel 6.17 and other new additions, VirtualBox 7.2 is a must-update for users.
-
Linux Mint 22.2 Beta Available for Testing
Some interesting new additions and improvements are coming to Linux Mint. Check out the Linux Mint 22.2 Beta to give it a test run.
-
Debian 13.0 Officially Released
After two years of development, the latest iteration of Debian is now available with plenty of under-the-hood improvements.
-
Upcoming Changes for MXLinux
MXLinux 25 has plenty in store to please all types of users.
-
A New Linux AI Assistant in Town
Newelle, a Linux AI assistant, works with different LLMs and includes document parsing and profiles.
-
Linux Kernel 6.16 Released with Minor Fixes
The latest Linux kernel doesn't really include any big-ticket features, just a lot of lines of code.
-
EU Sovereign Tech Fund Gains Traction
OpenForum Europe recently released a report regarding a sovereign tech fund with backing from several significant entities.