Selenium in Go: A Beautiful Nightmare

Max Finn
6 min readMar 4, 2020

--

Starting this treacherous journey started out innocent enough. I was asked to do the final project for my Golang class, something that would make our lives easier daily. Thinking to how much I hated having to type a bunch of things over and over outside of Go, I started with the idea of a SQL injector that would incorporate Go’s native concurrency features in order to test multiple websites, possibly scraped from a websites and any also checks everywhere the site links to. This idea has been done before, but with my newfound knowledge of the potency of Go routines and benchmarking, I thought it would be cool to see how Go would stack up. However, I had a profound realization based on experience and found an even higher calling, sending me down a deep, dark path.

My epiphany came from years of arguments over text. With the most opinionated people in chats everywhere. I wanted to make that even better, and also make it impossible to post fake/photoshopped answers. So by having a bot do this process, I could make it undeniable with an added bonus chance to backfire on the user should they be mistaken. I had just done a project on Slack chat bots in Go, giving me the platform I needed to take in user queries and return them. Finally, I just needed something to input my query into a fact check site like snopes.com. Unbeknownst to me, I was standing at the gate of something I still don’t know if I truly wanted to open.

Selenium immediately stood out as the best option, primarily because it was the only option. No other package could do what I needed and it didn’t look so bad! Its main purpose is to mimic human input for the sake of testing sites automatically, but we could use it instead to make a requests without ever needing to click anything ourselves. There was even code examples for running simple requests in Golang, what could go(ha) wrong? Little did I know, I had just walked my first steps down a winding and twisted road, full of peril.

Requirements

So how do you even set up Selenium in Go? I had a couple of different options, but went with Tebeka’s Selenium Framework for Go(found at https://github.com/tebeka/selenium), as it had the most extensive documentation and tests. It was looking great, the setup seemed easy enough. I soon learned this was not the case. Selenium requires a few things to work, which I will lay out in the most approachable way I can, for future attempts.

First, like any go packages, the first step is to “go get” the package using the correct name. Then, X frame virtual buffer or XVFB for short, was needed to help create headless instances, which I will go over later. As it turns out, XVFB was standard on Mac computers until OSX, when it was removed. Xquartz is a package for Mac that includes XVFB, so no trouble there. Just make sure it’s included in your $Path and you’re set. Now, this is where the fun begins.

Next we require the web drivers for the browser we want to run in. This requires the user to have this browser installed. This immediately stood out as a problem as many different browsers required different drivers, and I didn’t want to flood my code with cases. So, I just went with the standard Chrome driver and set on my way. The trick was finding where this driver had to go. I had never worked with an executable in my code, and as it turns out you can’t just establish a path for it. This chrome driver needed to go into your actual repository so Selenium can access it to actually run the headless instance. Finally, all we needed the Java software development kit(SDK) and the selenium web driver in the form of a .jar file. Establishing the paths for these requires a lot of patience with finding the correct file paths, and can get very confusing. I recommend going one at a time and checking each off the list, making sure each works as you go.

Selenium Requests

So, you’re finally ready to start your Selenium request. Start the frame buffer by simply using the command provided in the docs. Define the browser and initiate it, this is where choosing Chrome or another browser comes into play, and everything must be connected through the correct paths to work, and it’s very finicky. Connect to the driver locally by defining an output port. Then we use the headless instance to actually make a request with the driver on a real website.

The basic logic for how Selenium operates goes like this:

  • Navigate to the site
  • Reference the element you want to interact with using CSS selectors
  • Remove any boilerplate(i.e. the “Search” you see greyed out before you type something on website searches)
  • Input the query into the text box
  • Click the run button
  • Wait for the response and collect it

Almost all of these can prove to be very fickle, so more testing along the way can help save time. Each site will be different, and using CSS selectors makes the code more susceptible to breakage when websites change their frontend. You’re also bound to encounter different weird environment errors along the waySomething to experiment with would be to try and fill multiple fields, which is used by purchase bots that dry out stocks products in seconds of release. Fun fact: many websites hide the true CSS selector of the search bar to prevent things like this from happening. There’s a few test sites available, or you can just try to open the browser without doing anything. I have since found that Selenium is more intended for testing on your own sites, and websites don’t like us using bots to poke around on them.

Headless

So, as I mentioned earlier, what are headless instances and why does Selenium use them? A headless web instance is like running your browser, but without being able to see it. It doesn’t include headers, as it doesn’t need them, hence the name. Everything the program does is under-the-hood within this browser instance, and all we needed to do was write out the guide for it to make requests, click buttons, and more!

What else Can Selenium be used for though?

Well, with automation like this comes both good and bad. Selenium has capabilities to take screenshots, push buttons and running automated tests on JS libraries. The bad includes DDoS attacks and credential stuffing. Especially considering the power Go has to automate and run things asynchronously, Selenium has potential to be used in malicious ways. No wonder sites make it impossible to use!

Overall Thoughts and Conclusion

Selenium is difficult. I walked into it with 6 days to produce a fully functional bot incorporating this new, complicated and headache-inducing technology. I liked the capabilities of Selenium, but it seems to be better implemented into Python than Go. It felt like I fell into something I didn’t mean to, similar to Frodo from Lord of the Rings. But as the wise Boromir said, “One does not just simply ‘walk’ into Mordor.” I feel like the equivalent occurred.

In the future, I would actually come back to Selenium. It is powerful, even if frustrating. Another option, possibly much easier, would be to implement the function in Python and a Go concurrency script utilizing it in Docker, but that remains to be seen. It seemed strangely optimized for Linux and Windows, which threw me off guard. However, going through this process helped me steel my resolve. I pushed myself through extremely bizarre environmental errors, and many of them. It isn’t programming, just moving files and installing things and setting paths. It wasn’t enjoyable and much trial and error, but I know this is a problem that will continue in the future, and this is a project that rewards patience with some cool.

To quote my incredibly credible teacher:

“We just kinda use Selenium, because it’s the only thing we have. There’s nothing else right now.”

However, she recommended chrome web driver for my next attempt at this idea.

--

--

Max Finn
Max Finn

Written by Max Finn

I'm a passionate backend engineer writing about my code projects so that I can make it a little easier on myself(and hopefully you) later.

Responses (1)