Running Headless Selenium with Chrome

Scaling website automation for either testing or scraping can be a challenge when the site is enitrely driven by JavaScript or behaves differently when using specific browsers.

Running a headless Selenium machine with Google’s Chrome installed provides a scalable way to automate your tests on one of the most popular browsers in use.

Here are step by step instructions for installing a headless Selenium server with Chrome and Vagrant.

Selenium with Chrome

Side note: Why use Selenium instead of PhantomJS?


Spike Goals

  • Get up and running quickly
  • Run a sample script that demos it works
  • Use JavaScript only (via NodeJS)

Prerequisites

The code you write locally should work when deployed at scale in production. These tools help us do that by creating identical environments for development and production.

Both are free downloads. Install with the default settings

I also assume you can use a command line and have some vague idea of what a virtual machine and Vagrant is.


#1. Create a “Vagrantfile”

This file tells Vagrant how configure the testing environment. It applies universally to both development and production.

Create a project directory and create a file named Vagrantfile:

# encoding: utf-8
# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "precise64"
  config.vm.box_url = "http://files.vagrantup.com/precise64.box"
  config.ssh.forward_agent = true

  config.vm.provider :aws do |aws, override|
    aws.access_key_id = 'XXXX'      # Replace this
    aws.secret_access_key = 'XXXX'  # Replace this
    aws.keypair_name = 'XXXX'       # Replace this
    aws.ami = 'ami-7747d01e'        # ubuntu 12.04
    override.ssh.username = 'ubuntu'
    override.ssh.private_key_path = '~/.ssh/amazon-ubuntu.pem'
  end

  config.vm.provision :shell, :path => "setup.sh"
  config.vm.network :forwarded_port, guest:4444, host:4444

end

#2. Create “setup.sh”

The setup.sh file executes when Vagrant creates a virtual machine for you. In the same folder as you created your VagrantFile create a setup.sh file:

#!/bin/sh
set -e

if [ -e /.installed ]; then
  echo 'Already installed.'

else
  echo ''
  echo 'INSTALLING'
  echo '----------'

  # Add Google public key to apt
  wget -q -O - "https://dl-ssl.google.com/linux/linux_signing_key.pub" | sudo apt-key add -

  # Add Google to the apt-get source list
  echo 'deb http://dl.google.com/linux/chrome/deb/ stable main' >> /etc/apt/sources.list

  # Update app-get
  apt-get update

  # Install Java, Chrome, Xvfb, and unzip
  apt-get -y install openjdk-7-jre google-chrome-stable xvfb unzip

  # Download and copy the ChromeDriver to /usr/local/bin
  cd /tmp
  wget "https://chromedriver.googlecode.com/files/chromedriver_linux64_2.2.zip"
  wget "https://selenium.googlecode.com/files/selenium-server-standalone-2.35.0.jar"
  unzip chromedriver_linux64_2.2.zip
  mv chromedriver /usr/local/bin
  mv selenium-server-standalone-2.35.0.jar /usr/local/bin

  # So that running `vagrant provision` doesn't redownload everything
  touch /.installed
fi

# Start Xvfb, Chrome, and Selenium in the background
export DISPLAY=:10
cd /vagrant

echo "Starting Xvfb ..."
Xvfb :10 -screen 0 1366x768x24 -ac &

echo "Starting Google Chrome ..."
google-chrome --remote-debugging-port=9222 &

echo "Starting Selenium ..."
cd /usr/local/bin
nohup java -jar ./selenium-server-standalone-2.35.0.jar &

#3. Run “vagrant up”

On your command line and in the directory where you created the VagrantFile, run the following command:

vagrant up

This will kick off downloading and installing all the pieces neccessary. It should look like this:

vagrant up


#4. Make sure it’s running

You can check to see if everything is working by going to http://localhost:4444/wd/hub.

The VagrantFile has been configured to forward port 4444 on your localhost. This allows you UI control of the Selenium browser. This page shows you all the sessions that you’re running in your virtual machine. If you see this page, everything is OK.

WebDriver UI


#5. Install the selenium-webdriver

In order to write NodeJS scripts that talk to Chrome you will need the Selenium-Webdriver for NodeJS.

On your command line, install selenium-webdriver with the following command. This will install the modules needed for interacting with Selenium.

npm install selenium-webdriver

#6. Write your first Selenium script

This first script will go to Google’s homepage, type in a query, then print out the HTML.

var webdriver = require('selenium-webdriver');

var keyword = "chris le twitter";

var driver = new webdriver.Builder().
   usingServer('http://localhost:4444/wd/hub').
   withCapabilities(webdriver.Capabilities.chrome()).
   build();

driver.get('http://www.google.com');
driver.findElement(webdriver.By.name('q')).sendKeys(keyword);
driver.findElement(webdriver.By.name('btnG')).click();
driver.wait(function() {
  return driver.getTitle().then(function(title) {
    driver.getPageSource().then(function(html) {
      console.log(html);
      return true;
    });
  });
}, 1000);

driver.quit();

#7. Run your test

Run your test with node. You should see the HTML that was rendered by the Chrome browser.

HTML from NodeJs


Use Cases

So now that you have this up and running what can you use it for?

Running your automated test suites: This is great for doing integration testing against Chrome browsers and probably responsive websites.

Testing your Chrome Extentions: Debugging Chrome Extensions can be a bit of a pain. This could be your Asprin.

Taking many screenshots: If you want to make screenshots of many pages at once.

Scraping stubborn websites: I wasn’t able to scrape a website using PhantomJS because it fired JSONP requests long after the onLoad() event fired. Simply waiting for the event loop to empty itself wansn’t enough. A combination of debugging with a real browser and Selenium, I was more successful at getting the DOM after the scripts had run.

30 thoughts on “Running Headless Selenium with Chrome

  1. Hi, I’m following along and I believe you forgot to include something like

    wget https://selenium.googlecode.com/files/selenium-server-standalone-2.35.0.jar

    In the `setup.sh`. Otherwise, nice article! Next step is trying to automate some WebRTC testing…

  2. In the vagrant config: Is the box_url necessary when using the aws provider?

    • It’s there so that if you’ve just installed Vagrant for the first time it will be able to find and download it.

  3. I have gone through the listed steps and this is what I am getting. When I try to connect with Chrome I get “No Data Received” on the page. Please advise to what I might be doing wrong.

    For help on any individual command run `vagrant COMMAND -h`
    GS020054:Auto_Test matthew.warner$ vagrant halt
    [default] Attempting graceful shutdown of VM…
    GS020054:Auto_Test matthew.warner$ vagrant up
    Bringing machine ‘default’ up with ‘virtualbox’ provider…
    [default] Setting the name of the VM…
    [default] Clearing any previously set forwarded ports…
    [default] Creating shared folders metadata…
    [default] Clearing any previously set network interfaces…
    [default] Preparing network interfaces based on configuration…
    [default] Forwarding ports…
    [default] — 22 => 2222 (adapter 1)
    [default] — 4444 => 4444 (adapter 1)
    [default] Booting VM…
    [default] Waiting for VM to boot. This can take a few minutes.
    [default] VM booted and ready for use!
    [default] Mounting shared folders…
    [default] — /vagrant
    [default] Running provisioner: shell…
    [default] Running: /var/folders/hq/rkfzl4yj2nj6z1xt2dkc89d42z29vy/T/vagrant-shell20130822-11923-b80kxs
    stdin: is not a tty
    Already installed.
    Starting Xvfb …
    Starting Google Chrome …
    Starting Selenium …

    • You’re running into the same problem as other people above. Try adding

      config.vm.network :forwarded_port, host: 9222, guest: 9222

      to your Vagrantfile. This will forward the chrome remote debugger port.

  4. Thought I’d leave a note after having some trouble – Chrome will refuse to start as root, which means that the selenium server process must not be started by root.

  5. Thanks for writing this, Chris. I’ve been looking for a good way to implement testing for the Chrome extension I’m building.

    I’m running into an issue. When I run `node test.js` (which is what I called my test file, just what you did in step 6, same code), I get an output of HTML and CSS and then I get:

    /Users/zack/node_modules/selenium-webdriver/lib/webdriver/promise.js:1542
    throw error;

    Did you run into this at all? I did some Googling but nothing helpful came up. Any idea how to resolve this one?

    • This is happening because the function passed to the driver.wait call isn’t returning ‘true’ as expected, you can have the return value resolve correctly by editing the code like this:

      driver.wait(function() {

      return driver.getTitle().then(function(title) {

      return driver.getPageSource().then(function(html) {

      console.log(html);

      return true;

      });

      });

      }, 1000);

      Notice the return driver.getTitle()…

      • I’m also getting something similar:
        …./node_modules/selenium-webdriver/lib/webdriver/promise.js:1702
        throw error;
        ^
        UnknownError: unknown error: unable to discover open pages
        (Driver info: chromedriver=2.2,platform=Linux 3.2.0-23-generic x86_64) (WARNING: The server did not provide any stacktrace information)
        Command duration or timeout: 20.80 seconds
        Build info: version: ‘2.35.0’, revision: ‘c916b9d’, time: ‘2013-08-12 15:42:01’
        System info: os.name: ‘Linux’, os.arch: ‘amd64’, os.version: ‘3.2.0-23-generic’, java.version: ‘1.7.0_51’
        Driver info: org.openqa.selenium.chrome.ChromeDriver
        at new bot.Error (/Users/tomhowe/node_modules/selenium-webdriver/lib/atoms/error.js:109:18)

        using 2.40.0 of selenium-webdriver

        I tried adding ‘return’ in front of driver.getPageSource() as per your suggestion but no improvement.

        • Try forwarding the port to the Chrome debugger in the Vagrantfile

          This was the same error I ran into, you should be able to clear it up by editing the Vagrantfile and adding this line

          config.vm.network :forwarded_port, host: 9222, guest: 9222

          • Sorry I should have said I added that line already. I put it at the bottom of the vagrant file like so..

            Vagrant.configure(“2”) do |config|

            config.vm.box = “precise64”
            config.vm.box_url = “http://files.vagrantup.com/precise64.box”
            config.ssh.forward_agent = true
            config.vm.provider :aws do |aws, override|
            aws.access_key_id = ‘XXXX’ # Replace this
            aws.secret_access_key = ‘XXXX’ # Replace this
            aws.keypair_name = ‘XXXX’ # Replace this
            aws.ami = ‘ami-7747d01e’ # ubuntu 12.04
            override.ssh.username = ‘ubuntu’
            override.ssh.private_key_path = ‘~/.ssh/amazon-ubuntu.pem’

            end

            config.vm.provision :shell, :path => “setup.sh”
            config.vm.network :forwarded_port, guest:4444, host:4444
            config.vm.network :forwarded_port, host: 9222, guest: 9222
            end

            Is that the correct placement?

            I havent set any of the aws.keys – are these relevant?

            Thanks, Tom

            ~

          • I didn’t set up anything for AWS I’m running my VM locally. I also updated the download links to be up to date. Maybe using this setup will help you.

            _____VAGRANTFILE______
            # -*- mode: ruby -*-
            # vi: set ft=ruby :
            # Vagrantfile API/syntax version. Don’t touch unless you know what you’re doing!
            VAGRANTFILE_API_VERSION = “2”
            Vagrant.configure(“2”) do |config|
            config.vm.box = “precise64”
            config.vm.box_url = “http://files.vagrantup.com/precise64.box”
            config.ssh.forward_agent = true
            config.vm.provision :shell, :path => “bootstrap.sh”
            config.vm.network :forwarded_port, host: 4444, guest: 4444
            config.vm.network :forwarded_port, host: 9222, guest: 9222
            end

            _____BOOTSTRAP.SH_____
            #!/bin/sh
            set -e
            if [ -e /.installed ]; then
            echo ‘Already installed.’
            else
            echo ”
            echo ‘INSTALLING’
            echo ‘———-‘
            # Add Google public key to apt
            wget -q -O – “https://dl-ssl.google.com/linux/linux_signing_key.pub” | sudo apt-key add –
            # Add Google to the apt-get source list
            echo ‘deb http://dl.google.com/linux/chrome/deb/ stable main’ >> /etc/apt/sources.list
            # Update app-get
            apt-get update
            # Install Java, Chrome, Xvfb, and unzip
            apt-get -y install openjdk-7-jre google-chrome-stable xvfb unzip
            # Download and copy the ChromeDriver to /usr/local/bin
            cd /tmp
            wget “http://chromedriver.storage.googleapis.com/2.9/chromedriver_linux64.zip”
            wget “https://selenium.googlecode.com/files/selenium-server-standalone-2.39.0.jar”
            unzip chromedriver_linux64.zip
            mv chromedriver /usr/local/bin
            mv selenium-server-standalone-2.39.0.jar /usr/local/bin
            # So that running `vagrant provision` doesn’t redownload everything
            touch /.installed
            fi
            # Start Xvfb, Chrome, and Selenium in the background
            echo “Starting Xvfb …”
            export DISPLAY=:10
            cd /vagrant
            Xvfb :10 -screen 0 1366x768x24 -ac &
            #xvfb-run –server-args=”-screen 0, 1366x768x24″ -ac &
            echo “Starting Google Chrome …”
            google-chrome –remote-debugging-port=9222 &
            # –user-data-dir=remote-profile &
            # echo “Starting Xvfb …”
            echo “Starting Selenium …”
            cd /usr/local/bin
            # nohup java -jar ./selenium-server-standalone-2.39.0.jar &
            java -jar ./selenium-server-standalone-2.39.0.jar &

            ____vagrantTest.js_____
            var webdriver = require(‘selenium-webdriver’);
            var keyword = “Diego Mejia”;

            var driver = new webdriver.Builder().
            usingServer(‘http://localhost:4444/wd/hub/’).
            withCapabilities(webdriver.Capabilities.chrome()).
            build();
            driver.get(‘http://www.google.com’);
            driver.findElement(webdriver.By.name(‘q’)).sendKeys(keyword);
            driver.findElement(webdriver.By.name(‘btnG’)).click();
            driver.wait(function() {

            return driver.getTitle().then(function(title) {
            return driver.getPageSource().then(function(html) {
            console.log(html);
            return true;
            });
            });
            }, 1000);

            driver.quit();

            _____________
            Then you can just run the command ‘vagrant up –provision’ then ‘node vagrantTest.js’ and the test should run fine.

            It looks like Disqus cuts off links(“google.com/…”) when posted in a comment, you’re going to have to manually replace the text with the full link.

          • That works now! Thank you very much for taking the time to help. For anyone else using this, just be aware that the download links need to match the unzip commands for chromedriver_linux64 and selenium-server-standalone-2.39.0.jar.

          • I’m glad I could help!

  6. […] Cómo correr un Selenium con Chrome sin UI: Muy interesante para testing. […]

  7. Sorry for asking but… what color scheme are you using?

  8. Was so stoked to find this. You just saved me a couple hours of research. Thanks for taking the time to share!

  9. Hi there Thanks for putting this together! I had the same problem as Matt below. Any idea what might be causing that?

  10. Hi Chris

    I am getting following error when I try to run my selenium code to launch chrome driver and test signin functionality in EC2 Box.

    Exception in thread “main” org.openqa.selenium.WebDriverException: unknown error: an X display is required for keycode conversions, consider using Xvfb

    (Session info: chrome=31.0.1650.63)

    (Driver info: chromedriver=2.8.240825,platform=Linux 3.2.0-36-virtual x86_64) (WARNING: The server did not provide any stacktrace information)

    Command duration or timeout: 83 milliseconds

    Build info: version: ‘2.33.0’, revision: ‘4ecaf82108b2a6cc6f006aae81961236eba93358’, time: ‘2013-05-22 12:00:17’

    System info: os.name: ‘Linux’, os.arch: ‘amd64’, os.version: ‘3.2.0-36-virtual’, java.version: ‘1.7.0_25’

    Session ID: 9e7150582926e064d5c93c97a87b9008

    I run Xvfb server like this, in another terminal:-

    Xvfb :1 -screen 0 1366x768x24 -ac

    Any help is appreciated.

    Thanks!

  11. I got his error while running the test suiite

    WebDriverException: Message: u’unknown error: Chrome failed to start: exited abnormallyn (Driver info: chromedriver=2.9.248304,platform=Linux 3.2.0-23-generic x86_64)

    • This was the same error I ran into, you should be able to clear it up by editing the Vagrantfile and adding this line

      config.vm.network :forwarded_port, host: 9222, guest: 9222

  12. It looks good but I believe you are forwarding port 9222 for remote debugging the chrome instance. I’ve included an extra line in Vagrantfile

    config.vm.network :forwarded_port, host: 9222, guest: 9222

    This resolved the issues I had communicating with Chrome remotely.

  13. I am trying to run a test case, where user changes its profile pic. I click on Browse button which opens a another window (I need to pick a picture). Now Xvfb not able to understand this window and hence test case fails. How to assign a Xvfb variable to this file browse window.

  14. Where exactly do your scripts go? The script example that does a google search, where is that file placed? Inside your project directory that you created? Because selenium webdriver did not install to that directory, it is elsewhere on my mac. And am I correct that the scripts will be .js files?

  15. What’s the difference between running as shown above (in a virtual machine) and running all on one ubuntu native host (chrome, client, server on the same machine)

  16. Awesome tutorial. I used it to make Django Selenium tests work, I would add a line in the setup.sh giving permissions to chromedriver:

    chmod +rx /usr/local/bin/chromedriver

    This will make Django test server able to access the driver. If not it will fail.

  17. Will the vagrant file work with Java?

Leave a Reply

Your email address will not be published. Required fields are marked *