Skip to content

Selenium + AT-SPI = GUI Testing

Wednesday, 14 December 2022


At KDE we have multiple levels of quality assurance ranging from various degrees of a humans testing features to fully automated testing. Indeed automated testing is incredibly important for the continued quality of our software. A big corner stone of our testing strategy are so called unit tests, they test a specific piece of our software for its behavior in isolation. But for many aspects of our software we need a much higher level view, testing pieces of Plasma’s application launcher in isolation is all good and well but that won’t tell us if the entire UI can be easily navigated using the keyboard. For this type of test we require a different testing approach altogether. A couple months ago I’ve set set out to create a testing framework for this use case and I’m glad to say that it has matured enough to be used for writing tests. I’d like to walk you through the technical building blocks and a simple example.

Let us start of by looking at the architecture at large. So… there’s Selenium which is an incredibly popular, albeit web-oriented, testing framework. Its main advantages for us are its popularity and that it sports a server-client split. This means we can leverage the existing client tooling available for Selenium without having to write anything ourselves, we only need to grow a server. The server component, called a WebDriver, implements the actual interaction with UI elements and is generic enough to also apply to desktop applications. Indeed so thought others as well: there already exists Appium - it extends Selenium with more app-specific features and behaviors. Something for us to build upon. The clients meanwhile are completely separate and talk to the WebDriver over a well defined JSON REST protocol, meaning we can reuse the existing clients without having to write anything ourselves. They are available in a multitude of programming languages, and who knows maybe we’ll eventually get one for writing Selenium tests in QML ;)

That of course doesn’t explain how GUI testing can work with this on Linux. Enter: AT-SPI. AT-SPI is an accessibility API and pretty much the standard accessibility system for use on Linux. Obviously its primary use is assistive technologies, like the screen reader Orca, but to do its job it essentially offers a toolkit-independent way of introspecting and interacting with GUI applications. This then gives us a way to implement a WebDriver without caring about the toolkit or app specifics. As long as the app supports AT-SPI, which all Qt apps do implicitly, we can test it.

Since all the client tooling is independent of the server all we needed to get GUI testing going was a WebDriver that talks to AT-SPI.

That is what I set out to write and I’m happy to announce that we now have an AT-SPI based WebDriver, and the first tests are popping into existence already. There is also lovely documentation to hold onto.

So, without further ado. Let us write a simple test. Since the documentation already writes one in Python I’ll use Ruby this time around so we have some examples of different languages. A simple candidate is KInfoCenter. We can test its search functionality with a couple of lines of code.

First we need to install selenium-webdriver-at-spi, clone it, cmake build it, and cmake install it. You’ll also need to install the relevant client libraries. For ruby that’s simply running gem install appium_lib.

Then we can start with writing our test. We will need some boilerplate setup logic. This is more or less the same for every test. For more details on the driver setup you may also check the wiki page.

  def setup
    @appium\_driver = Appium::Driver.new(
      {
        'caps' => { app: 'org.kde.kinfocenter.desktop' },
        'appium\_lib' => {
          server\_url: 'http://127.0.0.1:4723',
          wait\_timeout: 10,
          wait\_interval: 0.5
        }
      }, true
    )
    @driver = @appium\_driver.start\_driver
  end

The driver will take care of starting the correct application and make sure that it is actually running correctly. Next we’ll write the actual test. Let’s test the search. The first order of business is using a tool called Accerciser to inspect the AT-SPI presentation of the application. For more information on how to use this tool please refer to the wiki. Using Accerciser I’ve located the search field and learned that it is called ‘Search’. So, let’s locate it and activate it, search for the CPU module:

  def test\_search
    search = driver.find\_element(:name, 'Search')
    search.click
    search.send\_keys('cpu')

Next let us find the CPU list item and activate it:

    cpu = driver.find\_element(:class\_name, '\[list item | CPU\]')
    assert(cpu.displayed?)
    cpu.click

And finally let’s assert that the page was actually activated:

    cpu\_tab = driver.find\_element(:class\_name, '\[page tab | CPU\]')
    assert(cpu\_tab.displayed?)

To run the complete test we can use the run wrapper: selenium-webdriver-at-spi-run ./kinfocentertest.rb (mind that it needs to be +x). If all has gone well we should get a successful test.

Finished in 1.345276s, 0.7433 runs/s, 1.4867 assertions/s.

1 runs, 2 assertions, 0 failures, 0 errors, 0 skips
I, \[2022-12-14T13:13:53.508516 #154338\]  INFO -- : tests done
I, \[2022-12-14T13:13:53.508583 #154338\]  INFO -- : run.rb exiting true

This should get you started with writing a test for your application! I’ll gladly help and review your forthcoming tests. For more detailed documentation check out the writing-tests wiki page as well as the appium command reference.

Of course the work is not done. selenium-webdriver-at-spi is very much still a work in progress and I’d be glad for others to help add features as they become needed. The gitlab project is the place for that. <3

The complete code of the example above:

#!/usr/bin/env ruby
# frozen\_string\_literal: true

# SPDX-License-Identifier: GPL-2.0-only OR GPL-3.0-only OR LicenseRef-KDE-Accepted-GPL
# SPDX-FileCopyrightText: 2022 Harald Sitter <sitter@kde.org>

require 'appium\_lib'
require 'minitest/autorun'

class TestKInfoCenter < Minitest::Test
  attr\_reader :driver

  def setup
    @appium\_driver = Appium::Driver.new(
      {
        'caps' => { app: 'org.kde.kinfocenter.desktop' },
        'appium\_lib' => {
          server\_url: 'http://127.0.0.1:4723',
          wait\_timeout: 10,
          wait\_interval: 0.5
        }
      }, true
    )
    @driver = @appium\_driver.start\_driver
  end

  def teardown
    driver.quit
  end

  def test\_search
    search = driver.find\_element(:name, 'Search')
    search.click
    search.send\_keys('cpu')

    cpu = driver.find\_element(:class\_name, '\[list item | CPU\]')
    assert(cpu.displayed?)
    cpu.click

    cpu\_tab = driver.find\_element(:class\_name, '\[page tab | CPU\]')
    assert(cpu\_tab.displayed?)
  end
end