Skip to content

propensive/tarantula

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Workflow

Tarantula

Drive a web browser using the WebDriver protocol

Tarantula makes it possible to interact with a web browser through a programmatic interface. It provides an immutable API for controlling the web browser from Scala, through the WebDriver protocol.

Features

  • simulate keypresses and mouse clicks in a web browser
  • automatically launch Chrome or Firefox programmatically
  • uses the standard WebDriver protocol
  • intuitive, but typesafe syntax

Availability

Getting Started

Browser Sessions

Tarantula makes it possible to control a web browser programmatically from Scala. Currently Firefox and Chrome are supported.

All browser operations take place in a session, which may be started by calling the session method, specifying a port number, on a Browser object; either Chrome or Firefox. For example:

Firefox.session(8120):
  // Browser actions are carried out in this scope

Simple navigation

Within the session body, the browser object, may be accessed and used to control the newly-launched browser.

The browser object is an instance of WebDriver#Session, and includes several navigational methods:

  • navigateTo(url) - to send the browser to a particular URL
  • refresh() - to refresh the page
  • back() - to go back to the previous page
  • forward() - to go forward (assuming we have already gone back at least once)

The title() method will also return the page title, and url() the current URL, as Text instances.

Acessing Elements

Within a particular page, it's possible to access an element with the element method, which takes, as a parameter, a way of locating that element, of which several different types are valid:

  • Text: finds an element by its link text
  • Selector: type from Cataclysm, which find a value by CSS selection
  • TagType, DomId, Cls: types from Honeycomb, which finds a value by an HTML tag, DOM ID or CSS class

For example, the link containing the text "here" could be selected with, browser.element(t"here") or the element which is an instance of an <img> HTML tag could be found with browser.element(Img), where the Img value is defined in Honeycomb. In both cases an Element instance will be returned, or an exception will be thrown if no matching element exists on the page.

HTML has a tree-based structure, so it's possible to select one element within another with repeated applications of the Element#element method, for example,

val link = browser.element(Nav).element(id"menu").element(t"About")

would find the link containing the text About in the element with ID menu which is inside a <nav> HTML element.

Accessing multiple elements

Often it's useful to find all elements on a page, which is served by the / method of browser, Element and as an extension on List[Element]. Although the method is defined on three types, it always returns a List[Element], and since this is one of the types defining /, it is easy to progressively filter a selection of elements to a single one, with repeated applications. The infix syntax is particularly intuitive. For example,

for elem <- browser / id"menu" / Li / cls"checkbox" do elem.click()

would simulate a click on every element with the checkbox CSS class inside an <li> tag in the element with ID menu.

Status

Tarantula is classified as fledgling. For reference, Soundness projects are categorized into one of the following five stability levels:

  • embryonic: for experimental or demonstrative purposes only, without any guarantees of longevity
  • fledgling: of proven utility, seeking contributions, but liable to significant redesigns
  • maturescent: major design decisions broady settled, seeking probatory adoption and refinement
  • dependable: production-ready, subject to controlled ongoing maintenance and enhancement; tagged as version 1.0.0 or later
  • adamantine: proven, reliable and production-ready, with no further breaking changes ever anticipated

Projects at any stability level, even embryonic projects, can still be used, as long as caution is taken to avoid a mismatch between the project's stability level and the required stability and maintainability of your own project.

Tarantula is designed to be small. Its entire source code currently consists of 141 lines of code.

Building

Tarantula will ultimately be built by Fury, when it is published. In the meantime, two possibilities are offered, however they are acknowledged to be fragile, inadequately tested, and unsuitable for anything more than experimentation. They are provided only for the necessity of providing some answer to the question, "how can I try Tarantula?".

  1. Copy the sources into your own project

    Read the fury file in the repository root to understand Tarantula's build structure, dependencies and source location; the file format should be short and quite intuitive. Copy the sources into a source directory in your own project, then repeat (recursively) for each of the dependencies.

    The sources are compiled against the latest nightly release of Scala 3. There should be no problem to compile the project together with all of its dependencies in a single compilation.

  2. Build with Wrath

    Wrath is a bootstrapping script for building Tarantula and other projects in the absence of a fully-featured build tool. It is designed to read the fury file in the project directory, and produce a collection of JAR files which can be added to a classpath, by compiling the project and all of its dependencies, including the Scala compiler itself.

    Download the latest version of wrath, make it executable, and add it to your path, for example by copying it to /usr/local/bin/.

    Clone this repository inside an empty directory, so that the build can safely make clones of repositories it depends on as peers of tarantula. Run wrath -F in the repository root. This will download and compile the latest version of Scala, as well as all of Tarantula's dependencies.

    If the build was successful, the compiled JAR files can be found in the .wrath/dist directory.

Contributing

Contributors to Tarantula are welcome and encouraged. New contributors may like to look for issues marked beginner.

We suggest that all contributors read the Contributing Guide to make the process of contributing to Tarantula easier.

Please do not contact project maintainers privately with questions unless there is a good reason to keep them private. While it can be tempting to repsond to such questions, private answers cannot be shared with a wider audience, and it can result in duplication of effort.

Author

Tarantula was designed and developed by Jon Pretty, and commercial support and training on all aspects of Scala 3 is available from Propensive OÜ.

Name

Tarantulas are spiders, known for making webs, and Tarantula is a library for the WebDriver protocol.

In general, Soundness project names are always chosen with some rationale, however it is usually frivolous. Each name is chosen for more for its uniqueness and intrigue than its concision or catchiness, and there is no bias towards names with positive or "nice" meanings—since many of the libraries perform some quite unpleasant tasks.

Names should be English words, though many are obscure or archaic, and it should be noted how willingly English adopts foreign words. Names are generally of Greek or Latin origin, and have often arrived in English via a romance language.

Logo

The logo represents the eight legs of a tarantula.

License

Tarantula is copyright © 2024 Jon Pretty & Propensive OÜ, and is made available under the Apache 2.0 License.

About

Drive a web browser with the WebDriver protocol for Scala

Topics

Resources

Stars

Watchers

Forks

Languages