What is StepWright?

StepWright is a powerful Python library built on top of Microsoft Playwright that abstracts away the complexity of raw browser automation scripts.

Instead of writing imperative page.locator(...).click() commands buried deep in loops, StepWright allows you to define declarative scraping workflows using dictionaries or Dataclasses.

Why StepWright?

The Problem

Traditional web scraping scripts quickly become a monolithic mess of try/except blocks, time.sleep(), and deeply nested loops when dealing with pagination, infinite scrolls, or robust error handling.

The Solution

StepWright separates what you want to extract from how to extract it. By defining your flow as a series of abstract BaseStep instructions, the underlying execution engine handles:

Automatic Waiting & Retries: Never write a manual wait_for_selector again.
Robust Fallbacks: Provide arrays of selectors (ID, Class, XPath) and let the engine find the element.
Complex Navigations: Built-in handlers for IFrames, Virtual Scrolling, and multi-tab workflows.
Performance: Natively supports parallel execution of scraping scenarios.

Core Concepts

`TabTemplate`

A sequence of actions executed sequentially in a single Browser Tab. Think of this as one specific "Scraping Job".

`BaseStep`

The fundamental unit of instruction. A step can represent an action (click, input, navigate), a logic controller (foreach), or an extraction command (data).

`Collector`

An internal dictionary automatically managed by StepWright. As BaseSteps extract data from the page, the data is pushed into the active collector and aggregated hierarchically.

What is StepWright? ​

Why StepWright? ​

The Problem ​

The Solution ​

Core Concepts ​

TabTemplate ​

BaseStep ​

Collector ​