software-engineeringtest-automationbddiot

The Framework Borrows From BDD. Here Is Where It Deliberately Departs.

Both Cucumber and the framework I built descend from a readable spec into code. The departures are in how the document binds to that code, how much it holds on its own, and what it was built to test.

By YodaJune 10, 20265 min read

The Framework Borrows From BDD. Here Is Where It Deliberately Departs.

I Borrowed From BDD.

Describe a system where readable documents execute as tests and a certain kind of engineer answers immediately: that is Cucumber. The reflex is fair, and I am not going to wave it away. My test automation framework borrows BDD's central idea, an executable specification a human can read, and I owe anyone asking a precise account of where it departs and why.

The similarities

A Cucumber test descends through layers. The feature file holds the sentence, a step-definition function holds the mechanism, and that function leans on page objects or helpers underneath. the framework I am building descends too. A step in the document can point through a code_ref to a spec driver, and that driver leans on shared test fixtures underneath. Both systems separate the readable specification from the code that executes it. When you go looking for the deepest mechanism, both put you in source code.

So the difference is not "no code." If that were the only axis, there would be no reason for the framework to exist, and I would have used Cucumber. The difference is in three seams that matter once the thing under test is physical hardware rather than a web application.

One: how the document binds to the code

This is the technical departure, and it is the one I care about most.

Cucumber binds a sentence to a function by pattern matching. You write When the user logs in, and somewhere a step definition declares a regular expression or a Cucumber Expression that claims to match that sentence. The binding is inferred at runtime. When two patterns match the same sentence you get ambiguity. When none match you get an undefined step. The relationship between the words and the code is a guess that the framework resolves by searching.

I have designed the framework to bind with an explicit named pointer. A step that needs procedural logic says code_ref: spec_drivers/zwave.py::onboard_lock. There is no pattern and no matching layer. The document names the exact function it runs, and the validator confirms every reference resolves before the run starts, rather than leaving the binding to be matched at runtime, where it can collide or come up undefined. The seam between intent and mechanism is a direct citation checked ahead of time, not a natural-language match resolved while the test executes.

For a behaviour specification shared with business stakeholders, pattern matching is the right call: it keeps the sentence clean. For a test procedure an auditor has to trust a year later, an explicit reference to the exact code that ran is worth far more than a clean sentence.

Two: how much mechanism the document holds on its own

In Gherkin every step is an opaque sentence. The actual call always lives elsewhere, because by design the feature file carries no vocabulary for "GET this endpoint and assert a 200." Mechanism is meant to live outside the sentence, and for a behaviour spec shared with non-technical stakeholders that is the correct choice.

In the framework I am building, steps are typed, and many of them carry their full mechanism in the document. An api step states its method, its URL, the validation it applies, and the outputs it captures, all in the step itself. A shell step carries its command. A manual step carries the action and what the operator must confirm. The code_ref descent into a spec driver is the escape hatch for steps that genuinely need procedural logic, not the only road available. The result is that a large share of the test spec document is self-contained: you can read what most steps actually do without leaving the page, and you only descend into a driver for the steps that truly warrant it.

Three: what kind of document it is

Gherkin is a dedicated language with a fixed grammar. A feature file is a list of scenarios in Given/When/Then form, and that constraint is deliberate, because a constrained grammar is easier for a mixed audience to agree on.

In my framework, spec is plain Markdown carrying the full structure of a test procedure: objective, scope, preconditions, test data, the procedure itself, expected results, postconditions, references. Around the executable steps it holds ordinary prose, tables, and links, the context an engineer actually needs. It renders in any Markdown viewer and reads as a complete test document rather than a behaviour script. On top of that structure sit constructs that go beyond what a feature file expresses: parameter matrices richer than a simple Examples table, loops for endurance runs, and collectors that capture device logs and traffic as the run proceeds.

A product manager reads the objective, the scope, and the step-level actions and understands exactly what the hardware team verifies and why. They skip the code_ref lines, the same way they would skip a step definition, and lose nothing they came for. The document serves the engineer first and the reviewer well, because progressive structure lets each read to the depth they need.

Why this shape, for this domain

The departures are not stylistic. Each one earns its place against a property of hardware testing that web behaviour does not have.

Hardware tests must record what physically happened, so the framework captures evidence as the run proceeds and renders the executed document with each step's status, timestamps, and links to the logs and traces it gathered. Hardware tests include steps no code can perform, inserting a battery, watching a deadbolt throw, so manual steps are first class and their operator confirmations land in the same evidence trail as everything else. Hardware tests run for days to catch the failure that only appears on cycle three hundred and forty, so the runner has a long-lived service mode that streams progress. Cucumber gives you a test run with a start and a finish and no long-lived service mode.

The honest cost

What the departure costs me is the ecosystem. Cucumber is fifteen years old, runs in every major language, and carries IDE support and an answer to almost every integration question a search away. I built a dialect shaped for one domain, and that precision buys no community in any other.

So this is not a claim that Cucumber is wrong. For describing application behaviour and aligning business with engineering on what to build, Cucumber wins, and it is not close. The framework I'm building takes BDD's best idea, a specification that executes, and reshapes it for a domain which benefits from explicit named binding instead of pattern matching, mechanism held in the document instead of always behind it, sensible abstractions, plain Markdown instead of a fixed grammar, and a runtime that treats physical devices, manual steps, and week-long runs as first-class facts rather than edge cases.

— end of dispatch —

More writing →