

TL;DR
- Traditional functional testing breaks at scale due to brittle, structure-based scripts and high maintenance overhead
- QA teams spend a large share of effort maintaining tests instead of improving coverage or finding real defects
- Codeless and AI-assisted tools reduce effort but still rely on human-defined test logic
- Agentic AI enables autonomous test creation, execution, and adaptation based on behavioral goals
- This shifts testing from script maintenance to outcome-driven validation
The introduction of AI coding tools has made developers more productive, causing management to expect new features, products, and services to be delivered faster than ever. But can QA teams keep up with this accelerated pace of development?
Traditional QA testing involves a script-writing approach, which includes gathering requirements from multiple sources (such as user stories, product requirement documents, and manual test cases), writing the test scripts, verifying and validating the tests (e.g., debugging the locator, fixing test data, adjusting wait times), and, finally, executing the tests.
But legacy QA testing is time-consuming; the process of creating tests can be painfully slow, particularly when writing across fragmented frameworks.
For example, QA might attempt to automate a business process using Playwright for desktop web, then a combination of Appium for mobile web, and finally XCUI for iOS native and Appium for Android native.
This essentially means automating the same thing four different times.
These scripts break often and require some resource-intensive maintenance.
The downstream impact of this is having to fix false failures in your testing suite and then likely spending almost half of the time scanning through them to find actual failures that developers need to fix. It impacts how much time is spent on test analysis.
All these issues simply won’t scale with this new software development reality, where time is limited, Agile teams are releasing features on a daily and weekly basis, and the test depth keeps growing.
Legacy QA testing is time-consuming; the process of creating tests can be painfully slow, particularly when writing across fragmented frameworks.
The actual problem with functional testing at scale
Today’s functional testing is implemented as a contract, where a tester describes the expected behavior of the software in code, which is then enforced by the test suite. For example, an application with only 20 screens and mostly stable/static UIs will require only a simple contract.
However, for an enterprise application consisting of hundreds of flows, having various dependencies, with A/B tests all running in parallel, this contract can become a liability.
Issues quickly arise when a developer makes changes that result in tests failing, not because the application’s expected behavior has been changed, but rather because the tests were written to just match the structure and not its meaning.
This leads to extra maintenance work by the QA team, triaging false failures instead of finding real ones.
According to Capgemini’s World Quality Report, QA teams spend 60-80% of their total test effort on maintenance alone.
And this does not include writing new tests, expanding test coverage, or catching bugs. Rather, all this work is spent just keeping existing scripts from falling apart.
Where traditional automation stops and agentic AI begins
Scriptless/codeless tools helped to lower the barrier to entry for test automation. This enabled manual testers, business analysts, and non-developers to create and maintain tests using visual interfaces and record-and-playback approaches—all without needing to write code.
The advent of gen-AI-assisted testing further attempts to solve this challenge by the introduction of self-healing locators, auto-generated test suggestions, and smart element identification.
But this still followed the same model, which requires a human to define the test logic while the AI handles the execution and maintenance.
In this setup, the human remains the constant bottleneck in test creation time. That’s reflected in the test coverage, as it is only as wide as what the engineer thought to test. Nevertheless, it has already changed the way testers work, and now, with agentic AI, things go a step further.
Agentic AI refers to a set of systems with a high degree of autonomy that seek to achieve a specific goal by making decisions, taking actions, and adapting to circumstances independently.
Agentic AI transforms traditional QA by deploying AI-powered agents that can independently analyse application behavior during runtime and autonomously detect anomalies, crashes, and performance bottlenecks.
AI can now be embedded into all stages of the testing lifecycle, from test design to test automation to testing execution and, finally, reporting.
With functional testing, this means scripts or hand-coded locators, as well as step-by-step test cases authored by a human, are not required before the agent touches the UI. It can also simulate real-world scenarios and generate intelligent test cases.
Scripted-based vs. scriptless (codeless) vs. AI-assisted vs. agentic testing: A comparison
Below is a table that explains the difference between testing methods.
| Dimension | Script-based | Scriptless/codeless (record & playback) | AI-assisted testing | Agentic testing |
| Test authoring | Manual step-by-step scripts by QA engineers/testers. Mostly done in Python, Java, and JS | Visual UI recording. Clicks and inputs are captured automatically | AI generates or suggests scripts from prompts/natural language | The goal is defined in natural language
Agents autonomously discover, generate, and prioritize test cases from requirements, code changes, or goals |
| Locator strategy | Hardcoded CSS/XPath selectors | Selectors are captured at record time | Self-healing selectors | Multi-modal element understanding |
| Failure handling | Stop on unexpected state | Playback breaks on any UI deviation from the recorded state | Retry with healed locators | Reason, adapt, continue |
| Maintenance trigger | Every structural UI change. This includes a renamed class, a moved element, a refactored component, or an updated selector | All visual or structural UI that deviates from the recorded state, including minor layout changes | Broken locators that self-healing cannot fix, or logic changes that require human re-authoring | Only genuine behavioral changes, i.e., when the application expected outcomes diverge from the defined goals, and not when just the UI structure shifts |
| Test coverage | Lacks coverage, limited to predefined paths; misses edge cases | Slightly better than a regular manual and covers recorded flows + some variations | Improved via AI suggestions and hints | Broad/dynamic. |
How agentic functional testing works: Under the hood
The reason agentic testing can survive rapid UI changes, unexpected interruptions, and shifting application states comes down to a few core mechanisms.
1. Multi-modal application perception
As you know, scripts interact with the DOM, looking for selectors like `#submit-btn`, `.form-input`, `[data-testid=”checkout”]`, When these attributes change, the test breaks.
Agentic testing, on the other hand, works by building a comprehensive model of your application by simultaneously combining multiple factors, such as the DOM structure, rendered visual layout, ARIA accessibility attributes, the role of semantic elements, and behavioral patterns observed during previous executions.
This enables the agent to easily and correctly identify a “Confirm Order” button even if the class, ID, or position changed at any point. This is because the AI understands the element’s context, not just its selector.
Rather than acting on a prewritten series of steps, an agentic system receives a goal/objective and then goes on to deduce its own plan and roadmap to achieve the task.
2. Goal-directed test planning
Rather than acting on a prewritten series of steps, an agentic system receives a goal/objective and then goes on to deduce its own plan and roadmap to achieve the task.
In most cases, this plan is not static. The agent adapts its plan as it progresses and to whatever roadblock it encounters.
If a modal pops up mid-flow or a step takes a long time to complete, the agent will proceed to reevaluate the current situation and adjust.
3. Behavioral outcome verification
While most functional tests validate the presence of structural assertions, an agentic system validates at the behavioral level, basically assessing whether the outcome of the workflow matches the defined goal. It forms this judgement by combining various signals.
To determine that a checkout process was successful, an agentic system can look for the presence of an order confirmation toast or display, the reset of the cart state, etc., rather than just a single assertion on an element ID, which might pass the test even though the actual transaction might have quietly failed.
Best practices for agentic functional testing
Most teams that are not fully productive with agentic testing don’t have a tooling problem; they have a setup and adoption problem, and the fixes are straightforward.
1. Write precise goal definitions
This is one of the best things to do to improve your agentic test quality. Definitions like “Test the admin dashboard” are not going to be helpful.
“Verify that a user without admin permissions cannot gain access or modify organization settings, and make sure they receive an appropriate error response” is an appropriate goal.
The precision level of the input greatly determines the depth of test coverage of the agent. With this in mind, it should be treated as a first-class activity and be subject to review by product and engineering teams.
Before choosing to move forward with any agentic solution, pull the failure data from the last set of releases, separating locator and script failures from the genuine behavioral failures.
2. Audit your false positive rate
Before choosing to move forward with any agentic solution, pull the failure data from the last set of releases, separating locator and script failures from the genuine behavioral failures.
If the structural failures are more than 20% of the total failures, then agentic testing directly solves it.
3. Start with your highest-churn, highest-value flows
Do not attempt to migrate your full suite all at once in a day. Identify flows that carry the most risk and break frequently. Apply the agentic testing to it first and run it in parallel for a few future releases. Then, measure the difference in false positive rate before expanding the scope of use.
4. Connect to CI/CD from day one
Having an agentic system that runs on demand doesn’t deliver on its potential value. You should integrate an agentic system into your pipeline and trigger it on every build.
How Tricentis supports agentic functional testing
In 2025, Tricentis introduced agentic test automation, which is embedded into Tricentis Tosca.
This autonomous agent can create complete, end-to-end functional tests across enterprise technologies from a natural-language description of the business processes, user journeys, or functional requirements.
Users describe what they want, and agentic test automation will intelligently generate the automated test cases. These tests verify core application behavior against expected business outcomes.
It can then analyse past tests, make intelligent decisions based on them, and even adapt to your tech stack using Tricentis’ proprietary Vision AI technology.
Alongside it, Tricentis introduced remote Model Context Protocol (MCP) servers, giving customers more flexibility to incorporate their own AI models in a fast and easy way.
In summary
Agentic functional testing gives QA teams under pressure a way to move faster without sacrificing coverage. Thus freeing them from the tedious maintenance cycle and refocusing on failures that actually matter.
For teams ready to adopt it, achieving success comes down to a few key practices: writing precise goal definitions, auditing your false positive rate, starting with your highest-value flows, and connecting to CI/CD from day one.
Tricentis Tosca makes all of this possible at enterprise scale today.
This post was written by Wisdom Ekpotu. Wisdom is a Software & Technical Writer based in Nigeria. Wisdom is passionate about web/mobile technologies, open-source, and building communities. He also helps companies improve the quality of their Technical Documentation.