How to Write Tests That Survive a UI Redesign

A redesign ships. The product looks great. And then someone opens the CI dashboard and finds 60% of the E2E suite failing — not because any behavior changed, but because a button moved, a class name got renamed, or a component got swapped out.

This is the most common reason teams lose confidence in their test suite. The tests weren't wrong, they were just brittle. Here's how to write tests that don't fall apart when the UI changes.

The Root Cause: Testing Structure Instead of Behavior

Brittle tests share a common pattern: they're written from the perspective of the implementation, not the user.

// Brittle — tied to DOM structure and CSS classes
await page.locator('.hero-section > div:nth-child(2) > button.btn-primary').click()
 
// Also brittle — tied to an ID attribute a developer can rename or remove
await page.locator('#submit-btn').click()

When the redesign moves that button, wraps it in a new container, or renames the class, the test fails — even though the user experience is unchanged.

The fix isn't a better selector. It's selecting from the user's perspective in the first place.

Use Semantic Selectors

Playwright's built-in locators (getByRole, getByLabel, getByText) query the accessibility tree rather than the DOM structure. A button is a button regardless of where it lives in the HTML.

// Resilient — works as long as the button exists and is labeled correctly
await page.getByRole('button', { name: 'Submit' }).click()
await page.getByLabel('Email address').fill('user@example.com')
await page.getByRole('link', { name: 'Get started' }).click()

These selectors survive:

Component restructuring
CSS class renames
Framework migrations (React → Vue, etc.)
Layout changes

They break when the label changes — which is the right time for a test to break. If you rename "Submit" to "Send message," that's a UX change that should be reviewed.

Use `data-testid` Strategically

Role-based selectors cover most cases, but sometimes you have a component with no meaningful semantic role — a custom drag handle, a canvas-based chart, a card in a Kanban board. For these, data-testid is the right tool.

// In your component
<div data-testid="pricing-tier-pro">...</div>

// In your test
await page.getByTestId('pricing-tier-pro').click()

The key is treating data-testid as a stable contract. Once it's there, it shouldn't change unless the feature itself changes. Agree with your team that these attributes are owned by QE, not subject to routine cleanup.

Tip

Add a lint rule or PR checklist item that flags removal of data-testid attributes. It prevents the "I cleaned up some unused attributes" commit from silently breaking your suite.

If your team already has a convention for test attributes — or you'd rather not ship data-testid on production elements — Playwright lets you configure a custom attribute name in playwright.config.ts:

export default defineConfig({
  use: {
    testIdAttribute: 'data-pw',
  },
})

// In your component — using your team's convention
<button data-pw="submit-order">Place order</button>

// In your test — getByTestId still works, just looks for data-pw now
await page.getByTestId('submit-order').click()

The attribute name is a team decision, but whatever you pick, keep it consistent across the codebase. Mixed conventions (data-testid in some places, data-cy in others) are harder to audit and easier to accidentally break.

Test at the Right Level of Abstraction

Most brittle tests are written at too low a level. They click through every step of a flow when they only need to verify the outcome.

// Too granular — every step is a failure point
test('user can check out', async ({ page }) => {
  await page.goto('/products')
  await page.getByRole('button', { name: 'Add to cart' }).first().click()
  await page.getByRole('link', { name: 'Cart' }).click()
  await page.getByRole('button', { name: 'Proceed to checkout' }).click()
  await page.getByLabel('Card number').fill('4242424242424242')
  await page.getByLabel('Expiry').fill('12/28')
  await page.getByLabel('CVC').fill('123')
  await page.getByRole('button', { name: 'Pay now' }).click()
  await expect(page.getByText('Order confirmed')).toBeVisible()
})

If payment UI gets redesigned, this test breaks on step 6. But the behavior you actually care about — that a user can complete a purchase — is only verified at step 9.

A better approach: handle setup in fixtures or API calls, and let E2E tests focus on the full user story.

test('user sees order confirmation after checkout', async ({ page, request }) => {
  // Set up cart state via API — fast and stable
  await request.post('/api/cart', { data: { productId: 'prod_123' } })
 
  await page.goto('/checkout')
  await page.getByLabel('Card number').fill('4242424242424242')
  await page.getByLabel('Expiry').fill('12/28')
  await page.getByLabel('CVC').fill('123')
  await page.getByRole('button', { name: 'Pay now' }).click()
 
  await expect(page.getByText('Order confirmed')).toBeVisible()
})

Now the cart and product UI can change freely without affecting this test.

Encapsulate Selectors in Page Objects

Even with good selectors, duplicating them across 30 tests creates a maintenance problem. When a label changes, you're hunting through your entire test suite.

Page Objects solve this by centralizing your selectors in one place:

// pages/CheckoutPage.ts
export class CheckoutPage {
  constructor(private page: Page) {}
 
  get cardNumberInput() { return this.page.getByLabel('Card number') }
  get expiryInput() { return this.page.getByLabel('Expiry') }
  get cvcInput() { return this.page.getByLabel('CVC') }
  get submitButton() { return this.page.getByRole('button', { name: 'Pay now' }) }
  get confirmationMessage() { return this.page.getByText('Order confirmed') }
 
  async completePayment(card: { number: string; expiry: string; cvc: string }) {
    await this.cardNumberInput.fill(card.number)
    await this.expiryInput.fill(card.expiry)
    await this.cvcInput.fill(card.cvc)
    await this.submitButton.click()
  }
}

When "Pay now" becomes "Complete order," you change it in one place.

Page Objects are a powerful pattern, but they come with their own set of design decisions — how to structure them, when to split them, and how to avoid turning them into maintenance problems of their own. If you're building a larger suite, Building a Page Object Model That Doesn't Become a Maintenance Nightmare covers those trade-offs in detail.

Assert on User-Visible Outcomes

Tests that verify DOM state rather than user-visible state break constantly:

// Brittle — asserts on internal state
await expect(page.locator('.modal')).toHaveClass(/is-open/)
 
// Resilient — asserts on what the user sees
await expect(page.getByRole('dialog', { name: 'Confirm deletion' })).toBeVisible()

The CSS class is an implementation detail. The dialog being visible is the behavior. If a redesign replaces the is-open class with an aria-hidden toggle, the first assertion breaks; the second doesn't.

When Tests Should Break

Not all test failures after a redesign are false positives. If you changed the copy on a CTA, renamed a form label, or removed a feature — the tests should catch that. The goal isn't to write tests that never fail; it's to write tests that only fail when behavior actually changes.

A good test suite acts as a behavioral contract: if you're changing something user-facing intentionally, you update the test. If the tests are failing because of structural churn that users can't perceive, that's a signal your selectors are at the wrong layer.

The Quick Audit

If you're not sure whether your suite is brittle, run this check on your test files:

How many selectors use .nth-child, positional indices, or deeply nested chains?
How many selectors depend on CSS class names or non-semantic attribute selectors?
How many tests would break if a developer extracted a component into a separate file?

If the answer to any of these is "a lot," start with the highest-value flows — auth, checkout, critical paths — and refactor those first. You don't need to fix everything at once.

The goal is a suite you trust. That means tests that break when behavior breaks, and hold up when the UI gets a fresh coat of paint.