Mobile-First or Mobile-Last?

The responsive design test

Mar 31, 2026

I gave eight AIs the same web development task of creating a simple one-page website, using only HTML, CSS, and JavaScript. Each of them received the same prompt, stipulating they had to meet certain constraints, including: dynamic screen size adaption, international accessibility standards, and human-readable code. They also had to have certain content included: photos, testimonials, date and time details of an event, and a registration form. The scenario was fictional but based on real projects I’ve done.

This article looks at how well the AI code did with different screen sizes. Somehow there are still websites that don’t work well on mobile devices, or on larger-than-average desktop monitors, or on anything that isn’t “whatever screen size the developer happened to use.”

Responsive design, which is the ability for a web page to resize and reshape gracefully across devices, should be standard. Given how common that requirement is, you’d think AI would handle it automatically.

You’d be wrong.

The Test Setup

I learned long ago from human developers that you can’t assume anything. My prompt explicitly stated: “Fully responsive: works on small mobile screens to large desktops.” I didn’t specify breakpoints or design patterns, but the implication was clear—it should look decent and function well on everything from older smartphones to multi-monitor desktop setups, with tablets and laptops in between.

For a volunteer event like my fictional fabric swap meet, the audience spans the full spectrum of devices and technical capabilities. If your registration page only works well on the latest iPhone, you’ve already lost half your potential attendees.

The Failures

This was one of the weakest areas across the board, which surprised me, since it can be solved with hard logic.

ChatGPT produced a responsive page, but not in a useful way. Some columns got very narrow before the layout finally changed to suit the screen size, and some elements which should have changed shape and layout to suit a smaller screen just stayed the same, and lowed off the right-end edge of the page. That included some of the text blocks.

Meta and Copilot both landed in “meh” territory. They worked, but without polish. The result would have let users would manage, but they wouldn’t enjoy the experience.

What Went Well

Claude stood out with excellent responsive design. Gemini, Poe, and Replit also performed well.

Interestingly, most AIs did well on touch-friendly interactive elements: buttons and links with adequate spacing. This good to see, since WCAG accessibility standards require sufficient space between interactive elements regardless of screen size, and it can require major changes to layout to fix. It’s one area where accessibility and responsive design reinforce each other.

Replit was also easier to test here, since responsive behaviour is visible without needing to inspect the code. You can load the page on different devices and see whether it works.

The Critical Detail

Here’s what separated acceptable from excellent: text sizing.

Most AIs used fixed text sizes or ugly breakpoints. Headlines might wrap awkwardly on tablets. Paragraphs might have uncomfortably long line lengths on desktops or cramped spacing on phones.

Claude got this right. Text sizes scaled smoothly with screen width, avoiding common pitfalls like strange word breaks in headings or awkward line lengths in body text. It’s a small detail that makes a massive difference in readability.

This is exactly the kind of nuance a human developer would do, but only if they really considered readability for multiple devices. Many don’t. The AIs that handled text scaling well were effectively doing better than many human developers.

What This Reveals

Responsive design was the criterion that separated passable results from truly excellent ones. In previous criteria, most AIs that followed the prompt reasonably well scored similarly. Responsive design changed that. The gap between “it works” and “it works well” became obvious.

This matters for the Make vs Buy decision. There’s no point in upgrading to a paid plan with an AI platform if you’re constantly having to rewrite everything it generates. An AI that accounts for the user experience will produce something people actually want to use.

Have you run into AI-generated code that technically works on mobile but feels clunky? Or found responsive designs that genuinely impressed you? Let me know in the comments.

Next up: We look at the most complex requirement—conditional form logic. Can AI handle the kind of interactive behavior that makes or breaks user experience? Find out in “The Devil’s in the Details: Form Logic and User Experience.”

This is part of the Make vs Buy series, exploring whether it’s better to build things yourself or pay for ready-made solutions.

Make vs Buy

Discussion about this post

Ready for more?