A consumer-facing UK price comparison product is only as good as its freshness, breadth, and accuracy. Competing with established players requires three things at once: full catalog coverage across the major UK grocers, near-real-time price and availability updates, and enriched product detail (ingredients, nutrition, allergens, dietary flags, reviews). None of that ships through a public API, and the six target retailers each take a different shape.
The platforms are not interchangeable. Two retailers run on the same Open Source Platform e-commerce stack and expose a workable category-tree API. Two others front their catalog with Algolia, which is permissive but ships search hits, not full product records. Tesco runs a private GraphQL API gated behind an API key, session cookies, and UK IP enforcement. Sainsbury's uses a custom SSR stack protected by Akamai Bot Manager, which fingerprints clients down to the operating system. Building a separate codebase per retailer was a non-starter; it would have multiplied maintenance cost without giving the client any leverage on new retailers added later.
The anti-bot surface is uneven. Algolia tolerates high request rates if you stagger them across facets. The OSP API tolerates aggressive parallelism but emits false-negative product-count hints that have to be ignored. Tesco's GraphQL responses include polymorphic fields where the same path returns either a string or a list, breaking naive parsers. Sainsbury's is the hardest case: Akamai blocks Linux Chrome at the TLS fingerprint level regardless of header spoofing, so the gate cannot be cleared from any Linux container. Controlled testing showed the gate is primarily IP geolocation and operating system fingerprint; a UK-residential or UK-datacenter Windows host passes, a non-UK Windows host fails after roughly three requests, a Linux host never gets a token at all.
The data layer is uneven too. Catalog endpoints return identifiers, names, prices, and category paths. Enrichment endpoints where they exist return ingredients, nutrition, dietary flags, packaging, manufacturer details. Reviews live on yet another endpoint, sometimes paginated, sometimes capped. The stable identifier is one field on some retailers, a different one on others. Joining all of that into a single product table the application can query in milliseconds is its own problem.