"Web-Scraping" Blog Posts

The Art of Painting, by Johannes Vermeer

Ecommerce10 min read

Schema design patterns for e-commerce extraction

Battle-tested schema patterns for product pages, category pages, reviews, and inventory. Edge cases, type choices, and the fields people forget.

June 11, 2026

SEO9 min read

Scraping Google SERP results in 2026: what works and what doesn't

Direct Google scraping is a losing battle in 2026. Here's the realistic landscape, the alternatives that work, and how to extract structured data from search results.

May 30, 2026

Lead-Generation10 min read

Lead generation from public web data: a builder's guide

How to extract qualified leads from company websites, public directories, and structured registries without violating terms of service or privacy law.

May 22, 2026

Legal9 min read

Is web scraping legal in 2026? A practical guide for builders

What courts, regulators, and contracts actually say about scraping public web data, with the case law that shaped the current landscape and a working playbook.

May 14, 2026

Engineering8 min read

Scraping JavaScript-heavy SPAs: Next.js, Nuxt, and React in 2026

Why plain HTTP fetching returns empty pages on modern frontends, what render targets work, and how to recover server-shipped data without a headless browser.

May 10, 2026

Departure of William III from Hellevoetsluis

Guide9 min read

The complete guide to web scraping APIs in 2026

What a modern web scraping API actually does, how to evaluate one, and where each category (proxies, browsers, extractors) fits into a real pipeline.

May 9, 2026

Michelangelo in His Studio Visited by Pope Julius II, by Alexandre Cabanel

Comparison8 min read

Firecrawl vs Apify vs Runo: which scraping API to pick in 2026

An honest, side-by-side look at three popular scraping APIs. What each is built for, where each shines, and where each costs you time and money.

May 8, 2026

Engineering7 min read

How to scrape Cloudflare-protected sites without getting blocked

A practical, layered approach to defeating Cloudflare's bot challenges in 2026. TLS fingerprints, hardened headless, cookie persistence, and when to escalate.

May 7, 2026

Engineering7 min read

LLM extraction vs CSS selectors: why selector-based scraping is dead at scale

Selectors break when sites redesign. LLMs extract by semantic meaning. Here's why the tradeoff has flipped, with cost numbers from real workloads.

May 5, 2026

Guide8 min read

Extracting structured JSON from any HTML: a developer's guide

How to turn arbitrary web pages into typed JSON shaped to your schema. Covers schema design, type coercion, null handling, and edge cases.

May 2, 2026

AI-Agents8 min read

Web scraping for AI agents: building the data layer for LLM apps

How to architect the scraping stack behind autonomous agents. Schema-typed data, low-latency tool calls, cost control, and error semantics that don't break the loop.

April 28, 2026

Guide9 min read

How to monitor competitor prices with a scraping API

A practical guide to building a competitor price monitoring pipeline. Schema design, change detection, alerting, and the legal and operational pitfalls.

April 22, 2026