How ethPandaOps keeps three regions of testnet/devnet traffic flowing with eRPC

ethPandaOps is a boutique DevOps collective that runs public Ethereum infrastructure for researchers, client teams, hackathon projects, and anyone who needs a free, fast endpoint. From Berlin (EU‑1), Boulder, Colorado (NA‑1), and Sydney (AP‑1), they stream testnet/devnet data at scale.

Challenges

Load‑balancer roulette
Before eRPC, ethPandaOps cycled through several commercial and open‑source load balancers. But they would intermittently drop connections or stall under heavy traffic, especially during sync spikes or testnet forks.
Latency across the pond
Even when the balancers stayed up, cross‑Atlantic requests introduced 100–150 ms of extra round‑trip time (RTT), adding noticeable latency for researchers making global queries.
Too many moving parts
Each execution‑layer (EL) client—Geth, Nethermind, Reth, Erigon, Besu—needed its own deployment and TLS certificate, which added complexity and maintenance drag.
Bare‑bones monitoring
The team treats the free RPC service as a goodwill perk, so they never wired full production‑grade observability. If an endpoint dies, the goal is to fail over gracefully, not page an on‑call engineer.

Setup

Edge
Uses Cloudflare DNS to load balance by region. Runs healthchecks (/healthcheck) for each site and geo‑routes traffic to the nearest healthy cluster.
Regional entry‑point
Region‑specific endpoints (rpc.<network>.<region>.ethpandaops.io) front all EL clients in that office. Examples: rpc.sepolia.eu1.ethpandaops.io, rpc.sepolia.na1.ethpandaops.io, rpc.sepolia.ap1.ethpandaops.io.
Clusters
Runs two k3s clusters built from 10–15 Intel NUCs each. Namespaces isolate staging vs. prod, and each eRPC pod pins to its associated EL client.
Archive tier
Runs separate archive.<network>.<region>.ethpandaops.io endpoints that only route to Reth/Erigon nodes for deep‑history queries.
In‑memory cache, max = 1M entries
No Redis throttling, instant hits for repeated calls like eth_chainId.
k3s + MetalLB
Keeps internal L4 balancing simple and avoids extra hops.

Results

Zero downtime
eRPC’s back‑pressure logic and in‑memory caching removed the connection churn that toppled previous balancers during client‑sync storms. Even without synthetic monitoring, Cloudflare’s health probes almost never mark an endpoint as DOWN today.
49% lower global latency for researchers
Geo load‑balancing plus local caching cut median request time for researchers in the U.S. from ~350 ms to ~180 ms during peak usage.
Lower ops overhead
Running k3s on inexpensive NUC hardware keeps CapEx minimal, and the team hasn’t touched the cluster autoscaler in months.
Scale deployments more effectively
With first‑class path‑based routing, ethPandaOps can reduce TLS and CI/CD pipeline complexity and scale clusters more effectively while maintaining a healthy diversity of EL clients.