Building an Offline-First Architecture for a Retail POS
April 2026 · 15 min read
A Point-of-Sale terminal going offline in the middle of a busy shopping shift is not a minor inconvenience — it is a queue of frustrated customers, blocked cashiers, and potentially lost revenue. When I was building StoreOS at Fynd, this was the constraint that shaped every architectural decision from day one.
Retail stores in India present a specific challenge: internet connectivity is unpredictable. It can vanish mid-transaction, throttle to near-zero during peak hours, or simply not exist in remote store locations. The system had to keep working regardless — processing orders, searching the catalog, accepting payments — and then silently reconcile all of that data once the network came back.
This post is a full breakdown of the offline system I engineered for StoreOS: the layered architecture, every technology involved, and the reasoning behind each decision.
The Layered Architecture
The offline system isn't a single feature — it's a stack of four cooperating layers, each with a distinct responsibility.
- Service Worker — intercepts all network requests; serves assets and critical API responses from cache when offline
- IndexedDB (via Dexie) — the local database; stores catalog, carts, orders, users, and analytics events
- Offline API Client — a set of domain clients that mirror the online API surface but route all calls to IndexedDB
- Web Workers — background threads that sync offline orders and download the product catalog without blocking the UI
The backend (BullMQ queues with exponential backoff) closes the loop by making the server-side resilient to the same connectivity problems. Together, these layers mean the app handles everything from a brief network blip to a multi-hour outage with the same code path.
A Three-State Machine at the Center
The most important file in the entire offline system is IDataSource.js. It manages a singleton state machine with three explicit operational modes:
READY ─────────────────────────────→ IN_OFFLINE_MODE
(network lost / manual toggle)
│
↓
IN_ONLINE_MODE
(transitioning back, sync running)
│
↓
READYA boolean isOffline flag would have been simpler, but it can only represent two states. The three-mode machine lets the UI represent the transitioning state — syncing a large order backlog can take minutes, and the cashier needs to know the system is working, not stuck.
Going Offline
When the mode switches to IN_OFFLINE_MODE, the system does five things:
- Requests the File System Access API — gives the app a directory handle for exporting end-of-day reports without internet
- Generates a unique offline session hash (
OFFLINE_DB_HASH) written to localStorage — this becomes the correlation key for every order and event in this offline window - Removes the active cart — avoids carrying half-state into offline mode
- Clears extensions — only core POS functionality is available offline
- Calls
createOfflineSession()on the backend, if still reachable
Coming Back Online
When connectivity returns, the system transitions through IN_ONLINE_MODE before reaching READY. During this window, the order sync worker runs and the analytics queue is flushed. The native Android bridge is also notified on every transition, so hardware integrations (barcode scanners, receipt printers, payment terminals) can adjust accordingly.
Service Worker Caching Strategies
The service worker (sw.js, using Workbox 7.3.0) uses different strategies for different resource types. One size does not fit all.
| Resource | Strategy | Why |
|---|---|---|
| HTML documents | NetworkFirst | Always prefer fresh markup |
| JS / CSS bundles | NetworkFirst, 3s timeout | Fresh on fast networks; cached on slow ones |
| Fonts, CDN assets | CacheFirst (30 days) | Immutable; no reason to hit the network |
| Critical API responses | Custom cache-first | App must boot even without network |
The 3-second timeout on JS/CSS bundles is the key heuristic for low-bandwidth environments. A pure NetworkFirst strategy would stall indefinitely on a throttled connection. A pure CacheFirst strategy would leave users running stale code for days after a deployment. Three seconds is the negotiated middle ground — responsive enough, fresh enough.
The following API endpoints are explicitly cached so the app can initialize and authenticate without a network round-trip:
service/panel/authentication/v1.0/profile service/platform/company-profile/v1.0/company service/platform/stos-config/v1.0/company staff/current/access service/application/configuration/v1.0/feature storeos/asia-south1/api-storeos/internal/appVersion
On service worker activation, 41 icon assets, fonts, and 10 CDN images are proactively cached — front-loading the cache so users never hit a cold-cache miss on first offline use.
IndexedDB via Dexie — The Offline Database
While the service worker handles the network layer, IndexedDB is the local persistence layer. I used Dexie as the wrapper — its transaction API and versioned schema migrations make it significantly more ergonomic than the raw IndexedDB API.
db.version(N).stores({
items: 'id, type, name, slug, brand, price, variants',
cart: 'id, user_id, items, total_quantity, status',
orders: 'id, user_id, is_sync_online, fynd_order_id',
users: 'id, username, emails, phone_numbers',
addresses:'_id, app_id, user_id, is_default_address',
analytics_events: 'id, event_name, status, retries',
offline_session_info: 'id, offline_hash_id, is_sync_online',
});The Two Most Important Fields
orders.is_sync_online and orders.fynd_order_id are the backbone of the entire sync mechanism. Every order created offline starts with is_sync_online = false. When the web worker successfully posts the order to the Fynd platform, it writes the returned fynd_order_id and flips the flag to true. The worker checks this flag before processing any order — idempotency built into the data model.
Resilience on Android WebViews
Android WebViews occasionally throw transient errors on IndexedDB writes. Every write in DatabaseManager.js is wrapped in a retry loop with exponential backoff:
async function withRetry(operation, maxAttempts = 3) {
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
return await operation();
} catch (err) {
if (attempt === maxAttempts - 1) throw err;
await sleep(200 * Math.pow(2, attempt)); // 200ms → 400ms → 800ms
}
}
}Three other patterns keep the database layer solid: a singleton instance prevents multiple Dexie connections racing on the same database; atomic db.put() upserts eliminate the check-then-act race condition; and cursor-based pagination returns { has_next, has_previous, next_id }— intentionally matching the online API response shape, so the UI layer never needs to know which mode it's in.
Offline Catalog Search
When offline, the product search is powered by Fuse.js running against the local IndexedDB catalog:
const fuse = new Fuse(localItems, {
keys: ['name', 'slug', 'brand'],
threshold: 0.4, // tolerant of minor typos
});Threshold 0.4 was tuned to tolerate the kind of quick-entry typos cashiers make without returning irrelevant results. Too tight (0.1) and it misses obvious matches. Too loose (0.6) and every search returns noise.
The Offline API Client Layer
The offline API client is the piece that makes the UI completely unaware of which mode it's running in. It's a set of domain-specific clients — CartPlatformClient, CatalogPlatformClient, OrderPlatformClient, and others — that mirror the online API surface exactly but route all calls to IndexedDB.
When the mode switches, the app swaps the active client. The UI components call the same methods with the same signatures and get the same response shapes. The abstraction is clean enough that adding a new offline-capable feature is mostly a matter of implementing the right method in the offline client.
One non-trivial piece is cart pricing. The CartPlatformClient recomputes the full price breakup locally — marked price, effective price, GST, discount amounts — using the same calculation logic as the server. Pricing is consistent whether or not the network is present.
Web Workers for Background Sync
Two dedicated web workers handle synchronization on background threads, keeping the main thread — and the cashier's UI — completely unblocked.
Order Sync Worker
syncOrdersOnline.js (532 lines) processes all unsynced orders from IndexedDB and posts them to the Fynd platform. For each order, it:
- Checks
fynd_order_id— if it exists, the order was already synced; skip it - Looks up the customer on the online platform; falls back to the local DB if that fails
- Transforms the local order format to the Fynd platform payload format — item charges, GST, billing addresses, seller identifiers
- Posts the order, writes back
fynd_order_id, setsis_sync_online = true
The worker emits structured progress messages back to the main thread, driving the real-time progress UI in the Settings panel:
postMessage({ type: 'start', total: N })
postMessage({ type: 'update_order_status', orderId, status })
postMessage({ type: 'file_sync_complete', elapsed })
postMessage({ type: 'error', orderId, error })Catalog Sync Worker
syncProductsOffline.js (389 lines) downloads the full product catalog into IndexedDB. For each page of products, variant details, size details, and price details are fetched concurrently:
// For each product page, fetch all supplementary data in parallel const [variants, sizes, prices] = await Promise.all([ fetchVariants(pageItems), fetchSizes(pageItems), fetchPrices(pageItems), ]);
This parallelism significantly reduces total catalog download time compared to sequential fetching. Per-item errors are caught and logged but don't abort the sync — the catalog always completes even if a handful of items fail.
I deliberately chose explicit web workers over the Background Sync API (available in service workers) for two reasons: Background Sync has inconsistent support on older Android WebViews used in POS hardware, and explicit workers let the UI show real-time sync progress — something Background Sync's browser-managed timing cannot provide.
Network Detection & UI
useOfflinePopup.jsx listens to the browser's offline and online events, but the onlineevent alone isn't trusted. It triggers a verification loop:
// 3 attempts, 2-second delay between checks // Only declares "back online" after all checks pass navigator.onLine // Prevents false positives from brief connectivity blips
One subtle detail: the offline modal's illustration image is converted to base64 and stored in localStorage. This ensures the modal renders correctly even when the service worker hasn't cached that particular image. The offline UI must work reliably offline — it would be ironic for it not to.
Proactive Token Renewal
Every API call goes through a withTokenCheck() wrapper that renews the auth token 120 seconds before it expires:
async function withTokenCheck(requestFn) {
const expiresIn = getTokenExpiresIn(); // seconds until expiry
if (expiresIn < 120) {
await renewAccessToken();
}
return requestFn();
}120 seconds was chosen to be safely above the 95th-percentile API latency in tested retail environments. A token expiring mid-request and causing a 401 in the middle of a checkout flow is a terrible experience. The buffer makes it structurally impossible.
Backend Resilience — BullMQ Queues
The offline resilience pattern continues on the server side. All async jobs (invoice generation, payment processing) run through BullMQ (Redis-backed) with exponential backoff:
defaultJobOptions: {
attempts: 3,
backoff: {
type: 'exponential',
delay: 1000, // 1s → 2s → 4s
},
},
limiter: {
max: 1000,
duration: 5000, // 1000 jobs / 5 seconds
}The rate limiter is as important as the retry logic. When connectivity is restored after a long outage, every store that was offline will attempt to sync simultaneously — a thundering herd. The 1000 jobs/5s limit keeps that from overwhelming downstream services.
All job consumers are idempotent. Before processing, they check the correlationId against a ProcessedEvent MongoDB collection. BullMQ can redeliver a job more than once under failure scenarios — without idempotency checks, a payment could be registered twice. This is not a hypothetical concern in a system that handles thousands of transactions daily.
The Core Trade-offs
Several decisions in this system were genuinely non-obvious. Here are the ones worth examining:
Optimistic local writes, pessimistic remote sync
Orders are written to IndexedDB immediately and confirmed to the cashier right away. The platform sync is a background concern. This matches the retail expectation: a POS should never make a customer wait for a network round-trip to confirm a sale. The risk is that a sync failure leaves an order unrecorded on the platform — mitigated by the sync worker's retry logic and the is_sync_online flag that makes every unsynced order visible.
Session hash as correlation key
The OFFLINE_DB_HASH generated at the start of each offline session is attached to every order, event, and session record created in that window. This makes debugging practical: you can query the backend for all data from a specific offline session in one shot. Without this, correlating offline data to a specific time window and store would require reconstructing it from timestamps — unreliable if device clocks drift.
Extensions get reload-on-reconnect, not full offline
The supplementary extensions (AR try-on, QSR ordering, scan-and-go) simply reload on reconnect rather than implementing offline state management. The complexity of offline mode for features that are already optional wasn't justified. This keeps the core POS offline system focused and the extension code simple.
What Works Offline vs What Doesn't
| Feature | Offline? |
|---|---|
| App loads and authenticates | Yes — Service Worker cache |
| Browse product catalog | Yes — IndexedDB + Fuse.js |
| Create cart, process order | Yes — IndexedDB clients |
| Accept cash payment | Yes — local order, synced later |
| Export session data | Yes — File System Access API |
| Online payment (card / UPI) | No — requires payment gateway |
| Real-time inventory updates | No — requires Fynd platform API |
| Extension features (AR, QSR) | No — blocked; reload on reconnect |
| Analytics | Queued — synced on reconnect |
Closing Thoughts
The most important insight from building this system is that offline-first is not a feature you add at the end — it's an architectural constraint that shapes every layer of the stack. The data model (sync flags, correlation hashes), the API layer (identical interface for online and offline clients), the state machine (three modes, not two), the backend (idempotent consumers, rate-limited retry) — all of it exists because the network is assumed to be unreliable from the start.
If you are building for environments where connectivity is a given, most of this complexity is unnecessary overhead. But if your software runs in a retail store in a tier-3 city in India, the network going down is not an edge case. It is Tuesday afternoon.
Written by Suraj Singh — SDE at Fynd, working on StoreOS offline infrastructure.