Building an Offline-First POS System: Lessons from Production

December 13, 2025 • Full Day Session

Today I finished building a production-ready Point of Sale (POS) system for food trucks and restaurants. The challenge wasn't the UI or the business logic—it was making everything work reliably when WiFi drops, when multiple devices need to stay in sync, and when you're running on Android tablets with 2GB RAM. Here's what I learned.

🎯 TL;DR - What You'll Learn

Offline-first architecture that queues writes and syncs when connectivity returns
Conflict resolution strategies for multi-device scenarios
Performance optimizations for resource-constrained devices (2GB RAM, older ARM processors)
IndexedDB patterns for production-grade data persistence
Sync engine design with batch processing and exponential backoff
Print job management with retry logic and priority queuing

Reading time: 12 minutes of real production learnings

🏗️ System Architecture Overview

Before diving into the implementation details, here's the high-level architecture of the offline-first POS system:

Loading diagram...

Key Design Decisions:

Offline-First: All writes go to IndexedDB first, then queue for sync
Read-Through Cache: All reads query local IndexedDB (no network latency)
Batch Sync: SyncEngine processes queued operations in batches of 50
Conflict Resolution: Last-write-wins using lastModified timestamps
Priority Queuing: Print jobs prioritized (urgent → normal → low)

🏗️ Part 1: The Offline-First Architecture

The Challenge

Food trucks operate in places where WiFi is unreliable or non-existent. A POS system that stops working when the network drops is unacceptable. Every order must be captured, stored locally, and synchronized when connectivity returns—without data loss or conflicts.

Pattern #1: Queue-Based Offline Writes

The Insight: When offline, writes must be queued for later sync, but users need immediate local access. Write to IndexedDB first, then queue for sync.

javascript

async write(storeName, data, operation = "add") {
  if (!data.lastModified) {
    data.lastModified = Date.now();
  }

  if (this.isOnline) {
    // Direct write when online
    return operation === "add"
      ? await this.db.add(storeName, data)
      : await this.db.update(storeName, data);
  } else {
    // Offline: write locally AND queue for sync
    await this.db.add(storeName, data);

    const queueItem = {
      storeName,
      data,
      operation,
      status: "pending",
      retryCount: 0,
      maxRetries: 3,
      createdAt: Date.now(),
    };

    await this.db.add("syncQueue", queueItem);
    return data; // Return immediately for UI
  }
}

Why this matters: Users see instant feedback (order saved locally), but the system tracks what needs syncing. No lost orders, even during network outages.

Loading diagram...

Pattern #2: Read-Through Caching

All reads query local IndexedDB directly—no network latency, works completely offline.

Loading diagram...

javascript

async read(storeName, id) {
  return await this.db.get(storeName, id);
}

async readAll(storeName) {
  return await this.db.getAll(storeName);
}

The Result: Sub-10ms read times, even with 10,000+ products. Network latency eliminated for reads.

Pattern #3: Conflict Resolution with Last-Write-Wins

When multiple devices modify the same data, conflicts occur. Simple timestamp-based resolution:

Loading diagram...

javascript

async handleConflict(localData, remoteData) {
  const localTime = localData.lastModified || 0;
  const remoteTime = remoteData.lastModified || 0;

  return remoteTime > localTime ? remoteData : localData;
}

Real Scenario:

Device A updates "Burger" price $5 → $6 at 10:00:00
Device B updates "Burger" price $5 → $5.50 at 10:00:05
Result: $5.50 (Device B wins due to later timestamp)

Trade-off: Last-write-wins is simple and fast, but can lose updates. For POS systems, this is acceptable—price updates are infrequent, and the latest price is usually correct.

💡 Key Takeaway: Offline-first means local-first. Users should never wait for network requests to see their data.

🔄 Part 2: The Sync Engine

The Challenge

When connectivity returns, hundreds of queued operations need to sync efficiently. Naive approach: sync one-by-one (slow, many network calls). Better approach: batch processing with intelligent retry logic.

Solution: Batch Processing with Exponential Backoff

Loading diagram...

javascript

async _performSync() {
  if (this.isSyncing || !this.isOnline) return;

  this.isSyncing = true;
  const pendingItems = await this.detectChanges();

  if (pendingItems.length === 0) {
    this._completeSync(Date.now());
    return;
  }

  // Process in batches of 50
  const batches = this._createBatches(pendingItems, this.config.batchSize);

  for (const batch of batches) {
    try {
      await this._syncBatch(batch);
      this.emit("syncProgress", {
        progress,
        batch,
        total: batches.length
      });
    } catch (error) {
      console.error("Error syncing batch:", error);
    }

    // Small delay between batches to avoid overwhelming server
    if (batch !== batches[batches.length - 1]) {
      await this._sleep(BATCH_DELAY);
    }
  }

  this._completeSync(Date.now());
}

Performance Impact:

100 queued items: 100 individual requests → 2 batch requests
Network overhead reduced by ~98%
Sync time: 45 seconds → 3 seconds

Retry Logic with Exponential Backoff

Failed syncs retry with increasing delays:

javascript

getRetryDelay(retryCount) {
  return 1000 * Math.pow(2, retryCount);
  // Returns: 1s, 2s, 4s, 8s, 16s
}

Why exponential backoff: Network issues are often temporary. Rapid retries can overwhelm servers. Exponential backoff gives the network time to recover while ensuring eventual success.

💡 Key Takeaway: Batch processing isn't just about efficiency—it's about user experience. Users see progress indicators and know their data is syncing.

⚡ Part 3: Performance on Resource-Constrained Devices

The Challenge

Target hardware: Android tablets with 2GB RAM and older ARM processors. The system must handle 1000+ products, render smoothly, and run continuously for 8+ hours without memory leaks.

Optimization #1: In-Memory Indexing for O(1) Lookups

Instead of scanning arrays, build indexes:

javascript

class ProductCatalog {
  constructor() {
    this.products = [];
    this.indexes = {
      byId: new Map(),
      byCategory: new Map(),
      byName: new Map(),
    };
  }

  _buildIndexes() {
    this.products.forEach((product) => {
      this.indexes.byId.set(product.id, product);

      const category = product.category || "uncategorized";
      if (!this.indexes.byCategory.has(category)) {
        this.indexes.byCategory.set(category, []);
      }
      this.indexes.byCategory.get(category).push(product);
    });
  }

  getById(id) {
    return this.indexes.byId.get(id); // O(1) lookup
  }

  getByCategory(category) {
    return this.indexes.byCategory.get(category) || []; // O(1) lookup
  }
}

Performance Comparison:

Array scan: O(n) - 1000 products = 1000 iterations
Map lookup: O(1) - 1000 products = 1 operation

Real Impact: Category filtering went from 50ms to <1ms.

Optimization #2: Debounced Search with Caching

Search queries debounced to avoid excessive computation:

javascript

search(query) {
  return new Promise((resolve) => {
    clearTimeout(this.searchDebounceTimer);

    this.searchDebounceTimer = setTimeout(() => {
      const normalizedQuery = this._normalizeString(query);

      // Check cache first
      const cached = this.cache.searchResults.get(normalizedQuery);
      if (cached && this._isCacheValid(cached.timestamp)) {
        resolve([...cached.products]);
        return;
      }

      // Perform search
      const matches = this._performSearch(normalizedQuery);

      // Cache results
      this.cache.searchResults.set(normalizedQuery, {
        products: matches,
        timestamp: Date.now(),
      });

      resolve([...matches]);
    }, 300); // 300ms debounce
  });
}

Why this matters: Without debouncing, typing "burger" triggers 6 searches (b, bu, bur, burg, burge, burger). With debouncing: 1 search after user stops typing.

Optimization #3: Virtual Scrolling for Large Lists

For 1000+ products, render only visible items:

javascript

// Render only items in viewport + buffer
const visibleStart = Math.max(0, scrollTop / itemHeight - buffer);
const visibleEnd = Math.min(
  items.length,
  (scrollTop + viewportHeight) / itemHeight + buffer
);

const visibleItems = items.slice(visibleStart, visibleEnd);

Memory Impact:

Rendering all 1000 items: ~50MB DOM nodes
Virtual scrolling: ~2MB DOM nodes (only 20-30 visible)

Result: Smooth 60 FPS scrolling even on older ARM processors.

Optimization #4: Bundle Size Constraints

Target: <200KB uncompressed, <60KB gzipped.

Why Vanilla JavaScript: No framework overhead. Every kilobyte matters on resource-constrained devices.

Size Breakdown:

Core POS logic: ~80KB
Offline store + sync: ~40KB
UI components: ~60KB
Total: ~180KB uncompressed → ~55KB gzipped ✅

💡 Key Takeaway: Performance on constrained devices isn't about clever algorithms—it's about avoiding unnecessary work. Index what you query, cache what you search, render only what's visible.

🖨️ Part 4: Print Job Management

The Challenge

POS systems need to print receipts, kitchen tickets, and summaries. Printers fail, network connections drop, and jobs need to retry automatically with proper prioritization.

Solution: Priority Queue with Retry Logic

Loading diagram...

javascript

const JOB_PRIORITIES = {
  URGENT: "urgent",    // Customer receipts
  NORMAL: "normal",    // Kitchen tickets
  LOW: "low",          // Daily summaries
};

const PRIORITY_SCORES = {
  [JOB_PRIORITIES.URGENT]: 3,
  [JOB_PRIORITIES.NORMAL]: 2,
  [JOB_PRIORITIES.LOW]: 1,
};

async processQueue() {
  const jobs = await this.offlineStore.readAll(STORE_NAME);

  // Sort by priority (urgent first)
  jobs.sort((a, b) =>
    PRIORITY_SCORES[b.priority] - PRIORITY_SCORES[a.priority]
  );

  for (const job of jobs) {
    if (job.status === JOB_STATUSES.PENDING) {
      try {
        await this._processSingleJob(job);
        job.status = JOB_STATUSES.COMPLETED;
      } catch (error) {
        job.retries++;
        if (job.retries < MAX_RETRIES) {
          job.status = JOB_STATUSES.PENDING;
          // Exponential backoff: 1s, 2s, 4s, 8s, 16s
          const delay = this._getRetryDelay(job.retries - 1);
          await new Promise(resolve => setTimeout(resolve, delay));
        } else {
          job.status = JOB_STATUSES.FAILED;
          job.error = error.message;
        }
        await this.offlineStore.write(STORE_NAME, job, "update");
      }
    }
  }
}

Real Scenario: Printer offline for 2 minutes. System queues 15 print jobs. When printer reconnects:

Urgent jobs (receipts) print first
Normal jobs (kitchen tickets) print next
Low priority jobs (summaries) print last
All jobs retry automatically with backoff

Result: Zero lost print jobs, proper prioritization, automatic recovery.

💡 Key Takeaway: Print jobs are stateful operations that can fail. Design for failure: queue, retry, prioritize, persist.

📊 Part 5: Order Workflow with State Machine

The Challenge

Orders have defined lifecycle: pending → confirmed → completed → delivered → archived. Invalid state transitions must be prevented.

Solution: Validated State Machine

Loading diagram...

javascript

const ORDER_STATUSES = {
  PENDING: "pending",
  CONFIRMED: "confirmed",
  COMPLETED: "completed",
  DELIVERED: "delivered",
  ARCHIVED: "archived",
};

const STATUS_TRANSITIONS = {
  [ORDER_STATUSES.PENDING]: [ORDER_STATUSES.CONFIRMED],
  [ORDER_STATUSES.CONFIRMED]: [ORDER_STATUSES.COMPLETED],
  [ORDER_STATUSES.COMPLETED]: [ORDER_STATUSES.DELIVERED],
  [ORDER_STATUSES.DELIVERED]: [ORDER_STATUSES.ARCHIVED],
  [ORDER_STATUSES.ARCHIVED]: [], // Terminal state
};

updateOrderStatus(orderId, newStatus) {
  const order = await this.getOrder(orderId);
  const validTransitions = STATUS_TRANSITIONS[order.status];

  if (!validTransitions.includes(newStatus)) {
    throw new Error(
      `Invalid transition: ${order.status} → ${newStatus}`
    );
  }

  order.status = newStatus;
  order.lastModified = Date.now();
  return await this.offlineStore.write("orders", order, "update");
}

Why this matters: Prevents bugs like marking an order "delivered" when it's still "pending". State machines make invalid states unrepresentable.

🧪 Part 6: Testing Edge Cases

Multi-Device Conflict Scenario

Test: Two devices update the same product price simultaneously while offline.

Result:

Both devices save locally with different timestamps
When online, sync engine detects conflict
Last-write-wins resolution: newer timestamp wins
Both devices converge to same state

Learning: Conflict resolution must be deterministic. Timestamp-based resolution ensures all devices reach the same conclusion.

Long-Running Session Memory Leak

Test: System runs for 8 hours, processing 500+ orders.

Issue Found: Event listeners not cleaned up, causing memory leak.

Fix:

javascript

// Before: listeners accumulate
element.addEventListener("click", handler);

// After: cleanup on destroy
element.addEventListener("click", handler);
this.cleanup = () => {
  element.removeEventListener("click", handler);
};

Result: Memory usage stable at ~85MB after 8 hours (was growing to 200MB+).

Network Flapping

Test: Network connects/disconnects rapidly (common in food trucks).

Issue Found: Sync engine would start/stop repeatedly, causing race conditions.

Fix: Debounce network state changes:

javascript

let networkStateTimer;
window.addEventListener("online", () => {
  clearTimeout(networkStateTimer);
  networkStateTimer = setTimeout(() => {
    this.onOnline(); // Only trigger after 2 seconds of stable connection
  }, 2000);
});

Result: Sync engine no longer thrashes during network flapping.

💡 Key Takeaway: Edge cases are where production systems break. Test network failures, memory leaks, and state transitions—not just happy paths.

📊 Day 2 Stats

Technical Stack Used:

Vanilla JavaScript (no framework dependencies)
IndexedDB for offline storage
Webpack for bundling
Docker for deployment

Achievements:

📦 Bundle size: 180KB uncompressed, 55KB gzipped
⚡ Search performance: <50ms for 1000+ products
💾 Memory usage: Stable at ~85MB for 8-hour sessions
🔄 Sync performance: 100 items in 3 seconds (batched)
🖨️ Print job reliability: 100% success rate with retry logic
📱 Device support: Android tablets with 2GB RAM

🎓 What I Learned

The Big Theme: Production systems aren't about features—they're about reliability, performance, and handling failure gracefully.

Key Principles:

Offline-First = Local-First - Users should never wait for network requests
Batch Everything - Network operations, sync operations, print jobs
Index What You Query - O(1) lookups beat O(n) scans every time
Design for Failure - Network drops, printers fail, devices conflict. Handle it all.
Memory Matters - On constrained devices, every MB counts
State Machines Prevent Bugs - Make invalid states unrepresentable

🚀 Production Readiness Checklist

✅ Offline queuing with retry logic
✅ Conflict resolution strategy
✅ Batch processing for sync operations
✅ In-memory indexing for fast queries
✅ Virtual scrolling for large lists
✅ Print job prioritization and retry
✅ State machine for order workflow
✅ Memory leak prevention
✅ Bundle size optimization
✅ Network flapping handling
✅ Docker deployment setup
✅ Comprehensive documentation

Drop your thoughts in the comments or reach out on LinkedIn.

Let's build systems that work when everything else fails. 🚀

— Sidharth