12 min read

Building Resilient Systems for AI-Powered Applications

Today I learned that the boring parts of software engineering—error handling, progress saving, file deduplication—are what separate toy projects from production systems.

#AI#Web Scraping#RAG#TypeScript#Python#UX

December 12, 2025 • Full Day Session

Today was one of those days where everything I touched taught me something new. I spent the day working across multiple technical domains—web scraping, AI/LLM interfaces, RAG systems, and stress testing. Here's what I learned.


🎯 TL;DR - What You'll Learn

  • Building resumable web scrapers that survive network failures
  • Making AI responses beautiful with structured rendering
  • Document chunking strategies that preserve context
  • Stress testing edge cases that break production
  • Error handling patterns users actually see

Reading time: 8 minutes of real technical learnings


🕷️ Part 1: The Web Scraper That Could Resume

The Challenge

I needed to download 100-150 PDF documents from institutional websites. Sounds simple, right? Wrong.

These sites had:

  • ⛔ CAPTCHA protection on some portals
  • 🐌 Rate limiting that would kick in randomly
  • 🧩 Complex DOM structures with JavaScript-rendered content
  • 🔐 Files hidden behind dynamic buttons

Pattern #1: Hybrid Scraping (Requests + Selenium)

The Insight: Not everything needs browser automation. Direct HTTP requests are 10x faster. Use Selenium only when absolutely necessary.

Loading diagram...
python
class DocumentDownloader:
    def download_pdf(self, url, filepath):
        # First try: Direct requests (fast, no browser overhead)
        if self._try_direct_download(url, filepath):
            return True

        # Fallback: Selenium for JavaScript-rendered content
        if self.driver:
            return self._download_with_selenium(url, filepath)

        return False

Why this matters: On a test run of 50 documents:

  • Direct requests: 5 minutes
  • Selenium for everything: 45 minutes

That's a 9x speedup just by being smart about tool selection.

Pattern #2: Resume Functionality for Long-Running Scripts

Here's something I learned the hard way: If your script runs for 30+ minutes, it WILL fail at some point.

Network hiccups, rate limits, CAPTCHAs—something will interrupt it. So I built in progress saving:

Loading diagram...
python
def _save_progress(self):
    progress = {
        'downloaded_urls': list(self.downloaded_urls),
        'downloaded_hashes': list(self.downloaded_hashes),
        'downloaded_count': self.downloaded_count,
        'last_updated': datetime.now().isoformat()
    }
    with open(self.progress_file, 'w') as f:
        json.dump(progress, f)

# Save progress every 10 downloads
if download_count % 10 == 0:
    self._save_progress()

Real numbers: Over a 2-hour run with 3 network failures, this pattern saved me from redownloading 87 documents.

Pattern #3: File Deduplication Using Hashes

Loading diagram...
python
def _calculate_file_hash(self, filepath: Path) -> str:
    hash_sha256 = hashlib.sha256()
    with open(filepath, 'rb') as f:
        for chunk in iter(lambda: f.read(4096), b''):
            hash_sha256.update(chunk)
    return hash_sha256.hexdigest()

Before implementing this: 150 files downloaded, 45MB total
After implementing this: 103 unique files, 31MB total

That's 47 duplicate files (31% duplication rate) that would have wasted storage and processing time.

💡 Key Takeaway: File names can differ but content can be identical. Always deduplicate by content hash, not filename.

Handling CAPTCHA: When to Give Up

Here's the uncomfortable truth: Sometimes the best scraper is the one you don't build.

Before diving into CAPTCHA solving services ($2-5 per 1000 solves), I asked:

  • ✅ Does this data exist on alternative platforms?
  • ✅ Can I request official API access?
  • ✅ Does the ToS even allow automated access?

In this case, I found an alternative data source that was CAPTCHA-free. Saved hours of development time.


🎨 Part 2: Making AI Responses Actually Beautiful

The Problem

LLMs produce excellent content but present it as plain text with markdown. Users need:

  • 📋 Clear visual hierarchy
  • 👁️ Quick scanning of key information
  • 🎯 Domain-specific section highlighting

Generic markdown rendering is a missed UX opportunity.

Solution: Structured Response Parser

I built a system that detects section types and renders them with custom components:

Loading diagram...
typescript
function detectSectionType(title: string): SectionType {
  const titleLower = title.toLowerCase();

  // Summary patterns
  if (
    titleLower.includes("summary") ||
    titleLower.includes("overview") ||
    titleLower.includes("introduction")
  ) {
    return "summary";
  }

  // Key information patterns
  if (
    titleLower.includes("key") ||
    titleLower.includes("important") ||
    titleLower.includes("highlights")
  ) {
    return "keyInfo";
  }

  // Risk/warning patterns
  if (
    titleLower.includes("risk") ||
    titleLower.includes("warning") ||
    titleLower.includes("caution")
  ) {
    return "riskAssessment";
  }

  return "generic";
}

Component Architecture

Each section type gets specialized rendering:

typescript
const SectionRenderer = ({ section }) => {
  switch (section.type) {
    case "summary":
      return <SummaryCard />; // 🟠 Orange accent, overview icon
    case "keyInfo":
      return <KeyInfoCard />; // 📊 Key-value grid layout
    case "issues":
      return <IssuesCard />; // 🔴 Red accent, severity badges
    case "procedure":
      return <ProcedureCard />; // 🔢 Numbered steps with progress
    case "references":
      return <ReferencesCard />; // 📚 Book icon, citation styling
    default:
      return <GenericCard />;
  }
};

The Result: User testing showed 60% faster information scanning and higher trust scores compared to generic markdown rendering.

💡 Key Takeaway: Spend time on output presentation. The same information presented better = more user trust and engagement.


📄 Part 3: Document Chunking for RAG Systems

The Challenge

Long documents need to be split into chunks for vector embeddings. But naive splitting breaks context:

Loading diagram...

Fixed-Size Chunking (Don't do this):

typescript
// Simple but BREAKS context
const chunks = [];
for (let i = 0; i < text.length; i += chunkSize) {
  chunks.push(text.slice(i, i + chunkSize));
}

Problem: This cuts in the middle of sentences, separates arguments from their context, and breaks references.

Better Approaches

✅ Paragraph-Aware Chunking

typescript
private chunkByParagraphs(text, chunkSize, overlap) {
  const paragraphs = text.split(/\n\n+/);
  const chunks = [];
  let currentChunk = "";

  for (const paragraph of paragraphs) {
    if (currentChunk.length + paragraph.length > chunkSize) {
      chunks.push(currentChunk);
      // Overlap preserves context continuity
      currentChunk = currentChunk.slice(-overlap) + paragraph;
    } else {
      currentChunk += "\n\n" + paragraph;
    }
  }
  return chunks;
}

✅✅ Section-Aware Chunking (Best for Structured Docs)

typescript
const SECTION_PATTERNS = [
  /^(?:SECTION|SEC\.?)\s*\d+/i,
  /^(?:CHAPTER|CH\.?)\s*\d+/i,
  /^#{1,3}\s+/, // Markdown headers
];

function isNewSection(line) {
  return SECTION_PATTERNS.some((p) => p.test(line));
}

Optimal Parameters I Found Through Testing

Document TypeChunk SizeOverlapStrategyRetrieval QualityBest For
📚 Technical Docs1000 chars200 charsSection-aware⭐⭐⭐⭐⭐API docs, manuals, structured content
📄 Contracts/Legal800 chars150 charsParagraph-aware⭐⭐⭐⭐Legal documents, terms of service
📝 General Articles500 chars100 charsParagraph-aware⭐⭐⭐Blog posts, news articles, essays
💬 Chat/Messages300 chars50 charsFixed-size⭐⭐Conversations, support tickets

Key Insight: Larger chunks preserve more context but make retrieval less precise. Match your chunk size to how users will query the information.

Real Results: After switching from naive to paragraph-aware chunking, retrieval accuracy improved by ~30% in my tests.

💡 Key Takeaway: Naive text splitting is like cutting a book into random pages. Context-aware chunking preserves the author's intended structure.


🧪 Part 4: Stress Testing Edge Cases

Security: XSS Protection ✅

Test Input: I tested various XSS attack vectors including script tags, event handlers, and other malicious HTML patterns to ensure proper content sanitization.

Result: All displayed as plain text. React's default escaping works! But I still audited:

  • Any dangerouslySetInnerHTML usage
  • URL parameters rendered in links
  • Markdown renderers (some allow HTML)

Long Input Handling ✅

Test: Sent a 3,500+ character message with repetitive content.

Results:

  • ✅ System accepted and processed correctly
  • ✅ AI extracted the real question from noise
  • ✅ Response was accurate and comprehensive
  • ✅ No performance degradation

Learning: Don't artificially limit input length too aggressively. Modern LLMs handle long context well.

Loading States: Beating ChatGPT

I analyzed loading UX across major AI tools:

Loading diagram...
FeatureChatGPTClaudeMy Implementation
AnimationDotsAnimated✅ Spinner
Progress Stages✅ "Analyzing..."
Time Estimate✅ "~10-15 sec"
Cancel Button✅ Yes
Context Info✅ "Searching sources..."

Result: User feedback rated my loading UX 4.5/5, above ChatGPT's baseline.

💡 Key Takeaway: Loading states are an underrated UX opportunity. Users tolerate longer waits when they understand what's happening.


🚨 Part 5: Error Handling That Users Actually See

The 401 Error Case Study

I found a critical bug where long messages occasionally triggered 401 errors.

Loading diagram...

❌ Bad Pattern (What I found):

typescript
// Silent failure - user sees NOTHING
try {
  const response = await fetch(url);
  return response.json();
} catch (error) {
  console.error(error); // Only in console!
}

✅ Good Pattern (What I implemented):

typescript
try {
  const response = await fetch(url);

  if (response.status === 401) {
    // User-visible action
    showToast("Session expired. Please refresh.", "error");
    // Optional: auto-refresh token
    await refreshToken();
    // Retry once
    return fetch(url);
  }

  return response.json();
} catch (error) {
  showToast("Connection failed. Please try again.", "error");
  throw error;
}

💡 Key Takeaway: If your error handling doesn't change what the user sees, you're not handling errors—you're hiding them.


📊 Day 1 Stats

Technical Stack Used:

  • Python (requests, Selenium, BeautifulSoup, tqdm)
  • TypeScript/React (Next.js, custom components)
  • Node.js (backend services)

Achievements:

  • 📄 Documents processed: 100-150 PDFs
  • 🎯 Chunking improvement: ~30% better retrieval
  • ⭐ Loading UX score: 4.5/5
  • 🐛 Critical bugs fixed: 1 (401 error handling)
  • 🎨 UX improvements: 12 identified and implemented

🎓 What I Learned

The Big Theme: Production-grade code isn't about clever algorithms—it's about handling all the ways things can go wrong.

Key Principles:

  1. Resilience > Performance - A slower script that finishes is better than a fast one that crashes
  2. Presentation Matters - The same information, presented better, builds trust
  3. Test the Edges - An AI that handles "Hello" but fails on "Hello!!!" isn't production-ready
  4. Errors Should Be Visible - Console logs ≠ user notifications
  5. Context is King - Whether chunking docs or handling errors, preserve context

Drop your thoughts in the comments or reach out on LinkedIn.

Let's build better systems together. 🚀

— Sidharth