Building Resilient Systems for AI-Powered Applications
Today I learned that the boring parts of software engineering—error handling, progress saving, file deduplication—are what separate toy projects from production systems.
December 12, 2025 • Full Day Session
Today was one of those days where everything I touched taught me something new. I spent the day working across multiple technical domains—web scraping, AI/LLM interfaces, RAG systems, and stress testing. Here's what I learned.
🎯 TL;DR - What You'll Learn
- Building resumable web scrapers that survive network failures
- Making AI responses beautiful with structured rendering
- Document chunking strategies that preserve context
- Stress testing edge cases that break production
- Error handling patterns users actually see
Reading time: 8 minutes of real technical learnings
🕷️ Part 1: The Web Scraper That Could Resume
The Challenge
I needed to download 100-150 PDF documents from institutional websites. Sounds simple, right? Wrong.
These sites had:
- ⛔ CAPTCHA protection on some portals
- 🐌 Rate limiting that would kick in randomly
- 🧩 Complex DOM structures with JavaScript-rendered content
- 🔐 Files hidden behind dynamic buttons
Pattern #1: Hybrid Scraping (Requests + Selenium)
The Insight: Not everything needs browser automation. Direct HTTP requests are 10x faster. Use Selenium only when absolutely necessary.
class DocumentDownloader:
def download_pdf(self, url, filepath):
# First try: Direct requests (fast, no browser overhead)
if self._try_direct_download(url, filepath):
return True
# Fallback: Selenium for JavaScript-rendered content
if self.driver:
return self._download_with_selenium(url, filepath)
return FalseWhy this matters: On a test run of 50 documents:
- Direct requests: 5 minutes
- Selenium for everything: 45 minutes
That's a 9x speedup just by being smart about tool selection.
Pattern #2: Resume Functionality for Long-Running Scripts
Here's something I learned the hard way: If your script runs for 30+ minutes, it WILL fail at some point.
Network hiccups, rate limits, CAPTCHAs—something will interrupt it. So I built in progress saving:
def _save_progress(self):
progress = {
'downloaded_urls': list(self.downloaded_urls),
'downloaded_hashes': list(self.downloaded_hashes),
'downloaded_count': self.downloaded_count,
'last_updated': datetime.now().isoformat()
}
with open(self.progress_file, 'w') as f:
json.dump(progress, f)
# Save progress every 10 downloads
if download_count % 10 == 0:
self._save_progress()Real numbers: Over a 2-hour run with 3 network failures, this pattern saved me from redownloading 87 documents.
Pattern #3: File Deduplication Using Hashes
def _calculate_file_hash(self, filepath: Path) -> str:
hash_sha256 = hashlib.sha256()
with open(filepath, 'rb') as f:
for chunk in iter(lambda: f.read(4096), b''):
hash_sha256.update(chunk)
return hash_sha256.hexdigest()Before implementing this: 150 files downloaded, 45MB total
After implementing this: 103 unique files, 31MB total
That's 47 duplicate files (31% duplication rate) that would have wasted storage and processing time.
💡 Key Takeaway: File names can differ but content can be identical. Always deduplicate by content hash, not filename.
Handling CAPTCHA: When to Give Up
Here's the uncomfortable truth: Sometimes the best scraper is the one you don't build.
Before diving into CAPTCHA solving services ($2-5 per 1000 solves), I asked:
- ✅ Does this data exist on alternative platforms?
- ✅ Can I request official API access?
- ✅ Does the ToS even allow automated access?
In this case, I found an alternative data source that was CAPTCHA-free. Saved hours of development time.
🎨 Part 2: Making AI Responses Actually Beautiful
The Problem
LLMs produce excellent content but present it as plain text with markdown. Users need:
- 📋 Clear visual hierarchy
- 👁️ Quick scanning of key information
- 🎯 Domain-specific section highlighting
Generic markdown rendering is a missed UX opportunity.
Solution: Structured Response Parser
I built a system that detects section types and renders them with custom components:
function detectSectionType(title: string): SectionType {
const titleLower = title.toLowerCase();
// Summary patterns
if (
titleLower.includes("summary") ||
titleLower.includes("overview") ||
titleLower.includes("introduction")
) {
return "summary";
}
// Key information patterns
if (
titleLower.includes("key") ||
titleLower.includes("important") ||
titleLower.includes("highlights")
) {
return "keyInfo";
}
// Risk/warning patterns
if (
titleLower.includes("risk") ||
titleLower.includes("warning") ||
titleLower.includes("caution")
) {
return "riskAssessment";
}
return "generic";
}Component Architecture
Each section type gets specialized rendering:
const SectionRenderer = ({ section }) => {
switch (section.type) {
case "summary":
return <SummaryCard />; // 🟠 Orange accent, overview icon
case "keyInfo":
return <KeyInfoCard />; // 📊 Key-value grid layout
case "issues":
return <IssuesCard />; // 🔴 Red accent, severity badges
case "procedure":
return <ProcedureCard />; // 🔢 Numbered steps with progress
case "references":
return <ReferencesCard />; // 📚 Book icon, citation styling
default:
return <GenericCard />;
}
};The Result: User testing showed 60% faster information scanning and higher trust scores compared to generic markdown rendering.
💡 Key Takeaway: Spend time on output presentation. The same information presented better = more user trust and engagement.
📄 Part 3: Document Chunking for RAG Systems
The Challenge
Long documents need to be split into chunks for vector embeddings. But naive splitting breaks context:
❌ Fixed-Size Chunking (Don't do this):
// Simple but BREAKS context
const chunks = [];
for (let i = 0; i < text.length; i += chunkSize) {
chunks.push(text.slice(i, i + chunkSize));
}Problem: This cuts in the middle of sentences, separates arguments from their context, and breaks references.
Better Approaches
✅ Paragraph-Aware Chunking
private chunkByParagraphs(text, chunkSize, overlap) {
const paragraphs = text.split(/\n\n+/);
const chunks = [];
let currentChunk = "";
for (const paragraph of paragraphs) {
if (currentChunk.length + paragraph.length > chunkSize) {
chunks.push(currentChunk);
// Overlap preserves context continuity
currentChunk = currentChunk.slice(-overlap) + paragraph;
} else {
currentChunk += "\n\n" + paragraph;
}
}
return chunks;
}✅✅ Section-Aware Chunking (Best for Structured Docs)
const SECTION_PATTERNS = [
/^(?:SECTION|SEC\.?)\s*\d+/i,
/^(?:CHAPTER|CH\.?)\s*\d+/i,
/^#{1,3}\s+/, // Markdown headers
];
function isNewSection(line) {
return SECTION_PATTERNS.some((p) => p.test(line));
}Optimal Parameters I Found Through Testing
| Document Type | Chunk Size | Overlap | Strategy | Retrieval Quality | Best For |
|---|---|---|---|---|---|
| 📚 Technical Docs | 1000 chars | 200 chars | Section-aware | ⭐⭐⭐⭐⭐ | API docs, manuals, structured content |
| 📄 Contracts/Legal | 800 chars | 150 chars | Paragraph-aware | ⭐⭐⭐⭐ | Legal documents, terms of service |
| 📝 General Articles | 500 chars | 100 chars | Paragraph-aware | ⭐⭐⭐ | Blog posts, news articles, essays |
| 💬 Chat/Messages | 300 chars | 50 chars | Fixed-size | ⭐⭐ | Conversations, support tickets |
Key Insight: Larger chunks preserve more context but make retrieval less precise. Match your chunk size to how users will query the information.
Real Results: After switching from naive to paragraph-aware chunking, retrieval accuracy improved by ~30% in my tests.
💡 Key Takeaway: Naive text splitting is like cutting a book into random pages. Context-aware chunking preserves the author's intended structure.
🧪 Part 4: Stress Testing Edge Cases
Security: XSS Protection ✅
Test Input: I tested various XSS attack vectors including script tags, event handlers, and other malicious HTML patterns to ensure proper content sanitization.
Result: All displayed as plain text. React's default escaping works! But I still audited:
- Any
dangerouslySetInnerHTMLusage - URL parameters rendered in links
- Markdown renderers (some allow HTML)
Long Input Handling ✅
Test: Sent a 3,500+ character message with repetitive content.
Results:
- ✅ System accepted and processed correctly
- ✅ AI extracted the real question from noise
- ✅ Response was accurate and comprehensive
- ✅ No performance degradation
Learning: Don't artificially limit input length too aggressively. Modern LLMs handle long context well.
Loading States: Beating ChatGPT
I analyzed loading UX across major AI tools:
| Feature | ChatGPT | Claude | My Implementation |
|---|---|---|---|
| Animation | Dots | Animated | ✅ Spinner |
| Progress Stages | ❌ | ❌ | ✅ "Analyzing..." |
| Time Estimate | ❌ | ❌ | ✅ "~10-15 sec" |
| Cancel Button | ❌ | ✅ | ✅ Yes |
| Context Info | ❌ | ❌ | ✅ "Searching sources..." |
Result: User feedback rated my loading UX 4.5/5, above ChatGPT's baseline.
💡 Key Takeaway: Loading states are an underrated UX opportunity. Users tolerate longer waits when they understand what's happening.
🚨 Part 5: Error Handling That Users Actually See
The 401 Error Case Study
I found a critical bug where long messages occasionally triggered 401 errors.
❌ Bad Pattern (What I found):
// Silent failure - user sees NOTHING
try {
const response = await fetch(url);
return response.json();
} catch (error) {
console.error(error); // Only in console!
}✅ Good Pattern (What I implemented):
try {
const response = await fetch(url);
if (response.status === 401) {
// User-visible action
showToast("Session expired. Please refresh.", "error");
// Optional: auto-refresh token
await refreshToken();
// Retry once
return fetch(url);
}
return response.json();
} catch (error) {
showToast("Connection failed. Please try again.", "error");
throw error;
}💡 Key Takeaway: If your error handling doesn't change what the user sees, you're not handling errors—you're hiding them.
📊 Day 1 Stats
Technical Stack Used:
- Python (requests, Selenium, BeautifulSoup, tqdm)
- TypeScript/React (Next.js, custom components)
- Node.js (backend services)
Achievements:
- 📄 Documents processed: 100-150 PDFs
- 🎯 Chunking improvement: ~30% better retrieval
- ⭐ Loading UX score: 4.5/5
- 🐛 Critical bugs fixed: 1 (401 error handling)
- 🎨 UX improvements: 12 identified and implemented
🎓 What I Learned
The Big Theme: Production-grade code isn't about clever algorithms—it's about handling all the ways things can go wrong.
Key Principles:
- Resilience > Performance - A slower script that finishes is better than a fast one that crashes
- Presentation Matters - The same information, presented better, builds trust
- Test the Edges - An AI that handles "Hello" but fails on "Hello!!!" isn't production-ready
- Errors Should Be Visible - Console logs ≠ user notifications
- Context is King - Whether chunking docs or handling errors, preserve context
Drop your thoughts in the comments or reach out on LinkedIn.
Let's build better systems together. 🚀
— Sidharth