50 FDE Interview Problems: Decomposition and Case Studies
The decomposition interview is the signature FDE interview format. You're given a vague business problem and must break it into technical components, propose an architecture, and discuss trade-offs — all while communicating clearly.
These problems are organized by industry and difficulty. Each includes a problem statement, key questions to ask, and a suggested approach.
How to Approach Decomposition Problems
The Framework (5 steps, 45 minutes)
- Clarify (5 min) — Ask questions. Understand the customer, constraints, and success metrics.
- Decompose (10 min) — Break the problem into 3-5 sub-problems.
- Prioritize (5 min) — Which sub-problem delivers the most value first?
- Design (15 min) — Architecture the solution. Draw diagrams. Discuss data flow.
- Trade-offs (10 min) — What could go wrong? What would you do differently with more time?
Logistics & Supply Chain (Problems 1-10)
Problem 1: Package Routing Optimization
Scenario: A shipping company delivers 500K packages daily across 200 cities. They want to reduce delivery times by 15%.
Key questions to ask:
- What data do they currently collect? (GPS, timestamps, weather, traffic)
- What's their current routing system? (Manual? Basic algorithm?)
- What does "delivery time" mean? (Warehouse to door? Last mile only?)
Approach:
- Sub-problems: (1) Data ingestion from GPS/IoT, (2) Route optimization algorithm, (3) Real-time re-routing, (4) Performance monitoring dashboard
- Start with: Historical data analysis to identify bottleneck routes
- Tech: Graph optimization, real-time streaming (Kafka), geospatial queries (PostGIS)
Problem 2: Warehouse Inventory Prediction
Scenario: An e-commerce company has 15 warehouses. They over-stock 30% of items and under-stock 20%. Design a system to predict optimal inventory levels.
Problem 3: Fleet Maintenance Scheduling
Scenario: A trucking company has 3,000 vehicles. They want to predict maintenance needs to reduce breakdowns by 50%.
Problem 4: Port Container Tracking
Scenario: A shipping port processes 10,000 containers daily. Containers get lost or delayed. Build a real-time tracking system.
Problem 5: Last-Mile Delivery Optimization
Scenario: A grocery delivery service operates in 5 cities. Each driver makes 20-30 deliveries per shift. Optimize driver assignment and routing.
Healthcare (Problems 6-15)
Problem 6: Patient Readmission Prediction
Scenario: A hospital network wants to reduce 30-day readmission rates. They have 5 years of patient records across 12 hospitals.
Key questions to ask:
- What data is available? (EHR, labs, medications, demographics)
- HIPAA constraints? Data residency requirements?
- What interventions are possible if we predict high risk?
Approach:
- Sub-problems: (1) Data pipeline from EHR systems (HL7/FHIR), (2) Feature engineering, (3) ML model (XGBoost or similar), (4) Clinical dashboard for care teams, (5) Feedback loop for model improvement
- Start with: Retrospective analysis on historical readmissions
- Critical: Model explainability — clinicians need to understand WHY a patient is flagged
Problem 7: Medical Image Triage
Scenario: A radiology department processes 500 scans daily. They want AI to prioritize urgent cases.
Problem 8: Drug Interaction Alert System
Scenario: A pharmacy chain wants real-time alerts when prescriptions have dangerous interactions.
Problem 9: Clinical Trial Patient Matching
Scenario: A pharma company has 50 active clinical trials. They need to match eligible patients faster.
Problem 10: Hospital Bed Capacity Planning
Scenario: A 500-bed hospital frequently runs at 95%+ capacity. Design a prediction system for bed availability.
Financial Services (Problems 11-20)
Problem 11: Fraud Detection Pipeline
Scenario: A fintech processes 10M transactions daily. Current fraud detection catches 60% of fraudulent transactions with a 5% false positive rate. Improve both metrics.
Key questions to ask:
- Latency requirements? (Real-time blocking vs. post-transaction review?)
- What data is available? (Transaction details, device info, user behavior, merchant data)
- What happens when fraud is detected? (Block, flag, require verification?)
Approach:
- Sub-problems: (1) Real-time feature computation, (2) ML scoring pipeline, (3) Rules engine for known patterns, (4) Investigation dashboard, (5) Feedback loop from investigators
- Architecture: Streaming pipeline (Kafka → feature store → model serving → decision engine)
- Key trade-off: Latency vs. accuracy. Adding more features improves detection but increases latency.
Problem 12: Know Your Customer (KYC) Automation
Scenario: A bank spends 45 minutes on average per KYC check. They want to automate 80% of checks.
Problem 13: Portfolio Risk Dashboard
Scenario: An investment firm manages $5B across 200 portfolios. Build a real-time risk monitoring system.
Problem 14: Loan Default Prediction
Scenario: A lending platform wants to predict loan defaults at application time.
Problem 15: Anti-Money Laundering (AML) Graph Analysis
Scenario: A bank needs to detect suspicious transaction networks across 50M accounts.
Defense & Government (Problems 16-25)
Problem 16: Satellite Imagery Analysis
Scenario: A defense agency receives 10TB of satellite imagery daily. They need to detect changes (new construction, vehicle movement) automatically.
Problem 17: Cybersecurity Threat Intelligence
Scenario: A government network operations center monitors 500K endpoints. They want to reduce mean time to detect threats from 72 hours to 4 hours.
Problem 18: Disaster Response Resource Allocation
Scenario: After a natural disaster, coordinate rescue teams, supplies, and medical resources across an affected region.
Problem 19: Border Surveillance System
Scenario: Monitor 500 miles of border using a combination of sensors, cameras, and drones.
Problem 20: Supply Chain Security for Critical Infrastructure
Scenario: A government agency needs to verify that hardware components haven't been tampered with across a global supply chain.
Retail & E-Commerce (Problems 21-30)
Problem 21: Real-Time Pricing Engine
Scenario: An e-commerce platform with 5M products wants dynamic pricing based on demand, competition, and inventory.
Problem 22: Customer Segmentation at Scale
Scenario: A retailer with 20M customers wants to create dynamic segments for personalized marketing.
Problem 23: Store Layout Optimization
Scenario: A grocery chain wants to use purchase data and foot traffic to optimize product placement.
Problem 24: Returns Prediction and Prevention
Scenario: An online fashion retailer has a 35% return rate. Predict and reduce returns.
Problem 25: Omnichannel Inventory Visibility
Scenario: A retailer with 500 stores and an e-commerce site wants unified, real-time inventory visibility.
Energy & Manufacturing (Problems 26-35)
Problem 26: Predictive Maintenance for Wind Turbines
Scenario: A wind farm operator has 200 turbines with IoT sensors. Predict failures before they cause downtime.
Problem 27: Energy Grid Load Balancing
Scenario: A utility company needs to balance supply and demand across a grid with 30% renewable (intermittent) energy.
Problem 28: Quality Control with Computer Vision
Scenario: A manufacturing line produces 50,000 units daily. Detect defects using camera inspection.
Problem 29: Digital Twin for Factory Operations
Scenario: Build a digital twin of a factory floor to simulate and optimize production workflows.
Problem 30: Carbon Emissions Tracking
Scenario: A large corporation needs to track Scope 1, 2, and 3 carbon emissions across their operations and supply chain.
AI / ML Specific (Problems 31-40)
Problem 31: Enterprise RAG System
Scenario: A law firm has 10M legal documents. Lawyers need to query them using natural language and get accurate, cited answers.
Key questions to ask:
- Document types? (PDFs, emails, contracts, case law)
- Accuracy requirements? (Legal context = very high)
- Latency? (Interactive search vs. batch analysis)
Approach:
- Sub-problems: (1) Document ingestion and chunking, (2) Embedding generation, (3) Vector store with metadata filtering, (4) Retrieval pipeline with re-ranking, (5) LLM generation with citations, (6) Evaluation and feedback
- Key trade-offs: Chunk size vs. context preservation. Speed vs. accuracy. Cost of LLM calls.
- Start with: 1,000 documents, one practice area, measure retrieval quality before scaling.
Problem 32: AI Agent for Customer Support
Scenario: A SaaS company handles 50K support tickets monthly. Build an AI agent that resolves 60% automatically.
Problem 33: LLM-Powered Data Analyst
Scenario: A business intelligence team wants non-technical users to query data using natural language.
Problem 34: Content Moderation at Scale
Scenario: A social platform needs to moderate 1M posts daily for harmful content.
Problem 35: Multi-Modal Search Engine
Scenario: A media company has 5M images, videos, and documents. Build a search system that accepts text, image, or audio queries.
Cross-Functional (Problems 36-50)
Problem 36: Data Migration Strategy
Scenario: Migrate a Fortune 500's data from Oracle + Hadoop to a modern cloud platform with zero downtime.
Problem 37: Real-Time Recommendation System
Scenario: A streaming service wants personalized recommendations updated in real-time as users browse.
Problem 38: IoT Data Platform
Scenario: A smart building company has 100K sensors across 500 buildings. Build a platform for real-time monitoring and analytics.
Problem 39: Compliance Monitoring System
Scenario: A financial institution needs automated monitoring of 500+ regulatory requirements.
Problem 40: Multi-Tenant SaaS Customization
Scenario: Your product serves 200 enterprise customers, each wanting custom workflows. Design a customization layer.
Problem 41: Event-Driven Architecture Migration
Scenario: Migrate a monolithic batch-processing system to event-driven real-time architecture.
Problem 42: Data Quality Monitoring
Scenario: A data platform has 10,000 tables. Build automated data quality checks with alerting.
Problem 43: API Gateway Design
Scenario: A company has 50 microservices. Design an API gateway for external partner access with rate limiting, auth, and versioning.
Problem 44: Search Infrastructure
Scenario: An e-commerce site with 10M products needs search that handles typos, synonyms, and personalized ranking.
Problem 45: Real-Time Dashboard for Operations
Scenario: An operations team needs a dashboard showing real-time metrics from 20 different data sources with <5 second latency.
Problem 46: Customer Data Platform
Scenario: Unify customer data from CRM, website, mobile app, support tickets, and purchase history into a single customer view.
Problem 47: Feature Store Design
Scenario: An ML team has 50 models in production. They're duplicating feature computation. Design a shared feature store.
Problem 48: Data Marketplace
Scenario: A data company wants to let customers discover, preview, and subscribe to datasets through a self-service marketplace.
Problem 49: Workflow Automation Platform
Scenario: A consulting firm has 200 consultants doing repetitive data processing tasks. Build a no-code/low-code automation platform.
Problem 50: AI-Powered Document Processing
Scenario: An insurance company processes 100K claims documents monthly. 80% are still manually reviewed. Automate extraction and classification.
Practice Tips
- Time yourself. 45 minutes per problem. If you can't structure an approach in 5 minutes, your framework needs work.
- Draw diagrams. Interviewers want to see visual thinking. Practice on a whiteboard or drawing tool.
- Talk through trade-offs. There's no single right answer. Show that you understand the implications of your choices.
- Ask questions first. The best FDE candidates spend 20% of the time clarifying the problem.
- Start with the simplest version. Deploy a POC in week 1, iterate based on feedback.
Want to discuss specific solutions? Pick a problem number and post your approach in the replies. Community feedback is the best interview prep.