metadata
title: XAI Image Classifier
emoji: ๐ฌ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: mit
tags:
- computer-vision
- image-classification
- explainable-ai
- grad-cam
- resnet
- pytorch
- interpretability
๐ฌ XAI Image Classifier: ResNet-152 with Grad-CAM
Production-grade explainable image classification powered by ResNet-152 architecture with gradient-based visual attribution via Grad-CAM.
๐ฏ Overview
This space provides transparent AI decision-making for image classification tasks. Built on ResNet-152 (82.3% ImageNet Top-1 accuracy), it integrates Captum's LayerGradCam to generate pixel-level attribution maps, revealing which spatial regions drive class-specific predictions.
โจ Key Features
| Feature | Description |
|---|---|
| ๐ง ResNet-152 Architecture | 60M parameters, 82.3% ImageNet accuracy |
| ๐ฅ Grad-CAM Visualization | Gradient-weighted class activation mapping |
| โก GPU-Optimized Inference | FP16 mixed-precision (~4-5ms latency on A100) |
| ๐ Multi-View Analysis | Original + Heatmap + Overlay + Contours |
| ๐จ 1000 ImageNet Classes | Comprehensive object recognition |
๐ How to Use
- Upload an image (JPG, PNG, WebP supported)
- Click "๐ Analyze" to run inference
- View Top-10 predictions with confidence scores
- Examine Grad-CAM heatmaps showing model attention
- Compare multiple colormap visualizations
๐ฌ Technical Architecture
Model: ResNet-152 (torchvision.models.resnet152)
Weights: IMAGENET1K_V2 (pretrained)
XAI Method: Layer Grad-CAM (Captum)
Target Layer: layer4[-1] (final conv block)
Input Size: 224ร224 RGB
Precision: FP16 (GPU) / FP32 (CPU)
Performance Metrics
| Hardware | Inference Time | Memory Usage |
|---|---|---|
| NVIDIA A100 | ~3-4ms | 1.2GB |
| NVIDIA T4 | ~8-10ms | 1.2GB |
| CPU (16 cores) | ~200ms | 2.5GB |
๐ Model Accuracy
- Top-1 Accuracy: 82.3% (ImageNet validation set)
- Top-5 Accuracy: 96.1%
- Parameter Count: 60.2M
- FLOPs: 11.6B
๐ ๏ธ Optimizations Applied
- FP16 Mixed Precision: 2x inference speedup on GPU
- cuDNN Benchmark: Auto-tuned convolution algorithms
- TF32 Operations: 8x faster matmuls on Ampere GPUs
- Gradient Checkpointing: Memory-efficient Grad-CAM computation
๐จ Visualization Outputs
- Original Image - Input as-is
- Grad-CAM Heatmap - Pure activation visualization
- Overlay - Heatmap superimposed on original
- Multi-Colormap Comparison - Jet, Hot, Viridis with contours
๐ Use Cases
| Domain | Application |
|---|---|
| Medical Imaging | Validate diagnostic AI attention regions |
| Autonomous Systems | Debug object detection focus |
| Security & Surveillance | Audit algorithmic decision-making |
| Research | Study CNN feature representations |
| Education | Teach explainable AI concepts |
๐ Privacy & Ethics
- โ No data retention - Images processed in-memory only
- โ Zero telemetry - No usage tracking
- โ Open source - Full code transparency
- โ Bias auditing - Visual inspection of model biases
๐ References
Model Architecture
- He, K., et al. (2016). Deep Residual Learning for Image Recognition. CVPR.
Explainability Method
- Selvaraju, R. R., et al. (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. ICCV.
Framework
- PyTorch Team. PyTorch: An Imperative Style, High-Performance Deep Learning Library. NeurIPS 2019.
๐ Links
- GitHub Repository: 0AnshuAditya0/xai
- Documentation: Full Technical Docs
- Paper (Grad-CAM): arXiv:1610.02391
- Paper (ResNet): arXiv:1512.03385
โ๏ธ Technical Requirements
# Core Dependencies
torch>=2.0.0
torchvision>=0.15.0
gradio>=4.44.0
captum>=0.6.0
Pillow>=9.0.0
numpy>=1.23.0
matplotlib>=3.5.0
๐ Known Limitations
- Memory: Requires ~1.2GB GPU memory (FP16 mode)
- Latency: CPU inference slower (~200ms vs ~5ms GPU)
- Classes: Limited to 1000 ImageNet categories
- Input Format: RGB images only (grayscale not supported)
๐ฎ Roadmap
- Add support for custom model fine-tuning
- Implement batch processing API
- Integrate additional XAI methods (SHAP, Integrated Gradients)
- Add uncertainty quantification
- Support for video frame analysis
๐ License
MIT License - Free for research, education, and commercial use.
๐จโ๐ป Author
Anshu Aditya
AI Engineer | Explainable AI Researcher
Built with โค๏ธ for transparent and accountable AI
Making deep learning interpretable, one image at a time
โญ Star this space if you find it useful!