flowchart TD classDef processNode fill:#dae8fc,stroke:#6c8ebf,color:#000 classDef imageNode fill:#d5e8d4,stroke:#82b366,color:#000 classDef networkNode fill:#ffe6cc,stroke:#d79b00,color:#000 classDef errorNode fill:#f8cecc,stroke:#b85450,color:#000 classDef successNode fill:#d5e8d4,stroke:#82b366,color:#000 classDef configNode fill:#fff2cc,stroke:#d6b656,color:#000 classDef metricNode fill:#e1d5e7,stroke:#9673a6,color:#000
Input[/"Image URLs List"/]:::imageNode --> BatchProcess
BatchProcess["Batch Processing Initialization
• Setup session with retries • Configure headers and timeouts • Initialize batch logging"]:::processNode --> ContextLoop
ContextLoop["For Each Context:
Check skip conditions Validate URL format"]:::processNode --> SkipCheck
SkipCheck{"Skip Conditions
skip_run OR skip_processing?"}:::processNode
SkipCheck -->|Yes| SkipLogging["Log Skip Reason
Update skip counter"]:::processNode SkipCheck -->|No| DownloadAttempt
DownloadAttempt["Download Attempt
• Apply retry strategy • Exponential backoff • Validate content type"]:::networkNode --> ContentValidation
ContentValidation{"Content Validation
Is valid image? Check MIME type Verify file signature"}:::processNode
ContentValidation -->|Valid| ImageProcessing["Image Processing
• Open with PIL • Extract EXIF data • Get original dimensions • Store content in context"]:::successNode
ContentValidation -->|Invalid| RetryLogic{"Retry Logic
Attempts < MAX_RETRIES? Check error type"}:::errorNode
RetryLogic -->|Yes| BackoffDelay["Exponential Backoff
delay = RETRY_DELAY * (BACKOFF_MULTIPLIER ^ attempt) Wait before retry"]:::errorNode BackoffDelay --> DownloadAttempt
RetryLogic -->|No| MarkFailed["Mark Context as Failed
Set skip_run = True Log failure details"]:::errorNode
ImageProcessing --> CleanupContent["Cleanup Downloaded Content
Remove _download_content Free memory"]:::processNode
CleanupContent --> UpdateLogs["Update Batch Logs
• Record success metrics • Store performance data • Update counters"]:::metricNode
MarkFailed --> UpdateLogs
SkipLogging --> UpdateLogs
UpdateLogs --> CheckNext{"More Contexts?"}:::processNode
CheckNext -->|Yes| ContextLoop
CheckNext -->|No| FinalSummary
FinalSummary["Generate Final Summary
• Calculate success rate • Analyze error distribution • Generate performance metrics • Check for batch abort conditions"]:::metricNode --> AbortCheck
AbortCheck{"Any Download Errors?
error_count > 0"}:::processNode
AbortCheck -->|Yes| BatchAbort["Batch Abort Logic
Mark all contexts skip_run = True Log abort reason"]:::errorNode
AbortCheck -->|No| Output
BatchAbort --> Output
Output[/"Processing Complete
Updated contexts with images or skip flags"/]:::imageNode
subgraph RetryStrategy["Retry Strategy Configuration"]
RetryConfig["Session Retry Strategy:
• MAX_RETRIES_PER_REQUEST = 2 • Status codes: [429, 500, 502, 503, 504] • Backoff factor: 1 • Allowed methods: [GET]"]:::configNode
CustomRetry["Application Retry Strategy:
• MAX_RETRIES = 3 • RETRY_DELAY = 2 seconds • BACKOFF_MULTIPLIER = 1.5 • Exponential backoff calculation"]:::configNode
SessionConfig["Session Configuration:
• User-Agent: Mozilla/5.0 (compatible; ImageProcessor/1.0) • Accept: image/* • Accept-Encoding: gzip, deflate • Connection: keep-alive • Timeout: BATCH_DOWNLOAD_TIMEOUT (30s)"]:::configNode
RetryConfig --> CustomRetry --> SessionConfig
end
BatchProcess -.-> RetryStrategy
subgraph ErrorHandling["Error Categorization & Handling"]
NetworkErrors["Network Errors:
• Connection timeouts • DNS resolution failures • SSL certificate issues • Socket errors"]:::errorNode
ContentErrors["Content Errors:
• Non-image MIME types • Corrupted image data • Empty responses • Invalid file signatures"]:::errorNode
ServerErrors["Server Errors:
• HTTP 4xx/5xx responses • Rate limiting (429) • Server unavailable (503) • Gateway errors (502, 504)"]:::errorNode
ProcessingErrors["Processing Errors:
• PIL image opening failures • Memory allocation errors • File format unsupported"]:::errorNode
NetworkErrors --> ContentErrors --> ServerErrors --> ProcessingErrors
end
DownloadAttempt -.-> ErrorHandling
subgraph PerformanceMetrics["Performance Tracking"]
DownloadMetrics["Download Metrics:
• Download time per image • Content size tracking • Attempts per success • Bandwidth utilization"]:::metricNode
BatchMetrics["Batch Metrics:
• Total processing time • Success rate calculation • Error rate by category • Resource utilization"]:::metricNode
QualityMetrics["Quality Metrics:
• Image dimensions • File format distribution • Content type validation • Error pattern analysis"]:::metricNode
DownloadMetrics --> BatchMetrics --> QualityMetrics
end
UpdateLogs -.-> PerformanceMetrics
subgraph SecurityMeasures["Security & Validation"]
ContentValidation2["Content Type Validation:
• HTTP Content-Type header • PIL format detection • Magic byte verification • File extension matching"]:::configNode
SecurityHeaders["Security Headers:
• User-Agent masking • Accept header specification • Connection management • Timeout enforcement"]:::configNode
MemoryProtection["Memory Protection:
• Streaming downloads • Content size limits • Immediate cleanup • Resource monitoring"]:::configNode
ContentValidation2 --> SecurityHeaders --> MemoryProtection
end
ContentValidation -.-> SecurityMeasures