Spaces:
Sleeping
Sleeping
Richard Young
commited on
Commit
·
c43a81f
0
Parent(s):
Initial commit for Hugging Face Space
Browse files- .gitattributes +7 -0
- .gitignore +22 -0
- README.md +106 -0
- app.py +563 -0
- find_bad_images.py +1670 -0
- rat_finder.py +1223 -0
- requirements.txt +8 -0
- steg_embedder.py +337 -0
.gitattributes
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.jpg filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.png filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.jpeg filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.gif filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.pdf filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
docs/*.jpg filter=lfs diff=lfs merge=lfs -text
|
.gitignore
ADDED
|
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Image files
|
| 2 |
+
*.jpg
|
| 3 |
+
*.jpeg
|
| 4 |
+
*.JPG
|
| 5 |
+
*.JPEG
|
| 6 |
+
|
| 7 |
+
# System files
|
| 8 |
+
.DS_Store
|
| 9 |
+
Thumbs.db
|
| 10 |
+
|
| 11 |
+
# Python
|
| 12 |
+
__pycache__/
|
| 13 |
+
*.py[cod]
|
| 14 |
+
*.class
|
| 15 |
+
.env
|
| 16 |
+
.venv
|
| 17 |
+
env/
|
| 18 |
+
venv/
|
| 19 |
+
ENV/
|
| 20 |
+
env.bak/
|
| 21 |
+
venv.bak/
|
| 22 |
+
|
README.md
ADDED
|
@@ -0,0 +1,106 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: 2PAC Picture Analyzer & Corruption Killer
|
| 3 |
+
emoji: 🔫
|
| 4 |
+
colorFrom: purple
|
| 5 |
+
colorTo: blue
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: 4.44.0
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# 🔫 2PAC: Picture Analyzer & Corruption Killer
|
| 14 |
+
|
| 15 |
+
**Advanced image security and steganography toolkit**
|
| 16 |
+
|
| 17 |
+
## Features
|
| 18 |
+
|
| 19 |
+
### 🔒 Hide Secret Data
|
| 20 |
+
Invisibly hide text messages inside images using **LSB (Least Significant Bit) steganography**:
|
| 21 |
+
- Hide text of any length (capacity depends on image size)
|
| 22 |
+
- Optional password encryption for added security
|
| 23 |
+
- Adjustable LSB depth (1-4 bits per channel)
|
| 24 |
+
- PNG output preserves hidden data perfectly
|
| 25 |
+
|
| 26 |
+
### 🔍 Detect & Extract Hidden Data
|
| 27 |
+
Advanced steganography detection using **RAT Finder** technology:
|
| 28 |
+
- **ELA (Error Level Analysis)** - Highlights compression artifacts
|
| 29 |
+
- **LSB Analysis** - Detects randomness in least significant bits
|
| 30 |
+
- **Histogram Analysis** - Finds statistical anomalies
|
| 31 |
+
- **Metadata Inspection** - Checks EXIF data for suspicious tools
|
| 32 |
+
- **Extract Data** - Recover messages hidden with this tool
|
| 33 |
+
|
| 34 |
+
### 🛡️ Check Image Integrity
|
| 35 |
+
Comprehensive image validation and corruption detection:
|
| 36 |
+
- File format validation (JPEG, PNG, GIF, TIFF, BMP, WebP, HEIC)
|
| 37 |
+
- Header integrity checks
|
| 38 |
+
- Data completeness verification
|
| 39 |
+
- Visual corruption detection (black/gray regions)
|
| 40 |
+
- Structure validation
|
| 41 |
+
|
| 42 |
+
## How It Works
|
| 43 |
+
|
| 44 |
+
### LSB Steganography
|
| 45 |
+
The tool hides data in the **least significant bits** of pixel values. Since changing the last 1-2 bits of a pixel value (e.g., changing 200 to 201) is imperceptible to the human eye, we can encode arbitrary data without visible changes to the image.
|
| 46 |
+
|
| 47 |
+
**Example:**
|
| 48 |
+
- Original pixel: RGB(156, 89, 201) = `10011100, 01011001, 11001001`
|
| 49 |
+
- After hiding bit '1': RGB(156, 89, 201) = `10011100, 01011001, 11001001` (last bit already 1)
|
| 50 |
+
- After hiding bit '0': RGB(156, 88, 201) = `10011100, 01011000, 11001001` (89→88)
|
| 51 |
+
|
| 52 |
+
This allows hiding hundreds to thousands of bytes in a typical photo!
|
| 53 |
+
|
| 54 |
+
### Steganography Detection
|
| 55 |
+
The RAT Finder uses multiple forensic techniques:
|
| 56 |
+
|
| 57 |
+
1. **ELA (Error Level Analysis)**: Re-saves the image at a known quality and compares compression artifacts. Hidden data or manipulation shows as bright areas.
|
| 58 |
+
|
| 59 |
+
2. **LSB Analysis**: Statistical tests check if the least significant bits are too random (hidden data) or too uniform (natural image).
|
| 60 |
+
|
| 61 |
+
3. **Histogram Analysis**: Analyzes color distribution for anomalies typical of steganography.
|
| 62 |
+
|
| 63 |
+
4. **Metadata Forensics**: Checks EXIF data for steganography tools or suspicious editing history.
|
| 64 |
+
|
| 65 |
+
## Usage Tips
|
| 66 |
+
|
| 67 |
+
### For Hiding Data:
|
| 68 |
+
- ✅ Use **PNG** images (JPEG compression destroys hidden data)
|
| 69 |
+
- ✅ Larger images = more capacity
|
| 70 |
+
- ✅ Use 1-2 bits per channel for undetectable hiding
|
| 71 |
+
- ✅ Add password encryption for sensitive data
|
| 72 |
+
- ⚠️ Don't re-save or edit the output image!
|
| 73 |
+
|
| 74 |
+
### For Detection:
|
| 75 |
+
- 🔍 Higher sensitivity = more thorough but more false positives
|
| 76 |
+
- 📊 Check the ELA image for bright spots (potential hiding)
|
| 77 |
+
- 💡 High confidence doesn't guarantee hidden data (could be compression artifacts)
|
| 78 |
+
- 🔓 Use "Extract Data" tab if you suspect LSB steganography
|
| 79 |
+
|
| 80 |
+
### For Corruption Checking:
|
| 81 |
+
- 🛡️ Enable visual corruption check for damaged photos
|
| 82 |
+
- ⚙️ Higher sensitivity for stricter validation
|
| 83 |
+
- 📁 Useful before archiving important photo collections
|
| 84 |
+
|
| 85 |
+
## About
|
| 86 |
+
|
| 87 |
+
**2PAC** combines three powerful tools:
|
| 88 |
+
- **LSB Steganography** engine (new!)
|
| 89 |
+
- **RAT Finder** - Advanced steg detection
|
| 90 |
+
- **Image Validator** - Corruption checker
|
| 91 |
+
|
| 92 |
+
Created by [Richard Young](https://github.com/ricyoung) | Part of [DeepNeuro.AI](https://deepneuro.ai)
|
| 93 |
+
|
| 94 |
+
🔗 **GitHub Repository:** [github.com/ricyoung/2pac](https://github.com/ricyoung/2pac)
|
| 95 |
+
🌐 **More Tools:** [demo.deepneuro.ai](https://demo.deepneuro.ai)
|
| 96 |
+
|
| 97 |
+
## Security & Privacy
|
| 98 |
+
|
| 99 |
+
- ✅ All processing happens in your browser session (Hugging Face Space)
|
| 100 |
+
- ✅ Images are not stored or logged
|
| 101 |
+
- ✅ Temporary files are deleted after processing
|
| 102 |
+
- ✅ Your hidden data and passwords are never saved
|
| 103 |
+
|
| 104 |
+
---
|
| 105 |
+
|
| 106 |
+
*"All Eyez On Your Images" 👁️*
|
app.py
ADDED
|
@@ -0,0 +1,563 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
2PAC: Picture Analyzer & Corruption Killer - Gradio Web Interface
|
| 4 |
+
Steganography, image corruption detection, and security analysis
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
import os
|
| 8 |
+
import tempfile
|
| 9 |
+
import gradio as gr
|
| 10 |
+
from PIL import Image
|
| 11 |
+
import matplotlib.pyplot as plt
|
| 12 |
+
import io
|
| 13 |
+
import base64
|
| 14 |
+
|
| 15 |
+
# Import 2PAC modules
|
| 16 |
+
from steg_embedder import StegEmbedder
|
| 17 |
+
import rat_finder
|
| 18 |
+
import find_bad_images
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
# Initialize embedder
|
| 22 |
+
embedder = StegEmbedder()
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
def hide_data_in_image(image, secret_text, password, bits_per_channel):
|
| 26 |
+
"""
|
| 27 |
+
Tab 1: Hide data in an image using LSB steganography
|
| 28 |
+
"""
|
| 29 |
+
if image is None:
|
| 30 |
+
return None, "⚠️ Please upload an image first"
|
| 31 |
+
|
| 32 |
+
if not secret_text or len(secret_text.strip()) == 0:
|
| 33 |
+
return None, "⚠️ Please enter text to hide"
|
| 34 |
+
|
| 35 |
+
try:
|
| 36 |
+
# Save uploaded image to temp file
|
| 37 |
+
with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as tmp_input:
|
| 38 |
+
img = Image.fromarray(image)
|
| 39 |
+
img.save(tmp_input.name, 'PNG')
|
| 40 |
+
input_path = tmp_input.name
|
| 41 |
+
|
| 42 |
+
# Create output file
|
| 43 |
+
with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as tmp_output:
|
| 44 |
+
output_path = tmp_output.name
|
| 45 |
+
|
| 46 |
+
# Calculate capacity first
|
| 47 |
+
img = Image.open(input_path)
|
| 48 |
+
capacity = embedder.calculate_capacity(img, bits_per_channel)
|
| 49 |
+
|
| 50 |
+
# Check if data fits
|
| 51 |
+
data_size = len(secret_text.encode('utf-8'))
|
| 52 |
+
if data_size > capacity:
|
| 53 |
+
os.unlink(input_path)
|
| 54 |
+
return None, f"❌ **Error:** Data too large!\n\n" \
|
| 55 |
+
f"- **Data size:** {data_size:,} bytes\n" \
|
| 56 |
+
f"- **Maximum capacity:** {capacity:,} bytes\n" \
|
| 57 |
+
f"- **Overflow:** {data_size - capacity:,} bytes\n\n" \
|
| 58 |
+
f"💡 Try: Shorter text, larger image, or more bits per channel"
|
| 59 |
+
|
| 60 |
+
# Embed data
|
| 61 |
+
pwd = password if password and len(password) > 0 else None
|
| 62 |
+
success, message, stats = embedder.embed_data(
|
| 63 |
+
input_path,
|
| 64 |
+
secret_text,
|
| 65 |
+
output_path,
|
| 66 |
+
password=pwd,
|
| 67 |
+
bits_per_channel=bits_per_channel
|
| 68 |
+
)
|
| 69 |
+
|
| 70 |
+
# Clean up input
|
| 71 |
+
os.unlink(input_path)
|
| 72 |
+
|
| 73 |
+
if not success:
|
| 74 |
+
if os.path.exists(output_path):
|
| 75 |
+
os.unlink(output_path)
|
| 76 |
+
return None, f"❌ **Error:** {message}"
|
| 77 |
+
|
| 78 |
+
# Load result image
|
| 79 |
+
result_img = Image.open(output_path)
|
| 80 |
+
|
| 81 |
+
# Format success message
|
| 82 |
+
result_message = f"""
|
| 83 |
+
✅ **Successfully Hidden!**
|
| 84 |
+
|
| 85 |
+
📊 **Statistics:**
|
| 86 |
+
- **Data hidden:** {stats['data_size']:,} bytes ({len(secret_text):,} characters)
|
| 87 |
+
- **Image capacity:** {stats['capacity']:,} bytes
|
| 88 |
+
- **Utilization:** {stats['utilization']}
|
| 89 |
+
- **Encryption:** {"🔒 Yes" if stats['encrypted'] else "🔓 No"}
|
| 90 |
+
- **LSB depth:** {stats['bits_per_channel']} bit(s) per channel
|
| 91 |
+
- **Image dimensions:** {stats['image_size']}
|
| 92 |
+
|
| 93 |
+
💾 **Download the image below** - your data is invisible to the naked eye!
|
| 94 |
+
|
| 95 |
+
⚠️ **Important:**
|
| 96 |
+
- Save as PNG (not JPEG - will destroy hidden data)
|
| 97 |
+
- Keep your password safe if you used encryption
|
| 98 |
+
"""
|
| 99 |
+
|
| 100 |
+
return result_img, result_message
|
| 101 |
+
|
| 102 |
+
except Exception as e:
|
| 103 |
+
if 'input_path' in locals() and os.path.exists(input_path):
|
| 104 |
+
os.unlink(input_path)
|
| 105 |
+
if 'output_path' in locals() and os.path.exists(output_path):
|
| 106 |
+
os.unlink(output_path)
|
| 107 |
+
return None, f"❌ **Error:** {str(e)}"
|
| 108 |
+
|
| 109 |
+
|
| 110 |
+
def detect_hidden_data(image, sensitivity):
|
| 111 |
+
"""
|
| 112 |
+
Tab 2: Detect steganography using RAT Finder analysis
|
| 113 |
+
"""
|
| 114 |
+
if image is None:
|
| 115 |
+
return None, "⚠️ Please upload an image to analyze"
|
| 116 |
+
|
| 117 |
+
try:
|
| 118 |
+
# Save uploaded image to temp file
|
| 119 |
+
with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as tmp:
|
| 120 |
+
img = Image.fromarray(image)
|
| 121 |
+
img.save(tmp.name, 'PNG')
|
| 122 |
+
image_path = tmp.name
|
| 123 |
+
|
| 124 |
+
# Map slider to sensitivity
|
| 125 |
+
sens_map = {1: 'low', 2: 'low', 3: 'low', 4: 'medium', 5: 'medium',
|
| 126 |
+
6: 'medium', 7: 'high', 8: 'high', 9: 'high', 10: 'high'}
|
| 127 |
+
sensitivity_str = sens_map.get(sensitivity, 'medium')
|
| 128 |
+
|
| 129 |
+
# Perform analysis
|
| 130 |
+
confidence, details = rat_finder.analyze_image(image_path, sensitivity=sensitivity_str)
|
| 131 |
+
|
| 132 |
+
# Generate ELA visualization
|
| 133 |
+
ela_result = rat_finder.perform_ela_analysis(image_path)
|
| 134 |
+
|
| 135 |
+
# Clean up
|
| 136 |
+
os.unlink(image_path)
|
| 137 |
+
|
| 138 |
+
# Create confidence indicator
|
| 139 |
+
if confidence >= 70:
|
| 140 |
+
confidence_emoji = "🚨"
|
| 141 |
+
confidence_label = "HIGH SUSPICION"
|
| 142 |
+
elif confidence >= 40:
|
| 143 |
+
confidence_emoji = "⚠️"
|
| 144 |
+
confidence_label = "MODERATE SUSPICION"
|
| 145 |
+
else:
|
| 146 |
+
confidence_emoji = "✅"
|
| 147 |
+
confidence_label = "LOW SUSPICION"
|
| 148 |
+
|
| 149 |
+
# Format results
|
| 150 |
+
result_text = f"""
|
| 151 |
+
{confidence_emoji} **{confidence_label}**
|
| 152 |
+
|
| 153 |
+
📊 **Confidence Score:** {confidence:.1f}%
|
| 154 |
+
|
| 155 |
+
🔍 **Analysis Details:**
|
| 156 |
+
"""
|
| 157 |
+
|
| 158 |
+
for detail in details:
|
| 159 |
+
result_text += f"\n• {detail}"
|
| 160 |
+
|
| 161 |
+
result_text += f"""
|
| 162 |
+
|
| 163 |
+
---
|
| 164 |
+
|
| 165 |
+
**What does this mean?**
|
| 166 |
+
|
| 167 |
+
- **ELA (Error Level Analysis):** Highlights areas with different compression levels
|
| 168 |
+
- Bright areas = potential manipulation or hidden data
|
| 169 |
+
- Uniform appearance = likely unmodified
|
| 170 |
+
|
| 171 |
+
- **LSB Analysis:** Checks randomness in least significant bits
|
| 172 |
+
- **Histogram Analysis:** Looks for statistical anomalies
|
| 173 |
+
- **Metadata:** Examines EXIF data for suspicious tools
|
| 174 |
+
- **File Structure:** Checks for trailing data
|
| 175 |
+
|
| 176 |
+
💡 **High confidence doesn't mean data is hidden** - just that anomalies exist.
|
| 177 |
+
Use the "Extract Data" tab if you suspect LSB steganography!
|
| 178 |
+
"""
|
| 179 |
+
|
| 180 |
+
# Return ELA plot if available
|
| 181 |
+
if ela_result['success'] and ela_result['ela_image']:
|
| 182 |
+
return ela_result['ela_image'], result_text
|
| 183 |
+
|
| 184 |
+
return None, result_text
|
| 185 |
+
|
| 186 |
+
except Exception as e:
|
| 187 |
+
if 'image_path' in locals() and os.path.exists(image_path):
|
| 188 |
+
os.unlink(image_path)
|
| 189 |
+
return None, f"❌ **Error:** {str(e)}"
|
| 190 |
+
|
| 191 |
+
|
| 192 |
+
def extract_hidden_data(image, password, bits_per_channel):
|
| 193 |
+
"""
|
| 194 |
+
Tab 2b: Extract data hidden with LSB steganography
|
| 195 |
+
"""
|
| 196 |
+
if image is None:
|
| 197 |
+
return "⚠️ Please upload an image"
|
| 198 |
+
|
| 199 |
+
try:
|
| 200 |
+
# Save uploaded image to temp file
|
| 201 |
+
with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as tmp:
|
| 202 |
+
img = Image.fromarray(image)
|
| 203 |
+
img.save(tmp.name, 'PNG')
|
| 204 |
+
image_path = tmp.name
|
| 205 |
+
|
| 206 |
+
# Attempt extraction
|
| 207 |
+
pwd = password if password and len(password) > 0 else None
|
| 208 |
+
success, message, extracted_data = embedder.extract_data(
|
| 209 |
+
image_path,
|
| 210 |
+
password=pwd,
|
| 211 |
+
bits_per_channel=bits_per_channel
|
| 212 |
+
)
|
| 213 |
+
|
| 214 |
+
# Clean up
|
| 215 |
+
os.unlink(image_path)
|
| 216 |
+
|
| 217 |
+
if not success:
|
| 218 |
+
return f"❌ **{message}**\n\nPossible reasons:\n" \
|
| 219 |
+
f"• No data hidden in this image\n" \
|
| 220 |
+
f"• Wrong password (if encrypted)\n" \
|
| 221 |
+
f"• Wrong bits-per-channel setting\n" \
|
| 222 |
+
f"• Image was modified/re-saved"
|
| 223 |
+
|
| 224 |
+
result = f"""
|
| 225 |
+
✅ **Data Successfully Extracted!**
|
| 226 |
+
|
| 227 |
+
📝 **Hidden Message:**
|
| 228 |
+
|
| 229 |
+
---
|
| 230 |
+
{extracted_data}
|
| 231 |
+
---
|
| 232 |
+
|
| 233 |
+
📊 **Extraction Info:**
|
| 234 |
+
- **Data size:** {len(extracted_data)} characters
|
| 235 |
+
- **Decryption:** {"🔒 Used" if pwd else "🔓 Not needed"}
|
| 236 |
+
- **LSB depth:** {bits_per_channel} bit(s) per channel
|
| 237 |
+
|
| 238 |
+
💡 Copy the message above - it has been successfully recovered from the image!
|
| 239 |
+
"""
|
| 240 |
+
return result
|
| 241 |
+
|
| 242 |
+
except Exception as e:
|
| 243 |
+
if 'image_path' in locals() and os.path.exists(image_path):
|
| 244 |
+
os.unlink(image_path)
|
| 245 |
+
return f"❌ **Error:** {str(e)}"
|
| 246 |
+
|
| 247 |
+
|
| 248 |
+
def check_image_corruption(image, sensitivity, check_visual):
|
| 249 |
+
"""
|
| 250 |
+
Tab 3: Check for image corruption and validate integrity
|
| 251 |
+
"""
|
| 252 |
+
if image is None:
|
| 253 |
+
return "⚠️ Please upload an image to check"
|
| 254 |
+
|
| 255 |
+
try:
|
| 256 |
+
# Save uploaded image to temp file
|
| 257 |
+
with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as tmp:
|
| 258 |
+
img = Image.fromarray(image)
|
| 259 |
+
img.save(tmp.name, 'PNG')
|
| 260 |
+
image_path = tmp.name
|
| 261 |
+
|
| 262 |
+
# Map slider to sensitivity
|
| 263 |
+
sens_map = {1: 'low', 2: 'low', 3: 'low', 4: 'medium', 5: 'medium',
|
| 264 |
+
6: 'medium', 7: 'high', 8: 'high', 9: 'high', 10: 'high'}
|
| 265 |
+
sensitivity_str = sens_map.get(sensitivity, 'medium')
|
| 266 |
+
|
| 267 |
+
# Validate image
|
| 268 |
+
is_valid = find_bad_images.is_valid_image(
|
| 269 |
+
image_path,
|
| 270 |
+
thorough=True,
|
| 271 |
+
sensitivity=sensitivity_str,
|
| 272 |
+
check_visual=check_visual
|
| 273 |
+
)
|
| 274 |
+
|
| 275 |
+
# Get diagnostic details
|
| 276 |
+
issues = find_bad_images.diagnose_image_issue(image_path)
|
| 277 |
+
|
| 278 |
+
# Clean up
|
| 279 |
+
os.unlink(image_path)
|
| 280 |
+
|
| 281 |
+
# Format results
|
| 282 |
+
if is_valid:
|
| 283 |
+
result = f"""
|
| 284 |
+
✅ **IMAGE IS VALID**
|
| 285 |
+
|
| 286 |
+
The image passed all validation checks:
|
| 287 |
+
- ✅ File structure is intact
|
| 288 |
+
- ✅ Headers are valid
|
| 289 |
+
- ✅ No truncation detected
|
| 290 |
+
- ✅ Metadata is consistent
|
| 291 |
+
"""
|
| 292 |
+
if check_visual:
|
| 293 |
+
result += "- ✅ No visual corruption detected\n"
|
| 294 |
+
|
| 295 |
+
result += "\n💚 **This image is safe to use!**"
|
| 296 |
+
|
| 297 |
+
else:
|
| 298 |
+
result = f"""
|
| 299 |
+
⚠️ **ISSUES DETECTED**
|
| 300 |
+
|
| 301 |
+
The image has validation problems:
|
| 302 |
+
|
| 303 |
+
"""
|
| 304 |
+
if issues:
|
| 305 |
+
for issue_type, issue_desc in issues.items():
|
| 306 |
+
result += f"**{issue_type}:**\n{issue_desc}\n\n"
|
| 307 |
+
else:
|
| 308 |
+
result += "❌ Image failed validation but no specific issues identified.\n\n"
|
| 309 |
+
|
| 310 |
+
result += """
|
| 311 |
+
---
|
| 312 |
+
|
| 313 |
+
**What to do:**
|
| 314 |
+
- Image may be corrupted or incomplete
|
| 315 |
+
- Try re-downloading the original file
|
| 316 |
+
- Check if the file was properly transferred
|
| 317 |
+
- Use image repair tools if needed
|
| 318 |
+
"""
|
| 319 |
+
|
| 320 |
+
return result
|
| 321 |
+
|
| 322 |
+
except Exception as e:
|
| 323 |
+
if 'image_path' in locals() and os.path.exists(image_path):
|
| 324 |
+
os.unlink(image_path)
|
| 325 |
+
return f"❌ **Error:** {str(e)}"
|
| 326 |
+
|
| 327 |
+
|
| 328 |
+
# Create Gradio interface
|
| 329 |
+
with gr.Blocks(
|
| 330 |
+
title="2PAC: Picture Analyzer & Corruption Killer",
|
| 331 |
+
theme=gr.themes.Soft(
|
| 332 |
+
primary_hue="violet",
|
| 333 |
+
secondary_hue="blue",
|
| 334 |
+
)
|
| 335 |
+
) as demo:
|
| 336 |
+
|
| 337 |
+
gr.Markdown("""
|
| 338 |
+
# 🔫 2PAC: Picture Analyzer & Corruption Killer
|
| 339 |
+
|
| 340 |
+
**Advanced image security and steganography toolkit**
|
| 341 |
+
|
| 342 |
+
Hide secret messages in images, detect hidden data, and validate image integrity.
|
| 343 |
+
""")
|
| 344 |
+
|
| 345 |
+
with gr.Tabs():
|
| 346 |
+
|
| 347 |
+
# TAB 1: Hide Data
|
| 348 |
+
with gr.Tab("🔒 Hide Secret Data"):
|
| 349 |
+
gr.Markdown("""
|
| 350 |
+
## Hide Data in Image (LSB Steganography)
|
| 351 |
+
|
| 352 |
+
Invisibly hide text inside an image using Least Significant Bit encoding.
|
| 353 |
+
The image will look identical to the naked eye, but contains your secret message!
|
| 354 |
+
""")
|
| 355 |
+
|
| 356 |
+
with gr.Row():
|
| 357 |
+
with gr.Column(scale=1):
|
| 358 |
+
hide_input_image = gr.Image(
|
| 359 |
+
label="Upload Image",
|
| 360 |
+
type="numpy",
|
| 361 |
+
height=300
|
| 362 |
+
)
|
| 363 |
+
hide_secret_text = gr.Textbox(
|
| 364 |
+
label="Secret Text to Hide",
|
| 365 |
+
placeholder="Enter your secret message here...",
|
| 366 |
+
lines=5,
|
| 367 |
+
max_lines=10
|
| 368 |
+
)
|
| 369 |
+
with gr.Row():
|
| 370 |
+
hide_password = gr.Textbox(
|
| 371 |
+
label="Password (Optional - for encryption)",
|
| 372 |
+
placeholder="Leave empty for no encryption",
|
| 373 |
+
type="password"
|
| 374 |
+
)
|
| 375 |
+
hide_bits = gr.Slider(
|
| 376 |
+
minimum=1,
|
| 377 |
+
maximum=4,
|
| 378 |
+
value=1,
|
| 379 |
+
step=1,
|
| 380 |
+
label="LSB Depth (higher = more capacity, less subtle)",
|
| 381 |
+
info="1=subtle, 4=maximum capacity"
|
| 382 |
+
)
|
| 383 |
+
|
| 384 |
+
hide_button = gr.Button("🔒 Hide Data in Image", variant="primary", size="lg")
|
| 385 |
+
|
| 386 |
+
with gr.Column(scale=1):
|
| 387 |
+
hide_output_image = gr.Image(label="Result Image (Download This!)", height=300)
|
| 388 |
+
hide_output_text = gr.Markdown(label="Status")
|
| 389 |
+
|
| 390 |
+
hide_button.click(
|
| 391 |
+
fn=hide_data_in_image,
|
| 392 |
+
inputs=[hide_input_image, hide_secret_text, hide_password, hide_bits],
|
| 393 |
+
outputs=[hide_output_image, hide_output_text]
|
| 394 |
+
)
|
| 395 |
+
|
| 396 |
+
gr.Markdown("""
|
| 397 |
+
---
|
| 398 |
+
**💡 Tips:**
|
| 399 |
+
- Use PNG images for best results (JPEG will destroy hidden data!)
|
| 400 |
+
- Larger images can hold more data
|
| 401 |
+
- Password encryption adds extra security layer
|
| 402 |
+
- LSB depth: 1-2 bits is undetectable, 3-4 bits provides more capacity
|
| 403 |
+
""")
|
| 404 |
+
|
| 405 |
+
# TAB 2: Detect & Extract
|
| 406 |
+
with gr.Tab("🔍 Detect & Extract Hidden Data"):
|
| 407 |
+
gr.Markdown("""
|
| 408 |
+
## Detect Steganography & Extract Hidden Data
|
| 409 |
+
|
| 410 |
+
Use advanced analysis techniques to detect hidden data in images, or extract data hidden with this tool.
|
| 411 |
+
""")
|
| 412 |
+
|
| 413 |
+
with gr.Tabs():
|
| 414 |
+
|
| 415 |
+
# Sub-tab: Detection
|
| 416 |
+
with gr.Tab("🔎 Detect (Analysis)"):
|
| 417 |
+
gr.Markdown("""
|
| 418 |
+
### Steganography Detection (RAT Finder)
|
| 419 |
+
|
| 420 |
+
Analyzes images for signs of hidden data using multiple techniques:
|
| 421 |
+
ELA, LSB analysis, histogram analysis, metadata inspection, and more.
|
| 422 |
+
""")
|
| 423 |
+
|
| 424 |
+
with gr.Row():
|
| 425 |
+
with gr.Column(scale=1):
|
| 426 |
+
detect_input_image = gr.Image(
|
| 427 |
+
label="Upload Image to Analyze",
|
| 428 |
+
type="numpy",
|
| 429 |
+
height=300
|
| 430 |
+
)
|
| 431 |
+
detect_sensitivity = gr.Slider(
|
| 432 |
+
minimum=1,
|
| 433 |
+
maximum=10,
|
| 434 |
+
value=5,
|
| 435 |
+
step=1,
|
| 436 |
+
label="Detection Sensitivity",
|
| 437 |
+
info="Higher = more thorough but more false positives"
|
| 438 |
+
)
|
| 439 |
+
detect_button = gr.Button("🔍 Analyze for Hidden Data", variant="primary", size="lg")
|
| 440 |
+
|
| 441 |
+
with gr.Column(scale=1):
|
| 442 |
+
detect_output_image = gr.Image(label="ELA Visualization", height=300)
|
| 443 |
+
detect_output_text = gr.Markdown(label="Analysis Results")
|
| 444 |
+
|
| 445 |
+
detect_button.click(
|
| 446 |
+
fn=detect_hidden_data,
|
| 447 |
+
inputs=[detect_input_image, detect_sensitivity],
|
| 448 |
+
outputs=[detect_output_image, detect_output_text]
|
| 449 |
+
)
|
| 450 |
+
|
| 451 |
+
# Sub-tab: Extraction
|
| 452 |
+
with gr.Tab("📤 Extract Data"):
|
| 453 |
+
gr.Markdown("""
|
| 454 |
+
### Extract Hidden Data (LSB Extraction)
|
| 455 |
+
|
| 456 |
+
If you have an image created with the "Hide Data" tool, extract the hidden message here.
|
| 457 |
+
""")
|
| 458 |
+
|
| 459 |
+
with gr.Row():
|
| 460 |
+
with gr.Column(scale=1):
|
| 461 |
+
extract_input_image = gr.Image(
|
| 462 |
+
label="Upload Image with Hidden Data",
|
| 463 |
+
type="numpy",
|
| 464 |
+
height=300
|
| 465 |
+
)
|
| 466 |
+
with gr.Row():
|
| 467 |
+
extract_password = gr.Textbox(
|
| 468 |
+
label="Password (if encrypted)",
|
| 469 |
+
placeholder="Leave empty if not encrypted",
|
| 470 |
+
type="password"
|
| 471 |
+
)
|
| 472 |
+
extract_bits = gr.Slider(
|
| 473 |
+
minimum=1,
|
| 474 |
+
maximum=4,
|
| 475 |
+
value=1,
|
| 476 |
+
step=1,
|
| 477 |
+
label="LSB Depth (must match encoding)",
|
| 478 |
+
info="Use same value as when hiding"
|
| 479 |
+
)
|
| 480 |
+
extract_button = gr.Button("📤 Extract Hidden Data", variant="primary", size="lg")
|
| 481 |
+
|
| 482 |
+
with gr.Column(scale=1):
|
| 483 |
+
extract_output_text = gr.Markdown(label="Extracted Data")
|
| 484 |
+
|
| 485 |
+
extract_button.click(
|
| 486 |
+
fn=extract_hidden_data,
|
| 487 |
+
inputs=[extract_input_image, extract_password, extract_bits],
|
| 488 |
+
outputs=[extract_output_text]
|
| 489 |
+
)
|
| 490 |
+
|
| 491 |
+
# TAB 3: Check Corruption
|
| 492 |
+
with gr.Tab("🛡️ Check Image Integrity"):
|
| 493 |
+
gr.Markdown("""
|
| 494 |
+
## Image Corruption & Validation
|
| 495 |
+
|
| 496 |
+
Thoroughly validate image files for corruption, truncation, and structural issues.
|
| 497 |
+
Detects damaged headers, incomplete data, and visual artifacts.
|
| 498 |
+
""")
|
| 499 |
+
|
| 500 |
+
with gr.Row():
|
| 501 |
+
with gr.Column(scale=1):
|
| 502 |
+
check_input_image = gr.Image(
|
| 503 |
+
label="Upload Image to Validate",
|
| 504 |
+
type="numpy",
|
| 505 |
+
height=300
|
| 506 |
+
)
|
| 507 |
+
with gr.Row():
|
| 508 |
+
check_sensitivity = gr.Slider(
|
| 509 |
+
minimum=1,
|
| 510 |
+
maximum=10,
|
| 511 |
+
value=5,
|
| 512 |
+
step=1,
|
| 513 |
+
label="Validation Sensitivity",
|
| 514 |
+
info="Higher = more strict validation"
|
| 515 |
+
)
|
| 516 |
+
check_visual = gr.Checkbox(
|
| 517 |
+
label="Check for Visual Corruption",
|
| 518 |
+
value=True,
|
| 519 |
+
info="Slower but detects visual artifacts"
|
| 520 |
+
)
|
| 521 |
+
check_button = gr.Button("🛡️ Validate Image", variant="primary", size="lg")
|
| 522 |
+
|
| 523 |
+
with gr.Column(scale=1):
|
| 524 |
+
check_output_text = gr.Markdown(label="Validation Results")
|
| 525 |
+
|
| 526 |
+
check_button.click(
|
| 527 |
+
fn=check_image_corruption,
|
| 528 |
+
inputs=[check_input_image, check_sensitivity, check_visual],
|
| 529 |
+
outputs=[check_output_text]
|
| 530 |
+
)
|
| 531 |
+
|
| 532 |
+
gr.Markdown("""
|
| 533 |
+
---
|
| 534 |
+
**🔍 Checks Performed:**
|
| 535 |
+
- ✅ File format validation (JPEG, PNG, GIF, etc.)
|
| 536 |
+
- ✅ Header integrity
|
| 537 |
+
- ✅ Data completeness
|
| 538 |
+
- ✅ Metadata consistency
|
| 539 |
+
- ✅ Visual corruption detection (black/gray regions)
|
| 540 |
+
- ✅ Structure validation
|
| 541 |
+
""")
|
| 542 |
+
|
| 543 |
+
gr.Markdown("""
|
| 544 |
+
---
|
| 545 |
+
|
| 546 |
+
## About 2PAC
|
| 547 |
+
|
| 548 |
+
**2PAC** (Picture Analyzer & Corruption Killer) is a comprehensive image security toolkit combining:
|
| 549 |
+
- **LSB Steganography**: Hide and extract secret messages in images
|
| 550 |
+
- **RAT Finder**: Advanced steganography detection using 7+ analysis techniques
|
| 551 |
+
- **Image Validation**: Detect corruption and structural issues
|
| 552 |
+
|
| 553 |
+
🔗 **GitHub:** [github.com/ricyoung/2pac](https://github.com/ricyoung/2pac)
|
| 554 |
+
🌐 **More Tools:** [demo.deepneuro.ai](https://demo.deepneuro.ai)
|
| 555 |
+
|
| 556 |
+
---
|
| 557 |
+
|
| 558 |
+
*Built with ❤️ by DeepNeuro.AI | Powered by Gradio & Hugging Face Spaces*
|
| 559 |
+
""")
|
| 560 |
+
|
| 561 |
+
|
| 562 |
+
if __name__ == "__main__":
|
| 563 |
+
demo.launch()
|
find_bad_images.py
ADDED
|
@@ -0,0 +1,1670 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
2PAC: The Picture Analyzer & Corruption killer
|
| 4 |
+
Author: Richard Young
|
| 5 |
+
License: MIT
|
| 6 |
+
|
| 7 |
+
In memory of Jeff Young, who loved Tupac's music and lived by his values of helping others.
|
| 8 |
+
Like Tupac, Jeff believed in bringing people together and always lending a hand to those in need.
|
| 9 |
+
May your photos always be as clear as the memories they capture, and may we all strive to help others as Jeff did.
|
| 10 |
+
"""
|
| 11 |
+
|
| 12 |
+
import os
|
| 13 |
+
import argparse
|
| 14 |
+
import concurrent.futures
|
| 15 |
+
import sys
|
| 16 |
+
import time
|
| 17 |
+
import io
|
| 18 |
+
import json
|
| 19 |
+
import shutil
|
| 20 |
+
import hashlib
|
| 21 |
+
import struct
|
| 22 |
+
import tempfile
|
| 23 |
+
import subprocess
|
| 24 |
+
import random
|
| 25 |
+
from datetime import datetime
|
| 26 |
+
from pathlib import Path
|
| 27 |
+
from PIL import Image, ImageFile, UnidentifiedImageError
|
| 28 |
+
from tqdm import tqdm
|
| 29 |
+
import tqdm.auto as tqdm_auto
|
| 30 |
+
import colorama
|
| 31 |
+
import humanize
|
| 32 |
+
import logging
|
| 33 |
+
|
| 34 |
+
# Import 2PAC quotes
|
| 35 |
+
try:
|
| 36 |
+
from quotes import QUOTES
|
| 37 |
+
except ImportError:
|
| 38 |
+
# Default quotes if file is missing
|
| 39 |
+
QUOTES = ["All Eyez On Your Images."]
|
| 40 |
+
|
| 41 |
+
# Initialize colorama (required for Windows)
|
| 42 |
+
colorama.init()
|
| 43 |
+
|
| 44 |
+
# Allow loading of truncated images for repair attempts
|
| 45 |
+
ImageFile.LOAD_TRUNCATED_IMAGES = True
|
| 46 |
+
|
| 47 |
+
# Dictionary of supported image formats with their extensions
|
| 48 |
+
SUPPORTED_FORMATS = {
|
| 49 |
+
'JPEG': ('.jpg', '.jpeg', '.jpe', '.jif', '.jfif', '.jfi'),
|
| 50 |
+
'PNG': ('.png',),
|
| 51 |
+
'GIF': ('.gif',),
|
| 52 |
+
'TIFF': ('.tiff', '.tif'),
|
| 53 |
+
'BMP': ('.bmp', '.dib'),
|
| 54 |
+
'WEBP': ('.webp',),
|
| 55 |
+
'ICO': ('.ico',),
|
| 56 |
+
'HEIC': ('.heic',),
|
| 57 |
+
}
|
| 58 |
+
|
| 59 |
+
# Default formats (all supported formats)
|
| 60 |
+
DEFAULT_FORMATS = list(SUPPORTED_FORMATS.keys())
|
| 61 |
+
|
| 62 |
+
# List of formats that can potentially be repaired
|
| 63 |
+
REPAIRABLE_FORMATS = ['JPEG', 'PNG', 'GIF']
|
| 64 |
+
|
| 65 |
+
# Default progress directory
|
| 66 |
+
DEFAULT_PROGRESS_DIR = os.path.expanduser("~/.bad_image_finder/progress")
|
| 67 |
+
|
| 68 |
+
# Current version
|
| 69 |
+
VERSION = "1.5.1"
|
| 70 |
+
|
| 71 |
+
# Security: Maximum file size to process (100MB) to prevent DoS
|
| 72 |
+
MAX_FILE_SIZE = 100 * 1024 * 1024
|
| 73 |
+
|
| 74 |
+
# Security: Maximum image dimensions (50 megapixels) to prevent decompression bombs
|
| 75 |
+
MAX_IMAGE_PIXELS = 50000 * 50000
|
| 76 |
+
|
| 77 |
+
def setup_logging(verbose, no_color=False):
|
| 78 |
+
level = logging.DEBUG if verbose else logging.INFO
|
| 79 |
+
|
| 80 |
+
# Define color codes
|
| 81 |
+
if not no_color:
|
| 82 |
+
# Color scheme
|
| 83 |
+
COLORS = {
|
| 84 |
+
'DEBUG': colorama.Fore.CYAN,
|
| 85 |
+
'INFO': colorama.Fore.GREEN,
|
| 86 |
+
'WARNING': colorama.Fore.YELLOW,
|
| 87 |
+
'ERROR': colorama.Fore.RED,
|
| 88 |
+
'CRITICAL': colorama.Fore.MAGENTA + colorama.Style.BRIGHT,
|
| 89 |
+
'RESET': colorama.Style.RESET_ALL
|
| 90 |
+
}
|
| 91 |
+
|
| 92 |
+
# Custom formatter with colors
|
| 93 |
+
class ColoredFormatter(logging.Formatter):
|
| 94 |
+
def format(self, record):
|
| 95 |
+
levelname = record.levelname
|
| 96 |
+
if levelname in COLORS:
|
| 97 |
+
record.levelname = f"{COLORS[levelname]}{levelname}{COLORS['RESET']}"
|
| 98 |
+
record.msg = f"{COLORS[levelname]}{record.msg}{COLORS['RESET']}"
|
| 99 |
+
return super().format(record)
|
| 100 |
+
|
| 101 |
+
formatter = ColoredFormatter('%(asctime)s - %(levelname)s - %(message)s')
|
| 102 |
+
else:
|
| 103 |
+
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
|
| 104 |
+
|
| 105 |
+
handler = logging.StreamHandler()
|
| 106 |
+
handler.setFormatter(formatter)
|
| 107 |
+
|
| 108 |
+
logging.basicConfig(
|
| 109 |
+
level=level,
|
| 110 |
+
handlers=[handler]
|
| 111 |
+
)
|
| 112 |
+
|
| 113 |
+
def diagnose_image_issue(file_path):
|
| 114 |
+
"""
|
| 115 |
+
Attempts to diagnose what's wrong with the image.
|
| 116 |
+
Returns: (error_type, details)
|
| 117 |
+
"""
|
| 118 |
+
try:
|
| 119 |
+
with open(file_path, 'rb') as f:
|
| 120 |
+
header = f.read(16) # Read first 16 bytes
|
| 121 |
+
|
| 122 |
+
# Check for zero-byte file
|
| 123 |
+
if len(header) == 0:
|
| 124 |
+
return "empty_file", "File is empty (0 bytes)"
|
| 125 |
+
|
| 126 |
+
# Check for correct JPEG header
|
| 127 |
+
if file_path.lower().endswith(SUPPORTED_FORMATS['JPEG']):
|
| 128 |
+
if not (header.startswith(b'\xff\xd8\xff')):
|
| 129 |
+
return "invalid_header", "Invalid JPEG header"
|
| 130 |
+
|
| 131 |
+
# Check for correct PNG header
|
| 132 |
+
elif file_path.lower().endswith(SUPPORTED_FORMATS['PNG']):
|
| 133 |
+
if not header.startswith(b'\x89PNG\r\n\x1a\n'):
|
| 134 |
+
return "invalid_header", "Invalid PNG header"
|
| 135 |
+
|
| 136 |
+
# Try to open with PIL for more detailed diagnosis
|
| 137 |
+
try:
|
| 138 |
+
with Image.open(file_path) as img:
|
| 139 |
+
img.verify()
|
| 140 |
+
except Exception as e:
|
| 141 |
+
error_str = str(e).lower()
|
| 142 |
+
|
| 143 |
+
if "truncated" in error_str:
|
| 144 |
+
return "truncated", "File is truncated"
|
| 145 |
+
elif "corrupt" in error_str:
|
| 146 |
+
return "corrupt_data", "Data corruption detected"
|
| 147 |
+
elif "incorrect mode" in error_str or "decoder" in error_str:
|
| 148 |
+
return "decoder_issue", "Image decoder issue"
|
| 149 |
+
else:
|
| 150 |
+
return "unknown", f"Unknown issue: {str(e)}"
|
| 151 |
+
|
| 152 |
+
# Now try to load the data
|
| 153 |
+
try:
|
| 154 |
+
with Image.open(file_path) as img:
|
| 155 |
+
img.load()
|
| 156 |
+
except Exception as e:
|
| 157 |
+
return "data_load_failed", f"Image data couldn't be loaded: {str(e)}"
|
| 158 |
+
|
| 159 |
+
# If we got here, there's some other issue
|
| 160 |
+
return "unknown", "Unknown issue"
|
| 161 |
+
|
| 162 |
+
except Exception as e:
|
| 163 |
+
return "access_error", f"Error accessing file: {str(e)}"
|
| 164 |
+
|
| 165 |
+
def check_jpeg_structure(file_path):
|
| 166 |
+
"""
|
| 167 |
+
Performs a deep check of JPEG file structure to find corruption that PIL might miss.
|
| 168 |
+
Returns (is_valid, error_message)
|
| 169 |
+
"""
|
| 170 |
+
try:
|
| 171 |
+
with open(file_path, 'rb') as f:
|
| 172 |
+
data = f.read()
|
| 173 |
+
|
| 174 |
+
# Check for correct JPEG header (SOI marker)
|
| 175 |
+
if not data.startswith(b'\xFF\xD8'):
|
| 176 |
+
return False, "Invalid JPEG header (missing SOI marker)"
|
| 177 |
+
|
| 178 |
+
# Check for proper EOI marker at the end
|
| 179 |
+
if not data.endswith(b'\xFF\xD9'):
|
| 180 |
+
return False, "Missing EOI marker at end of file"
|
| 181 |
+
|
| 182 |
+
# Check for key JPEG segments
|
| 183 |
+
# SOF marker (Start of Frame) - At least one should be present
|
| 184 |
+
sof_markers = [b'\xFF\xC0', b'\xFF\xC1', b'\xFF\xC2', b'\xFF\xC3']
|
| 185 |
+
has_sof = any(marker in data for marker in sof_markers)
|
| 186 |
+
if not has_sof:
|
| 187 |
+
return False, "No Start of Frame (SOF) marker found"
|
| 188 |
+
|
| 189 |
+
# Check for SOS marker (Start of Scan)
|
| 190 |
+
if b'\xFF\xDA' not in data:
|
| 191 |
+
return False, "No Start of Scan (SOS) marker found"
|
| 192 |
+
|
| 193 |
+
# Scan through the file to check marker structure
|
| 194 |
+
i = 2 # Skip SOI marker
|
| 195 |
+
while i < len(data) - 1:
|
| 196 |
+
if data[i] == 0xFF and data[i+1] != 0x00 and data[i+1] != 0xFF:
|
| 197 |
+
# Found a marker
|
| 198 |
+
marker = data[i:i+2]
|
| 199 |
+
|
| 200 |
+
# For markers with length fields, validate length
|
| 201 |
+
if (0xC0 <= data[i+1] <= 0xCF and data[i+1] != 0xC4 and data[i+1] != 0xC8) or \
|
| 202 |
+
(0xDB <= data[i+1] <= 0xFE):
|
| 203 |
+
if i + 4 >= len(data):
|
| 204 |
+
return False, f"Truncated marker {data[i+1]:02X} at position {i}"
|
| 205 |
+
length = struct.unpack('>H', data[i+2:i+4])[0]
|
| 206 |
+
if i + 2 + length > len(data):
|
| 207 |
+
return False, f"Invalid segment length for marker {data[i+1]:02X}"
|
| 208 |
+
i += 2 + length
|
| 209 |
+
continue
|
| 210 |
+
|
| 211 |
+
# Move to next byte
|
| 212 |
+
i += 1
|
| 213 |
+
|
| 214 |
+
return True, "JPEG structure appears valid"
|
| 215 |
+
except Exception as e:
|
| 216 |
+
return False, f"Error during JPEG structure check: {str(e)}"
|
| 217 |
+
|
| 218 |
+
def check_png_structure(file_path):
|
| 219 |
+
"""
|
| 220 |
+
Performs a deep check of PNG file structure to find corruption.
|
| 221 |
+
Returns (is_valid, error_message)
|
| 222 |
+
"""
|
| 223 |
+
try:
|
| 224 |
+
with open(file_path, 'rb') as f:
|
| 225 |
+
data = f.read()
|
| 226 |
+
|
| 227 |
+
# Check for PNG signature
|
| 228 |
+
png_signature = b'\x89PNG\r\n\x1a\n'
|
| 229 |
+
if not data.startswith(png_signature):
|
| 230 |
+
return False, "Invalid PNG signature"
|
| 231 |
+
|
| 232 |
+
# Check minimum viable PNG (signature + IHDR chunk)
|
| 233 |
+
if len(data) < 8 + 12: # 8 bytes signature + 12 bytes min IHDR chunk
|
| 234 |
+
return False, "PNG file too small to contain valid header"
|
| 235 |
+
|
| 236 |
+
# Check for IEND chunk at the end
|
| 237 |
+
if not data.endswith(b'IEND\xaeB`\x82'):
|
| 238 |
+
return False, "Missing IEND chunk at end of file"
|
| 239 |
+
|
| 240 |
+
# Parse chunks
|
| 241 |
+
pos = 8 # Skip signature
|
| 242 |
+
required_chunks = {'IHDR': False}
|
| 243 |
+
|
| 244 |
+
while pos < len(data):
|
| 245 |
+
if pos + 8 > len(data):
|
| 246 |
+
return False, "Truncated chunk header"
|
| 247 |
+
|
| 248 |
+
# Read chunk length and type
|
| 249 |
+
chunk_len = struct.unpack('>I', data[pos:pos+4])[0]
|
| 250 |
+
chunk_type = data[pos+4:pos+8].decode('ascii', errors='replace')
|
| 251 |
+
|
| 252 |
+
# Validate chunk length
|
| 253 |
+
if pos + chunk_len + 12 > len(data):
|
| 254 |
+
return False, f"Truncated {chunk_type} chunk"
|
| 255 |
+
|
| 256 |
+
# Track required chunks
|
| 257 |
+
if chunk_type in required_chunks:
|
| 258 |
+
required_chunks[chunk_type] = True
|
| 259 |
+
|
| 260 |
+
# Special validation for IHDR chunk
|
| 261 |
+
if chunk_type == 'IHDR' and chunk_len != 13:
|
| 262 |
+
return False, "Invalid IHDR chunk length"
|
| 263 |
+
|
| 264 |
+
# Mandatory IHDR must be first chunk
|
| 265 |
+
if pos == 8 and chunk_type != 'IHDR':
|
| 266 |
+
return False, "First chunk must be IHDR"
|
| 267 |
+
|
| 268 |
+
# IEND must be the last chunk
|
| 269 |
+
if chunk_type == 'IEND' and pos + chunk_len + 12 != len(data):
|
| 270 |
+
return False, "Data after IEND chunk"
|
| 271 |
+
|
| 272 |
+
# Move to next chunk
|
| 273 |
+
pos += chunk_len + 12 # Length (4) + Type (4) + Data (chunk_len) + CRC (4)
|
| 274 |
+
|
| 275 |
+
# Verify required chunks
|
| 276 |
+
for chunk, present in required_chunks.items():
|
| 277 |
+
if not present:
|
| 278 |
+
return False, f"Missing required {chunk} chunk"
|
| 279 |
+
|
| 280 |
+
return True, "PNG structure appears valid"
|
| 281 |
+
except Exception as e:
|
| 282 |
+
return False, f"Error during PNG structure check: {str(e)}"
|
| 283 |
+
|
| 284 |
+
def validate_subprocess_path(file_path):
|
| 285 |
+
"""
|
| 286 |
+
Validate file path before passing to subprocess to prevent command injection.
|
| 287 |
+
|
| 288 |
+
Args:
|
| 289 |
+
file_path: Path to validate
|
| 290 |
+
|
| 291 |
+
Returns:
|
| 292 |
+
True if path is safe
|
| 293 |
+
|
| 294 |
+
Raises:
|
| 295 |
+
ValueError: If path contains dangerous characters or patterns
|
| 296 |
+
"""
|
| 297 |
+
import re
|
| 298 |
+
|
| 299 |
+
# Must be an absolute path
|
| 300 |
+
if not os.path.isabs(file_path):
|
| 301 |
+
raise ValueError(f"Path must be absolute: {file_path}")
|
| 302 |
+
|
| 303 |
+
# File must exist
|
| 304 |
+
if not os.path.exists(file_path):
|
| 305 |
+
raise ValueError(f"File does not exist: {file_path}")
|
| 306 |
+
|
| 307 |
+
# Check for shell metacharacters and dangerous patterns
|
| 308 |
+
# Allow: alphanumeric, spaces, dots, dashes, underscores, forward slashes
|
| 309 |
+
# Block: semicolons, pipes, backticks, $, &, >, <, etc.
|
| 310 |
+
dangerous_chars = ['`', '$', '&', '|', ';', '>', '<', '\n', '\r', '(', ')']
|
| 311 |
+
for char in dangerous_chars:
|
| 312 |
+
if char in file_path:
|
| 313 |
+
raise ValueError(f"Dangerous character '{char}' found in path: {file_path}")
|
| 314 |
+
|
| 315 |
+
# Block path traversal attempts
|
| 316 |
+
if '..' in file_path:
|
| 317 |
+
raise ValueError(f"Path traversal pattern '..' detected: {file_path}")
|
| 318 |
+
|
| 319 |
+
# Block null bytes
|
| 320 |
+
if '\x00' in file_path:
|
| 321 |
+
raise ValueError("Null byte detected in path")
|
| 322 |
+
|
| 323 |
+
return True
|
| 324 |
+
|
| 325 |
+
|
| 326 |
+
def try_external_tools(file_path):
|
| 327 |
+
"""
|
| 328 |
+
Try using external tools to validate the image if they're available.
|
| 329 |
+
Returns (is_valid, message)
|
| 330 |
+
|
| 331 |
+
Security: Validates file path before passing to subprocess to prevent
|
| 332 |
+
command injection attacks.
|
| 333 |
+
"""
|
| 334 |
+
# Validate path before passing to subprocess
|
| 335 |
+
try:
|
| 336 |
+
validate_subprocess_path(file_path)
|
| 337 |
+
except ValueError as e:
|
| 338 |
+
logging.warning(f"Skipping external tool validation due to security check: {e}")
|
| 339 |
+
return True, "External tools check skipped (security)"
|
| 340 |
+
|
| 341 |
+
# Try using exiftool if available
|
| 342 |
+
try:
|
| 343 |
+
result = subprocess.run(['exiftool', '-m', '-p', '$Error', file_path],
|
| 344 |
+
capture_output=True, text=True, timeout=5)
|
| 345 |
+
if result.returncode == 0 and result.stdout.strip():
|
| 346 |
+
return False, f"Exiftool error: {result.stdout.strip()}"
|
| 347 |
+
|
| 348 |
+
# Check with identify (ImageMagick) if available
|
| 349 |
+
result = subprocess.run(['identify', '-verbose', file_path],
|
| 350 |
+
capture_output=True, text=True, timeout=5)
|
| 351 |
+
if result.returncode != 0:
|
| 352 |
+
return False, "ImageMagick identify failed to read the image"
|
| 353 |
+
|
| 354 |
+
return True, "Passed external tool validation"
|
| 355 |
+
except (subprocess.SubprocessError, FileNotFoundError):
|
| 356 |
+
# External tools not available or failed
|
| 357 |
+
return True, "External tools check skipped"
|
| 358 |
+
|
| 359 |
+
def try_full_decode_check(file_path):
|
| 360 |
+
"""
|
| 361 |
+
Try to fully decode the image to a temporary file.
|
| 362 |
+
This catches more subtle corruption that might otherwise be missed.
|
| 363 |
+
"""
|
| 364 |
+
try:
|
| 365 |
+
# For JPEGs, try to decode and re-encode the image
|
| 366 |
+
with Image.open(file_path) as img:
|
| 367 |
+
# Create a temporary file for testing
|
| 368 |
+
with tempfile.NamedTemporaryFile(delete=True) as tmp:
|
| 369 |
+
# Try to save a decoded copy
|
| 370 |
+
img.save(tmp.name, format="BMP")
|
| 371 |
+
|
| 372 |
+
# If we get here, the image data could be fully decoded
|
| 373 |
+
return True, "Full decode test passed"
|
| 374 |
+
except Exception as e:
|
| 375 |
+
return False, f"Full decode test failed: {str(e)}"
|
| 376 |
+
|
| 377 |
+
def check_visual_corruption(file_path, block_threshold=0.20, uniform_threshold=10, strict_mode=False):
|
| 378 |
+
"""
|
| 379 |
+
Analyze image content to detect visual corruption like large uniform areas.
|
| 380 |
+
|
| 381 |
+
Args:
|
| 382 |
+
file_path: Path to the image file
|
| 383 |
+
block_threshold: Percentage of image that must be uniform to be considered corrupt (0.0-1.0)
|
| 384 |
+
uniform_threshold: Color variation threshold for considering pixels "uniform"
|
| 385 |
+
strict_mode: If True, only detect gray/black areas as corruption indicators
|
| 386 |
+
|
| 387 |
+
Returns:
|
| 388 |
+
(is_visually_corrupt, details)
|
| 389 |
+
"""
|
| 390 |
+
try:
|
| 391 |
+
with Image.open(file_path) as img:
|
| 392 |
+
# Get image dimensions
|
| 393 |
+
width, height = img.size
|
| 394 |
+
total_pixels = width * height
|
| 395 |
+
|
| 396 |
+
# Convert to RGB to ensure consistent analysis
|
| 397 |
+
if img.mode != "RGB":
|
| 398 |
+
img = img.convert("RGB")
|
| 399 |
+
|
| 400 |
+
# Sample the image (analyzing every pixel would be too slow)
|
| 401 |
+
# We'll create a grid of sample points - we'll use more samples for more accuracy
|
| 402 |
+
sample_step = max(1, min(width, height) // 150) # Adjust based on image size
|
| 403 |
+
|
| 404 |
+
# Track unique colors and their counts
|
| 405 |
+
color_counts = {}
|
| 406 |
+
total_samples = 0
|
| 407 |
+
|
| 408 |
+
# Sample the image
|
| 409 |
+
for y in range(0, height, sample_step):
|
| 410 |
+
for x in range(0, width, sample_step):
|
| 411 |
+
total_samples += 1
|
| 412 |
+
pixel = img.getpixel((x, y))
|
| 413 |
+
|
| 414 |
+
# Round pixel values to reduce sensitivity to minor variations
|
| 415 |
+
rounded_pixel = (
|
| 416 |
+
pixel[0] // uniform_threshold * uniform_threshold,
|
| 417 |
+
pixel[1] // uniform_threshold * uniform_threshold,
|
| 418 |
+
pixel[2] // uniform_threshold * uniform_threshold
|
| 419 |
+
)
|
| 420 |
+
|
| 421 |
+
if rounded_pixel in color_counts:
|
| 422 |
+
color_counts[rounded_pixel] += 1
|
| 423 |
+
else:
|
| 424 |
+
color_counts[rounded_pixel] = 1
|
| 425 |
+
|
| 426 |
+
# Find the most common color
|
| 427 |
+
most_common_color = max(color_counts.items(), key=lambda x: x[1])
|
| 428 |
+
most_common_percentage = most_common_color[1] / total_samples
|
| 429 |
+
|
| 430 |
+
# Check for large blocks of uniform color (potential corruption)
|
| 431 |
+
if most_common_percentage > block_threshold:
|
| 432 |
+
# Calculate approximate percentage of the image affected
|
| 433 |
+
affected_pct = most_common_percentage * 100
|
| 434 |
+
color_value = most_common_color[0]
|
| 435 |
+
|
| 436 |
+
# Determine if this is likely corruption
|
| 437 |
+
# Gray/black areas are common in corruption
|
| 438 |
+
is_dark = sum(color_value) < 3 * uniform_threshold # Very dark areas
|
| 439 |
+
|
| 440 |
+
# Check if it's a gray area (equal R,G,B values)
|
| 441 |
+
is_gray = abs(color_value[0] - color_value[1]) < uniform_threshold and \
|
| 442 |
+
abs(color_value[1] - color_value[2]) < uniform_threshold and \
|
| 443 |
+
abs(color_value[0] - color_value[2]) < uniform_threshold
|
| 444 |
+
|
| 445 |
+
# Only consider mid-range grays as corruption indicators (not white/black)
|
| 446 |
+
is_mid_gray = is_gray and 30 < sum(color_value)/3 < 220
|
| 447 |
+
|
| 448 |
+
# Special case: almost pure white is often legitimate content
|
| 449 |
+
is_white = color_value[0] > 240 and color_value[1] > 240 and color_value[2] > 240
|
| 450 |
+
|
| 451 |
+
# Determine likelihood of corruption based on color and percentage
|
| 452 |
+
if (is_dark or is_mid_gray) and not is_white:
|
| 453 |
+
# Higher threshold for white areas since they're common in legitimate images
|
| 454 |
+
white_threshold = 0.4 # 40% of image
|
| 455 |
+
if is_white and most_common_percentage < white_threshold:
|
| 456 |
+
return False, f"Large white area ({affected_pct:.1f}%) but likely not corruption"
|
| 457 |
+
|
| 458 |
+
# More likely to be corruption
|
| 459 |
+
return True, f"Visual corruption detected: {affected_pct:.1f}% of image is uniform {color_value}"
|
| 460 |
+
else:
|
| 461 |
+
# Could be a legitimate image with a uniform background
|
| 462 |
+
return False, f"Large uniform area ({affected_pct:.1f}%) but likely not corruption"
|
| 463 |
+
|
| 464 |
+
# Check for other telltale signs of corruption - but only in strict mode
|
| 465 |
+
if strict_mode:
|
| 466 |
+
# 1. Excessive color blocks (fragmentation) - this works well for detecting noise
|
| 467 |
+
if len(color_counts) > total_samples * 0.85 and total_samples > 200:
|
| 468 |
+
return True, f"Excessive color fragmentation detected ({len(color_counts)} colors in {total_samples} samples)"
|
| 469 |
+
|
| 470 |
+
# 2. Check for very specific corruption patterns
|
| 471 |
+
# Analyze distribution of colors to look for unusual patterns
|
| 472 |
+
if total_samples > 500: # Only for larger images with enough samples
|
| 473 |
+
# Check if there's an unnatural color distribution
|
| 474 |
+
# Normal photos have a more gradual distribution rather than spikes
|
| 475 |
+
sorted_counts = sorted(color_counts.values(), reverse=True)
|
| 476 |
+
|
| 477 |
+
# Calculate the color distribution ratio
|
| 478 |
+
if len(sorted_counts) > 5:
|
| 479 |
+
top5_ratio = sum(sorted_counts[:5]) / sum(sorted_counts)
|
| 480 |
+
# Usually, the top 5 colors shouldn't dominate more than 80% of the image
|
| 481 |
+
# unless it's a graphic or very simple image
|
| 482 |
+
if top5_ratio < 0.2 and most_common_percentage < 0.1:
|
| 483 |
+
return True, f"Unusual color distribution (possible noise/corruption)"
|
| 484 |
+
|
| 485 |
+
return False, "No visual corruption detected"
|
| 486 |
+
|
| 487 |
+
except Exception as e:
|
| 488 |
+
return False, f"Error during visual analysis: {str(e)}"
|
| 489 |
+
|
| 490 |
+
def is_valid_image(file_path, thorough=True, sensitivity='medium', ignore_eof=False, check_visual=False, visual_strictness='medium'):
|
| 491 |
+
"""
|
| 492 |
+
Validate image file integrity using multiple methods.
|
| 493 |
+
|
| 494 |
+
Args:
|
| 495 |
+
file_path: Path to the image file
|
| 496 |
+
thorough: Whether to perform deep structure validation
|
| 497 |
+
sensitivity: 'low', 'medium', or 'high'
|
| 498 |
+
ignore_eof: Whether to ignore missing end-of-file markers
|
| 499 |
+
check_visual: Whether to perform visual content analysis to detect corruption
|
| 500 |
+
visual_strictness: 'low', 'medium', or 'high' strictness for visual corruption detection
|
| 501 |
+
|
| 502 |
+
Returns:
|
| 503 |
+
True if valid, False if corrupt.
|
| 504 |
+
"""
|
| 505 |
+
# Basic PIL validation first (fast check)
|
| 506 |
+
try:
|
| 507 |
+
with Image.open(file_path) as img:
|
| 508 |
+
# verify() checks the file header
|
| 509 |
+
img.verify()
|
| 510 |
+
|
| 511 |
+
# Additional step: try to load the image data
|
| 512 |
+
# This catches more corruption issues
|
| 513 |
+
with Image.open(file_path) as img2:
|
| 514 |
+
img2.load()
|
| 515 |
+
|
| 516 |
+
# If check_visual is enabled, analyze the image content
|
| 517 |
+
if check_visual:
|
| 518 |
+
# Set thresholds based on strictness level
|
| 519 |
+
if visual_strictness == 'low':
|
| 520 |
+
# More permissive - only detect very obvious corruption
|
| 521 |
+
block_threshold = 0.3 # 30% of the image must be uniform
|
| 522 |
+
uniform_threshold = 5 # Smaller color variations are allowed
|
| 523 |
+
elif visual_strictness == 'high':
|
| 524 |
+
# Most strict - catches subtle corruption but may have false positives
|
| 525 |
+
block_threshold = 0.15 # Only 15% of the image needs to be uniform
|
| 526 |
+
uniform_threshold = 15 # Larger color variations are considered uniform
|
| 527 |
+
else: # medium (default)
|
| 528 |
+
block_threshold = 0.20 # 20% threshold
|
| 529 |
+
uniform_threshold = 10
|
| 530 |
+
|
| 531 |
+
# Check for visual corruption with appropriate thresholds
|
| 532 |
+
is_visually_corrupt, msg = check_visual_corruption(
|
| 533 |
+
file_path,
|
| 534 |
+
block_threshold=block_threshold,
|
| 535 |
+
uniform_threshold=uniform_threshold,
|
| 536 |
+
# Only use additional detection methods in high strictness mode
|
| 537 |
+
strict_mode=(visual_strictness == 'high')
|
| 538 |
+
)
|
| 539 |
+
|
| 540 |
+
if is_visually_corrupt:
|
| 541 |
+
logging.debug(f"Visual corruption detected in {file_path}: {msg}")
|
| 542 |
+
return False
|
| 543 |
+
|
| 544 |
+
# If thorough checking is disabled, return after basic check
|
| 545 |
+
if not thorough or sensitivity == 'low':
|
| 546 |
+
return True
|
| 547 |
+
|
| 548 |
+
# For JPEG files, do additional structure checking
|
| 549 |
+
if file_path.lower().endswith(tuple(SUPPORTED_FORMATS['JPEG'])):
|
| 550 |
+
# Check JPEG structure
|
| 551 |
+
is_valid, error_msg = check_jpeg_structure(file_path)
|
| 552 |
+
if not is_valid:
|
| 553 |
+
# If ignore_eof is enabled and the only issue is missing EOI marker, consider it valid
|
| 554 |
+
if ignore_eof and error_msg == "Missing EOI marker at end of file":
|
| 555 |
+
logging.debug(f"Ignoring missing EOI marker for {file_path} as requested")
|
| 556 |
+
else:
|
| 557 |
+
logging.debug(f"JPEG structure invalid for {file_path}: {error_msg}")
|
| 558 |
+
return False
|
| 559 |
+
|
| 560 |
+
# Try full decode test (catches subtle corruption)
|
| 561 |
+
is_valid, error_msg = try_full_decode_check(file_path)
|
| 562 |
+
if not is_valid:
|
| 563 |
+
logging.debug(f"Full decode test failed for {file_path}: {error_msg}")
|
| 564 |
+
return False
|
| 565 |
+
|
| 566 |
+
# Try external tools if applicable
|
| 567 |
+
is_valid, error_msg = try_external_tools(file_path)
|
| 568 |
+
if not is_valid:
|
| 569 |
+
logging.debug(f"External tool validation failed for {file_path}: {error_msg}")
|
| 570 |
+
return False
|
| 571 |
+
|
| 572 |
+
# For PNG files, do additional structure checking
|
| 573 |
+
elif file_path.lower().endswith(tuple(SUPPORTED_FORMATS['PNG'])):
|
| 574 |
+
# Check PNG structure
|
| 575 |
+
is_valid, error_msg = check_png_structure(file_path)
|
| 576 |
+
if not is_valid:
|
| 577 |
+
logging.debug(f"PNG structure invalid for {file_path}: {error_msg}")
|
| 578 |
+
return False
|
| 579 |
+
|
| 580 |
+
# Try full decode test (catches subtle corruption)
|
| 581 |
+
is_valid, error_msg = try_full_decode_check(file_path)
|
| 582 |
+
if not is_valid:
|
| 583 |
+
logging.debug(f"Full decode test failed for {file_path}: {error_msg}")
|
| 584 |
+
return False
|
| 585 |
+
|
| 586 |
+
return True
|
| 587 |
+
except Exception as e:
|
| 588 |
+
logging.debug(f"Invalid image {file_path}: {str(e)}")
|
| 589 |
+
return False
|
| 590 |
+
|
| 591 |
+
def attempt_repair(file_path, backup_dir=None):
|
| 592 |
+
"""
|
| 593 |
+
Attempts to repair corrupt image files.
|
| 594 |
+
Returns: (success, message, fixed_width, fixed_height)
|
| 595 |
+
"""
|
| 596 |
+
# Create backup if requested
|
| 597 |
+
if backup_dir:
|
| 598 |
+
backup_path = os.path.join(backup_dir, os.path.basename(file_path) + ".bak")
|
| 599 |
+
try:
|
| 600 |
+
shutil.copy2(file_path, backup_path)
|
| 601 |
+
logging.debug(f"Created backup at {backup_path}")
|
| 602 |
+
except Exception as e:
|
| 603 |
+
logging.warning(f"Could not create backup: {str(e)}")
|
| 604 |
+
|
| 605 |
+
try:
|
| 606 |
+
# First, diagnose the issue
|
| 607 |
+
issue_type, details = diagnose_image_issue(file_path)
|
| 608 |
+
logging.debug(f"Diagnosis for {file_path}: {issue_type} - {details}")
|
| 609 |
+
|
| 610 |
+
file_ext = os.path.splitext(file_path)[1].lower()
|
| 611 |
+
|
| 612 |
+
# Check if file format is supported for repair
|
| 613 |
+
format_supported = False
|
| 614 |
+
for fmt in REPAIRABLE_FORMATS:
|
| 615 |
+
if file_ext in SUPPORTED_FORMATS[fmt]:
|
| 616 |
+
format_supported = True
|
| 617 |
+
break
|
| 618 |
+
|
| 619 |
+
if not format_supported:
|
| 620 |
+
return False, f"Format not supported for repair ({file_ext})", None, None
|
| 621 |
+
|
| 622 |
+
# Try to open and resave the image with PIL's error forgiveness
|
| 623 |
+
# This works for many truncated files
|
| 624 |
+
try:
|
| 625 |
+
with Image.open(file_path) as img:
|
| 626 |
+
width, height = img.size
|
| 627 |
+
format = img.format
|
| 628 |
+
|
| 629 |
+
# Create a buffer for the fixed image
|
| 630 |
+
buffer = io.BytesIO()
|
| 631 |
+
img.save(buffer, format=format)
|
| 632 |
+
|
| 633 |
+
# Write the repaired image back to the original file
|
| 634 |
+
with open(file_path, 'wb') as f:
|
| 635 |
+
f.write(buffer.getvalue())
|
| 636 |
+
|
| 637 |
+
# Verify the repaired image
|
| 638 |
+
if is_valid_image(file_path):
|
| 639 |
+
return True, f"Repaired {issue_type} issue", width, height
|
| 640 |
+
else:
|
| 641 |
+
# If verification fails, try again with JPEG specific options for JPEG files
|
| 642 |
+
if format == 'JPEG':
|
| 643 |
+
with Image.open(file_path) as img:
|
| 644 |
+
buffer = io.BytesIO()
|
| 645 |
+
# Use optimize=True and quality=85 for better repair chances
|
| 646 |
+
img.save(buffer, format='JPEG', optimize=True, quality=85)
|
| 647 |
+
with open(file_path, 'wb') as f:
|
| 648 |
+
f.write(buffer.getvalue())
|
| 649 |
+
|
| 650 |
+
if is_valid_image(file_path):
|
| 651 |
+
return True, f"Repaired {issue_type} issue with JPEG optimization", width, height
|
| 652 |
+
|
| 653 |
+
return False, f"Failed to repair {issue_type} issue", None, None
|
| 654 |
+
|
| 655 |
+
except Exception as e:
|
| 656 |
+
logging.debug(f"Repair attempt failed for {file_path}: {str(e)}")
|
| 657 |
+
return False, f"Repair failed: {str(e)}", None, None
|
| 658 |
+
|
| 659 |
+
except Exception as e:
|
| 660 |
+
logging.debug(f"Error during repair of {file_path}: {str(e)}")
|
| 661 |
+
return False, f"Repair error: {str(e)}", None, None
|
| 662 |
+
|
| 663 |
+
def process_file(args):
|
| 664 |
+
"""Process a single image file."""
|
| 665 |
+
file_path, repair_mode, repair_dir, thorough_check, sensitivity, ignore_eof, check_visual, visual_strictness, enable_security_checks = args
|
| 666 |
+
|
| 667 |
+
# Security validation (if enabled)
|
| 668 |
+
if enable_security_checks:
|
| 669 |
+
try:
|
| 670 |
+
is_safe, warnings = validate_file_security(file_path, check_size=True, check_dimensions=True)
|
| 671 |
+
|
| 672 |
+
# Log security warnings
|
| 673 |
+
for warning in warnings:
|
| 674 |
+
logging.warning(f"Security warning for {file_path}: {warning}")
|
| 675 |
+
|
| 676 |
+
if not is_safe:
|
| 677 |
+
# File failed security checks - treat as invalid
|
| 678 |
+
size = os.path.getsize(file_path)
|
| 679 |
+
return file_path, False, size, "security_failed", "Failed security validation", None
|
| 680 |
+
|
| 681 |
+
except ValueError as e:
|
| 682 |
+
# Critical security failure (file too large, dimensions too big, etc.)
|
| 683 |
+
logging.error(f"Security check failed for {file_path}: {e}")
|
| 684 |
+
size = os.path.getsize(file_path) if os.path.exists(file_path) else 0
|
| 685 |
+
return file_path, False, size, "security_failed", str(e), None
|
| 686 |
+
except Exception as e:
|
| 687 |
+
# Unexpected error during security validation
|
| 688 |
+
logging.debug(f"Security validation error for {file_path}: {e}")
|
| 689 |
+
# Continue processing anyway for this case
|
| 690 |
+
|
| 691 |
+
# Check if the image is valid
|
| 692 |
+
is_valid = is_valid_image(file_path, thorough=thorough_check, sensitivity=sensitivity,
|
| 693 |
+
ignore_eof=ignore_eof, check_visual=check_visual, visual_strictness=visual_strictness)
|
| 694 |
+
|
| 695 |
+
if not is_valid and repair_mode:
|
| 696 |
+
# Try to repair the file
|
| 697 |
+
repair_success, repair_msg, width, height = attempt_repair(file_path, repair_dir)
|
| 698 |
+
|
| 699 |
+
if repair_success:
|
| 700 |
+
# File was repaired
|
| 701 |
+
return file_path, True, 0, "repaired", repair_msg, (width, height)
|
| 702 |
+
else:
|
| 703 |
+
# File is still corrupt
|
| 704 |
+
size = os.path.getsize(file_path)
|
| 705 |
+
return file_path, False, size, "repair_failed", repair_msg, None
|
| 706 |
+
else:
|
| 707 |
+
# No repair attempted or file is valid
|
| 708 |
+
size = os.path.getsize(file_path) if not is_valid else 0
|
| 709 |
+
return file_path, is_valid, size, "not_repaired", None, None
|
| 710 |
+
|
| 711 |
+
def get_session_id(directory, formats, recursive):
|
| 712 |
+
"""Generate a unique session ID based on scan parameters."""
|
| 713 |
+
# Create a unique identifier for this scan session
|
| 714 |
+
dir_path = str(directory).encode('utf-8')
|
| 715 |
+
formats_str = ",".join(sorted(formats)).encode('utf-8')
|
| 716 |
+
recursive_str = str(recursive).encode('utf-8')
|
| 717 |
+
|
| 718 |
+
# Use SHA256 instead of MD5 for better security
|
| 719 |
+
# MD5 is cryptographically broken and should not be used
|
| 720 |
+
hash_obj = hashlib.sha256()
|
| 721 |
+
hash_obj.update(dir_path)
|
| 722 |
+
hash_obj.update(formats_str)
|
| 723 |
+
hash_obj.update(recursive_str)
|
| 724 |
+
|
| 725 |
+
return hash_obj.hexdigest()[:16] # Use first 16 chars of hash for uniqueness
|
| 726 |
+
|
| 727 |
+
def _deduplicate(seq):
|
| 728 |
+
"""Return a list with duplicates removed while preserving order."""
|
| 729 |
+
seen = set()
|
| 730 |
+
deduped = []
|
| 731 |
+
for item in seq:
|
| 732 |
+
if item not in seen:
|
| 733 |
+
deduped.append(item)
|
| 734 |
+
seen.add(item)
|
| 735 |
+
return deduped
|
| 736 |
+
|
| 737 |
+
|
| 738 |
+
def validate_file_security(file_path, check_size=True, check_dimensions=True):
|
| 739 |
+
"""
|
| 740 |
+
Perform security validation on a file before processing.
|
| 741 |
+
|
| 742 |
+
Args:
|
| 743 |
+
file_path: Path to the file
|
| 744 |
+
check_size: Whether to check file size limits
|
| 745 |
+
check_dimensions: Whether to check image dimension limits
|
| 746 |
+
|
| 747 |
+
Returns:
|
| 748 |
+
(is_safe, warnings) - tuple of boolean and list of warning messages
|
| 749 |
+
|
| 750 |
+
Raises:
|
| 751 |
+
ValueError: If file fails critical security checks
|
| 752 |
+
"""
|
| 753 |
+
warnings = []
|
| 754 |
+
|
| 755 |
+
# Check if file exists
|
| 756 |
+
if not os.path.exists(file_path):
|
| 757 |
+
raise ValueError(f"File does not exist: {file_path}")
|
| 758 |
+
|
| 759 |
+
# Check file size to prevent DoS via huge files
|
| 760 |
+
if check_size:
|
| 761 |
+
file_size = os.path.getsize(file_path)
|
| 762 |
+
if file_size > MAX_FILE_SIZE:
|
| 763 |
+
raise ValueError(f"File too large ({file_size} bytes, max {MAX_FILE_SIZE}). "
|
| 764 |
+
f"This could indicate a malicious file or decompression bomb.")
|
| 765 |
+
|
| 766 |
+
# Warn about suspiciously large files (over 10MB for images is unusual)
|
| 767 |
+
if file_size > 10 * 1024 * 1024:
|
| 768 |
+
warnings.append(f"Large file size: {humanize.naturalsize(file_size)}")
|
| 769 |
+
|
| 770 |
+
# Check image dimensions to prevent decompression bombs
|
| 771 |
+
if check_dimensions:
|
| 772 |
+
try:
|
| 773 |
+
with Image.open(file_path) as img:
|
| 774 |
+
width, height = img.size
|
| 775 |
+
total_pixels = width * height
|
| 776 |
+
|
| 777 |
+
if total_pixels > MAX_IMAGE_PIXELS:
|
| 778 |
+
raise ValueError(f"Image dimensions too large ({width}x{height} = {total_pixels} pixels, "
|
| 779 |
+
f"max {MAX_IMAGE_PIXELS}). This could be a decompression bomb attack.")
|
| 780 |
+
|
| 781 |
+
# Warn about very large images
|
| 782 |
+
if total_pixels > 10000 * 10000:
|
| 783 |
+
warnings.append(f"Large image dimensions: {width}x{height}")
|
| 784 |
+
|
| 785 |
+
# Check for format mismatch (file extension vs actual format)
|
| 786 |
+
actual_format = img.format
|
| 787 |
+
expected_formats = []
|
| 788 |
+
for fmt, extensions in SUPPORTED_FORMATS.items():
|
| 789 |
+
if file_path.lower().endswith(extensions):
|
| 790 |
+
expected_formats.append(fmt)
|
| 791 |
+
|
| 792 |
+
if actual_format and expected_formats and actual_format not in expected_formats:
|
| 793 |
+
warnings.append(f"Format mismatch: file has '{file_path.split('.')[-1]}' extension "
|
| 794 |
+
f"but is actually '{actual_format}' format")
|
| 795 |
+
|
| 796 |
+
except UnidentifiedImageError:
|
| 797 |
+
raise ValueError(f"Cannot identify image format - file may be corrupted or malicious")
|
| 798 |
+
except Exception as e:
|
| 799 |
+
raise ValueError(f"Error validating image: {str(e)}")
|
| 800 |
+
|
| 801 |
+
return True, warnings
|
| 802 |
+
|
| 803 |
+
|
| 804 |
+
def calculate_file_hash(file_path, algorithm='sha256'):
|
| 805 |
+
"""
|
| 806 |
+
Calculate cryptographic hash of a file.
|
| 807 |
+
|
| 808 |
+
Args:
|
| 809 |
+
file_path: Path to the file
|
| 810 |
+
algorithm: Hash algorithm to use (sha256, sha512, etc.)
|
| 811 |
+
|
| 812 |
+
Returns:
|
| 813 |
+
Hexadecimal hash string
|
| 814 |
+
"""
|
| 815 |
+
hash_obj = hashlib.new(algorithm)
|
| 816 |
+
|
| 817 |
+
# Read file in chunks to handle large files
|
| 818 |
+
with open(file_path, 'rb') as f:
|
| 819 |
+
for chunk in iter(lambda: f.read(4096), b''):
|
| 820 |
+
hash_obj.update(chunk)
|
| 821 |
+
|
| 822 |
+
return hash_obj.hexdigest()
|
| 823 |
+
|
| 824 |
+
|
| 825 |
+
def safe_join_path(base_dir, user_path):
|
| 826 |
+
"""
|
| 827 |
+
Safely join paths and prevent path traversal attacks.
|
| 828 |
+
|
| 829 |
+
Args:
|
| 830 |
+
base_dir: Base directory (trusted)
|
| 831 |
+
user_path: User-provided path component (untrusted)
|
| 832 |
+
|
| 833 |
+
Returns:
|
| 834 |
+
Safe absolute path within base_dir
|
| 835 |
+
|
| 836 |
+
Raises:
|
| 837 |
+
ValueError: If path traversal is detected
|
| 838 |
+
"""
|
| 839 |
+
# Normalize base directory
|
| 840 |
+
base_dir = os.path.abspath(base_dir)
|
| 841 |
+
|
| 842 |
+
# Join paths
|
| 843 |
+
full_path = os.path.normpath(os.path.join(base_dir, user_path))
|
| 844 |
+
|
| 845 |
+
# Resolve any symlinks
|
| 846 |
+
full_path = os.path.abspath(full_path)
|
| 847 |
+
|
| 848 |
+
# Ensure the result is within base_dir
|
| 849 |
+
if not full_path.startswith(base_dir + os.sep) and full_path != base_dir:
|
| 850 |
+
raise ValueError(f"Path traversal detected: '{user_path}' resolves outside base directory")
|
| 851 |
+
|
| 852 |
+
return full_path
|
| 853 |
+
|
| 854 |
+
|
| 855 |
+
def save_progress(session_id, directory, formats, recursive, processed_files,
|
| 856 |
+
bad_files, repaired_files, progress_dir=DEFAULT_PROGRESS_DIR):
|
| 857 |
+
"""Save the current progress to a file."""
|
| 858 |
+
# Create progress directory if it doesn't exist
|
| 859 |
+
if not os.path.exists(progress_dir):
|
| 860 |
+
os.makedirs(progress_dir, exist_ok=True)
|
| 861 |
+
|
| 862 |
+
# Create a progress state object
|
| 863 |
+
progress_state = {
|
| 864 |
+
'version': VERSION,
|
| 865 |
+
'timestamp': datetime.now().isoformat(),
|
| 866 |
+
'directory': str(directory),
|
| 867 |
+
'formats': formats,
|
| 868 |
+
'recursive': recursive,
|
| 869 |
+
'processed_files': _deduplicate(processed_files),
|
| 870 |
+
'bad_files': _deduplicate(bad_files),
|
| 871 |
+
'repaired_files': _deduplicate(repaired_files)
|
| 872 |
+
}
|
| 873 |
+
|
| 874 |
+
# Save to file using JSON instead of pickle for security
|
| 875 |
+
# This prevents arbitrary code execution via malicious progress files
|
| 876 |
+
progress_file = os.path.join(progress_dir, f"session_{session_id}.progress.json")
|
| 877 |
+
with open(progress_file, 'w') as f:
|
| 878 |
+
json.dump(progress_state, f, indent=2)
|
| 879 |
+
|
| 880 |
+
logging.debug(f"Progress saved to {progress_file}")
|
| 881 |
+
return progress_file
|
| 882 |
+
|
| 883 |
+
def load_progress(session_id, progress_dir=DEFAULT_PROGRESS_DIR):
|
| 884 |
+
"""Load progress from a saved session."""
|
| 885 |
+
# Try new JSON format first (more secure)
|
| 886 |
+
progress_file_json = os.path.join(progress_dir, f"session_{session_id}.progress.json")
|
| 887 |
+
progress_file_legacy = os.path.join(progress_dir, f"session_{session_id}.progress")
|
| 888 |
+
|
| 889 |
+
# Prefer JSON format for security
|
| 890 |
+
if os.path.exists(progress_file_json):
|
| 891 |
+
progress_file = progress_file_json
|
| 892 |
+
use_json = True
|
| 893 |
+
elif os.path.exists(progress_file_legacy):
|
| 894 |
+
progress_file = progress_file_legacy
|
| 895 |
+
use_json = False
|
| 896 |
+
logging.warning("Loading legacy pickle format. This format is deprecated for security reasons.")
|
| 897 |
+
else:
|
| 898 |
+
return None
|
| 899 |
+
|
| 900 |
+
try:
|
| 901 |
+
if use_json:
|
| 902 |
+
# Secure JSON deserialization
|
| 903 |
+
with open(progress_file, 'r') as f:
|
| 904 |
+
progress_state = json.load(f)
|
| 905 |
+
else:
|
| 906 |
+
# Legacy pickle support (with warning)
|
| 907 |
+
# TODO: Remove pickle support in future versions
|
| 908 |
+
import pickle
|
| 909 |
+
with open(progress_file, 'rb') as f:
|
| 910 |
+
progress_state = pickle.load(f)
|
| 911 |
+
logging.warning("SECURITY WARNING: Loaded progress file using unsafe pickle format. "
|
| 912 |
+
"Please delete old .progress files and use new .progress.json format.")
|
| 913 |
+
|
| 914 |
+
# Remove any duplicate entries from lists
|
| 915 |
+
for key in ('processed_files', 'bad_files', 'repaired_files'):
|
| 916 |
+
if key in progress_state:
|
| 917 |
+
progress_state[key] = _deduplicate(progress_state[key])
|
| 918 |
+
|
| 919 |
+
# Check version compatibility
|
| 920 |
+
if progress_state.get('version', '0.0.0') != VERSION:
|
| 921 |
+
logging.warning("Progress file was created with a different version. Some incompatibilities may exist.")
|
| 922 |
+
|
| 923 |
+
logging.info(f"Loaded progress from {progress_file}")
|
| 924 |
+
return progress_state
|
| 925 |
+
except Exception as e:
|
| 926 |
+
logging.error(f"Failed to load progress: {str(e)}")
|
| 927 |
+
return None
|
| 928 |
+
|
| 929 |
+
def list_saved_sessions(progress_dir=DEFAULT_PROGRESS_DIR):
|
| 930 |
+
"""List all saved sessions with their details."""
|
| 931 |
+
if not os.path.exists(progress_dir):
|
| 932 |
+
return []
|
| 933 |
+
|
| 934 |
+
sessions = []
|
| 935 |
+
for filename in os.listdir(progress_dir):
|
| 936 |
+
# Support both new JSON format and legacy pickle format
|
| 937 |
+
if filename.endswith('.progress.json') or filename.endswith('.progress'):
|
| 938 |
+
try:
|
| 939 |
+
filepath = os.path.join(progress_dir, filename)
|
| 940 |
+
use_json = filename.endswith('.progress.json')
|
| 941 |
+
|
| 942 |
+
if use_json:
|
| 943 |
+
with open(filepath, 'r') as f:
|
| 944 |
+
progress_state = json.load(f)
|
| 945 |
+
else:
|
| 946 |
+
# Legacy pickle format
|
| 947 |
+
import pickle
|
| 948 |
+
with open(filepath, 'rb') as f:
|
| 949 |
+
progress_state = pickle.load(f)
|
| 950 |
+
|
| 951 |
+
# Extract session ID from filename
|
| 952 |
+
if filename.endswith('.progress.json'):
|
| 953 |
+
session_id = filename.replace('session_', '').replace('.progress.json', '')
|
| 954 |
+
else:
|
| 955 |
+
session_id = filename.replace('session_', '').replace('.progress', '')
|
| 956 |
+
|
| 957 |
+
session_info = {
|
| 958 |
+
'id': session_id,
|
| 959 |
+
'timestamp': progress_state.get('timestamp', 'Unknown'),
|
| 960 |
+
'directory': progress_state.get('directory', 'Unknown'),
|
| 961 |
+
'formats': progress_state.get('formats', []),
|
| 962 |
+
'processed_count': len(progress_state.get('processed_files', [])),
|
| 963 |
+
'bad_count': len(progress_state.get('bad_files', [])),
|
| 964 |
+
'repaired_count': len(progress_state.get('repaired_files', [])),
|
| 965 |
+
'filepath': filepath,
|
| 966 |
+
'format': 'JSON' if use_json else 'Pickle (Legacy)'
|
| 967 |
+
}
|
| 968 |
+
sessions.append(session_info)
|
| 969 |
+
except Exception as e:
|
| 970 |
+
logging.debug(f"Failed to load session from {filename}: {str(e)}")
|
| 971 |
+
|
| 972 |
+
# Sort by timestamp, newest first
|
| 973 |
+
sessions.sort(key=lambda x: x['timestamp'], reverse=True)
|
| 974 |
+
return sessions
|
| 975 |
+
|
| 976 |
+
def get_extensions_for_formats(formats):
|
| 977 |
+
"""Get all file extensions for the specified formats."""
|
| 978 |
+
extensions = []
|
| 979 |
+
for fmt in formats:
|
| 980 |
+
if fmt in SUPPORTED_FORMATS:
|
| 981 |
+
extensions.extend(SUPPORTED_FORMATS[fmt])
|
| 982 |
+
return tuple(extensions)
|
| 983 |
+
|
| 984 |
+
def find_image_files(directory, formats, recursive=True):
|
| 985 |
+
"""Find all image files of specified formats in a directory."""
|
| 986 |
+
image_files = []
|
| 987 |
+
extensions = get_extensions_for_formats(formats)
|
| 988 |
+
|
| 989 |
+
if not extensions:
|
| 990 |
+
logging.warning("No valid image formats specified!")
|
| 991 |
+
return []
|
| 992 |
+
|
| 993 |
+
format_names = ", ".join(formats)
|
| 994 |
+
if recursive:
|
| 995 |
+
logging.info(f"Recursively scanning for {format_names} files...")
|
| 996 |
+
for root, _, files in os.walk(directory):
|
| 997 |
+
for file in files:
|
| 998 |
+
if file.lower().endswith(extensions):
|
| 999 |
+
image_files.append(os.path.join(root, file))
|
| 1000 |
+
else:
|
| 1001 |
+
logging.info(f"Scanning for {format_names} files in {directory} (non-recursive)...")
|
| 1002 |
+
for file in os.listdir(directory):
|
| 1003 |
+
if os.path.isfile(os.path.join(directory, file)) and file.lower().endswith(extensions):
|
| 1004 |
+
image_files.append(os.path.join(directory, file))
|
| 1005 |
+
|
| 1006 |
+
logging.info(f"Found {len(image_files)} image files")
|
| 1007 |
+
return image_files
|
| 1008 |
+
|
| 1009 |
+
def process_images(directory, formats, dry_run=True, repair=False,
|
| 1010 |
+
max_workers=None, recursive=True, move_to=None, repair_dir=None,
|
| 1011 |
+
save_progress_interval=5, resume_session=None, progress_dir=DEFAULT_PROGRESS_DIR,
|
| 1012 |
+
thorough_check=False, sensitivity='medium', ignore_eof=False, check_visual=False,
|
| 1013 |
+
visual_strictness='medium', enable_security_checks=False):
|
| 1014 |
+
"""Find corrupt image files and optionally repair, delete, or move them."""
|
| 1015 |
+
start_time = time.time()
|
| 1016 |
+
|
| 1017 |
+
# Generate session ID for this scan
|
| 1018 |
+
session_id = get_session_id(directory, formats, recursive)
|
| 1019 |
+
processed_files = []
|
| 1020 |
+
bad_files = []
|
| 1021 |
+
repaired_files = []
|
| 1022 |
+
total_size_saved = 0
|
| 1023 |
+
last_progress_save = time.time()
|
| 1024 |
+
|
| 1025 |
+
# If resuming, load previous progress
|
| 1026 |
+
if resume_session:
|
| 1027 |
+
try:
|
| 1028 |
+
progress = load_progress(resume_session, progress_dir)
|
| 1029 |
+
if progress and progress['directory'] == str(directory) and progress['formats'] == formats:
|
| 1030 |
+
processed_files = progress['processed_files']
|
| 1031 |
+
bad_files = progress['bad_files']
|
| 1032 |
+
repaired_files = progress['repaired_files']
|
| 1033 |
+
logging.info(f"Resuming session: {len(processed_files)} files already processed")
|
| 1034 |
+
else:
|
| 1035 |
+
if progress:
|
| 1036 |
+
logging.warning("Session parameters don't match current parameters. Starting fresh scan.")
|
| 1037 |
+
else:
|
| 1038 |
+
logging.warning(f"Couldn't find session {resume_session}. Starting fresh scan.")
|
| 1039 |
+
except Exception as e:
|
| 1040 |
+
logging.error(f"Error loading session: {str(e)}. Starting fresh scan.")
|
| 1041 |
+
|
| 1042 |
+
# Find all image files
|
| 1043 |
+
image_files = find_image_files(directory, formats, recursive)
|
| 1044 |
+
if not image_files:
|
| 1045 |
+
logging.warning("No image files found!")
|
| 1046 |
+
return [], [], 0
|
| 1047 |
+
|
| 1048 |
+
# Filter out already processed files if resuming
|
| 1049 |
+
if processed_files:
|
| 1050 |
+
remaining_files = [f for f in image_files if f not in processed_files]
|
| 1051 |
+
skipped_count = len(image_files) - len(remaining_files)
|
| 1052 |
+
image_files = remaining_files
|
| 1053 |
+
logging.info(f"Skipping {skipped_count} already processed files")
|
| 1054 |
+
|
| 1055 |
+
if not image_files:
|
| 1056 |
+
logging.info("All files have already been processed in the previous session!")
|
| 1057 |
+
return bad_files, repaired_files, total_size_saved
|
| 1058 |
+
|
| 1059 |
+
# Create directories if they don't exist
|
| 1060 |
+
if move_to and not os.path.exists(move_to):
|
| 1061 |
+
os.makedirs(move_to)
|
| 1062 |
+
logging.info(f"Created directory for corrupt files: {move_to}")
|
| 1063 |
+
|
| 1064 |
+
if repair and repair_dir and not os.path.exists(repair_dir):
|
| 1065 |
+
os.makedirs(repair_dir)
|
| 1066 |
+
logging.info(f"Created directory for backup files: {repair_dir}")
|
| 1067 |
+
|
| 1068 |
+
# Prepare input arguments for workers
|
| 1069 |
+
input_args = [(file_path, repair, repair_dir, thorough_check, sensitivity, ignore_eof, check_visual, visual_strictness, enable_security_checks) for file_path in image_files]
|
| 1070 |
+
|
| 1071 |
+
# Process files in parallel
|
| 1072 |
+
logging.info("Processing files in parallel...")
|
| 1073 |
+
|
| 1074 |
+
# Create a custom progress bar class that saves progress periodically
|
| 1075 |
+
class ProgressSavingBar(tqdm_auto.tqdm):
|
| 1076 |
+
def update(self, n=1):
|
| 1077 |
+
nonlocal last_progress_save, processed_files
|
| 1078 |
+
result = super().update(n)
|
| 1079 |
+
|
| 1080 |
+
# Save progress periodically
|
| 1081 |
+
current_time = time.time()
|
| 1082 |
+
if save_progress_interval > 0 and current_time - last_progress_save >= save_progress_interval * 60:
|
| 1083 |
+
# Save the progress using the list of files that have actually
|
| 1084 |
+
# completed processing. ``processed_files`` is updated as each
|
| 1085 |
+
# future finishes so we can safely persist it as-is.
|
| 1086 |
+
save_progress(
|
| 1087 |
+
session_id,
|
| 1088 |
+
directory,
|
| 1089 |
+
formats,
|
| 1090 |
+
recursive,
|
| 1091 |
+
processed_files,
|
| 1092 |
+
bad_files,
|
| 1093 |
+
repaired_files,
|
| 1094 |
+
progress_dir,
|
| 1095 |
+
)
|
| 1096 |
+
|
| 1097 |
+
last_progress_save = current_time
|
| 1098 |
+
logging.debug(f"Progress saved at {self.n} / {len(image_files)} files")
|
| 1099 |
+
|
| 1100 |
+
return result
|
| 1101 |
+
|
| 1102 |
+
try:
|
| 1103 |
+
with concurrent.futures.ProcessPoolExecutor(max_workers=max_workers) as executor:
|
| 1104 |
+
# Colorful progress bar with progress saving
|
| 1105 |
+
results = []
|
| 1106 |
+
futures = {executor.submit(process_file, arg): arg[0] for arg in input_args}
|
| 1107 |
+
|
| 1108 |
+
with ProgressSavingBar(
|
| 1109 |
+
total=len(image_files),
|
| 1110 |
+
desc=f"{colorama.Fore.BLUE}Checking image files{colorama.Style.RESET_ALL}",
|
| 1111 |
+
unit="file",
|
| 1112 |
+
bar_format="{desc}: {percentage:3.0f}%|{bar:30}| {n_fmt}/{total_fmt} [{elapsed}<{remaining}, {rate_fmt}]",
|
| 1113 |
+
colour="blue"
|
| 1114 |
+
) as pbar:
|
| 1115 |
+
for future in concurrent.futures.as_completed(futures):
|
| 1116 |
+
file_path = futures[future]
|
| 1117 |
+
try:
|
| 1118 |
+
result = future.result()
|
| 1119 |
+
results.append(result)
|
| 1120 |
+
|
| 1121 |
+
# Track this file as processed for resuming later if needed
|
| 1122 |
+
processed_files.append(file_path)
|
| 1123 |
+
|
| 1124 |
+
# Update progress for successful or failed processing
|
| 1125 |
+
pbar.update(1)
|
| 1126 |
+
|
| 1127 |
+
# Update our tracking of bad/repaired files in real-time for progress saving
|
| 1128 |
+
file_path, is_valid, size, repair_status, repair_msg, dimensions = result
|
| 1129 |
+
if repair_status == "repaired":
|
| 1130 |
+
repaired_files.append(file_path)
|
| 1131 |
+
elif not is_valid:
|
| 1132 |
+
bad_files.append(file_path)
|
| 1133 |
+
|
| 1134 |
+
except Exception as e:
|
| 1135 |
+
logging.error(f"Error processing {file_path}: {str(e)}")
|
| 1136 |
+
pbar.update(1)
|
| 1137 |
+
except KeyboardInterrupt:
|
| 1138 |
+
# If the user interrupts, save progress before exiting
|
| 1139 |
+
logging.warning("Process interrupted by user. Saving progress...")
|
| 1140 |
+
save_progress(session_id, directory, formats, recursive,
|
| 1141 |
+
processed_files, bad_files, repaired_files, progress_dir)
|
| 1142 |
+
logging.info(f"Progress saved. You can resume with --resume {session_id}")
|
| 1143 |
+
raise
|
| 1144 |
+
|
| 1145 |
+
# Process results
|
| 1146 |
+
total_size_saved = 0
|
| 1147 |
+
for file_path, is_valid, size, repair_status, repair_msg, dimensions in results:
|
| 1148 |
+
if repair_status == "repaired":
|
| 1149 |
+
# File was successfully repaired (already added to repaired_files during processing)
|
| 1150 |
+
width, height = dimensions
|
| 1151 |
+
msg = f"Repaired: {file_path} ({width}x{height}) - {repair_msg}"
|
| 1152 |
+
logging.info(msg)
|
| 1153 |
+
elif not is_valid:
|
| 1154 |
+
# File is corrupt and wasn't repaired (or repair failed)
|
| 1155 |
+
# (already added to bad_files during processing)
|
| 1156 |
+
total_size_saved += size
|
| 1157 |
+
|
| 1158 |
+
size_str = humanize.naturalsize(size)
|
| 1159 |
+
if repair_status == "repair_failed":
|
| 1160 |
+
fail_msg = f"Repair failed: {file_path} ({size_str}) - {repair_msg}"
|
| 1161 |
+
logging.warning(fail_msg)
|
| 1162 |
+
|
| 1163 |
+
if dry_run:
|
| 1164 |
+
msg = f"Would delete: {file_path} ({size_str})"
|
| 1165 |
+
logging.info(msg)
|
| 1166 |
+
elif move_to:
|
| 1167 |
+
# Preserve the subdirectory structure by getting the relative path from the search directory
|
| 1168 |
+
try:
|
| 1169 |
+
# Get the relative path from the base directory
|
| 1170 |
+
rel_path = os.path.relpath(file_path, str(directory))
|
| 1171 |
+
# If relpath starts with ".." it means file_path is not within directory
|
| 1172 |
+
# In this case, just use the basename as fallback
|
| 1173 |
+
if rel_path.startswith('..'):
|
| 1174 |
+
rel_path = os.path.basename(file_path)
|
| 1175 |
+
|
| 1176 |
+
# Use safe path joining to prevent path traversal attacks
|
| 1177 |
+
# This ensures files can't be written outside the move_to directory
|
| 1178 |
+
try:
|
| 1179 |
+
dest_path = safe_join_path(move_to, rel_path)
|
| 1180 |
+
except ValueError as ve:
|
| 1181 |
+
logging.error(f"Security error moving {file_path}: {ve}")
|
| 1182 |
+
continue
|
| 1183 |
+
|
| 1184 |
+
# Create parent directories if they don't exist
|
| 1185 |
+
os.makedirs(os.path.dirname(dest_path), exist_ok=True)
|
| 1186 |
+
|
| 1187 |
+
# Use shutil.move instead of os.rename to handle cross-device file movements
|
| 1188 |
+
shutil.move(file_path, dest_path)
|
| 1189 |
+
|
| 1190 |
+
# Add arrow with color
|
| 1191 |
+
arrow = f"{colorama.Fore.CYAN}→{colorama.Style.RESET_ALL}"
|
| 1192 |
+
msg = f"Moved: {file_path} {arrow} {dest_path} ({size_str})"
|
| 1193 |
+
logging.info(msg)
|
| 1194 |
+
except Exception as e:
|
| 1195 |
+
logging.error(f"Failed to move {file_path}: {e}")
|
| 1196 |
+
else:
|
| 1197 |
+
try:
|
| 1198 |
+
os.remove(file_path)
|
| 1199 |
+
msg = f"Deleted: {file_path} ({size_str})"
|
| 1200 |
+
logging.info(msg)
|
| 1201 |
+
except Exception as e:
|
| 1202 |
+
logging.error(f"Failed to delete {file_path}: {e}")
|
| 1203 |
+
|
| 1204 |
+
# Final progress save
|
| 1205 |
+
save_progress(session_id, directory, formats, recursive,
|
| 1206 |
+
processed_files, bad_files, repaired_files, progress_dir)
|
| 1207 |
+
|
| 1208 |
+
elapsed = time.time() - start_time
|
| 1209 |
+
logging.info(f"Processed {len(processed_files)} files in {elapsed:.2f} seconds")
|
| 1210 |
+
logging.info(f"Session ID: {session_id} (use --resume {session_id} to resume if needed)")
|
| 1211 |
+
|
| 1212 |
+
return bad_files, repaired_files, total_size_saved
|
| 1213 |
+
|
| 1214 |
+
def print_banner():
|
| 1215 |
+
"""Print 2PAC-themed ASCII art banner"""
|
| 1216 |
+
banner = r"""
|
| 1217 |
+
░▒▓███████▓▒░░▒▓███████▓▒░ ░▒▓██████▓▒░ ░▒▓██████▓▒░
|
| 1218 |
+
░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░
|
| 1219 |
+
░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░
|
| 1220 |
+
░▒▓██████▓▒░░▒▓███████▓▒░░▒▓████████▓▒░▒▓█▓▒░
|
| 1221 |
+
░▒▓█▓▒░ ░▒▓█▓▒░ ░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░
|
| 1222 |
+
░▒▓█▓▒░ ░▒▓█▓▒░ ░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░
|
| 1223 |
+
░▒▓████████▓▒░▒▓█▓▒░ ░▒▓█▓▒░░▒▓█▓▒░░▒▓██████▓▒░
|
| 1224 |
+
╔═════════════════════════════════════════════════════════╗
|
| 1225 |
+
║ The Picture Analyzer & Corruption killer ║
|
| 1226 |
+
║ In memory of Jeff Young - Bringing people together ║
|
| 1227 |
+
╚═════════════════════════════════════════════════════════╝
|
| 1228 |
+
"""
|
| 1229 |
+
|
| 1230 |
+
# Colored version of the banner, highlighting PAC for Picture Analyzer Corruption
|
| 1231 |
+
if 'colorama' in sys.modules:
|
| 1232 |
+
banner_lines = banner.strip().split('\n')
|
| 1233 |
+
colored_banner = []
|
| 1234 |
+
|
| 1235 |
+
# Color the new gradient ASCII art logo (lines 0-6)
|
| 1236 |
+
for i, line in enumerate(banner_lines):
|
| 1237 |
+
if i < 7: # The ASCII art logo lines for the new gradient style
|
| 1238 |
+
# For "2" part (first column)
|
| 1239 |
+
part1 = line[:11]
|
| 1240 |
+
# For "P" part (second column)
|
| 1241 |
+
part2 = line[11:24]
|
| 1242 |
+
# For "A" part (third column)
|
| 1243 |
+
part3 = line[24:38]
|
| 1244 |
+
# For "C" part (fourth column)
|
| 1245 |
+
part4 = line[38:]
|
| 1246 |
+
|
| 1247 |
+
colored_line = f"{colorama.Fore.WHITE}{part1}" + \
|
| 1248 |
+
f"{colorama.Fore.RED}{part2}" + \
|
| 1249 |
+
f"{colorama.Fore.GREEN}{part3}" + \
|
| 1250 |
+
f"{colorama.Fore.BLUE}{part4}{colorama.Style.RESET_ALL}"
|
| 1251 |
+
|
| 1252 |
+
colored_banner.append(colored_line)
|
| 1253 |
+
elif i >= 7 and i <= 10: # The box and text lines
|
| 1254 |
+
if i == 8: # Title line with PAC highlighted
|
| 1255 |
+
parts = line.split("Picture Analyzer & Corruption")
|
| 1256 |
+
if len(parts) == 2:
|
| 1257 |
+
prefix = parts[0]
|
| 1258 |
+
suffix = parts[1]
|
| 1259 |
+
colored_title = f"{colorama.Fore.YELLOW}{prefix}" + \
|
| 1260 |
+
f"{colorama.Fore.RED}Picture " + \
|
| 1261 |
+
f"{colorama.Fore.GREEN}Analyzer " + \
|
| 1262 |
+
f"{colorama.Fore.WHITE}& " + \
|
| 1263 |
+
f"{colorama.Fore.BLUE}Corruption" + \
|
| 1264 |
+
f"{colorama.Fore.YELLOW}{suffix}{colorama.Style.RESET_ALL}"
|
| 1265 |
+
colored_banner.append(colored_title)
|
| 1266 |
+
else:
|
| 1267 |
+
colored_banner.append(f"{colorama.Fore.YELLOW}{line}{colorama.Style.RESET_ALL}")
|
| 1268 |
+
elif i == 9: # Jeff Young tribute line
|
| 1269 |
+
colored_banner.append(f"{colorama.Fore.CYAN}{line}{colorama.Style.RESET_ALL}")
|
| 1270 |
+
else: # Box border lines
|
| 1271 |
+
colored_banner.append(f"{colorama.Fore.YELLOW}{line}{colorama.Style.RESET_ALL}")
|
| 1272 |
+
else:
|
| 1273 |
+
colored_banner.append(f"{colorama.Fore.WHITE}{line}{colorama.Style.RESET_ALL}")
|
| 1274 |
+
|
| 1275 |
+
print('\n'.join(colored_banner))
|
| 1276 |
+
else:
|
| 1277 |
+
print(banner)
|
| 1278 |
+
print()
|
| 1279 |
+
|
| 1280 |
+
def main():
|
| 1281 |
+
print_banner()
|
| 1282 |
+
|
| 1283 |
+
# Check for 'q' command to quit
|
| 1284 |
+
if len(sys.argv) == 2 and sys.argv[1].lower() == 'q':
|
| 1285 |
+
print(f"{colorama.Fore.YELLOW}Exiting 2PAC. Stay safe!{colorama.Style.RESET_ALL}")
|
| 1286 |
+
sys.exit(0)
|
| 1287 |
+
|
| 1288 |
+
parser = argparse.ArgumentParser(
|
| 1289 |
+
description='2PAC: The Picture Analyzer & Corruption killer',
|
| 1290 |
+
epilog='Created by Richard Young - "All Eyez On Your Images" - https://github.com/ricyoung/2pac'
|
| 1291 |
+
)
|
| 1292 |
+
|
| 1293 |
+
# Main action (mutually exclusive)
|
| 1294 |
+
action_group = parser.add_mutually_exclusive_group()
|
| 1295 |
+
action_group.add_argument('directory', nargs='?', help='Directory to search for image files')
|
| 1296 |
+
action_group.add_argument('--list-sessions', action='store_true', help='List all saved sessions')
|
| 1297 |
+
action_group.add_argument('--check-file', type=str, help='Check a specific file for corruption (useful for testing)')
|
| 1298 |
+
|
| 1299 |
+
# Basic options
|
| 1300 |
+
parser.add_argument('--delete', action='store_true', help='Delete corrupt image files (without this flag, runs in dry-run mode)')
|
| 1301 |
+
parser.add_argument('--move-to', type=str, help='Move corrupt files to this directory instead of deleting them')
|
| 1302 |
+
parser.add_argument('--workers', type=int, default=None, help='Number of worker processes (default: CPU count)')
|
| 1303 |
+
parser.add_argument('--non-recursive', action='store_true', help='Only search in the specified directory, not subdirectories')
|
| 1304 |
+
parser.add_argument('--output', type=str, help='Save list of corrupt files to this file')
|
| 1305 |
+
parser.add_argument('--verbose', '-v', action='store_true', help='Enable verbose logging')
|
| 1306 |
+
parser.add_argument('--no-color', action='store_true', help='Disable colored output')
|
| 1307 |
+
parser.add_argument('--version', action='version', version=f'Bad Image Finder v{VERSION} by Richard Young')
|
| 1308 |
+
|
| 1309 |
+
# Repair options
|
| 1310 |
+
repair_group = parser.add_argument_group('Repair options')
|
| 1311 |
+
repair_group.add_argument('--repair', action='store_true', help='Attempt to repair corrupt image files')
|
| 1312 |
+
repair_group.add_argument('--backup-dir', type=str, help='Directory to store backups of files before repair')
|
| 1313 |
+
repair_group.add_argument('--repair-report', type=str, help='Save list of repaired files to this file')
|
| 1314 |
+
|
| 1315 |
+
# Format options
|
| 1316 |
+
format_group = parser.add_argument_group('Image format options')
|
| 1317 |
+
format_group.add_argument('--formats', type=str, nargs='+', choices=SUPPORTED_FORMATS.keys(),
|
| 1318 |
+
help=f'Image formats to check (default: all formats)')
|
| 1319 |
+
format_group.add_argument('--jpeg', action='store_true', help='Check JPEG files only')
|
| 1320 |
+
format_group.add_argument('--png', action='store_true', help='Check PNG files only')
|
| 1321 |
+
format_group.add_argument('--tiff', action='store_true', help='Check TIFF files only')
|
| 1322 |
+
format_group.add_argument('--gif', action='store_true', help='Check GIF files only')
|
| 1323 |
+
format_group.add_argument('--bmp', action='store_true', help='Check BMP files only')
|
| 1324 |
+
|
| 1325 |
+
# Validation options
|
| 1326 |
+
validation_group = parser.add_argument_group('Validation options')
|
| 1327 |
+
validation_group.add_argument('--thorough', action='store_true',
|
| 1328 |
+
help='Perform thorough image validation (slower but catches more subtle corruption)')
|
| 1329 |
+
validation_group.add_argument('--sensitivity', type=str, choices=['low', 'medium', 'high'], default='medium',
|
| 1330 |
+
help='Set validation sensitivity level: low (basic checks), medium (standard checks), high (most strict)')
|
| 1331 |
+
validation_group.add_argument('--ignore-eof', action='store_true',
|
| 1332 |
+
help='Ignore missing end-of-file markers (useful for truncated but viewable files)')
|
| 1333 |
+
validation_group.add_argument('--check-visual', action='store_true',
|
| 1334 |
+
help='Analyze image content to detect visible corruption like gray/black areas')
|
| 1335 |
+
validation_group.add_argument('--visual-strictness', type=str, choices=['low', 'medium', 'high'], default='medium',
|
| 1336 |
+
help='Set strictness level for visual corruption detection: low (most permissive), medium (balanced), high (only clear corruption)')
|
| 1337 |
+
|
| 1338 |
+
# Security options
|
| 1339 |
+
security_group = parser.add_argument_group('Security options')
|
| 1340 |
+
security_group.add_argument('--security-checks', action='store_true',
|
| 1341 |
+
help='Enable enhanced security validation (file size limits, dimension checks, format verification)')
|
| 1342 |
+
security_group.add_argument('--max-file-size', type=int, default=MAX_FILE_SIZE,
|
| 1343 |
+
help=f'Maximum file size in bytes to process (default: {MAX_FILE_SIZE} = 100MB)')
|
| 1344 |
+
security_group.add_argument('--max-pixels', type=int, default=MAX_IMAGE_PIXELS,
|
| 1345 |
+
help=f'Maximum image dimensions in pixels (default: {MAX_IMAGE_PIXELS} = 50MP)')
|
| 1346 |
+
|
| 1347 |
+
# Progress saving options
|
| 1348 |
+
progress_group = parser.add_argument_group('Progress options')
|
| 1349 |
+
progress_group.add_argument('--save-interval', type=int, default=5,
|
| 1350 |
+
help='Save progress every N minutes (0 to disable progress saving)')
|
| 1351 |
+
progress_group.add_argument('--progress-dir', type=str, default=DEFAULT_PROGRESS_DIR,
|
| 1352 |
+
help='Directory to store progress files')
|
| 1353 |
+
progress_group.add_argument('--resume', type=str, metavar='SESSION_ID',
|
| 1354 |
+
help='Resume from a previously saved session')
|
| 1355 |
+
|
| 1356 |
+
args = parser.parse_args()
|
| 1357 |
+
|
| 1358 |
+
# Setup logging
|
| 1359 |
+
setup_logging(args.verbose, args.no_color)
|
| 1360 |
+
|
| 1361 |
+
# Handle specific file check mode
|
| 1362 |
+
if args.check_file:
|
| 1363 |
+
file_path = args.check_file
|
| 1364 |
+
if not os.path.exists(file_path):
|
| 1365 |
+
logging.error(f"Error: File not found: {file_path}")
|
| 1366 |
+
sys.exit(1)
|
| 1367 |
+
|
| 1368 |
+
print(f"\n{colorama.Style.BRIGHT}Checking file: {file_path}{colorama.Style.RESET_ALL}\n")
|
| 1369 |
+
|
| 1370 |
+
# Basic check
|
| 1371 |
+
print(f"{colorama.Fore.CYAN}Basic validation:{colorama.Style.RESET_ALL}")
|
| 1372 |
+
try:
|
| 1373 |
+
with Image.open(file_path) as img:
|
| 1374 |
+
print(f"✓ File can be opened by PIL")
|
| 1375 |
+
print(f" Format: {img.format}")
|
| 1376 |
+
print(f" Mode: {img.mode}")
|
| 1377 |
+
print(f" Size: {img.size[0]}x{img.size[1]}")
|
| 1378 |
+
|
| 1379 |
+
try:
|
| 1380 |
+
img.verify()
|
| 1381 |
+
print(f"✓ Header verification passed")
|
| 1382 |
+
except Exception as e:
|
| 1383 |
+
print(f"❌ Header verification failed: {str(e)}")
|
| 1384 |
+
|
| 1385 |
+
try:
|
| 1386 |
+
with Image.open(file_path) as img2:
|
| 1387 |
+
img2.load()
|
| 1388 |
+
print(f"✓ Data loading test passed")
|
| 1389 |
+
except Exception as e:
|
| 1390 |
+
print(f"❌ Data loading test failed: {str(e)}")
|
| 1391 |
+
except Exception as e:
|
| 1392 |
+
print(f"❌ Cannot open file with PIL: {str(e)}")
|
| 1393 |
+
|
| 1394 |
+
# Detailed format-specific checks
|
| 1395 |
+
if file_path.lower().endswith(tuple(SUPPORTED_FORMATS['JPEG'])):
|
| 1396 |
+
print(f"\n{colorama.Fore.CYAN}JPEG structure checks:{colorama.Style.RESET_ALL}")
|
| 1397 |
+
is_valid, msg = check_jpeg_structure(file_path)
|
| 1398 |
+
if is_valid:
|
| 1399 |
+
print(f"✓ JPEG structure valid: {msg}")
|
| 1400 |
+
else:
|
| 1401 |
+
print(f"❌ JPEG structure invalid: {msg}")
|
| 1402 |
+
elif file_path.lower().endswith(tuple(SUPPORTED_FORMATS['PNG'])):
|
| 1403 |
+
print(f"\n{colorama.Fore.CYAN}PNG structure checks:{colorama.Style.RESET_ALL}")
|
| 1404 |
+
is_valid, msg = check_png_structure(file_path)
|
| 1405 |
+
if is_valid:
|
| 1406 |
+
print(f"✓ PNG structure valid: {msg}")
|
| 1407 |
+
else:
|
| 1408 |
+
print(f"❌ PNG structure invalid: {msg}")
|
| 1409 |
+
|
| 1410 |
+
# Decode test
|
| 1411 |
+
print(f"\n{colorama.Fore.CYAN}Full decode test:{colorama.Style.RESET_ALL}")
|
| 1412 |
+
is_valid, msg = try_full_decode_check(file_path)
|
| 1413 |
+
if is_valid:
|
| 1414 |
+
print(f"✓ Full decode test passed: {msg}")
|
| 1415 |
+
else:
|
| 1416 |
+
print(f"❌ Full decode test failed: {msg}")
|
| 1417 |
+
|
| 1418 |
+
# External tools check
|
| 1419 |
+
print(f"\n{colorama.Fore.CYAN}External tools check:{colorama.Style.RESET_ALL}")
|
| 1420 |
+
is_valid, msg = try_external_tools(file_path)
|
| 1421 |
+
if is_valid:
|
| 1422 |
+
print(f"✓ External tools: {msg}")
|
| 1423 |
+
else:
|
| 1424 |
+
print(f"❌ External tools: {msg}")
|
| 1425 |
+
|
| 1426 |
+
# Visual corruption check
|
| 1427 |
+
print(f"\n{colorama.Fore.CYAN}Visual content analysis:{colorama.Style.RESET_ALL}")
|
| 1428 |
+
is_visually_corrupt, vis_msg = check_visual_corruption(file_path)
|
| 1429 |
+
if not is_visually_corrupt:
|
| 1430 |
+
print(f"✓ No visual corruption detected: {vis_msg}")
|
| 1431 |
+
else:
|
| 1432 |
+
print(f"❌ {vis_msg}")
|
| 1433 |
+
|
| 1434 |
+
# Final verdict
|
| 1435 |
+
print(f"\n{colorama.Fore.CYAN}Final verdict:{colorama.Style.RESET_ALL}")
|
| 1436 |
+
is_valid_basic = is_valid_image(file_path, thorough=False)
|
| 1437 |
+
is_valid_thorough = is_valid_image(file_path, thorough=True)
|
| 1438 |
+
is_valid_visual = not is_visually_corrupt
|
| 1439 |
+
|
| 1440 |
+
if is_valid_basic and is_valid_thorough and is_valid_visual:
|
| 1441 |
+
print(f"{colorama.Fore.GREEN}This file appears to be valid by all checks.{colorama.Style.RESET_ALL}")
|
| 1442 |
+
elif not is_valid_visual:
|
| 1443 |
+
print(f"{colorama.Fore.RED}This file shows visible corruption in the image content.{colorama.Style.RESET_ALL}")
|
| 1444 |
+
print(f"Recommendation: Use --check-visual to detect this type of corruption.")
|
| 1445 |
+
elif is_valid_basic and not is_valid_thorough:
|
| 1446 |
+
print(f"{colorama.Fore.YELLOW}This file passes basic validation but fails thorough checks.{colorama.Style.RESET_ALL}")
|
| 1447 |
+
print(f"Recommendation: Use --thorough mode to detect this type of corruption.")
|
| 1448 |
+
else:
|
| 1449 |
+
print(f"{colorama.Fore.RED}This file is corrupt and would be detected by the basic scan.{colorama.Style.RESET_ALL}")
|
| 1450 |
+
|
| 1451 |
+
sys.exit(0)
|
| 1452 |
+
|
| 1453 |
+
# Handle session listing mode
|
| 1454 |
+
if args.list_sessions:
|
| 1455 |
+
sessions = list_saved_sessions(args.progress_dir)
|
| 1456 |
+
if sessions:
|
| 1457 |
+
print(f"\n{colorama.Style.BRIGHT}Saved Sessions:{colorama.Style.RESET_ALL}")
|
| 1458 |
+
for i, session in enumerate(sessions):
|
| 1459 |
+
ts = datetime.fromisoformat(session['timestamp']).strftime('%Y-%m-%d %H:%M:%S')
|
| 1460 |
+
print(f"\n{colorama.Fore.CYAN}Session ID: {session['id']}{colorama.Style.RESET_ALL}")
|
| 1461 |
+
print(f" Created: {ts}")
|
| 1462 |
+
print(f" Directory: {session['directory']}")
|
| 1463 |
+
print(f" Formats: {', '.join(session['formats'])}")
|
| 1464 |
+
print(f" Progress: {session['processed_count']} files processed, "
|
| 1465 |
+
f"{session['bad_count']} corrupt, {session['repaired_count']} repaired")
|
| 1466 |
+
|
| 1467 |
+
# Show resume command
|
| 1468 |
+
resume_cmd = f"find_bad_images.py --resume {session['id']}"
|
| 1469 |
+
if os.path.exists(session['directory']):
|
| 1470 |
+
print(f" {colorama.Fore.GREEN}Resume command: {resume_cmd}{colorama.Style.RESET_ALL}")
|
| 1471 |
+
else:
|
| 1472 |
+
print(f" {colorama.Fore.YELLOW}Directory no longer exists, cannot resume{colorama.Style.RESET_ALL}")
|
| 1473 |
+
else:
|
| 1474 |
+
print("No saved sessions found.")
|
| 1475 |
+
sys.exit(0)
|
| 1476 |
+
|
| 1477 |
+
# Check if directory is specified for a new scan
|
| 1478 |
+
if not args.directory and not args.resume:
|
| 1479 |
+
logging.error("Error: You must specify a directory to scan or use --resume to continue a session")
|
| 1480 |
+
sys.exit(1)
|
| 1481 |
+
|
| 1482 |
+
# If we're resuming without a directory, load from previous session
|
| 1483 |
+
directory = None
|
| 1484 |
+
if args.resume and not args.directory:
|
| 1485 |
+
progress = load_progress(args.resume, args.progress_dir)
|
| 1486 |
+
if progress:
|
| 1487 |
+
directory = Path(progress['directory'])
|
| 1488 |
+
logging.info(f"Using directory from saved session: {directory}")
|
| 1489 |
+
else:
|
| 1490 |
+
logging.error(f"Could not load session {args.resume}")
|
| 1491 |
+
sys.exit(1)
|
| 1492 |
+
elif args.directory:
|
| 1493 |
+
directory = Path(args.directory)
|
| 1494 |
+
|
| 1495 |
+
# Verify the directory exists
|
| 1496 |
+
if not directory.exists() or not directory.is_dir():
|
| 1497 |
+
logging.error(f"Error: {directory} is not a valid directory")
|
| 1498 |
+
sys.exit(1)
|
| 1499 |
+
|
| 1500 |
+
# Check for incompatible options
|
| 1501 |
+
if args.delete and args.move_to:
|
| 1502 |
+
logging.error("Error: Cannot use both --delete and --move-to options")
|
| 1503 |
+
sys.exit(1)
|
| 1504 |
+
|
| 1505 |
+
# Determine which formats to check
|
| 1506 |
+
formats = []
|
| 1507 |
+
if args.formats:
|
| 1508 |
+
formats = args.formats
|
| 1509 |
+
elif args.jpeg:
|
| 1510 |
+
formats.append('JPEG')
|
| 1511 |
+
elif args.png:
|
| 1512 |
+
formats.append('PNG')
|
| 1513 |
+
elif args.tiff:
|
| 1514 |
+
formats.append('TIFF')
|
| 1515 |
+
elif args.gif:
|
| 1516 |
+
formats.append('GIF')
|
| 1517 |
+
elif args.bmp:
|
| 1518 |
+
formats.append('BMP')
|
| 1519 |
+
else:
|
| 1520 |
+
# Default: check all formats
|
| 1521 |
+
formats = DEFAULT_FORMATS
|
| 1522 |
+
|
| 1523 |
+
dry_run = not (args.delete or args.move_to)
|
| 1524 |
+
|
| 1525 |
+
# Colorful mode indicators
|
| 1526 |
+
if args.repair:
|
| 1527 |
+
mode_str = f"{colorama.Fore.MAGENTA}REPAIR MODE{colorama.Style.RESET_ALL}: Attempting to fix corrupt files"
|
| 1528 |
+
logging.info(mode_str)
|
| 1529 |
+
|
| 1530 |
+
repairable_formats = [fmt for fmt in formats if fmt in REPAIRABLE_FORMATS]
|
| 1531 |
+
if repairable_formats:
|
| 1532 |
+
logging.info(f"Repairable formats: {', '.join(repairable_formats)}")
|
| 1533 |
+
else:
|
| 1534 |
+
logging.warning("None of the selected formats support repair")
|
| 1535 |
+
|
| 1536 |
+
if dry_run:
|
| 1537 |
+
mode_str = f"{colorama.Fore.YELLOW}DRY RUN MODE{colorama.Style.RESET_ALL}: No files will be deleted or moved"
|
| 1538 |
+
logging.info(mode_str)
|
| 1539 |
+
elif args.move_to:
|
| 1540 |
+
mode_str = f"{colorama.Fore.BLUE}MOVE MODE{colorama.Style.RESET_ALL}: Corrupt files will be moved to {args.move_to}"
|
| 1541 |
+
logging.info(mode_str)
|
| 1542 |
+
else:
|
| 1543 |
+
mode_str = f"{colorama.Fore.RED}DELETE MODE{colorama.Style.RESET_ALL}: Corrupt files will be permanently deleted"
|
| 1544 |
+
logging.info(mode_str)
|
| 1545 |
+
|
| 1546 |
+
# Add progress saving info
|
| 1547 |
+
if args.save_interval > 0:
|
| 1548 |
+
save_interval_str = f"{colorama.Fore.CYAN}PROGRESS SAVING{colorama.Style.RESET_ALL}: Every {args.save_interval} minutes"
|
| 1549 |
+
logging.info(save_interval_str)
|
| 1550 |
+
else:
|
| 1551 |
+
logging.info("Progress saving is disabled")
|
| 1552 |
+
|
| 1553 |
+
if args.resume:
|
| 1554 |
+
resume_str = f"{colorama.Fore.CYAN}RESUMING{colorama.Style.RESET_ALL}: From session {args.resume}"
|
| 1555 |
+
logging.info(resume_str)
|
| 1556 |
+
|
| 1557 |
+
if args.thorough:
|
| 1558 |
+
thorough_str = f"{colorama.Fore.MAGENTA}THOROUGH MODE{colorama.Style.RESET_ALL}: Using deep validation checks (slower but more accurate)"
|
| 1559 |
+
logging.info(thorough_str)
|
| 1560 |
+
|
| 1561 |
+
# Show sensitivity level
|
| 1562 |
+
sensitivity_colors = {
|
| 1563 |
+
'low': colorama.Fore.GREEN,
|
| 1564 |
+
'medium': colorama.Fore.YELLOW,
|
| 1565 |
+
'high': colorama.Fore.RED
|
| 1566 |
+
}
|
| 1567 |
+
sensitivity_color = sensitivity_colors.get(args.sensitivity, colorama.Fore.YELLOW)
|
| 1568 |
+
sensitivity_str = f"{sensitivity_color}SENSITIVITY: {args.sensitivity.upper()}{colorama.Style.RESET_ALL}"
|
| 1569 |
+
logging.info(sensitivity_str)
|
| 1570 |
+
|
| 1571 |
+
# Show EOF handling
|
| 1572 |
+
if args.ignore_eof:
|
| 1573 |
+
eof_str = f"{colorama.Fore.CYAN}IGNORING EOF MARKERS{colorama.Style.RESET_ALL}: Allowing truncated but viewable files"
|
| 1574 |
+
logging.info(eof_str)
|
| 1575 |
+
|
| 1576 |
+
# Show visual corruption checking status
|
| 1577 |
+
if args.check_visual:
|
| 1578 |
+
strictness_color = {
|
| 1579 |
+
'low': colorama.Fore.GREEN,
|
| 1580 |
+
'medium': colorama.Fore.YELLOW,
|
| 1581 |
+
'high': colorama.Fore.RED
|
| 1582 |
+
}.get(args.visual_strictness, colorama.Fore.YELLOW)
|
| 1583 |
+
|
| 1584 |
+
visual_str = f"{colorama.Fore.MAGENTA}VISUAL CHECK{colorama.Style.RESET_ALL}: " + \
|
| 1585 |
+
f"Analyzing image content (strictness: {strictness_color}{args.visual_strictness.upper()}{colorama.Style.RESET_ALL})"
|
| 1586 |
+
logging.info(visual_str)
|
| 1587 |
+
|
| 1588 |
+
# Show security checks status
|
| 1589 |
+
if args.security_checks:
|
| 1590 |
+
security_str = f"{colorama.Fore.RED}SECURITY CHECKS ENABLED{colorama.Style.RESET_ALL}: " + \
|
| 1591 |
+
f"Validating file sizes (max {humanize.naturalsize(MAX_FILE_SIZE)}), " + \
|
| 1592 |
+
f"dimensions (max {MAX_IMAGE_PIXELS:,} pixels), and format integrity"
|
| 1593 |
+
logging.info(security_str)
|
| 1594 |
+
|
| 1595 |
+
# Show which formats we're checking
|
| 1596 |
+
format_list = ", ".join(formats)
|
| 1597 |
+
logging.info(f"Checking image formats: {format_list}")
|
| 1598 |
+
logging.info(f"Searching for corrupt image files in {directory}")
|
| 1599 |
+
|
| 1600 |
+
try:
|
| 1601 |
+
bad_files, repaired_files, total_size_saved = process_images(
|
| 1602 |
+
directory,
|
| 1603 |
+
formats,
|
| 1604 |
+
dry_run=dry_run,
|
| 1605 |
+
repair=args.repair,
|
| 1606 |
+
max_workers=args.workers,
|
| 1607 |
+
recursive=not args.non_recursive,
|
| 1608 |
+
move_to=args.move_to,
|
| 1609 |
+
repair_dir=args.backup_dir,
|
| 1610 |
+
save_progress_interval=args.save_interval,
|
| 1611 |
+
resume_session=args.resume,
|
| 1612 |
+
progress_dir=args.progress_dir,
|
| 1613 |
+
thorough_check=args.thorough,
|
| 1614 |
+
sensitivity=args.sensitivity,
|
| 1615 |
+
ignore_eof=args.ignore_eof,
|
| 1616 |
+
check_visual=args.check_visual,
|
| 1617 |
+
visual_strictness=args.visual_strictness,
|
| 1618 |
+
enable_security_checks=args.security_checks
|
| 1619 |
+
)
|
| 1620 |
+
|
| 1621 |
+
# Colorful summary
|
| 1622 |
+
count_color = colorama.Fore.RED if bad_files else colorama.Fore.GREEN
|
| 1623 |
+
file_count = f"{count_color}{len(bad_files)}{colorama.Style.RESET_ALL}"
|
| 1624 |
+
logging.info(f"Found {file_count} corrupt image files")
|
| 1625 |
+
|
| 1626 |
+
if args.repair:
|
| 1627 |
+
repair_color = colorama.Fore.GREEN if repaired_files else colorama.Fore.YELLOW
|
| 1628 |
+
repair_count = f"{repair_color}{len(repaired_files)}{colorama.Style.RESET_ALL}"
|
| 1629 |
+
logging.info(f"Successfully repaired {repair_count} files")
|
| 1630 |
+
|
| 1631 |
+
if args.repair_report and repaired_files:
|
| 1632 |
+
with open(args.repair_report, 'w') as f:
|
| 1633 |
+
for file_path in repaired_files:
|
| 1634 |
+
f.write(f"{file_path}\n")
|
| 1635 |
+
logging.info(f"Saved list of repaired files to {args.repair_report}")
|
| 1636 |
+
|
| 1637 |
+
savings_str = humanize.naturalsize(total_size_saved)
|
| 1638 |
+
savings_color = colorama.Fore.GREEN if total_size_saved > 0 else colorama.Fore.RESET
|
| 1639 |
+
savings_msg = f"Total space savings: {savings_color}{savings_str}{colorama.Style.RESET_ALL}"
|
| 1640 |
+
logging.info(savings_msg)
|
| 1641 |
+
|
| 1642 |
+
if not args.no_color:
|
| 1643 |
+
# Add signature at the end of the run
|
| 1644 |
+
signature = f"\n{colorama.Fore.CYAN}2PAC v{VERSION} by Richard Young{colorama.Style.RESET_ALL}"
|
| 1645 |
+
quote = f"{colorama.Fore.YELLOW}\"{random.choice(QUOTES)}\"{colorama.Style.RESET_ALL}"
|
| 1646 |
+
print(signature)
|
| 1647 |
+
print(quote)
|
| 1648 |
+
|
| 1649 |
+
# Save list of corrupt files if requested
|
| 1650 |
+
if args.output and bad_files:
|
| 1651 |
+
with open(args.output, 'w') as f:
|
| 1652 |
+
for file_path in bad_files:
|
| 1653 |
+
f.write(f"{file_path}\n")
|
| 1654 |
+
logging.info(f"Saved list of corrupt files to {args.output}")
|
| 1655 |
+
|
| 1656 |
+
if bad_files and dry_run:
|
| 1657 |
+
logging.info("Run with --delete to remove these files or --move-to to relocate them")
|
| 1658 |
+
|
| 1659 |
+
except KeyboardInterrupt:
|
| 1660 |
+
logging.info("Operation cancelled by user")
|
| 1661 |
+
sys.exit(130)
|
| 1662 |
+
except Exception as e:
|
| 1663 |
+
logging.error(f"Error: {str(e)}")
|
| 1664 |
+
if args.verbose:
|
| 1665 |
+
import traceback
|
| 1666 |
+
traceback.print_exc()
|
| 1667 |
+
sys.exit(1)
|
| 1668 |
+
|
| 1669 |
+
if __name__ == "__main__":
|
| 1670 |
+
main()
|
rat_finder.py
ADDED
|
@@ -0,0 +1,1223 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
RAT Finder - Beta steganography detection tool for 2PAC
|
| 4 |
+
|
| 5 |
+
This tool is designed to detect potential steganography in images.
|
| 6 |
+
It's part of the 2PAC toolkit but focused on security aspects.
|
| 7 |
+
|
| 8 |
+
Author: Richard Young
|
| 9 |
+
License: MIT
|
| 10 |
+
"""
|
| 11 |
+
|
| 12 |
+
import os
|
| 13 |
+
import sys
|
| 14 |
+
import argparse
|
| 15 |
+
import concurrent.futures
|
| 16 |
+
import logging
|
| 17 |
+
import tempfile
|
| 18 |
+
import numpy as np
|
| 19 |
+
from pathlib import Path
|
| 20 |
+
from PIL import Image
|
| 21 |
+
import matplotlib.pyplot as plt
|
| 22 |
+
from scipy import stats
|
| 23 |
+
import colorama
|
| 24 |
+
from tqdm import tqdm
|
| 25 |
+
|
| 26 |
+
# Initialize colorama
|
| 27 |
+
colorama.init()
|
| 28 |
+
|
| 29 |
+
# Version
|
| 30 |
+
VERSION = "0.2.0"
|
| 31 |
+
|
| 32 |
+
# Set up logging
|
| 33 |
+
def setup_logging(verbose, no_color=False):
|
| 34 |
+
level = logging.DEBUG if verbose else logging.INFO
|
| 35 |
+
|
| 36 |
+
# Define color codes
|
| 37 |
+
if not no_color:
|
| 38 |
+
# Color scheme
|
| 39 |
+
COLORS = {
|
| 40 |
+
'DEBUG': colorama.Fore.CYAN,
|
| 41 |
+
'INFO': colorama.Fore.GREEN,
|
| 42 |
+
'WARNING': colorama.Fore.YELLOW,
|
| 43 |
+
'ERROR': colorama.Fore.RED,
|
| 44 |
+
'CRITICAL': colorama.Fore.MAGENTA + colorama.Style.BRIGHT,
|
| 45 |
+
'RESET': colorama.Style.RESET_ALL
|
| 46 |
+
}
|
| 47 |
+
|
| 48 |
+
# Custom formatter with colors
|
| 49 |
+
class ColoredFormatter(logging.Formatter):
|
| 50 |
+
def format(self, record):
|
| 51 |
+
levelname = record.levelname
|
| 52 |
+
if levelname in COLORS:
|
| 53 |
+
record.levelname = f"{COLORS[levelname]}{levelname}{COLORS['RESET']}"
|
| 54 |
+
record.msg = f"{COLORS[levelname]}{record.msg}{COLORS['RESET']}"
|
| 55 |
+
return super().format(record)
|
| 56 |
+
|
| 57 |
+
formatter = ColoredFormatter('%(asctime)s - %(levelname)s - %(message)s')
|
| 58 |
+
else:
|
| 59 |
+
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
|
| 60 |
+
|
| 61 |
+
handler = logging.StreamHandler()
|
| 62 |
+
handler.setFormatter(formatter)
|
| 63 |
+
|
| 64 |
+
logging.basicConfig(
|
| 65 |
+
level=level,
|
| 66 |
+
handlers=[handler]
|
| 67 |
+
)
|
| 68 |
+
|
| 69 |
+
def print_banner():
|
| 70 |
+
"""Print RAT Finder themed ASCII art banner"""
|
| 71 |
+
banner = r"""
|
| 72 |
+
██████╗ █████╗ ████████╗ ███████╗██╗███╗ ██╗██████╗ ███████╗██████╗
|
| 73 |
+
██╔══██╗██╔══██╗╚══██╔══╝ ██╔════╝██║████╗ ██║██╔══██╗██╔════╝██╔══██╗
|
| 74 |
+
██████╔╝███████║ ██║█████╗█████╗ ██║██╔██╗ ██║██║ ██║█████╗ ██████╔╝
|
| 75 |
+
██╔══██╗██╔══██║ ██║╚════╝██╔══╝ ██║██║╚██╗██║██║ ██║██╔══╝ ██╔══██╗
|
| 76 |
+
██║ ██║██║ ██║ ██║ ██║ ██║██║ ╚████║██████╔╝███████╗██║ ██║
|
| 77 |
+
╚═╝ ╚═╝╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═══╝╚═════╝ ╚══════╝╚═╝ ╚═╝
|
| 78 |
+
╔═══════════════════════════════════════════════════════════════════════╗
|
| 79 |
+
║ Steganography Detection Tool (v0.2.0) - Part of the 2PAC toolkit ║
|
| 80 |
+
║ "What the eyes see and the ears hear, the mind believes" ║
|
| 81 |
+
╚═══════════════════════════════════════════════════════════════════════╝
|
| 82 |
+
"""
|
| 83 |
+
|
| 84 |
+
if 'colorama' in sys.modules:
|
| 85 |
+
banner_lines = banner.strip().split('\n')
|
| 86 |
+
colored_banner = []
|
| 87 |
+
|
| 88 |
+
# Color the RAT part in red, the FINDER part in blue
|
| 89 |
+
for i, line in enumerate(banner_lines):
|
| 90 |
+
if i < 6: # The logo lines
|
| 91 |
+
# Add the RAT part in red
|
| 92 |
+
part1 = line[:24]
|
| 93 |
+
# Add the FINDER part in blue
|
| 94 |
+
part2 = line[24:]
|
| 95 |
+
colored_line = f"{colorama.Fore.RED}{part1}{colorama.Fore.BLUE}{part2}{colorama.Style.RESET_ALL}"
|
| 96 |
+
colored_banner.append(colored_line)
|
| 97 |
+
elif i >= 6 and i <= 9: # The box with text
|
| 98 |
+
colored_banner.append(f"{colorama.Fore.YELLOW}{line}{colorama.Style.RESET_ALL}")
|
| 99 |
+
else:
|
| 100 |
+
colored_banner.append(f"{colorama.Fore.WHITE}{line}{colorama.Style.RESET_ALL}")
|
| 101 |
+
|
| 102 |
+
print('\n'.join(colored_banner))
|
| 103 |
+
else:
|
| 104 |
+
print(banner)
|
| 105 |
+
print()
|
| 106 |
+
|
| 107 |
+
#------------------------------------------------------------------------------
|
| 108 |
+
# STEGANOGRAPHY DETECTION TECHNIQUES
|
| 109 |
+
#------------------------------------------------------------------------------
|
| 110 |
+
|
| 111 |
+
def perform_ela_analysis(image_path, quality=75):
|
| 112 |
+
"""
|
| 113 |
+
Performs Error Level Analysis (ELA) to detect manipulated areas in an image.
|
| 114 |
+
|
| 115 |
+
ELA works by intentionally resaving an image at a known quality level and
|
| 116 |
+
analyzing the differences between the original and resaved versions.
|
| 117 |
+
Areas that have been manipulated often show up as having different error levels.
|
| 118 |
+
|
| 119 |
+
Args:
|
| 120 |
+
image_path: Path to the image
|
| 121 |
+
quality: JPEG quality level to use for recompression (default: 75)
|
| 122 |
+
|
| 123 |
+
Returns:
|
| 124 |
+
(is_suspicious, confidence, details)
|
| 125 |
+
"""
|
| 126 |
+
try:
|
| 127 |
+
# Only perform ELA on JPEG images
|
| 128 |
+
if not image_path.lower().endswith(('.jpg', '.jpeg', '.jfif')):
|
| 129 |
+
return False, 0, {"error": "ELA is only effective for JPEG images"}
|
| 130 |
+
|
| 131 |
+
with Image.open(image_path) as original_img:
|
| 132 |
+
# Convert to RGB if needed
|
| 133 |
+
if original_img.mode != 'RGB':
|
| 134 |
+
original_img = original_img.convert('RGB')
|
| 135 |
+
|
| 136 |
+
# Create a temporary file for the resaved image
|
| 137 |
+
temp_file = tempfile.NamedTemporaryFile(suffix='.jpg', delete=True)
|
| 138 |
+
resaved_path = temp_file.name
|
| 139 |
+
|
| 140 |
+
# Save the image with the specified quality
|
| 141 |
+
original_img.save(resaved_path, quality=quality)
|
| 142 |
+
|
| 143 |
+
# Read the resaved image
|
| 144 |
+
with Image.open(resaved_path) as resaved_img:
|
| 145 |
+
# Convert both to numpy arrays
|
| 146 |
+
original_array = np.array(original_img).astype('int32')
|
| 147 |
+
resaved_array = np.array(resaved_img).astype('int32')
|
| 148 |
+
|
| 149 |
+
# Calculate absolute difference
|
| 150 |
+
diff = np.abs(original_array - resaved_array)
|
| 151 |
+
|
| 152 |
+
# Calculate statistics from the difference
|
| 153 |
+
mean_diff = np.mean(diff)
|
| 154 |
+
std_diff = np.std(diff)
|
| 155 |
+
max_diff = np.max(diff)
|
| 156 |
+
|
| 157 |
+
# Scale the differences to make them more visible (for visualization)
|
| 158 |
+
diff_scaled = diff * 10
|
| 159 |
+
|
| 160 |
+
# Look for suspicious patterns
|
| 161 |
+
# 1. High variance in error levels can indicate manipulation
|
| 162 |
+
# 2. Localized areas with significantly different error levels are suspicious
|
| 163 |
+
# 3. Unnaturally low error in complex areas can indicate steganography
|
| 164 |
+
|
| 165 |
+
# Calculate local variation using sliding window approach
|
| 166 |
+
# We're looking for areas where the difference between neighboring pixels
|
| 167 |
+
# has unusually high or low variance
|
| 168 |
+
|
| 169 |
+
# Use a simple method - check variance in blocks
|
| 170 |
+
block_size = 8 # 8x8 blocks, common in JPEG
|
| 171 |
+
shape = diff.shape
|
| 172 |
+
block_variance = []
|
| 173 |
+
|
| 174 |
+
# Sample blocks throughout the image
|
| 175 |
+
for i in range(0, shape[0] - block_size, block_size):
|
| 176 |
+
for j in range(0, shape[1] - block_size, block_size):
|
| 177 |
+
# Extract block for each channel
|
| 178 |
+
for c in range(3): # RGB channels
|
| 179 |
+
block = diff[i:i+block_size, j:j+block_size, c]
|
| 180 |
+
block_var = np.var(block)
|
| 181 |
+
if block_var > 0: # Avoid divisions by zero
|
| 182 |
+
block_variance.append(block_var)
|
| 183 |
+
|
| 184 |
+
if not block_variance:
|
| 185 |
+
return False, 0, {"error": "Could not calculate block variance"}
|
| 186 |
+
|
| 187 |
+
# Calculate statistics on block variances
|
| 188 |
+
mean_block_var = np.mean(block_variance)
|
| 189 |
+
max_block_var = np.max(block_variance)
|
| 190 |
+
std_block_var = np.std(block_variance)
|
| 191 |
+
|
| 192 |
+
# What we're looking for:
|
| 193 |
+
# 1. Unusually high block variance in some areas (significantly above the mean)
|
| 194 |
+
# 2. Unusually consistent error levels (too perfect - could indicate manipulation)
|
| 195 |
+
|
| 196 |
+
# Determine suspiciousness based on these factors
|
| 197 |
+
# Calculate a normalized ratio of max variance to mean variance
|
| 198 |
+
if mean_block_var > 0:
|
| 199 |
+
var_ratio = max_block_var / mean_block_var
|
| 200 |
+
else:
|
| 201 |
+
var_ratio = 0
|
| 202 |
+
|
| 203 |
+
# Calculate coefficient of variation for block variances
|
| 204 |
+
if mean_block_var > 0:
|
| 205 |
+
coeff_var = std_block_var / mean_block_var
|
| 206 |
+
else:
|
| 207 |
+
coeff_var = 0
|
| 208 |
+
|
| 209 |
+
# Heuristics based on ELA characteristics
|
| 210 |
+
# Unusually high variation ratio can indicate manipulation
|
| 211 |
+
is_suspicious_var_ratio = var_ratio > 50
|
| 212 |
+
|
| 213 |
+
# High coefficient of variation indicates inconsistent error levels
|
| 214 |
+
is_suspicious_coeff_var = coeff_var > 2.0
|
| 215 |
+
|
| 216 |
+
# Unusually high mean difference can indicate manipulation
|
| 217 |
+
is_suspicious_mean_diff = mean_diff > 15
|
| 218 |
+
|
| 219 |
+
# Combine factors
|
| 220 |
+
is_suspicious = (is_suspicious_var_ratio or
|
| 221 |
+
is_suspicious_coeff_var or
|
| 222 |
+
is_suspicious_mean_diff)
|
| 223 |
+
|
| 224 |
+
# Calculate confidence based on these factors
|
| 225 |
+
confidence = 0
|
| 226 |
+
if is_suspicious_var_ratio:
|
| 227 |
+
# Scale based on how extreme the ratio is
|
| 228 |
+
confidence += min(40, var_ratio / 2)
|
| 229 |
+
if is_suspicious_coeff_var:
|
| 230 |
+
# Scale based on coefficient of variation
|
| 231 |
+
confidence += min(30, coeff_var * 10)
|
| 232 |
+
if is_suspicious_mean_diff:
|
| 233 |
+
# Scale based on mean difference
|
| 234 |
+
confidence += min(30, mean_diff)
|
| 235 |
+
|
| 236 |
+
# Cap confidence at 90%
|
| 237 |
+
confidence = min(confidence, 90)
|
| 238 |
+
|
| 239 |
+
# Save results for return
|
| 240 |
+
details = {
|
| 241 |
+
"mean_diff": float(mean_diff),
|
| 242 |
+
"max_diff": float(max_diff),
|
| 243 |
+
"var_ratio": float(var_ratio),
|
| 244 |
+
"coeff_var": float(coeff_var),
|
| 245 |
+
"diff_image": diff_scaled.astype(np.uint8), # For visualization
|
| 246 |
+
"quality_used": quality
|
| 247 |
+
}
|
| 248 |
+
|
| 249 |
+
return is_suspicious, confidence, details
|
| 250 |
+
|
| 251 |
+
except Exception as e:
|
| 252 |
+
logging.debug(f"Error performing ELA on {image_path}: {str(e)}")
|
| 253 |
+
return False, 0, {"error": str(e)}
|
| 254 |
+
|
| 255 |
+
def check_lsb_anomalies(image_path, threshold=0.03):
|
| 256 |
+
"""
|
| 257 |
+
Detect potential LSB steganography by analyzing bit plane patterns.
|
| 258 |
+
|
| 259 |
+
Args:
|
| 260 |
+
image_path: Path to the image
|
| 261 |
+
threshold: Threshold for statistical anomaly detection
|
| 262 |
+
|
| 263 |
+
Returns:
|
| 264 |
+
(is_suspicious, confidence, details)
|
| 265 |
+
"""
|
| 266 |
+
try:
|
| 267 |
+
with Image.open(image_path) as img:
|
| 268 |
+
# Convert to RGB
|
| 269 |
+
if img.mode != 'RGB':
|
| 270 |
+
img = img.convert('RGB')
|
| 271 |
+
|
| 272 |
+
# Get image data as numpy array
|
| 273 |
+
img_array = np.array(img)
|
| 274 |
+
|
| 275 |
+
# Extract least significant bits from each channel
|
| 276 |
+
red_lsb = img_array[:,:,0] % 2
|
| 277 |
+
green_lsb = img_array[:,:,1] % 2
|
| 278 |
+
blue_lsb = img_array[:,:,2] % 2
|
| 279 |
+
|
| 280 |
+
# Calculate statistics
|
| 281 |
+
# Chi-square test to detect non-random patterns in LSB
|
| 282 |
+
red_chi = stats.chisquare(np.bincount(red_lsb.flatten()))[1]
|
| 283 |
+
green_chi = stats.chisquare(np.bincount(green_lsb.flatten()))[1]
|
| 284 |
+
blue_chi = stats.chisquare(np.bincount(blue_lsb.flatten()))[1]
|
| 285 |
+
|
| 286 |
+
# Calculate entropy of the LSB plane
|
| 287 |
+
red_entropy = stats.entropy(np.bincount(red_lsb.flatten()))
|
| 288 |
+
green_entropy = stats.entropy(np.bincount(green_lsb.flatten()))
|
| 289 |
+
blue_entropy = stats.entropy(np.bincount(blue_lsb.flatten()))
|
| 290 |
+
|
| 291 |
+
# Suspicious if chi-square test shows non-random distribution
|
| 292 |
+
# or if entropy is too high (close to 1 for random, lower for non-random)
|
| 293 |
+
chi_suspicious = min(red_chi, green_chi, blue_chi) < threshold
|
| 294 |
+
entropy_suspicious = abs(np.mean([red_entropy, green_entropy, blue_entropy]) - 1.0) > 0.1
|
| 295 |
+
|
| 296 |
+
# Calculate a confidence score (0-100%)
|
| 297 |
+
confidence = 0
|
| 298 |
+
if chi_suspicious:
|
| 299 |
+
confidence += 50
|
| 300 |
+
if entropy_suspicious:
|
| 301 |
+
confidence += 30
|
| 302 |
+
|
| 303 |
+
# Additional checks for common LSB steganography patterns
|
| 304 |
+
# Check for abnormal color distributions
|
| 305 |
+
color_distribution = np.std([np.std(red_lsb), np.std(green_lsb), np.std(blue_lsb)])
|
| 306 |
+
if color_distribution < 0.1: # Suspicious if too uniform
|
| 307 |
+
confidence += 20
|
| 308 |
+
|
| 309 |
+
is_suspicious = confidence > 50
|
| 310 |
+
|
| 311 |
+
details = {
|
| 312 |
+
"chi_square_values": [red_chi, green_chi, blue_chi],
|
| 313 |
+
"entropy_values": [red_entropy, green_entropy, blue_entropy],
|
| 314 |
+
"color_distribution": color_distribution
|
| 315 |
+
}
|
| 316 |
+
|
| 317 |
+
return is_suspicious, confidence, details
|
| 318 |
+
except Exception as e:
|
| 319 |
+
logging.debug(f"Error analyzing LSB in {image_path}: {str(e)}")
|
| 320 |
+
return False, 0, {"error": str(e)}
|
| 321 |
+
|
| 322 |
+
def check_file_size_anomalies(image_path):
|
| 323 |
+
"""
|
| 324 |
+
Check if file size is suspicious compared to image dimensions.
|
| 325 |
+
|
| 326 |
+
Args:
|
| 327 |
+
image_path: Path to the image
|
| 328 |
+
|
| 329 |
+
Returns:
|
| 330 |
+
(is_suspicious, confidence, details)
|
| 331 |
+
"""
|
| 332 |
+
try:
|
| 333 |
+
# Get file size
|
| 334 |
+
file_size = os.path.getsize(image_path)
|
| 335 |
+
|
| 336 |
+
with Image.open(image_path) as img:
|
| 337 |
+
width, height = img.size
|
| 338 |
+
pixel_count = width * height
|
| 339 |
+
|
| 340 |
+
# Calculate expected file size range based on image type
|
| 341 |
+
expected_size = 0
|
| 342 |
+
if image_path.lower().endswith('.png'):
|
| 343 |
+
# PNG files have variable compression but generally follow a pattern
|
| 344 |
+
# This is a very rough estimate
|
| 345 |
+
expected_min = pixel_count * 0.1 # Minimum expected size
|
| 346 |
+
expected_max = pixel_count * 3 # Maximum expected size
|
| 347 |
+
elif image_path.lower().endswith(('.jpg', '.jpeg')):
|
| 348 |
+
# JPEG files are typically smaller due to compression
|
| 349 |
+
expected_min = pixel_count * 0.05 # Minimum for very compressed JPEG
|
| 350 |
+
expected_max = pixel_count * 1.5 # Maximum for high quality JPEG
|
| 351 |
+
else:
|
| 352 |
+
# For other formats, use a more generic range
|
| 353 |
+
expected_min = pixel_count * 0.1
|
| 354 |
+
expected_max = pixel_count * 4
|
| 355 |
+
|
| 356 |
+
# Check if file size is within expected range
|
| 357 |
+
is_too_small = file_size < expected_min
|
| 358 |
+
is_too_large = file_size > expected_max
|
| 359 |
+
is_suspicious = is_too_small or is_too_large
|
| 360 |
+
|
| 361 |
+
# Calculate confidence
|
| 362 |
+
confidence = 0
|
| 363 |
+
if is_too_large:
|
| 364 |
+
# More likely to contain hidden data if too large
|
| 365 |
+
ratio = file_size / expected_max
|
| 366 |
+
confidence = min(int((ratio - 1) * 100), 90) # Cap at 90%
|
| 367 |
+
elif is_too_small:
|
| 368 |
+
# Less likely but still suspicious if too small
|
| 369 |
+
ratio = expected_min / file_size
|
| 370 |
+
confidence = min(int((ratio - 1) * 50), 70) # Cap at 70%
|
| 371 |
+
|
| 372 |
+
details = {
|
| 373 |
+
"file_size": file_size,
|
| 374 |
+
"expected_min": expected_min,
|
| 375 |
+
"expected_max": expected_max,
|
| 376 |
+
"pixel_count": pixel_count,
|
| 377 |
+
"width": width,
|
| 378 |
+
"height": height
|
| 379 |
+
}
|
| 380 |
+
|
| 381 |
+
return is_suspicious, confidence, details
|
| 382 |
+
except Exception as e:
|
| 383 |
+
logging.debug(f"Error analyzing file size in {image_path}: {str(e)}")
|
| 384 |
+
return False, 0, {"error": str(e)}
|
| 385 |
+
|
| 386 |
+
def check_histogram_anomalies(image_path):
|
| 387 |
+
"""
|
| 388 |
+
Analyze image histogram for unusual patterns that might indicate steganography.
|
| 389 |
+
|
| 390 |
+
Args:
|
| 391 |
+
image_path: Path to the image
|
| 392 |
+
|
| 393 |
+
Returns:
|
| 394 |
+
(is_suspicious, confidence, details)
|
| 395 |
+
"""
|
| 396 |
+
try:
|
| 397 |
+
with Image.open(image_path) as img:
|
| 398 |
+
# Convert to RGB
|
| 399 |
+
if img.mode != 'RGB':
|
| 400 |
+
img = img.convert('RGB')
|
| 401 |
+
|
| 402 |
+
# Get image data as numpy array
|
| 403 |
+
img_array = np.array(img)
|
| 404 |
+
|
| 405 |
+
# Calculate histograms for each color channel
|
| 406 |
+
hist_r = np.histogram(img_array[:,:,0], bins=256, range=(0, 256))[0]
|
| 407 |
+
hist_g = np.histogram(img_array[:,:,1], bins=256, range=(0, 256))[0]
|
| 408 |
+
hist_b = np.histogram(img_array[:,:,2], bins=256, range=(0, 256))[0]
|
| 409 |
+
|
| 410 |
+
# Normalize histograms
|
| 411 |
+
pixel_count = img_array.shape[0] * img_array.shape[1]
|
| 412 |
+
hist_r = hist_r / pixel_count
|
| 413 |
+
hist_g = hist_g / pixel_count
|
| 414 |
+
hist_b = hist_b / pixel_count
|
| 415 |
+
|
| 416 |
+
# Analyze histogram characteristics
|
| 417 |
+
# 1. Check for comb patterns (alternating peaks/valleys) which can indicate LSB steganography
|
| 418 |
+
comb_pattern_r = np.sum(np.abs(np.diff(np.diff(hist_r))))
|
| 419 |
+
comb_pattern_g = np.sum(np.abs(np.diff(np.diff(hist_g))))
|
| 420 |
+
comb_pattern_b = np.sum(np.abs(np.diff(np.diff(hist_b))))
|
| 421 |
+
|
| 422 |
+
# 2. Check for unusual peaks at specific values
|
| 423 |
+
# LSB steganography often causes unusual spikes at even or odd values
|
| 424 |
+
even_odd_ratio_r = np.sum(hist_r[::2]) / np.sum(hist_r[1::2]) if np.sum(hist_r[1::2]) > 0 else 1
|
| 425 |
+
even_odd_ratio_g = np.sum(hist_g[::2]) / np.sum(hist_g[1::2]) if np.sum(hist_g[1::2]) > 0 else 1
|
| 426 |
+
even_odd_ratio_b = np.sum(hist_b[::2]) / np.sum(hist_b[1::2]) if np.sum(hist_b[1::2]) > 0 else 1
|
| 427 |
+
|
| 428 |
+
# Calculate an evenness score - how far from 1.0 (perfect balance) are we?
|
| 429 |
+
even_odd_deviation = max(
|
| 430 |
+
abs(even_odd_ratio_r - 1.0),
|
| 431 |
+
abs(even_odd_ratio_g - 1.0),
|
| 432 |
+
abs(even_odd_ratio_b - 1.0)
|
| 433 |
+
)
|
| 434 |
+
|
| 435 |
+
# 3. Calculate histogram smoothness (natural images tend to have smoother histograms)
|
| 436 |
+
smoothness_r = np.mean(np.abs(np.diff(hist_r)))
|
| 437 |
+
smoothness_g = np.mean(np.abs(np.diff(hist_g)))
|
| 438 |
+
smoothness_b = np.mean(np.abs(np.diff(hist_b)))
|
| 439 |
+
|
| 440 |
+
# Suspicious if large even/odd ratio deviation or high comb pattern values
|
| 441 |
+
is_suspicious_comb = max(comb_pattern_r, comb_pattern_g, comb_pattern_b) > 0.015
|
| 442 |
+
is_suspicious_even_odd = even_odd_deviation > 0.1
|
| 443 |
+
is_suspicious_smoothness = max(smoothness_r, smoothness_g, smoothness_b) > 0.01
|
| 444 |
+
|
| 445 |
+
is_suspicious = is_suspicious_comb or is_suspicious_even_odd or is_suspicious_smoothness
|
| 446 |
+
|
| 447 |
+
# Calculate confidence
|
| 448 |
+
confidence = 0
|
| 449 |
+
if is_suspicious_comb:
|
| 450 |
+
confidence += 30
|
| 451 |
+
if is_suspicious_even_odd:
|
| 452 |
+
confidence += 40
|
| 453 |
+
if is_suspicious_smoothness:
|
| 454 |
+
confidence += 20
|
| 455 |
+
|
| 456 |
+
# Cap confidence at 90%
|
| 457 |
+
confidence = min(confidence, 90)
|
| 458 |
+
|
| 459 |
+
details = {
|
| 460 |
+
"comb_pattern_values": [comb_pattern_r, comb_pattern_g, comb_pattern_b],
|
| 461 |
+
"even_odd_ratios": [even_odd_ratio_r, even_odd_ratio_g, even_odd_ratio_b],
|
| 462 |
+
"smoothness_values": [smoothness_r, smoothness_g, smoothness_b],
|
| 463 |
+
"even_odd_deviation": even_odd_deviation
|
| 464 |
+
}
|
| 465 |
+
|
| 466 |
+
return is_suspicious, confidence, details
|
| 467 |
+
except Exception as e:
|
| 468 |
+
logging.debug(f"Error analyzing histogram in {image_path}: {str(e)}")
|
| 469 |
+
return False, 0, {"error": str(e)}
|
| 470 |
+
|
| 471 |
+
def check_metadata_anomalies(image_path):
|
| 472 |
+
"""
|
| 473 |
+
Look for unusual metadata or metadata inconsistencies that could indicate steganography.
|
| 474 |
+
|
| 475 |
+
Args:
|
| 476 |
+
image_path: Path to the image
|
| 477 |
+
|
| 478 |
+
Returns:
|
| 479 |
+
(is_suspicious, confidence, details)
|
| 480 |
+
"""
|
| 481 |
+
try:
|
| 482 |
+
with Image.open(image_path) as img:
|
| 483 |
+
# Extract metadata (EXIF, etc)
|
| 484 |
+
metadata = {}
|
| 485 |
+
if hasattr(img, '_getexif') and img._getexif() is not None:
|
| 486 |
+
metadata = {k: v for k, v in img._getexif().items()}
|
| 487 |
+
|
| 488 |
+
# Check for known steganography software markers
|
| 489 |
+
steganography_markers = [
|
| 490 |
+
'outguess', 'stegano', 'steghide', 'jsteg', 'f5', 'secret',
|
| 491 |
+
'hidden', 'conceal', 'invisible', 'steganography'
|
| 492 |
+
]
|
| 493 |
+
|
| 494 |
+
found_markers = []
|
| 495 |
+
for key, value in metadata.items():
|
| 496 |
+
if isinstance(value, str):
|
| 497 |
+
value_lower = value.lower()
|
| 498 |
+
for marker in steganography_markers:
|
| 499 |
+
if marker in value_lower:
|
| 500 |
+
found_markers.append((key, marker, value))
|
| 501 |
+
|
| 502 |
+
# Check for unusual metadata structure
|
| 503 |
+
is_suspicious = len(found_markers) > 0
|
| 504 |
+
confidence = min(len(found_markers) * 30, 90) if is_suspicious else 0
|
| 505 |
+
|
| 506 |
+
# Check for metadata size anomalies
|
| 507 |
+
if len(metadata) > 30: # Unusually large metadata
|
| 508 |
+
is_suspicious = True
|
| 509 |
+
confidence = max(confidence, 50)
|
| 510 |
+
|
| 511 |
+
details = {
|
| 512 |
+
"metadata_count": len(metadata),
|
| 513 |
+
"suspicious_markers": found_markers
|
| 514 |
+
}
|
| 515 |
+
|
| 516 |
+
return is_suspicious, confidence, details
|
| 517 |
+
except Exception as e:
|
| 518 |
+
logging.debug(f"Error analyzing metadata in {image_path}: {str(e)}")
|
| 519 |
+
return False, 0, {"error": str(e)}
|
| 520 |
+
|
| 521 |
+
def check_trailing_data(image_path):
|
| 522 |
+
"""Detect suspicious data appended after the official end markers."""
|
| 523 |
+
try:
|
| 524 |
+
with open(image_path, 'rb') as f:
|
| 525 |
+
data = f.read()
|
| 526 |
+
|
| 527 |
+
appended_bytes = 0
|
| 528 |
+
lower = image_path.lower()
|
| 529 |
+
|
| 530 |
+
if lower.endswith(('.jpg', '.jpeg', '.jfif')):
|
| 531 |
+
marker = data.rfind(b'\xFF\xD9')
|
| 532 |
+
if marker != -1 and marker < len(data) - 2:
|
| 533 |
+
appended_bytes = len(data) - marker - 2
|
| 534 |
+
elif lower.endswith('.png'):
|
| 535 |
+
marker = data.rfind(b'\x00\x00\x00\x00IEND\xAEB\x60\x82')
|
| 536 |
+
if marker != -1 and marker < len(data) - 12:
|
| 537 |
+
appended_bytes = len(data) - marker - 12
|
| 538 |
+
else:
|
| 539 |
+
return False, 0, {"error": "unsupported format"}
|
| 540 |
+
|
| 541 |
+
is_suspicious = appended_bytes > 0
|
| 542 |
+
confidence = 0
|
| 543 |
+
if is_suspicious:
|
| 544 |
+
ratio = appended_bytes / len(data)
|
| 545 |
+
confidence = min(95, 50 + int(ratio * 500))
|
| 546 |
+
|
| 547 |
+
details = {
|
| 548 |
+
"appended_bytes": appended_bytes
|
| 549 |
+
}
|
| 550 |
+
|
| 551 |
+
return is_suspicious, confidence, details
|
| 552 |
+
except Exception as e:
|
| 553 |
+
logging.debug(f"Error analyzing trailing data in {image_path}: {str(e)}")
|
| 554 |
+
return False, 0, {"error": str(e)}
|
| 555 |
+
|
| 556 |
+
def check_visual_noise_anomalies(image_path):
|
| 557 |
+
"""
|
| 558 |
+
Analyze visual noise patterns to detect potential steganography.
|
| 559 |
+
|
| 560 |
+
Args:
|
| 561 |
+
image_path: Path to the image
|
| 562 |
+
|
| 563 |
+
Returns:
|
| 564 |
+
(is_suspicious, confidence, details)
|
| 565 |
+
"""
|
| 566 |
+
try:
|
| 567 |
+
with Image.open(image_path) as img:
|
| 568 |
+
# Convert to RGB
|
| 569 |
+
if img.mode != 'RGB':
|
| 570 |
+
img = img.convert('RGB')
|
| 571 |
+
|
| 572 |
+
# Resize if image is too large for faster processing
|
| 573 |
+
width, height = img.size
|
| 574 |
+
if width > 1000 or height > 1000:
|
| 575 |
+
ratio = min(1000 / width, 1000 / height)
|
| 576 |
+
new_width = int(width * ratio)
|
| 577 |
+
new_height = int(height * ratio)
|
| 578 |
+
img = img.resize((new_width, new_height))
|
| 579 |
+
|
| 580 |
+
# Get image data as numpy array
|
| 581 |
+
img_array = np.array(img)
|
| 582 |
+
|
| 583 |
+
# Apply noise detection
|
| 584 |
+
# Calculate noise in each channel by looking at differences between adjacent pixels
|
| 585 |
+
red_noise = np.mean(np.abs(np.diff(img_array[:,:,0], axis=0))) + np.mean(np.abs(np.diff(img_array[:,:,0], axis=1)))
|
| 586 |
+
green_noise = np.mean(np.abs(np.diff(img_array[:,:,1], axis=0))) + np.mean(np.abs(np.diff(img_array[:,:,1], axis=1)))
|
| 587 |
+
blue_noise = np.mean(np.abs(np.diff(img_array[:,:,2], axis=0))) + np.mean(np.abs(np.diff(img_array[:,:,2], axis=1)))
|
| 588 |
+
|
| 589 |
+
# Calculate noise ratio between channels
|
| 590 |
+
# In natural images, noise should be roughly similar across channels
|
| 591 |
+
# Large differences might indicate steganographic content
|
| 592 |
+
avg_noise = (red_noise + green_noise + blue_noise) / 3
|
| 593 |
+
noise_diffs = [abs(red_noise - avg_noise), abs(green_noise - avg_noise), abs(blue_noise - avg_noise)]
|
| 594 |
+
max_diff_ratio = max(noise_diffs) / avg_noise if avg_noise > 0 else 0
|
| 595 |
+
|
| 596 |
+
# Suspicious if significant differences between channels
|
| 597 |
+
is_suspicious = max_diff_ratio > 0.2
|
| 598 |
+
confidence = min(int(max_diff_ratio * 100), 90) if is_suspicious else 0
|
| 599 |
+
|
| 600 |
+
details = {
|
| 601 |
+
"red_noise": red_noise,
|
| 602 |
+
"green_noise": green_noise,
|
| 603 |
+
"blue_noise": blue_noise,
|
| 604 |
+
"max_diff_ratio": max_diff_ratio
|
| 605 |
+
}
|
| 606 |
+
|
| 607 |
+
return is_suspicious, confidence, details
|
| 608 |
+
except Exception as e:
|
| 609 |
+
logging.debug(f"Error analyzing visual noise in {image_path}: {str(e)}")
|
| 610 |
+
return False, 0, {"error": str(e)}
|
| 611 |
+
|
| 612 |
+
def analyze_image(image_path, sensitivity='medium'):
|
| 613 |
+
"""
|
| 614 |
+
Perform comprehensive steganography detection on an image.
|
| 615 |
+
|
| 616 |
+
Args:
|
| 617 |
+
image_path: Path to the image
|
| 618 |
+
sensitivity: 'low', 'medium', or 'high'
|
| 619 |
+
|
| 620 |
+
Returns:
|
| 621 |
+
(is_suspicious, overall_confidence, detection_details)
|
| 622 |
+
"""
|
| 623 |
+
# Set threshold based on sensitivity
|
| 624 |
+
thresholds = {
|
| 625 |
+
'low': 0.01, # More likely to find steganography but more false positives
|
| 626 |
+
'medium': 0.03, # Balanced detection
|
| 627 |
+
'high': 0.05 # Fewer false positives but might miss some steganography
|
| 628 |
+
}
|
| 629 |
+
|
| 630 |
+
confidence_required = {
|
| 631 |
+
'low': 60, # Lower bar for detection
|
| 632 |
+
'medium': 70, # Moderate confidence required
|
| 633 |
+
'high': 80 # High confidence required to report
|
| 634 |
+
}
|
| 635 |
+
|
| 636 |
+
threshold = thresholds.get(sensitivity, 0.03)
|
| 637 |
+
min_confidence = confidence_required.get(sensitivity, 70)
|
| 638 |
+
|
| 639 |
+
try:
|
| 640 |
+
results = {}
|
| 641 |
+
|
| 642 |
+
# Run all detection methods
|
| 643 |
+
lsb_result = check_lsb_anomalies(image_path, threshold)
|
| 644 |
+
results['lsb_analysis'] = {
|
| 645 |
+
'suspicious': lsb_result[0],
|
| 646 |
+
'confidence': lsb_result[1],
|
| 647 |
+
'details': lsb_result[2]
|
| 648 |
+
}
|
| 649 |
+
|
| 650 |
+
size_result = check_file_size_anomalies(image_path)
|
| 651 |
+
results['file_size_analysis'] = {
|
| 652 |
+
'suspicious': size_result[0],
|
| 653 |
+
'confidence': size_result[1],
|
| 654 |
+
'details': size_result[2]
|
| 655 |
+
}
|
| 656 |
+
|
| 657 |
+
metadata_result = check_metadata_anomalies(image_path)
|
| 658 |
+
results['metadata_analysis'] = {
|
| 659 |
+
'suspicious': metadata_result[0],
|
| 660 |
+
'confidence': metadata_result[1],
|
| 661 |
+
'details': metadata_result[2]
|
| 662 |
+
}
|
| 663 |
+
|
| 664 |
+
trailing_result = check_trailing_data(image_path)
|
| 665 |
+
results['trailing_data_analysis'] = {
|
| 666 |
+
'suspicious': trailing_result[0],
|
| 667 |
+
'confidence': trailing_result[1],
|
| 668 |
+
'details': trailing_result[2]
|
| 669 |
+
}
|
| 670 |
+
|
| 671 |
+
noise_result = check_visual_noise_anomalies(image_path)
|
| 672 |
+
results['visual_noise_analysis'] = {
|
| 673 |
+
'suspicious': noise_result[0],
|
| 674 |
+
'confidence': noise_result[1],
|
| 675 |
+
'details': noise_result[2]
|
| 676 |
+
}
|
| 677 |
+
|
| 678 |
+
# Add the new histogram analysis
|
| 679 |
+
histogram_result = check_histogram_anomalies(image_path)
|
| 680 |
+
results['histogram_analysis'] = {
|
| 681 |
+
'suspicious': histogram_result[0],
|
| 682 |
+
'confidence': histogram_result[1],
|
| 683 |
+
'details': histogram_result[2]
|
| 684 |
+
}
|
| 685 |
+
|
| 686 |
+
# Add Error Level Analysis (ELA) for JPEG images
|
| 687 |
+
if image_path.lower().endswith(('.jpg', '.jpeg', '.jfif')):
|
| 688 |
+
ela_result = perform_ela_analysis(image_path)
|
| 689 |
+
results['ela_analysis'] = {
|
| 690 |
+
'suspicious': ela_result[0],
|
| 691 |
+
'confidence': ela_result[1],
|
| 692 |
+
'details': ela_result[2]
|
| 693 |
+
}
|
| 694 |
+
|
| 695 |
+
# Calculate overall confidence
|
| 696 |
+
# Weight the different tests
|
| 697 |
+
weights = {
|
| 698 |
+
'lsb_analysis': 0.25, # LSB is a common technique
|
| 699 |
+
'histogram_analysis': 0.20, # Histogram patterns are strong indicators
|
| 700 |
+
'file_size_analysis': 0.10, # Size can be indicative
|
| 701 |
+
'metadata_analysis': 0.10, # Metadata less common but useful indicator
|
| 702 |
+
'trailing_data_analysis': 0.10, # Detects data after EOF markers
|
| 703 |
+
'visual_noise_analysis': 0.15, # Visual noise can be a good indicator
|
| 704 |
+
'ela_analysis': 0.20 # Error Level Analysis is effective for JPEG manipulation
|
| 705 |
+
}
|
| 706 |
+
|
| 707 |
+
# Only include weights for methods that were actually run
|
| 708 |
+
used_weights = {k: v for k, v in weights.items() if k in results}
|
| 709 |
+
|
| 710 |
+
# Normalize the weights to ensure they sum to 1.0
|
| 711 |
+
weight_sum = sum(used_weights.values())
|
| 712 |
+
if weight_sum > 0:
|
| 713 |
+
used_weights = {k: v/weight_sum for k, v in used_weights.items()}
|
| 714 |
+
|
| 715 |
+
# Calculate weighted confidence
|
| 716 |
+
overall_confidence = sum(
|
| 717 |
+
results[key]['confidence'] * used_weights[key] for key in used_weights
|
| 718 |
+
)
|
| 719 |
+
|
| 720 |
+
# Determine if image is suspicious overall
|
| 721 |
+
is_suspicious = overall_confidence >= min_confidence
|
| 722 |
+
|
| 723 |
+
return is_suspicious, overall_confidence, results
|
| 724 |
+
except Exception as e:
|
| 725 |
+
logging.debug(f"Error analyzing {image_path}: {str(e)}")
|
| 726 |
+
return False, 0, {"error": str(e)}
|
| 727 |
+
|
| 728 |
+
def process_file(args):
|
| 729 |
+
"""Process a single image file."""
|
| 730 |
+
image_path, sensitivity, output_dir = args
|
| 731 |
+
|
| 732 |
+
try:
|
| 733 |
+
is_suspicious, confidence, details = analyze_image(image_path, sensitivity)
|
| 734 |
+
|
| 735 |
+
result = {
|
| 736 |
+
'path': image_path,
|
| 737 |
+
'suspicious': is_suspicious,
|
| 738 |
+
'confidence': confidence,
|
| 739 |
+
'details': details
|
| 740 |
+
}
|
| 741 |
+
|
| 742 |
+
# Create visual report if output directory is specified
|
| 743 |
+
if output_dir and is_suspicious:
|
| 744 |
+
create_visual_report(image_path, confidence, details, output_dir)
|
| 745 |
+
|
| 746 |
+
return result
|
| 747 |
+
except Exception as e:
|
| 748 |
+
logging.debug(f"Error processing {image_path}: {str(e)}")
|
| 749 |
+
return {
|
| 750 |
+
'path': image_path,
|
| 751 |
+
'suspicious': False,
|
| 752 |
+
'confidence': 0,
|
| 753 |
+
'details': {'error': str(e)}
|
| 754 |
+
}
|
| 755 |
+
|
| 756 |
+
def create_visual_report(image_path, confidence, details, output_dir):
|
| 757 |
+
"""
|
| 758 |
+
Create a visual report showing the analysis of a suspicious image.
|
| 759 |
+
|
| 760 |
+
Args:
|
| 761 |
+
image_path: Path to the analyzed image
|
| 762 |
+
confidence: Detection confidence
|
| 763 |
+
details: Analysis details
|
| 764 |
+
output_dir: Directory to save report
|
| 765 |
+
"""
|
| 766 |
+
try:
|
| 767 |
+
# Create output directory if it doesn't exist
|
| 768 |
+
os.makedirs(output_dir, exist_ok=True)
|
| 769 |
+
|
| 770 |
+
# Create a figure with 3x3 subplots to accommodate ELA visualization
|
| 771 |
+
fig, axs = plt.subplots(3, 3, figsize=(15, 15))
|
| 772 |
+
fig.suptitle(f"Steganography Analysis: {os.path.basename(image_path)}\nConfidence: {confidence:.1f}%", fontsize=16)
|
| 773 |
+
|
| 774 |
+
# Original image
|
| 775 |
+
with Image.open(image_path) as img:
|
| 776 |
+
axs[0, 0].imshow(img)
|
| 777 |
+
axs[0, 0].set_title("Original Image")
|
| 778 |
+
axs[0, 0].axis('off')
|
| 779 |
+
|
| 780 |
+
# LSB visualization
|
| 781 |
+
img_array = np.array(img.convert('RGB'))
|
| 782 |
+
lsb_img = np.zeros_like(img_array)
|
| 783 |
+
|
| 784 |
+
# Amplify LSB data by 255 for visibility
|
| 785 |
+
lsb_img[:,:,0] = (img_array[:,:,0] % 2) * 255
|
| 786 |
+
lsb_img[:,:,1] = (img_array[:,:,1] % 2) * 255
|
| 787 |
+
lsb_img[:,:,2] = (img_array[:,:,2] % 2) * 255
|
| 788 |
+
|
| 789 |
+
axs[0, 1].imshow(lsb_img)
|
| 790 |
+
axs[0, 1].set_title("LSB Visualization")
|
| 791 |
+
axs[0, 1].axis('off')
|
| 792 |
+
|
| 793 |
+
# ELA visualization (NEW)
|
| 794 |
+
if 'ela_analysis' in details and 'details' in details['ela_analysis']:
|
| 795 |
+
ela_data = details['ela_analysis']['details']
|
| 796 |
+
if 'diff_image' in ela_data and not isinstance(ela_data.get('error', ''), str):
|
| 797 |
+
# Display the ELA image
|
| 798 |
+
axs[0, 2].imshow(ela_data['diff_image'])
|
| 799 |
+
axs[0, 2].set_title("Error Level Analysis (ELA)")
|
| 800 |
+
axs[0, 2].axis('off')
|
| 801 |
+
|
| 802 |
+
# Add annotation with key metrics
|
| 803 |
+
metrics = []
|
| 804 |
+
if 'var_ratio' in ela_data:
|
| 805 |
+
metrics.append(f"Variance ratio: {ela_data['var_ratio']:.2f}")
|
| 806 |
+
if 'coeff_var' in ela_data:
|
| 807 |
+
metrics.append(f"Coefficient of var: {ela_data['coeff_var']:.2f}")
|
| 808 |
+
if 'mean_diff' in ela_data:
|
| 809 |
+
metrics.append(f"Mean diff: {ela_data['mean_diff']:.2f}")
|
| 810 |
+
|
| 811 |
+
if metrics:
|
| 812 |
+
axs[0, 2].text(0.05, 0.05, "\n".join(metrics), transform=axs[0, 2].transAxes,
|
| 813 |
+
fontsize=9, verticalalignment='bottom',
|
| 814 |
+
bbox=dict(boxstyle='round,pad=0.5',
|
| 815 |
+
facecolor='white', alpha=0.7))
|
| 816 |
+
else:
|
| 817 |
+
axs[0, 2].text(0.5, 0.5, "ELA data not available",
|
| 818 |
+
horizontalalignment='center', verticalalignment='center')
|
| 819 |
+
axs[0, 2].axis('off')
|
| 820 |
+
else:
|
| 821 |
+
axs[0, 2].text(0.5, 0.5, "ELA analysis not available",
|
| 822 |
+
horizontalalignment='center', verticalalignment='center')
|
| 823 |
+
axs[0, 2].axis('off')
|
| 824 |
+
|
| 825 |
+
# Histogram visualization
|
| 826 |
+
if 'histogram_analysis' in details:
|
| 827 |
+
# Create histograms for each channel
|
| 828 |
+
hist_r = np.histogram(img_array[:,:,0], bins=256, range=(0, 256))[0]
|
| 829 |
+
hist_g = np.histogram(img_array[:,:,1], bins=256, range=(0, 256))[0]
|
| 830 |
+
hist_b = np.histogram(img_array[:,:,2], bins=256, range=(0, 256))[0]
|
| 831 |
+
|
| 832 |
+
# Plot the histograms
|
| 833 |
+
bin_edges = np.arange(0, 257)
|
| 834 |
+
axs[1, 0].plot(bin_edges[:-1], hist_r, color='red', alpha=0.7)
|
| 835 |
+
axs[1, 0].plot(bin_edges[:-1], hist_g, color='green', alpha=0.7)
|
| 836 |
+
axs[1, 0].plot(bin_edges[:-1], hist_b, color='blue', alpha=0.7)
|
| 837 |
+
axs[1, 0].set_title("Color Channel Histograms")
|
| 838 |
+
axs[1, 0].set_xlabel("Pixel Value")
|
| 839 |
+
axs[1, 0].set_ylabel("Frequency")
|
| 840 |
+
axs[1, 0].legend(['Red', 'Green', 'Blue'])
|
| 841 |
+
|
| 842 |
+
# Show odd/even distribution analysis
|
| 843 |
+
histogram_data = details['histogram_analysis']['details']
|
| 844 |
+
|
| 845 |
+
# Get even/odd ratio values
|
| 846 |
+
if 'even_odd_ratios' in histogram_data:
|
| 847 |
+
even_odd_ratios = histogram_data['even_odd_ratios']
|
| 848 |
+
|
| 849 |
+
# Plot as bar chart
|
| 850 |
+
axs[1, 1].bar(['Red', 'Green', 'Blue'], even_odd_ratios,
|
| 851 |
+
color=['red', 'green', 'blue'], alpha=0.7)
|
| 852 |
+
axs[1, 1].axhline(y=1.0, linestyle='--', color='gray')
|
| 853 |
+
axs[1, 1].set_title("Even/Odd Value Ratios")
|
| 854 |
+
axs[1, 1].set_ylabel("Ratio (1.0 = balanced)")
|
| 855 |
+
|
| 856 |
+
# Annotate with explanatory text
|
| 857 |
+
deviation = histogram_data.get('even_odd_deviation', 0)
|
| 858 |
+
assessment = "SUSPICIOUS" if deviation > 0.1 else "NORMAL"
|
| 859 |
+
axs[1, 1].annotate(f"Deviation: {deviation:.3f}\nAssessment: {assessment}",
|
| 860 |
+
xy=(0.05, 0.05), xycoords='axes fraction')
|
| 861 |
+
else:
|
| 862 |
+
axs[1, 1].text(0.5, 0.5, "Histogram ratio data not available",
|
| 863 |
+
horizontalalignment='center', verticalalignment='center')
|
| 864 |
+
axs[1, 1].axis('off')
|
| 865 |
+
else:
|
| 866 |
+
axs[1, 0].text(0.5, 0.5, "Histogram analysis not available",
|
| 867 |
+
horizontalalignment='center', verticalalignment='center')
|
| 868 |
+
axs[1, 0].axis('off')
|
| 869 |
+
axs[1, 1].axis('off')
|
| 870 |
+
|
| 871 |
+
# Noise visualization
|
| 872 |
+
if 'visual_noise_analysis' in details:
|
| 873 |
+
noise_data = details['visual_noise_analysis']['details']
|
| 874 |
+
noise_values = [noise_data.get('red_noise', 0),
|
| 875 |
+
noise_data.get('green_noise', 0),
|
| 876 |
+
noise_data.get('blue_noise', 0)]
|
| 877 |
+
|
| 878 |
+
axs[1, 2].bar(['Red', 'Green', 'Blue'], noise_values, color=['red', 'green', 'blue'])
|
| 879 |
+
axs[1, 2].set_title("Noise Levels by Channel")
|
| 880 |
+
axs[1, 2].set_ylabel("Noise Level")
|
| 881 |
+
else:
|
| 882 |
+
axs[1, 2].text(0.5, 0.5, "Noise analysis not available",
|
| 883 |
+
horizontalalignment='center', verticalalignment='center')
|
| 884 |
+
axs[1, 2].axis('off')
|
| 885 |
+
|
| 886 |
+
# File size analysis visualization
|
| 887 |
+
if 'file_size_analysis' in details and 'details' in details['file_size_analysis']:
|
| 888 |
+
size_data = details['file_size_analysis']['details']
|
| 889 |
+
|
| 890 |
+
if ('file_size' in size_data and 'expected_min' in size_data
|
| 891 |
+
and 'expected_max' in size_data and 'pixel_count' in size_data):
|
| 892 |
+
|
| 893 |
+
# Create a simple bar chart comparing actual vs expected size
|
| 894 |
+
sizes = [size_data['file_size'],
|
| 895 |
+
size_data['expected_min'],
|
| 896 |
+
size_data['expected_max']]
|
| 897 |
+
|
| 898 |
+
labels = ['Actual Size', 'Min Expected', 'Max Expected']
|
| 899 |
+
colors = ['blue', 'green', 'green']
|
| 900 |
+
|
| 901 |
+
axs[2, 0].bar(labels, sizes, color=colors, alpha=0.7)
|
| 902 |
+
axs[2, 0].set_title("File Size Analysis")
|
| 903 |
+
axs[2, 0].set_ylabel("Size (bytes)")
|
| 904 |
+
|
| 905 |
+
# Format y-axis to show human-readable sizes
|
| 906 |
+
axs[2, 0].get_yaxis().set_major_formatter(
|
| 907 |
+
plt.FuncFormatter(lambda x, loc: f"{x/1024:.1f}KB" if x >= 1024 else f"{x}B"))
|
| 908 |
+
|
| 909 |
+
# Is it suspiciously large?
|
| 910 |
+
is_too_large = size_data['file_size'] > size_data['expected_max']
|
| 911 |
+
is_too_small = size_data['file_size'] < size_data['expected_min']
|
| 912 |
+
|
| 913 |
+
if is_too_large:
|
| 914 |
+
assessment = f"SUSPICIOUS: {(size_data['file_size'] - size_data['expected_max'])/1024:.1f}KB larger than expected"
|
| 915 |
+
elif is_too_small:
|
| 916 |
+
assessment = f"SUSPICIOUS: {(size_data['expected_min'] - size_data['file_size'])/1024:.1f}KB smaller than expected"
|
| 917 |
+
else:
|
| 918 |
+
assessment = "NORMAL: Size within expected range"
|
| 919 |
+
|
| 920 |
+
axs[2, 0].annotate(assessment, xy=(0.05, 0.05), xycoords='axes fraction',
|
| 921 |
+
fontsize=9, verticalalignment='bottom')
|
| 922 |
+
|
| 923 |
+
if 'trailing_data_analysis' in details:
|
| 924 |
+
tdata = details['trailing_data_analysis']['details']
|
| 925 |
+
if tdata.get('appended_bytes', 0) > 0:
|
| 926 |
+
axs[2, 0].annotate(
|
| 927 |
+
f"Appended data: {tdata['appended_bytes']} bytes",
|
| 928 |
+
xy=(0.05, 0.85), xycoords='axes fraction',
|
| 929 |
+
fontsize=9, verticalalignment='bottom',
|
| 930 |
+
color='red'
|
| 931 |
+
)
|
| 932 |
+
else:
|
| 933 |
+
axs[2, 0].text(0.5, 0.5, "Size analysis data not available",
|
| 934 |
+
horizontalalignment='center', verticalalignment='center')
|
| 935 |
+
axs[2, 0].axis('off')
|
| 936 |
+
else:
|
| 937 |
+
axs[2, 0].text(0.5, 0.5, "Size analysis not available",
|
| 938 |
+
horizontalalignment='center', verticalalignment='center')
|
| 939 |
+
axs[2, 0].axis('off')
|
| 940 |
+
|
| 941 |
+
# Metadata analysis visualization
|
| 942 |
+
if 'metadata_analysis' in details and 'details' in details['metadata_analysis']:
|
| 943 |
+
metadata = details['metadata_analysis']['details']
|
| 944 |
+
|
| 945 |
+
metadata_text = f"Total metadata entries: {metadata.get('metadata_count', 0)}\n\n"
|
| 946 |
+
|
| 947 |
+
if 'suspicious_markers' in metadata and metadata['suspicious_markers']:
|
| 948 |
+
metadata_text += "Suspicious markers found:\n"
|
| 949 |
+
for key, marker, value in metadata['suspicious_markers'][:3]: # Show top 3
|
| 950 |
+
metadata_text += f"- '{marker}' in {key}\n"
|
| 951 |
+
|
| 952 |
+
if len(metadata['suspicious_markers']) > 3:
|
| 953 |
+
metadata_text += f"...and {len(metadata['suspicious_markers'])-3} more\n"
|
| 954 |
+
else:
|
| 955 |
+
metadata_text += "No suspicious metadata markers found"
|
| 956 |
+
|
| 957 |
+
axs[2, 1].text(0.1, 0.5, metadata_text, fontsize=10,
|
| 958 |
+
verticalalignment='center', horizontalalignment='left')
|
| 959 |
+
axs[2, 1].set_title("Metadata Analysis")
|
| 960 |
+
axs[2, 1].axis('off')
|
| 961 |
+
else:
|
| 962 |
+
axs[2, 1].text(0.5, 0.5, "Metadata analysis not available",
|
| 963 |
+
horizontalalignment='center', verticalalignment='center')
|
| 964 |
+
axs[2, 1].axis('off')
|
| 965 |
+
|
| 966 |
+
# Overall analysis metrics
|
| 967 |
+
axs[2, 2].axis('off')
|
| 968 |
+
metrics_text = "Detection Confidence by Method:\n\n"
|
| 969 |
+
|
| 970 |
+
for analysis_type, results in details.items():
|
| 971 |
+
if isinstance(results, dict) and 'confidence' in results:
|
| 972 |
+
confidence_value = results['confidence']
|
| 973 |
+
if confidence_value > 70:
|
| 974 |
+
highlight = " 🚨 HIGH"
|
| 975 |
+
elif confidence_value > 40:
|
| 976 |
+
highlight = " ⚠️ MEDIUM"
|
| 977 |
+
else:
|
| 978 |
+
highlight = ""
|
| 979 |
+
metrics_text += f"{analysis_type.replace('_', ' ').title()}: {confidence_value:.1f}%{highlight}\n"
|
| 980 |
+
|
| 981 |
+
axs[2, 2].text(0.1, 0.5, metrics_text, fontsize=10, verticalalignment='center')
|
| 982 |
+
axs[2, 2].set_title("Overall Analysis Results")
|
| 983 |
+
|
| 984 |
+
# Adjust layout
|
| 985 |
+
plt.tight_layout(rect=[0, 0, 1, 0.95])
|
| 986 |
+
|
| 987 |
+
# Save figure
|
| 988 |
+
report_filename = os.path.join(output_dir, f"steganalysis_{os.path.basename(image_path)}.png")
|
| 989 |
+
plt.savefig(report_filename)
|
| 990 |
+
plt.close()
|
| 991 |
+
|
| 992 |
+
logging.debug(f"Created visual report: {report_filename}")
|
| 993 |
+
return report_filename
|
| 994 |
+
except Exception as e:
|
| 995 |
+
logging.debug(f"Error creating visual report for {image_path}: {str(e)}")
|
| 996 |
+
return None
|
| 997 |
+
|
| 998 |
+
def find_image_files(directory, recursive=True):
|
| 999 |
+
"""Find all image files in a directory."""
|
| 1000 |
+
image_extensions = ('.jpg', '.jpeg', '.png', '.bmp', '.gif', '.tiff', '.tif', '.webp')
|
| 1001 |
+
image_files = []
|
| 1002 |
+
|
| 1003 |
+
if recursive:
|
| 1004 |
+
for root, _, files in os.walk(directory):
|
| 1005 |
+
for file in files:
|
| 1006 |
+
if file.lower().endswith(image_extensions):
|
| 1007 |
+
image_files.append(os.path.join(root, file))
|
| 1008 |
+
else:
|
| 1009 |
+
for file in os.listdir(directory):
|
| 1010 |
+
if os.path.isfile(os.path.join(directory, file)) and file.lower().endswith(image_extensions):
|
| 1011 |
+
image_files.append(os.path.join(directory, file))
|
| 1012 |
+
|
| 1013 |
+
return image_files
|
| 1014 |
+
|
| 1015 |
+
def analyze_images(directory, sensitivity='medium', recursive=True, output_dir=None, max_workers=None):
|
| 1016 |
+
"""
|
| 1017 |
+
Analyze all images in a directory for steganography.
|
| 1018 |
+
|
| 1019 |
+
Args:
|
| 1020 |
+
directory: Directory to scan
|
| 1021 |
+
sensitivity: 'low', 'medium', or 'high'
|
| 1022 |
+
recursive: Whether to scan subdirectories
|
| 1023 |
+
output_dir: Directory to save visual reports
|
| 1024 |
+
max_workers: Number of worker processes
|
| 1025 |
+
|
| 1026 |
+
Returns:
|
| 1027 |
+
List of suspicious image details
|
| 1028 |
+
"""
|
| 1029 |
+
# Find all image files
|
| 1030 |
+
image_files = find_image_files(directory, recursive)
|
| 1031 |
+
if not image_files:
|
| 1032 |
+
logging.warning("No image files found!")
|
| 1033 |
+
return []
|
| 1034 |
+
|
| 1035 |
+
logging.info(f"Found {len(image_files)} image files to analyze")
|
| 1036 |
+
|
| 1037 |
+
# Create output directory if specified
|
| 1038 |
+
if output_dir:
|
| 1039 |
+
os.makedirs(output_dir, exist_ok=True)
|
| 1040 |
+
logging.info(f"Visual reports will be saved to: {output_dir}")
|
| 1041 |
+
|
| 1042 |
+
# Prepare input arguments for workers
|
| 1043 |
+
input_args = [(file_path, sensitivity, output_dir) for file_path in image_files]
|
| 1044 |
+
|
| 1045 |
+
suspicious_images = []
|
| 1046 |
+
|
| 1047 |
+
# Process files in parallel
|
| 1048 |
+
with concurrent.futures.ProcessPoolExecutor(max_workers=max_workers) as executor:
|
| 1049 |
+
# Colorful progress bar
|
| 1050 |
+
results = []
|
| 1051 |
+
futures = {executor.submit(process_file, arg): arg[0] for arg in input_args}
|
| 1052 |
+
|
| 1053 |
+
with tqdm(
|
| 1054 |
+
total=len(image_files),
|
| 1055 |
+
desc=f"{colorama.Fore.RED}Analyzing images for steganography{colorama.Style.RESET_ALL}",
|
| 1056 |
+
unit="file",
|
| 1057 |
+
bar_format="{desc}: {percentage:3.0f}%|{bar:30}| {n_fmt}/{total_fmt} [{elapsed}<{remaining}, {rate_fmt}]",
|
| 1058 |
+
colour="red"
|
| 1059 |
+
) as pbar:
|
| 1060 |
+
for future in concurrent.futures.as_completed(futures):
|
| 1061 |
+
file_path = futures[future]
|
| 1062 |
+
try:
|
| 1063 |
+
result = future.result()
|
| 1064 |
+
results.append(result)
|
| 1065 |
+
|
| 1066 |
+
# Update progress
|
| 1067 |
+
pbar.update(1)
|
| 1068 |
+
|
| 1069 |
+
# Add to suspicious images if applicable
|
| 1070 |
+
if result['suspicious']:
|
| 1071 |
+
suspicious_images.append(result)
|
| 1072 |
+
logging.info(f"Suspicious image found: {file_path} (confidence: {result['confidence']:.1f}%)")
|
| 1073 |
+
except Exception as e:
|
| 1074 |
+
logging.error(f"Error analyzing {file_path}: {str(e)}")
|
| 1075 |
+
pbar.update(1)
|
| 1076 |
+
|
| 1077 |
+
# Sort suspicious images by confidence
|
| 1078 |
+
suspicious_images.sort(key=lambda x: x['confidence'], reverse=True)
|
| 1079 |
+
|
| 1080 |
+
logging.info(f"Analysis complete. Found {len(suspicious_images)} suspicious images")
|
| 1081 |
+
return suspicious_images
|
| 1082 |
+
|
| 1083 |
+
def main():
|
| 1084 |
+
print_banner()
|
| 1085 |
+
|
| 1086 |
+
# Check for 'q' command to quit
|
| 1087 |
+
if len(sys.argv) == 2 and sys.argv[1].lower() == 'q':
|
| 1088 |
+
print(f"{colorama.Fore.YELLOW}Exiting RAT Finder. Stay vigilant!{colorama.Style.RESET_ALL}")
|
| 1089 |
+
sys.exit(0)
|
| 1090 |
+
|
| 1091 |
+
parser = argparse.ArgumentParser(
|
| 1092 |
+
description='RAT Finder: Steganography Detection Tool (v0.2.0)',
|
| 1093 |
+
epilog='Part of the 2PAC toolkit - Created by Richard Young'
|
| 1094 |
+
)
|
| 1095 |
+
|
| 1096 |
+
# Main action
|
| 1097 |
+
parser.add_argument('directory', nargs='?', help='Directory to search for images')
|
| 1098 |
+
parser.add_argument('--check-file', type=str, help='Check a specific file for steganography')
|
| 1099 |
+
|
| 1100 |
+
# Options
|
| 1101 |
+
parser.add_argument('--sensitivity', type=str, choices=['low', 'medium', 'high'], default='medium',
|
| 1102 |
+
help='Set detection sensitivity level (default: medium)')
|
| 1103 |
+
parser.add_argument('--non-recursive', action='store_true', help='Only search in the specified directory, not subdirectories')
|
| 1104 |
+
parser.add_argument('--output', type=str, help='Save list of suspicious files to this file')
|
| 1105 |
+
parser.add_argument('--visual-reports', type=str, help='Directory to save visual analysis reports')
|
| 1106 |
+
parser.add_argument('--workers', type=int, default=None, help='Number of worker processes (default: CPU count)')
|
| 1107 |
+
parser.add_argument('--verbose', '-v', action='store_true', help='Enable verbose logging')
|
| 1108 |
+
parser.add_argument('--no-color', action='store_true', help='Disable colored output')
|
| 1109 |
+
parser.add_argument('--version', action='version', version=f'RAT Finder v{VERSION} by Richard Young')
|
| 1110 |
+
|
| 1111 |
+
args = parser.parse_args()
|
| 1112 |
+
|
| 1113 |
+
# Setup logging
|
| 1114 |
+
setup_logging(args.verbose, args.no_color)
|
| 1115 |
+
|
| 1116 |
+
# Handle specific file check mode
|
| 1117 |
+
if args.check_file:
|
| 1118 |
+
file_path = args.check_file
|
| 1119 |
+
if not os.path.exists(file_path):
|
| 1120 |
+
logging.error(f"Error: File not found: {file_path}")
|
| 1121 |
+
sys.exit(1)
|
| 1122 |
+
|
| 1123 |
+
print(f"\n{colorama.Style.BRIGHT}Analyzing file for steganography: {file_path}{colorama.Style.RESET_ALL}\n")
|
| 1124 |
+
|
| 1125 |
+
is_suspicious, confidence, details = analyze_image(file_path, args.sensitivity)
|
| 1126 |
+
|
| 1127 |
+
# Print results
|
| 1128 |
+
if is_suspicious:
|
| 1129 |
+
print(f"{colorama.Fore.RED}[!] SUSPICIOUS: This image may contain hidden data{colorama.Style.RESET_ALL}")
|
| 1130 |
+
print(f"Confidence: {confidence:.1f}%\n")
|
| 1131 |
+
else:
|
| 1132 |
+
print(f"{colorama.Fore.GREEN}[✓] No steganography detected in this image{colorama.Style.RESET_ALL}")
|
| 1133 |
+
print(f"Confidence: {(100 - confidence):.1f}% clean\n")
|
| 1134 |
+
|
| 1135 |
+
# Details of analysis
|
| 1136 |
+
print(f"{colorama.Fore.CYAN}Detection Details:{colorama.Style.RESET_ALL}")
|
| 1137 |
+
|
| 1138 |
+
for analysis_type, results in details.items():
|
| 1139 |
+
if isinstance(results, dict) and 'confidence' in results:
|
| 1140 |
+
detection_status = f"{colorama.Fore.RED}[DETECTED]" if results['suspicious'] else f"{colorama.Fore.GREEN}[OK]"
|
| 1141 |
+
print(f"{detection_status} {analysis_type.replace('_', ' ').title()}: {results['confidence']:.1f}%{colorama.Style.RESET_ALL}")
|
| 1142 |
+
|
| 1143 |
+
# Print specific findings
|
| 1144 |
+
if 'details' in results and isinstance(results['details'], dict):
|
| 1145 |
+
for key, value in results['details'].items():
|
| 1146 |
+
if key != 'error':
|
| 1147 |
+
print(f" - {key}: {value}")
|
| 1148 |
+
|
| 1149 |
+
# Create visual report if requested
|
| 1150 |
+
if args.visual_reports:
|
| 1151 |
+
report_path = create_visual_report(file_path, confidence, details, args.visual_reports)
|
| 1152 |
+
if report_path:
|
| 1153 |
+
print(f"\n{colorama.Fore.CYAN}Visual report saved to: {report_path}{colorama.Style.RESET_ALL}")
|
| 1154 |
+
|
| 1155 |
+
sys.exit(0)
|
| 1156 |
+
|
| 1157 |
+
# Check if directory is specified
|
| 1158 |
+
if not args.directory:
|
| 1159 |
+
logging.error("Error: You must specify a directory to scan or use --check-file for a specific file")
|
| 1160 |
+
sys.exit(1)
|
| 1161 |
+
|
| 1162 |
+
directory = Path(args.directory)
|
| 1163 |
+
|
| 1164 |
+
# Verify the directory exists
|
| 1165 |
+
if not directory.exists() or not directory.is_dir():
|
| 1166 |
+
logging.error(f"Error: {directory} is not a valid directory")
|
| 1167 |
+
sys.exit(1)
|
| 1168 |
+
|
| 1169 |
+
# Begin analysis
|
| 1170 |
+
logging.info(f"Starting steganography analysis with {args.sensitivity} sensitivity")
|
| 1171 |
+
logging.info(f"Scanning for images in {directory}")
|
| 1172 |
+
|
| 1173 |
+
try:
|
| 1174 |
+
suspicious_images = analyze_images(
|
| 1175 |
+
directory,
|
| 1176 |
+
sensitivity=args.sensitivity,
|
| 1177 |
+
recursive=not args.non_recursive,
|
| 1178 |
+
output_dir=args.visual_reports,
|
| 1179 |
+
max_workers=args.workers
|
| 1180 |
+
)
|
| 1181 |
+
|
| 1182 |
+
# Print summary
|
| 1183 |
+
if suspicious_images:
|
| 1184 |
+
count_str = f"{colorama.Fore.RED}{len(suspicious_images)}{colorama.Style.RESET_ALL}"
|
| 1185 |
+
logging.info(f"Found {count_str} suspicious images that may contain hidden data")
|
| 1186 |
+
|
| 1187 |
+
# Print top findings
|
| 1188 |
+
print("\nTop suspicious images:")
|
| 1189 |
+
for i, result in enumerate(suspicious_images[:10]): # Show top 10
|
| 1190 |
+
confidence_color = colorama.Fore.RED if result['confidence'] > 80 else colorama.Fore.YELLOW
|
| 1191 |
+
print(f"{i+1}. {result['path']} - Confidence: {confidence_color}{result['confidence']:.1f}%{colorama.Style.RESET_ALL}")
|
| 1192 |
+
|
| 1193 |
+
if len(suspicious_images) > 10:
|
| 1194 |
+
print(f"... and {len(suspicious_images) - 10} more")
|
| 1195 |
+
else:
|
| 1196 |
+
logging.info(f"{colorama.Fore.GREEN}No suspicious images found{colorama.Style.RESET_ALL}")
|
| 1197 |
+
|
| 1198 |
+
# Save output if requested
|
| 1199 |
+
if args.output and suspicious_images:
|
| 1200 |
+
with open(args.output, 'w') as f:
|
| 1201 |
+
for result in suspicious_images:
|
| 1202 |
+
f.write(f"{result['path']},{result['confidence']:.1f}\n")
|
| 1203 |
+
logging.info(f"Saved list of suspicious files to {args.output}")
|
| 1204 |
+
|
| 1205 |
+
except KeyboardInterrupt:
|
| 1206 |
+
logging.info("Operation cancelled by user")
|
| 1207 |
+
sys.exit(130)
|
| 1208 |
+
except Exception as e:
|
| 1209 |
+
logging.error(f"Error: {str(e)}")
|
| 1210 |
+
if args.verbose:
|
| 1211 |
+
import traceback
|
| 1212 |
+
traceback.print_exc()
|
| 1213 |
+
sys.exit(1)
|
| 1214 |
+
|
| 1215 |
+
# Add signature at the end
|
| 1216 |
+
if not args.no_color:
|
| 1217 |
+
signature = f"\n{colorama.Fore.RED}RAT Finder v{VERSION} by Richard Young{colorama.Style.RESET_ALL}"
|
| 1218 |
+
tagline = f"{colorama.Fore.YELLOW}\"Uncovering what's hidden in plain sight.\"{colorama.Style.RESET_ALL}"
|
| 1219 |
+
print(signature)
|
| 1220 |
+
print(tagline)
|
| 1221 |
+
|
| 1222 |
+
if __name__ == "__main__":
|
| 1223 |
+
main()
|
requirements.txt
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Pillow
|
| 2 |
+
tqdm
|
| 3 |
+
humanize
|
| 4 |
+
colorama
|
| 5 |
+
numpy
|
| 6 |
+
scipy
|
| 7 |
+
matplotlib
|
| 8 |
+
gradio>=4.0.0
|
steg_embedder.py
ADDED
|
@@ -0,0 +1,337 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
LSB Steganography Embedder for 2PAC
|
| 4 |
+
Hides and extracts data in images using Least Significant Bit technique
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
import io
|
| 8 |
+
import hashlib
|
| 9 |
+
import struct
|
| 10 |
+
from typing import Tuple, Optional
|
| 11 |
+
from PIL import Image
|
| 12 |
+
import numpy as np
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
class StegEmbedder:
|
| 16 |
+
"""
|
| 17 |
+
LSB (Least Significant Bit) Steganography implementation
|
| 18 |
+
Hides data in the least significant bits of image pixels
|
| 19 |
+
"""
|
| 20 |
+
|
| 21 |
+
HEADER_SIZE = 12 # 4 bytes for data length + 8 bytes for checksum
|
| 22 |
+
MAGIC_NUMBER = b'2PAC' # Signature to identify embedded data
|
| 23 |
+
|
| 24 |
+
def __init__(self):
|
| 25 |
+
self.last_capacity = 0
|
| 26 |
+
self.last_used = 0
|
| 27 |
+
|
| 28 |
+
def calculate_capacity(self, image: Image.Image, bits_per_channel: int = 1) -> int:
|
| 29 |
+
"""
|
| 30 |
+
Calculate how many bytes can be hidden in the image
|
| 31 |
+
|
| 32 |
+
Args:
|
| 33 |
+
image: PIL Image object
|
| 34 |
+
bits_per_channel: Number of LSBs to use per color channel (1-4)
|
| 35 |
+
|
| 36 |
+
Returns:
|
| 37 |
+
Maximum bytes that can be hidden
|
| 38 |
+
"""
|
| 39 |
+
if image.mode not in ['RGB', 'RGBA']:
|
| 40 |
+
raise ValueError(f"Unsupported image mode: {image.mode}. Use RGB or RGBA.")
|
| 41 |
+
|
| 42 |
+
width, height = image.size
|
| 43 |
+
channels = len(image.mode) # 3 for RGB, 4 for RGBA
|
| 44 |
+
|
| 45 |
+
# Total bits available
|
| 46 |
+
total_bits = width * height * channels * bits_per_channel
|
| 47 |
+
|
| 48 |
+
# Account for header (magic number + length + checksum)
|
| 49 |
+
header_bits = (len(self.MAGIC_NUMBER) + self.HEADER_SIZE) * 8
|
| 50 |
+
|
| 51 |
+
available_bits = total_bits - header_bits
|
| 52 |
+
capacity = available_bits // 8 # Convert to bytes
|
| 53 |
+
|
| 54 |
+
self.last_capacity = capacity
|
| 55 |
+
return capacity
|
| 56 |
+
|
| 57 |
+
def _string_to_bits(self, data: str) -> str:
|
| 58 |
+
"""Convert string to binary representation"""
|
| 59 |
+
return ''.join(format(byte, '08b') for byte in data.encode('utf-8'))
|
| 60 |
+
|
| 61 |
+
def _bits_to_string(self, bits: str) -> str:
|
| 62 |
+
"""Convert binary representation back to string"""
|
| 63 |
+
chars = []
|
| 64 |
+
for i in range(0, len(bits), 8):
|
| 65 |
+
byte = bits[i:i+8]
|
| 66 |
+
if len(byte) == 8:
|
| 67 |
+
chars.append(chr(int(byte, 2)))
|
| 68 |
+
return ''.join(chars)
|
| 69 |
+
|
| 70 |
+
def _encrypt_data(self, data: str, password: str) -> bytes:
|
| 71 |
+
"""Simple XOR encryption with password-derived key"""
|
| 72 |
+
key = hashlib.sha256(password.encode()).digest()
|
| 73 |
+
data_bytes = data.encode('utf-8')
|
| 74 |
+
|
| 75 |
+
encrypted = bytearray()
|
| 76 |
+
for i, byte in enumerate(data_bytes):
|
| 77 |
+
encrypted.append(byte ^ key[i % len(key)])
|
| 78 |
+
|
| 79 |
+
return bytes(encrypted)
|
| 80 |
+
|
| 81 |
+
def _decrypt_data(self, encrypted_data: bytes, password: str) -> str:
|
| 82 |
+
"""Decrypt XOR-encrypted data"""
|
| 83 |
+
key = hashlib.sha256(password.encode()).digest()
|
| 84 |
+
|
| 85 |
+
decrypted = bytearray()
|
| 86 |
+
for i, byte in enumerate(encrypted_data):
|
| 87 |
+
decrypted.append(byte ^ key[i % len(key)])
|
| 88 |
+
|
| 89 |
+
return bytes(decrypted).decode('utf-8', errors='replace')
|
| 90 |
+
|
| 91 |
+
def embed_data(
|
| 92 |
+
self,
|
| 93 |
+
image_path: str,
|
| 94 |
+
data: str,
|
| 95 |
+
output_path: str,
|
| 96 |
+
password: Optional[str] = None,
|
| 97 |
+
bits_per_channel: int = 1
|
| 98 |
+
) -> Tuple[bool, str, dict]:
|
| 99 |
+
"""
|
| 100 |
+
Hide data in an image using LSB steganography
|
| 101 |
+
|
| 102 |
+
Args:
|
| 103 |
+
image_path: Path to input image
|
| 104 |
+
data: Text data to hide
|
| 105 |
+
output_path: Path for output image (will be PNG)
|
| 106 |
+
password: Optional password for encryption
|
| 107 |
+
bits_per_channel: LSBs to use per channel (1=subtle, 2-4=more capacity)
|
| 108 |
+
|
| 109 |
+
Returns:
|
| 110 |
+
Tuple of (success, message, stats_dict)
|
| 111 |
+
"""
|
| 112 |
+
try:
|
| 113 |
+
# Load image
|
| 114 |
+
img = Image.open(image_path)
|
| 115 |
+
if img.mode not in ['RGB', 'RGBA']:
|
| 116 |
+
img = img.convert('RGB')
|
| 117 |
+
|
| 118 |
+
# Calculate capacity
|
| 119 |
+
capacity = self.calculate_capacity(img, bits_per_channel)
|
| 120 |
+
|
| 121 |
+
# Encrypt data if password provided
|
| 122 |
+
if password:
|
| 123 |
+
data_bytes = self._encrypt_data(data, password)
|
| 124 |
+
is_encrypted = True
|
| 125 |
+
else:
|
| 126 |
+
data_bytes = data.encode('utf-8')
|
| 127 |
+
is_encrypted = False
|
| 128 |
+
|
| 129 |
+
data_length = len(data_bytes)
|
| 130 |
+
|
| 131 |
+
if data_length > capacity:
|
| 132 |
+
return False, f"Data too large! Maximum: {capacity} bytes, Provided: {data_length} bytes", {}
|
| 133 |
+
|
| 134 |
+
# Create header: MAGIC + encrypted_flag + length + checksum
|
| 135 |
+
checksum = hashlib.md5(data_bytes).digest()[:8]
|
| 136 |
+
encrypted_flag = b'\x01' if is_encrypted else b'\x00'
|
| 137 |
+
header = self.MAGIC_NUMBER + encrypted_flag + struct.pack('<I', data_length) + checksum
|
| 138 |
+
|
| 139 |
+
# Combine header and data
|
| 140 |
+
full_data = header + data_bytes
|
| 141 |
+
|
| 142 |
+
# Convert to bit string
|
| 143 |
+
bit_string = ''.join(format(byte, '08b') for byte in full_data)
|
| 144 |
+
|
| 145 |
+
# Embed in image
|
| 146 |
+
img_array = np.array(img, dtype=np.uint8)
|
| 147 |
+
flat_array = img_array.flatten()
|
| 148 |
+
|
| 149 |
+
bit_index = 0
|
| 150 |
+
for i in range(len(flat_array)):
|
| 151 |
+
if bit_index >= len(bit_string):
|
| 152 |
+
break
|
| 153 |
+
|
| 154 |
+
# Clear LSBs and set new bits
|
| 155 |
+
pixel = flat_array[i]
|
| 156 |
+
for bit in range(bits_per_channel):
|
| 157 |
+
if bit_index >= len(bit_string):
|
| 158 |
+
break
|
| 159 |
+
# Clear bit
|
| 160 |
+
pixel = (pixel & ~(1 << bit))
|
| 161 |
+
# Set new bit
|
| 162 |
+
if bit_string[bit_index] == '1':
|
| 163 |
+
pixel = pixel | (1 << bit)
|
| 164 |
+
bit_index += 1
|
| 165 |
+
|
| 166 |
+
flat_array[i] = pixel
|
| 167 |
+
|
| 168 |
+
# Reshape and save
|
| 169 |
+
steg_img_array = flat_array.reshape(img_array.shape)
|
| 170 |
+
steg_img = Image.fromarray(steg_img_array, img.mode)
|
| 171 |
+
|
| 172 |
+
# Save as PNG to preserve data
|
| 173 |
+
steg_img.save(output_path, 'PNG', optimize=False)
|
| 174 |
+
|
| 175 |
+
self.last_used = data_length
|
| 176 |
+
|
| 177 |
+
stats = {
|
| 178 |
+
'data_size': data_length,
|
| 179 |
+
'capacity': capacity,
|
| 180 |
+
'utilization': f"{(data_length / capacity * 100):.1f}%",
|
| 181 |
+
'encrypted': is_encrypted,
|
| 182 |
+
'bits_per_channel': bits_per_channel,
|
| 183 |
+
'image_size': f"{img.width}x{img.height}"
|
| 184 |
+
}
|
| 185 |
+
|
| 186 |
+
return True, f"Successfully embedded {data_length} bytes", stats
|
| 187 |
+
|
| 188 |
+
except Exception as e:
|
| 189 |
+
return False, f"Error embedding data: {str(e)}", {}
|
| 190 |
+
|
| 191 |
+
def extract_data(
|
| 192 |
+
self,
|
| 193 |
+
image_path: str,
|
| 194 |
+
password: Optional[str] = None,
|
| 195 |
+
bits_per_channel: int = 1
|
| 196 |
+
) -> Tuple[bool, str, str]:
|
| 197 |
+
"""
|
| 198 |
+
Extract hidden data from a steganographic image
|
| 199 |
+
|
| 200 |
+
Args:
|
| 201 |
+
image_path: Path to image with hidden data
|
| 202 |
+
password: Password if data is encrypted
|
| 203 |
+
bits_per_channel: LSBs used per channel (must match embedding)
|
| 204 |
+
|
| 205 |
+
Returns:
|
| 206 |
+
Tuple of (success, message, extracted_data)
|
| 207 |
+
"""
|
| 208 |
+
try:
|
| 209 |
+
# Load image
|
| 210 |
+
img = Image.open(image_path)
|
| 211 |
+
img_array = np.array(img, dtype=np.uint8)
|
| 212 |
+
flat_array = img_array.flatten()
|
| 213 |
+
|
| 214 |
+
# Extract header first
|
| 215 |
+
header_bits = (len(self.MAGIC_NUMBER) + 1 + 4 + 8) * 8
|
| 216 |
+
extracted_bits = []
|
| 217 |
+
|
| 218 |
+
bit_index = 0
|
| 219 |
+
for i in range(len(flat_array)):
|
| 220 |
+
if bit_index >= header_bits:
|
| 221 |
+
break
|
| 222 |
+
pixel = flat_array[i]
|
| 223 |
+
for bit in range(bits_per_channel):
|
| 224 |
+
if bit_index >= header_bits:
|
| 225 |
+
break
|
| 226 |
+
extracted_bits.append(str((pixel >> bit) & 1))
|
| 227 |
+
bit_index += 1
|
| 228 |
+
|
| 229 |
+
# Convert bits to bytes
|
| 230 |
+
header_bytes = bytearray()
|
| 231 |
+
for i in range(0, len(extracted_bits), 8):
|
| 232 |
+
byte_bits = ''.join(extracted_bits[i:i+8])
|
| 233 |
+
if len(byte_bits) == 8:
|
| 234 |
+
header_bytes.append(int(byte_bits, 2))
|
| 235 |
+
|
| 236 |
+
# Verify magic number
|
| 237 |
+
magic = bytes(header_bytes[:len(self.MAGIC_NUMBER)])
|
| 238 |
+
if magic != self.MAGIC_NUMBER:
|
| 239 |
+
return False, "No hidden data found (invalid magic number)", ""
|
| 240 |
+
|
| 241 |
+
# Parse header
|
| 242 |
+
offset = len(self.MAGIC_NUMBER)
|
| 243 |
+
is_encrypted = header_bytes[offset] == 1
|
| 244 |
+
offset += 1
|
| 245 |
+
|
| 246 |
+
data_length = struct.unpack('<I', bytes(header_bytes[offset:offset+4]))[0]
|
| 247 |
+
offset += 4
|
| 248 |
+
|
| 249 |
+
stored_checksum = bytes(header_bytes[offset:offset+8])
|
| 250 |
+
offset += 8
|
| 251 |
+
|
| 252 |
+
# Extract data
|
| 253 |
+
total_bits_needed = (len(self.MAGIC_NUMBER) + 1 + 4 + 8 + data_length) * 8
|
| 254 |
+
extracted_bits = []
|
| 255 |
+
|
| 256 |
+
bit_index = 0
|
| 257 |
+
for i in range(len(flat_array)):
|
| 258 |
+
if bit_index >= total_bits_needed:
|
| 259 |
+
break
|
| 260 |
+
pixel = flat_array[i]
|
| 261 |
+
for bit in range(bits_per_channel):
|
| 262 |
+
if bit_index >= total_bits_needed:
|
| 263 |
+
break
|
| 264 |
+
extracted_bits.append(str((pixel >> bit) & 1))
|
| 265 |
+
bit_index += 1
|
| 266 |
+
|
| 267 |
+
# Convert to bytes
|
| 268 |
+
data_bytes = bytearray()
|
| 269 |
+
for i in range(0, len(extracted_bits), 8):
|
| 270 |
+
byte_bits = ''.join(extracted_bits[i:i+8])
|
| 271 |
+
if len(byte_bits) == 8:
|
| 272 |
+
data_bytes.append(int(byte_bits, 2))
|
| 273 |
+
|
| 274 |
+
# Skip header and get data
|
| 275 |
+
data_bytes = bytes(data_bytes[offset:offset+data_length])
|
| 276 |
+
|
| 277 |
+
# Verify checksum
|
| 278 |
+
calculated_checksum = hashlib.md5(data_bytes).digest()[:8]
|
| 279 |
+
if calculated_checksum != stored_checksum:
|
| 280 |
+
return False, "Data corruption detected (checksum mismatch)", ""
|
| 281 |
+
|
| 282 |
+
# Decrypt if needed
|
| 283 |
+
if is_encrypted:
|
| 284 |
+
if not password:
|
| 285 |
+
return False, "Data is encrypted but no password provided", ""
|
| 286 |
+
try:
|
| 287 |
+
data_str = self._decrypt_data(data_bytes, password)
|
| 288 |
+
except Exception as e:
|
| 289 |
+
return False, f"Decryption failed (wrong password?): {str(e)}", ""
|
| 290 |
+
else:
|
| 291 |
+
data_str = data_bytes.decode('utf-8', errors='replace')
|
| 292 |
+
|
| 293 |
+
return True, f"Successfully extracted {data_length} bytes", data_str
|
| 294 |
+
|
| 295 |
+
except Exception as e:
|
| 296 |
+
return False, f"Error extracting data: {str(e)}", ""
|
| 297 |
+
|
| 298 |
+
|
| 299 |
+
def main():
|
| 300 |
+
"""Command-line interface for testing"""
|
| 301 |
+
import argparse
|
| 302 |
+
|
| 303 |
+
parser = argparse.ArgumentParser(description='LSB Steganography Tool')
|
| 304 |
+
parser.add_argument('mode', choices=['embed', 'extract'], help='Operation mode')
|
| 305 |
+
parser.add_argument('image', help='Input image path')
|
| 306 |
+
parser.add_argument('--data', help='Data to embed (for embed mode)')
|
| 307 |
+
parser.add_argument('--output', help='Output image path (for embed mode)')
|
| 308 |
+
parser.add_argument('--password', help='Encryption password (optional)')
|
| 309 |
+
parser.add_argument('--bits', type=int, default=1, help='Bits per channel (1-4)')
|
| 310 |
+
|
| 311 |
+
args = parser.parse_args()
|
| 312 |
+
|
| 313 |
+
embedder = StegEmbedder()
|
| 314 |
+
|
| 315 |
+
if args.mode == 'embed':
|
| 316 |
+
if not args.data or not args.output:
|
| 317 |
+
print("Error: --data and --output required for embed mode")
|
| 318 |
+
return
|
| 319 |
+
|
| 320 |
+
success, message, stats = embedder.embed_data(
|
| 321 |
+
args.image, args.data, args.output, args.password, args.bits
|
| 322 |
+
)
|
| 323 |
+
print(message)
|
| 324 |
+
if success:
|
| 325 |
+
print(f"Stats: {stats}")
|
| 326 |
+
|
| 327 |
+
elif args.mode == 'extract':
|
| 328 |
+
success, message, data = embedder.extract_data(
|
| 329 |
+
args.image, args.password, args.bits
|
| 330 |
+
)
|
| 331 |
+
print(message)
|
| 332 |
+
if success:
|
| 333 |
+
print(f"Extracted data:\n{data}")
|
| 334 |
+
|
| 335 |
+
|
| 336 |
+
if __name__ == '__main__':
|
| 337 |
+
main()
|