Spaces:

richardyoung
/

2pac

Sleeping

App Files Files Community

Richard Young commited on 15 days ago

Commit

c43a81f

0 Parent(s):

Initial commit for Hugging Face Space

Browse files

Files changed (8) hide show

.gitattributes +7 -0
.gitignore +22 -0
README.md +106 -0
app.py +563 -0
find_bad_images.py +1670 -0
rat_finder.py +1223 -0
requirements.txt +8 -0
steg_embedder.py +337 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,7 @@

+*.jpg filter=lfs diff=lfs merge=lfs -text
+*.png filter=lfs diff=lfs merge=lfs -text
+*.jpeg filter=lfs diff=lfs merge=lfs -text
+*.gif filter=lfs diff=lfs merge=lfs -text
+*.pdf filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+docs/*.jpg filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,22 @@

+# Image files
+*.jpg
+*.jpeg
+*.JPG
+*.JPEG
+# System files
+.DS_Store
+Thumbs.db
+# Python
+__pycache__/
+*.py[cod]
+*.class
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/

README.md ADDED Viewed

	@@ -0,0 +1,106 @@

+---
+title: 2PAC Picture Analyzer & Corruption Killer
+emoji: 🔫
+colorFrom: purple
+colorTo: blue
+sdk: gradio
+sdk_version: 4.44.0
+app_file: app.py
+pinned: false
+license: mit
+---
+# 🔫 2PAC: Picture Analyzer & Corruption Killer
+**Advanced image security and steganography toolkit**
+## Features
+### 🔒 Hide Secret Data
+Invisibly hide text messages inside images using **LSB (Least Significant Bit) steganography**:
+- Hide text of any length (capacity depends on image size)
+- Optional password encryption for added security
+- Adjustable LSB depth (1-4 bits per channel)
+- PNG output preserves hidden data perfectly
+### 🔍 Detect & Extract Hidden Data
+Advanced steganography detection using **RAT Finder** technology:
+- **ELA (Error Level Analysis)** - Highlights compression artifacts
+- **LSB Analysis** - Detects randomness in least significant bits
+- **Histogram Analysis** - Finds statistical anomalies
+- **Metadata Inspection** - Checks EXIF data for suspicious tools
+- **Extract Data** - Recover messages hidden with this tool
+### 🛡️ Check Image Integrity
+Comprehensive image validation and corruption detection:
+- File format validation (JPEG, PNG, GIF, TIFF, BMP, WebP, HEIC)
+- Header integrity checks
+- Data completeness verification
+- Visual corruption detection (black/gray regions)
+- Structure validation
+## How It Works
+### LSB Steganography
+The tool hides data in the **least significant bits** of pixel values. Since changing the last 1-2 bits of a pixel value (e.g., changing 200 to 201) is imperceptible to the human eye, we can encode arbitrary data without visible changes to the image.
+**Example:**
+- Original pixel: RGB(156, 89, 201) = `10011100, 01011001, 11001001`
+- After hiding bit '1': RGB(156, 89, 201) = `10011100, 01011001, 11001001` (last bit already 1)
+- After hiding bit '0': RGB(156, 88, 201) = `10011100, 01011000, 11001001` (89→88)
+This allows hiding hundreds to thousands of bytes in a typical photo!
+### Steganography Detection
+The RAT Finder uses multiple forensic techniques:
+1. **ELA (Error Level Analysis)**: Re-saves the image at a known quality and compares compression artifacts. Hidden data or manipulation shows as bright areas.
+2. **LSB Analysis**: Statistical tests check if the least significant bits are too random (hidden data) or too uniform (natural image).
+3. **Histogram Analysis**: Analyzes color distribution for anomalies typical of steganography.
+4. **Metadata Forensics**: Checks EXIF data for steganography tools or suspicious editing history.
+## Usage Tips
+### For Hiding Data:
+- ✅ Use **PNG** images (JPEG compression destroys hidden data)
+- ✅ Larger images = more capacity
+- ✅ Use 1-2 bits per channel for undetectable hiding
+- ✅ Add password encryption for sensitive data
+- ⚠️ Don't re-save or edit the output image!
+### For Detection:
+- 🔍 Higher sensitivity = more thorough but more false positives
+- 📊 Check the ELA image for bright spots (potential hiding)
+- 💡 High confidence doesn't guarantee hidden data (could be compression artifacts)
+- 🔓 Use "Extract Data" tab if you suspect LSB steganography
+### For Corruption Checking:
+- 🛡️ Enable visual corruption check for damaged photos
+- ⚙️ Higher sensitivity for stricter validation
+- 📁 Useful before archiving important photo collections
+## About
+**2PAC** combines three powerful tools:
+- **LSB Steganography** engine (new!)
+- **RAT Finder** - Advanced steg detection
+- **Image Validator** - Corruption checker
+Created by [Richard Young](https://github.com/ricyoung) | Part of [DeepNeuro.AI](https://deepneuro.ai)
+🔗 **GitHub Repository:** [github.com/ricyoung/2pac](https://github.com/ricyoung/2pac)
+🌐 **More Tools:** [demo.deepneuro.ai](https://demo.deepneuro.ai)
+## Security & Privacy
+- ✅ All processing happens in your browser session (Hugging Face Space)
+- ✅ Images are not stored or logged
+- ✅ Temporary files are deleted after processing
+- ✅ Your hidden data and passwords are never saved
+---
+*"All Eyez On Your Images" 👁️*

app.py ADDED Viewed

	@@ -0,0 +1,563 @@

+#!/usr/bin/env python3
+"""
+2PAC: Picture Analyzer & Corruption Killer - Gradio Web Interface
+Steganography, image corruption detection, and security analysis
+"""
+import os
+import tempfile
+import gradio as gr
+from PIL import Image
+import matplotlib.pyplot as plt
+import io
+import base64
+# Import 2PAC modules
+from steg_embedder import StegEmbedder
+import rat_finder
+import find_bad_images
+# Initialize embedder
+embedder = StegEmbedder()
+def hide_data_in_image(image, secret_text, password, bits_per_channel):
+    """
+    Tab 1: Hide data in an image using LSB steganography
+    """
+    if image is None:
+        return None, "⚠️ Please upload an image first"
+    if not secret_text or len(secret_text.strip()) == 0:
+        return None, "⚠️ Please enter text to hide"
+    try:
+        # Save uploaded image to temp file
+        with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as tmp_input:
+            img = Image.fromarray(image)
+            img.save(tmp_input.name, 'PNG')
+            input_path = tmp_input.name
+        # Create output file
+        with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as tmp_output:
+            output_path = tmp_output.name
+        # Calculate capacity first
+        img = Image.open(input_path)
+        capacity = embedder.calculate_capacity(img, bits_per_channel)
+        # Check if data fits
+        data_size = len(secret_text.encode('utf-8'))
+        if data_size > capacity:
+            os.unlink(input_path)
+            return None, f"❌ **Error:** Data too large!\n\n" \
+                        f"- **Data size:** {data_size:,} bytes\n" \
+                        f"- **Maximum capacity:** {capacity:,} bytes\n" \
+                        f"- **Overflow:** {data_size - capacity:,} bytes\n\n" \
+                        f"💡 Try: Shorter text, larger image, or more bits per channel"
+        # Embed data
+        pwd = password if password and len(password) > 0 else None
+        success, message, stats = embedder.embed_data(
+            input_path,
+            secret_text,
+            output_path,
+            password=pwd,
+            bits_per_channel=bits_per_channel
+        )
+        # Clean up input
+        os.unlink(input_path)
+        if not success:
+            if os.path.exists(output_path):
+                os.unlink(output_path)
+            return None, f"❌ **Error:** {message}"
+        # Load result image
+        result_img = Image.open(output_path)
+        # Format success message
+        result_message = f"""
+✅ **Successfully Hidden!**
+📊 **Statistics:**
+- **Data hidden:** {stats['data_size']:,} bytes ({len(secret_text):,} characters)
+- **Image capacity:** {stats['capacity']:,} bytes
+- **Utilization:** {stats['utilization']}
+- **Encryption:** {"🔒 Yes" if stats['encrypted'] else "🔓 No"}
+- **LSB depth:** {stats['bits_per_channel']} bit(s) per channel
+- **Image dimensions:** {stats['image_size']}
+💾 **Download the image below** - your data is invisible to the naked eye!
+⚠️ **Important:**
+- Save as PNG (not JPEG - will destroy hidden data)
+- Keep your password safe if you used encryption
+"""
+        return result_img, result_message
+    except Exception as e:
+        if 'input_path' in locals() and os.path.exists(input_path):
+            os.unlink(input_path)
+        if 'output_path' in locals() and os.path.exists(output_path):
+            os.unlink(output_path)
+        return None, f"❌ **Error:** {str(e)}"
+def detect_hidden_data(image, sensitivity):
+    """
+    Tab 2: Detect steganography using RAT Finder analysis
+    """
+    if image is None:
+        return None, "⚠️ Please upload an image to analyze"
+    try:
+        # Save uploaded image to temp file
+        with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as tmp:
+            img = Image.fromarray(image)
+            img.save(tmp.name, 'PNG')
+            image_path = tmp.name
+        # Map slider to sensitivity
+        sens_map = {1: 'low', 2: 'low', 3: 'low', 4: 'medium', 5: 'medium',
+                   6: 'medium', 7: 'high', 8: 'high', 9: 'high', 10: 'high'}
+        sensitivity_str = sens_map.get(sensitivity, 'medium')
+        # Perform analysis
+        confidence, details = rat_finder.analyze_image(image_path, sensitivity=sensitivity_str)
+        # Generate ELA visualization
+        ela_result = rat_finder.perform_ela_analysis(image_path)
+        # Clean up
+        os.unlink(image_path)
+        # Create confidence indicator
+        if confidence >= 70:
+            confidence_emoji = "🚨"
+            confidence_label = "HIGH SUSPICION"
+        elif confidence >= 40:
+            confidence_emoji = "⚠️"
+            confidence_label = "MODERATE SUSPICION"
+        else:
+            confidence_emoji = "✅"
+            confidence_label = "LOW SUSPICION"
+        # Format results
+        result_text = f"""
+{confidence_emoji} **{confidence_label}**
+📊 **Confidence Score:** {confidence:.1f}%
+🔍 **Analysis Details:**
+"""
+        for detail in details:
+            result_text += f"\n• {detail}"
+        result_text += f"""
+---
+**What does this mean?**
+- **ELA (Error Level Analysis):** Highlights areas with different compression levels
+  - Bright areas = potential manipulation or hidden data
+  - Uniform appearance = likely unmodified
+- **LSB Analysis:** Checks randomness in least significant bits
+- **Histogram Analysis:** Looks for statistical anomalies
+- **Metadata:** Examines EXIF data for suspicious tools
+- **File Structure:** Checks for trailing data
+💡 **High confidence doesn't mean data is hidden** - just that anomalies exist.
+Use the "Extract Data" tab if you suspect LSB steganography!
+"""
+        # Return ELA plot if available
+        if ela_result['success'] and ela_result['ela_image']:
+            return ela_result['ela_image'], result_text
+        return None, result_text
+    except Exception as e:
+        if 'image_path' in locals() and os.path.exists(image_path):
+            os.unlink(image_path)
+        return None, f"❌ **Error:** {str(e)}"
+def extract_hidden_data(image, password, bits_per_channel):
+    """
+    Tab 2b: Extract data hidden with LSB steganography
+    """
+    if image is None:
+        return "⚠️ Please upload an image"
+    try:
+        # Save uploaded image to temp file
+        with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as tmp:
+            img = Image.fromarray(image)
+            img.save(tmp.name, 'PNG')
+            image_path = tmp.name
+        # Attempt extraction
+        pwd = password if password and len(password) > 0 else None
+        success, message, extracted_data = embedder.extract_data(
+            image_path,
+            password=pwd,
+            bits_per_channel=bits_per_channel
+        )
+        # Clean up
+        os.unlink(image_path)
+        if not success:
+            return f"❌ **{message}**\n\nPossible reasons:\n" \
+                   f"• No data hidden in this image\n" \
+                   f"• Wrong password (if encrypted)\n" \
+                   f"• Wrong bits-per-channel setting\n" \
+                   f"• Image was modified/re-saved"
+        result = f"""
+✅ **Data Successfully Extracted!**
+📝 **Hidden Message:**
+---
+{extracted_data}
+---
+📊 **Extraction Info:**
+- **Data size:** {len(extracted_data)} characters
+- **Decryption:** {"🔒 Used" if pwd else "🔓 Not needed"}
+- **LSB depth:** {bits_per_channel} bit(s) per channel
+💡 Copy the message above - it has been successfully recovered from the image!
+"""
+        return result
+    except Exception as e:
+        if 'image_path' in locals() and os.path.exists(image_path):
+            os.unlink(image_path)
+        return f"❌ **Error:** {str(e)}"
+def check_image_corruption(image, sensitivity, check_visual):
+    """
+    Tab 3: Check for image corruption and validate integrity
+    """
+    if image is None:
+        return "⚠️ Please upload an image to check"
+    try:
+        # Save uploaded image to temp file
+        with tempfile.NamedTemporaryFile(delete=False, suffix='.png') as tmp:
+            img = Image.fromarray(image)
+            img.save(tmp.name, 'PNG')
+            image_path = tmp.name
+        # Map slider to sensitivity
+        sens_map = {1: 'low', 2: 'low', 3: 'low', 4: 'medium', 5: 'medium',
+                   6: 'medium', 7: 'high', 8: 'high', 9: 'high', 10: 'high'}
+        sensitivity_str = sens_map.get(sensitivity, 'medium')
+        # Validate image
+        is_valid = find_bad_images.is_valid_image(
+            image_path,
+            thorough=True,
+            sensitivity=sensitivity_str,
+            check_visual=check_visual
+        )
+        # Get diagnostic details
+        issues = find_bad_images.diagnose_image_issue(image_path)
+        # Clean up
+        os.unlink(image_path)
+        # Format results
+        if is_valid:
+            result = f"""
+✅ **IMAGE IS VALID**
+The image passed all validation checks:
+- ✅ File structure is intact
+- ✅ Headers are valid
+- ✅ No truncation detected
+- ✅ Metadata is consistent
+"""
+            if check_visual:
+                result += "- ✅ No visual corruption detected\n"
+            result += "\n💚 **This image is safe to use!**"
+        else:
+            result = f"""
+⚠️ **ISSUES DETECTED**
+The image has validation problems:
+"""
+            if issues:
+                for issue_type, issue_desc in issues.items():
+                    result += f"**{issue_type}:**\n{issue_desc}\n\n"
+            else:
+                result += "❌ Image failed validation but no specific issues identified.\n\n"
+            result += """
+---
+**What to do:**
+- Image may be corrupted or incomplete
+- Try re-downloading the original file
+- Check if the file was properly transferred
+- Use image repair tools if needed
+"""
+        return result
+    except Exception as e:
+        if 'image_path' in locals() and os.path.exists(image_path):
+            os.unlink(image_path)
+        return f"❌ **Error:** {str(e)}"
+# Create Gradio interface
+with gr.Blocks(
+    title="2PAC: Picture Analyzer & Corruption Killer",
+    theme=gr.themes.Soft(
+        primary_hue="violet",
+        secondary_hue="blue",
+    )
+) as demo:
+    gr.Markdown("""
+# 🔫 2PAC: Picture Analyzer & Corruption Killer
+**Advanced image security and steganography toolkit**
+Hide secret messages in images, detect hidden data, and validate image integrity.
+    """)
+    with gr.Tabs():
+        # TAB 1: Hide Data
+        with gr.Tab("🔒 Hide Secret Data"):
+            gr.Markdown("""
+## Hide Data in Image (LSB Steganography)
+Invisibly hide text inside an image using Least Significant Bit encoding.
+The image will look identical to the naked eye, but contains your secret message!
+            """)
+            with gr.Row():
+                with gr.Column(scale=1):
+                    hide_input_image = gr.Image(
+                        label="Upload Image",
+                        type="numpy",
+                        height=300
+                    )
+                    hide_secret_text = gr.Textbox(
+                        label="Secret Text to Hide",
+                        placeholder="Enter your secret message here...",
+                        lines=5,
+                        max_lines=10
+                    )
+                    with gr.Row():
+                        hide_password = gr.Textbox(
+                            label="Password (Optional - for encryption)",
+                            placeholder="Leave empty for no encryption",
+                            type="password"
+                        )
+                        hide_bits = gr.Slider(
+                            minimum=1,
+                            maximum=4,
+                            value=1,
+                            step=1,
+                            label="LSB Depth (higher = more capacity, less subtle)",
+                            info="1=subtle, 4=maximum capacity"
+                        )
+                    hide_button = gr.Button("🔒 Hide Data in Image", variant="primary", size="lg")
+                with gr.Column(scale=1):
+                    hide_output_image = gr.Image(label="Result Image (Download This!)", height=300)
+                    hide_output_text = gr.Markdown(label="Status")
+            hide_button.click(
+                fn=hide_data_in_image,
+                inputs=[hide_input_image, hide_secret_text, hide_password, hide_bits],
+                outputs=[hide_output_image, hide_output_text]
+            )
+            gr.Markdown("""
+---
+**💡 Tips:**
+- Use PNG images for best results (JPEG will destroy hidden data!)
+- Larger images can hold more data
+- Password encryption adds extra security layer
+- LSB depth: 1-2 bits is undetectable, 3-4 bits provides more capacity
+            """)
+        # TAB 2: Detect & Extract
+        with gr.Tab("🔍 Detect & Extract Hidden Data"):
+            gr.Markdown("""
+## Detect Steganography & Extract Hidden Data
+Use advanced analysis techniques to detect hidden data in images, or extract data hidden with this tool.
+            """)
+            with gr.Tabs():
+                # Sub-tab: Detection
+                with gr.Tab("🔎 Detect (Analysis)"):
+                    gr.Markdown("""
+### Steganography Detection (RAT Finder)
+Analyzes images for signs of hidden data using multiple techniques:
+ELA, LSB analysis, histogram analysis, metadata inspection, and more.
+                    """)
+                    with gr.Row():
+                        with gr.Column(scale=1):
+                            detect_input_image = gr.Image(
+                                label="Upload Image to Analyze",
+                                type="numpy",
+                                height=300
+                            )
+                            detect_sensitivity = gr.Slider(
+                                minimum=1,
+                                maximum=10,
+                                value=5,
+                                step=1,
+                                label="Detection Sensitivity",
+                                info="Higher = more thorough but more false positives"
+                            )
+                            detect_button = gr.Button("🔍 Analyze for Hidden Data", variant="primary", size="lg")
+                        with gr.Column(scale=1):
+                            detect_output_image = gr.Image(label="ELA Visualization", height=300)
+                            detect_output_text = gr.Markdown(label="Analysis Results")
+                    detect_button.click(
+                        fn=detect_hidden_data,
+                        inputs=[detect_input_image, detect_sensitivity],
+                        outputs=[detect_output_image, detect_output_text]
+                    )
+                # Sub-tab: Extraction
+                with gr.Tab("📤 Extract Data"):
+                    gr.Markdown("""
+### Extract Hidden Data (LSB Extraction)
+If you have an image created with the "Hide Data" tool, extract the hidden message here.
+                    """)
+                    with gr.Row():
+                        with gr.Column(scale=1):
+                            extract_input_image = gr.Image(
+                                label="Upload Image with Hidden Data",
+                                type="numpy",
+                                height=300
+                            )
+                            with gr.Row():
+                                extract_password = gr.Textbox(
+                                    label="Password (if encrypted)",
+                                    placeholder="Leave empty if not encrypted",
+                                    type="password"
+                                )
+                                extract_bits = gr.Slider(
+                                    minimum=1,
+                                    maximum=4,
+                                    value=1,
+                                    step=1,
+                                    label="LSB Depth (must match encoding)",
+                                    info="Use same value as when hiding"
+                                )
+                            extract_button = gr.Button("📤 Extract Hidden Data", variant="primary", size="lg")
+                        with gr.Column(scale=1):
+                            extract_output_text = gr.Markdown(label="Extracted Data")
+                    extract_button.click(
+                        fn=extract_hidden_data,
+                        inputs=[extract_input_image, extract_password, extract_bits],
+                        outputs=[extract_output_text]
+                    )
+        # TAB 3: Check Corruption
+        with gr.Tab("🛡️ Check Image Integrity"):
+            gr.Markdown("""
+## Image Corruption & Validation
+Thoroughly validate image files for corruption, truncation, and structural issues.
+Detects damaged headers, incomplete data, and visual artifacts.
+            """)
+            with gr.Row():
+                with gr.Column(scale=1):
+                    check_input_image = gr.Image(
+                        label="Upload Image to Validate",
+                        type="numpy",
+                        height=300
+                    )
+                    with gr.Row():
+                        check_sensitivity = gr.Slider(
+                            minimum=1,
+                            maximum=10,
+                            value=5,
+                            step=1,
+                            label="Validation Sensitivity",
+                            info="Higher = more strict validation"
+                        )
+                        check_visual = gr.Checkbox(
+                            label="Check for Visual Corruption",
+                            value=True,
+                            info="Slower but detects visual artifacts"
+                        )
+                    check_button = gr.Button("🛡️ Validate Image", variant="primary", size="lg")
+                with gr.Column(scale=1):
+                    check_output_text = gr.Markdown(label="Validation Results")
+            check_button.click(
+                fn=check_image_corruption,
+                inputs=[check_input_image, check_sensitivity, check_visual],
+                outputs=[check_output_text]
+            )
+            gr.Markdown("""
+---
+**🔍 Checks Performed:**
+- ✅ File format validation (JPEG, PNG, GIF, etc.)
+- ✅ Header integrity
+- ✅ Data completeness
+- ✅ Metadata consistency
+- ✅ Visual corruption detection (black/gray regions)
+- ✅ Structure validation
+            """)
+    gr.Markdown("""
+---
+## About 2PAC
+**2PAC** (Picture Analyzer & Corruption Killer) is a comprehensive image security toolkit combining:
+- **LSB Steganography**: Hide and extract secret messages in images
+- **RAT Finder**: Advanced steganography detection using 7+ analysis techniques
+- **Image Validation**: Detect corruption and structural issues
+🔗 **GitHub:** [github.com/ricyoung/2pac](https://github.com/ricyoung/2pac)
+🌐 **More Tools:** [demo.deepneuro.ai](https://demo.deepneuro.ai)
+---
+*Built with ❤️ by DeepNeuro.AI | Powered by Gradio & Hugging Face Spaces*
+    """)
+if __name__ == "__main__":
+    demo.launch()

find_bad_images.py ADDED Viewed

	@@ -0,0 +1,1670 @@

+#!/usr/bin/env python3
+"""
+2PAC: The Picture Analyzer & Corruption killer
+Author: Richard Young
+License: MIT
+In memory of Jeff Young, who loved Tupac's music and lived by his values of helping others.
+Like Tupac, Jeff believed in bringing people together and always lending a hand to those in need.
+May your photos always be as clear as the memories they capture, and may we all strive to help others as Jeff did.
+"""
+import os
+import argparse
+import concurrent.futures
+import sys
+import time
+import io
+import json
+import shutil
+import hashlib
+import struct
+import tempfile
+import subprocess
+import random
+from datetime import datetime
+from pathlib import Path
+from PIL import Image, ImageFile, UnidentifiedImageError
+from tqdm import tqdm
+import tqdm.auto as tqdm_auto
+import colorama
+import humanize
+import logging
+# Import 2PAC quotes
+try:
+    from quotes import QUOTES
+except ImportError:
+    # Default quotes if file is missing
+    QUOTES = ["All Eyez On Your Images."]
+# Initialize colorama (required for Windows)
+colorama.init()
+# Allow loading of truncated images for repair attempts
+ImageFile.LOAD_TRUNCATED_IMAGES = True
+# Dictionary of supported image formats with their extensions
+SUPPORTED_FORMATS = {
+    'JPEG': ('.jpg', '.jpeg', '.jpe', '.jif', '.jfif', '.jfi'),
+    'PNG': ('.png',),
+    'GIF': ('.gif',),
+    'TIFF': ('.tiff', '.tif'),
+    'BMP': ('.bmp', '.dib'),
+    'WEBP': ('.webp',),
+    'ICO': ('.ico',),
+    'HEIC': ('.heic',),
+}
+# Default formats (all supported formats)
+DEFAULT_FORMATS = list(SUPPORTED_FORMATS.keys())
+# List of formats that can potentially be repaired
+REPAIRABLE_FORMATS = ['JPEG', 'PNG', 'GIF']
+# Default progress directory
+DEFAULT_PROGRESS_DIR = os.path.expanduser("~/.bad_image_finder/progress")
+# Current version
+VERSION = "1.5.1"
+# Security: Maximum file size to process (100MB) to prevent DoS
+MAX_FILE_SIZE = 100 * 1024 * 1024
+# Security: Maximum image dimensions (50 megapixels) to prevent decompression bombs
+MAX_IMAGE_PIXELS = 50000 * 50000
+def setup_logging(verbose, no_color=False):
+    level = logging.DEBUG if verbose else logging.INFO
+    # Define color codes
+    if not no_color:
+        # Color scheme
+        COLORS = {
+            'DEBUG': colorama.Fore.CYAN,
+            'INFO': colorama.Fore.GREEN,
+            'WARNING': colorama.Fore.YELLOW,
+            'ERROR': colorama.Fore.RED,
+            'CRITICAL': colorama.Fore.MAGENTA + colorama.Style.BRIGHT,
+            'RESET': colorama.Style.RESET_ALL
+        }
+        # Custom formatter with colors
+        class ColoredFormatter(logging.Formatter):
+            def format(self, record):
+                levelname = record.levelname
+                if levelname in COLORS:
+                    record.levelname = f"{COLORS[levelname]}{levelname}{COLORS['RESET']}"
+                    record.msg = f"{COLORS[levelname]}{record.msg}{COLORS['RESET']}"
+                return super().format(record)
+        formatter = ColoredFormatter('%(asctime)s - %(levelname)s - %(message)s')
+    else:
+        formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
+    handler = logging.StreamHandler()
+    handler.setFormatter(formatter)
+    logging.basicConfig(
+        level=level,
+        handlers=[handler]
+    )
+def diagnose_image_issue(file_path):
+    """
+    Attempts to diagnose what's wrong with the image.
+    Returns: (error_type, details)
+    """
+    try:
+        with open(file_path, 'rb') as f:
+            header = f.read(16)  # Read first 16 bytes
+        # Check for zero-byte file
+        if len(header) == 0:
+            return "empty_file", "File is empty (0 bytes)"
+        # Check for correct JPEG header
+        if file_path.lower().endswith(SUPPORTED_FORMATS['JPEG']):
+            if not (header.startswith(b'\xff\xd8\xff')):
+                return "invalid_header", "Invalid JPEG header"
+        # Check for correct PNG header
+        elif file_path.lower().endswith(SUPPORTED_FORMATS['PNG']):
+            if not header.startswith(b'\x89PNG\r\n\x1a\n'):
+                return "invalid_header", "Invalid PNG header"
+        # Try to open with PIL for more detailed diagnosis
+        try:
+            with Image.open(file_path) as img:
+                img.verify()
+        except Exception as e:
+            error_str = str(e).lower()
+            if "truncated" in error_str:
+                return "truncated", "File is truncated"
+            elif "corrupt" in error_str:
+                return "corrupt_data", "Data corruption detected"
+            elif "incorrect mode" in error_str or "decoder" in error_str:
+                return "decoder_issue", "Image decoder issue"
+            else:
+                return "unknown", f"Unknown issue: {str(e)}"
+        # Now try to load the data
+        try:
+            with Image.open(file_path) as img:
+                img.load()
+        except Exception as e:
+            return "data_load_failed", f"Image data couldn't be loaded: {str(e)}"
+        # If we got here, there's some other issue
+        return "unknown", "Unknown issue"
+    except Exception as e:
+        return "access_error", f"Error accessing file: {str(e)}"
+def check_jpeg_structure(file_path):
+    """
+    Performs a deep check of JPEG file structure to find corruption that PIL might miss.
+    Returns (is_valid, error_message)
+    """
+    try:
+        with open(file_path, 'rb') as f:
+            data = f.read()
+        # Check for correct JPEG header (SOI marker)
+        if not data.startswith(b'\xFF\xD8'):
+            return False, "Invalid JPEG header (missing SOI marker)"
+        # Check for proper EOI marker at the end
+        if not data.endswith(b'\xFF\xD9'):
+            return False, "Missing EOI marker at end of file"
+        # Check for key JPEG segments
+        # SOF marker (Start of Frame) - At least one should be present
+        sof_markers = [b'\xFF\xC0', b'\xFF\xC1', b'\xFF\xC2', b'\xFF\xC3']
+        has_sof = any(marker in data for marker in sof_markers)
+        if not has_sof:
+            return False, "No Start of Frame (SOF) marker found"
+        # Check for SOS marker (Start of Scan)
+        if b'\xFF\xDA' not in data:
+            return False, "No Start of Scan (SOS) marker found"
+        # Scan through the file to check marker structure
+        i = 2  # Skip SOI marker
+        while i < len(data) - 1:
+            if data[i] == 0xFF and data[i+1] != 0x00 and data[i+1] != 0xFF:
+                # Found a marker
+                marker = data[i:i+2]
+                # For markers with length fields, validate length
+                if (0xC0 <= data[i+1] <= 0xCF and data[i+1] != 0xC4 and data[i+1] != 0xC8) or \
+                   (0xDB <= data[i+1] <= 0xFE):
+                    if i + 4 >= len(data):
+                        return False, f"Truncated marker {data[i+1]:02X} at position {i}"
+                    length = struct.unpack('>H', data[i+2:i+4])[0]
+                    if i + 2 + length > len(data):
+                        return False, f"Invalid segment length for marker {data[i+1]:02X}"
+                    i += 2 + length
+                    continue
+            # Move to next byte
+            i += 1
+        return True, "JPEG structure appears valid"
+    except Exception as e:
+        return False, f"Error during JPEG structure check: {str(e)}"
+def check_png_structure(file_path):
+    """
+    Performs a deep check of PNG file structure to find corruption.
+    Returns (is_valid, error_message)
+    """
+    try:
+        with open(file_path, 'rb') as f:
+            data = f.read()
+        # Check for PNG signature
+        png_signature = b'\x89PNG\r\n\x1a\n'
+        if not data.startswith(png_signature):
+            return False, "Invalid PNG signature"
+        # Check minimum viable PNG (signature + IHDR chunk)
+        if len(data) < 8 + 12:  # 8 bytes signature + 12 bytes min IHDR chunk
+            return False, "PNG file too small to contain valid header"
+        # Check for IEND chunk at the end
+        if not data.endswith(b'IEND\xaeB`\x82'):
+            return False, "Missing IEND chunk at end of file"
+        # Parse chunks
+        pos = 8  # Skip signature
+        required_chunks = {'IHDR': False}
+        while pos < len(data):
+            if pos + 8 > len(data):
+                return False, "Truncated chunk header"
+            # Read chunk length and type
+            chunk_len = struct.unpack('>I', data[pos:pos+4])[0]
+            chunk_type = data[pos+4:pos+8].decode('ascii', errors='replace')
+            # Validate chunk length
+            if pos + chunk_len + 12 > len(data):
+                return False, f"Truncated {chunk_type} chunk"
+            # Track required chunks
+            if chunk_type in required_chunks:
+                required_chunks[chunk_type] = True
+            # Special validation for IHDR chunk
+            if chunk_type == 'IHDR' and chunk_len != 13:
+                return False, "Invalid IHDR chunk length"
+            # Mandatory IHDR must be first chunk
+            if pos == 8 and chunk_type != 'IHDR':
+                return False, "First chunk must be IHDR"
+            # IEND must be the last chunk
+            if chunk_type == 'IEND' and pos + chunk_len + 12 != len(data):
+                return False, "Data after IEND chunk"
+            # Move to next chunk
+            pos += chunk_len + 12  # Length (4) + Type (4) + Data (chunk_len) + CRC (4)
+        # Verify required chunks
+        for chunk, present in required_chunks.items():
+            if not present:
+                return False, f"Missing required {chunk} chunk"
+        return True, "PNG structure appears valid"
+    except Exception as e:
+        return False, f"Error during PNG structure check: {str(e)}"
+def validate_subprocess_path(file_path):
+    """
+    Validate file path before passing to subprocess to prevent command injection.
+    Args:
+        file_path: Path to validate
+    Returns:
+        True if path is safe
+    Raises:
+        ValueError: If path contains dangerous characters or patterns
+    """
+    import re
+    # Must be an absolute path
+    if not os.path.isabs(file_path):
+        raise ValueError(f"Path must be absolute: {file_path}")
+    # File must exist
+    if not os.path.exists(file_path):
+        raise ValueError(f"File does not exist: {file_path}")
+    # Check for shell metacharacters and dangerous patterns
+    # Allow: alphanumeric, spaces, dots, dashes, underscores, forward slashes
+    # Block: semicolons, pipes, backticks, $, &, >, <, etc.
+    dangerous_chars = ['`', '$', '&', '|', ';', '>', '<', '\n', '\r', '(', ')']
+    for char in dangerous_chars:
+        if char in file_path:
+            raise ValueError(f"Dangerous character '{char}' found in path: {file_path}")
+    # Block path traversal attempts
+    if '..' in file_path:
+        raise ValueError(f"Path traversal pattern '..' detected: {file_path}")
+    # Block null bytes
+    if '\x00' in file_path:
+        raise ValueError("Null byte detected in path")
+    return True
+def try_external_tools(file_path):
+    """
+    Try using external tools to validate the image if they're available.
+    Returns (is_valid, message)
+    Security: Validates file path before passing to subprocess to prevent
+    command injection attacks.
+    """
+    # Validate path before passing to subprocess
+    try:
+        validate_subprocess_path(file_path)
+    except ValueError as e:
+        logging.warning(f"Skipping external tool validation due to security check: {e}")
+        return True, "External tools check skipped (security)"
+    # Try using exiftool if available
+    try:
+        result = subprocess.run(['exiftool', '-m', '-p', '$Error', file_path],
+                               capture_output=True, text=True, timeout=5)
+        if result.returncode == 0 and result.stdout.strip():
+            return False, f"Exiftool error: {result.stdout.strip()}"
+        # Check with identify (ImageMagick) if available
+        result = subprocess.run(['identify', '-verbose', file_path],
+                               capture_output=True, text=True, timeout=5)
+        if result.returncode != 0:
+            return False, "ImageMagick identify failed to read the image"
+        return True, "Passed external tool validation"
+    except (subprocess.SubprocessError, FileNotFoundError):
+        # External tools not available or failed
+        return True, "External tools check skipped"
+def try_full_decode_check(file_path):
+    """
+    Try to fully decode the image to a temporary file.
+    This catches more subtle corruption that might otherwise be missed.
+    """
+    try:
+        # For JPEGs, try to decode and re-encode the image
+        with Image.open(file_path) as img:
+            # Create a temporary file for testing
+            with tempfile.NamedTemporaryFile(delete=True) as tmp:
+                # Try to save a decoded copy
+                img.save(tmp.name, format="BMP")
+                # If we get here, the image data could be fully decoded
+                return True, "Full decode test passed"
+    except Exception as e:
+        return False, f"Full decode test failed: {str(e)}"
+def check_visual_corruption(file_path, block_threshold=0.20, uniform_threshold=10, strict_mode=False):
+    """
+    Analyze image content to detect visual corruption like large uniform areas.
+    Args:
+        file_path: Path to the image file
+        block_threshold: Percentage of image that must be uniform to be considered corrupt (0.0-1.0)
+        uniform_threshold: Color variation threshold for considering pixels "uniform"
+        strict_mode: If True, only detect gray/black areas as corruption indicators
+    Returns:
+        (is_visually_corrupt, details)
+    """
+    try:
+        with Image.open(file_path) as img:
+            # Get image dimensions
+            width, height = img.size
+            total_pixels = width * height
+            # Convert to RGB to ensure consistent analysis
+            if img.mode != "RGB":
+                img = img.convert("RGB")
+            # Sample the image (analyzing every pixel would be too slow)
+            # We'll create a grid of sample points - we'll use more samples for more accuracy
+            sample_step = max(1, min(width, height) // 150)  # Adjust based on image size
+            # Track unique colors and their counts
+            color_counts = {}
+            total_samples = 0
+            # Sample the image
+            for y in range(0, height, sample_step):
+                for x in range(0, width, sample_step):
+                    total_samples += 1
+                    pixel = img.getpixel((x, y))
+                    # Round pixel values to reduce sensitivity to minor variations
+                    rounded_pixel = (
+                        pixel[0] // uniform_threshold * uniform_threshold,
+                        pixel[1] // uniform_threshold * uniform_threshold,
+                        pixel[2] // uniform_threshold * uniform_threshold
+                    )
+                    if rounded_pixel in color_counts:
+                        color_counts[rounded_pixel] += 1
+                    else:
+                        color_counts[rounded_pixel] = 1
+            # Find the most common color
+            most_common_color = max(color_counts.items(), key=lambda x: x[1])
+            most_common_percentage = most_common_color[1] / total_samples
+            # Check for large blocks of uniform color (potential corruption)
+            if most_common_percentage > block_threshold:
+                # Calculate approximate percentage of the image affected
+                affected_pct = most_common_percentage * 100
+                color_value = most_common_color[0]
+                # Determine if this is likely corruption
+                # Gray/black areas are common in corruption
+                is_dark = sum(color_value) < 3 * uniform_threshold  # Very dark areas
+                # Check if it's a gray area (equal R,G,B values)
+                is_gray = abs(color_value[0] - color_value[1]) < uniform_threshold and \
+                          abs(color_value[1] - color_value[2]) < uniform_threshold and \
+                          abs(color_value[0] - color_value[2]) < uniform_threshold
+                # Only consider mid-range grays as corruption indicators (not white/black)
+                is_mid_gray = is_gray and 30 < sum(color_value)/3 < 220
+                # Special case: almost pure white is often legitimate content
+                is_white = color_value[0] > 240 and color_value[1] > 240 and color_value[2] > 240
+                # Determine likelihood of corruption based on color and percentage
+                if (is_dark or is_mid_gray) and not is_white:
+                    # Higher threshold for white areas since they're common in legitimate images
+                    white_threshold = 0.4  # 40% of image
+                    if is_white and most_common_percentage < white_threshold:
+                        return False, f"Large white area ({affected_pct:.1f}%) but likely not corruption"
+                    # More likely to be corruption
+                    return True, f"Visual corruption detected: {affected_pct:.1f}% of image is uniform {color_value}"
+                else:
+                    # Could be a legitimate image with a uniform background
+                    return False, f"Large uniform area ({affected_pct:.1f}%) but likely not corruption"
+            # Check for other telltale signs of corruption - but only in strict mode
+            if strict_mode:
+                # 1. Excessive color blocks (fragmentation) - this works well for detecting noise
+                if len(color_counts) > total_samples * 0.85 and total_samples > 200:
+                    return True, f"Excessive color fragmentation detected ({len(color_counts)} colors in {total_samples} samples)"
+                # 2. Check for very specific corruption patterns
+                # Analyze distribution of colors to look for unusual patterns
+                if total_samples > 500:  # Only for larger images with enough samples
+                    # Check if there's an unnatural color distribution
+                    # Normal photos have a more gradual distribution rather than spikes
+                    sorted_counts = sorted(color_counts.values(), reverse=True)
+                    # Calculate the color distribution ratio
+                    if len(sorted_counts) > 5:
+                        top5_ratio = sum(sorted_counts[:5]) / sum(sorted_counts)
+                        # Usually, the top 5 colors shouldn't dominate more than 80% of the image
+                        # unless it's a graphic or very simple image
+                        if top5_ratio < 0.2 and most_common_percentage < 0.1:
+                            return True, f"Unusual color distribution (possible noise/corruption)"
+            return False, "No visual corruption detected"
+    except Exception as e:
+        return False, f"Error during visual analysis: {str(e)}"
+def is_valid_image(file_path, thorough=True, sensitivity='medium', ignore_eof=False, check_visual=False, visual_strictness='medium'):
+    """
+    Validate image file integrity using multiple methods.
+    Args:
+        file_path: Path to the image file
+        thorough: Whether to perform deep structure validation
+        sensitivity: 'low', 'medium', or 'high'
+        ignore_eof: Whether to ignore missing end-of-file markers
+        check_visual: Whether to perform visual content analysis to detect corruption
+        visual_strictness: 'low', 'medium', or 'high' strictness for visual corruption detection
+    Returns:
+        True if valid, False if corrupt.
+    """
+    # Basic PIL validation first (fast check)
+    try:
+        with Image.open(file_path) as img:
+            # verify() checks the file header
+            img.verify()
+            # Additional step: try to load the image data
+            # This catches more corruption issues
+            with Image.open(file_path) as img2:
+                img2.load()
+            # If check_visual is enabled, analyze the image content
+            if check_visual:
+                # Set thresholds based on strictness level
+                if visual_strictness == 'low':
+                    # More permissive - only detect very obvious corruption
+                    block_threshold = 0.3  # 30% of the image must be uniform
+                    uniform_threshold = 5  # Smaller color variations are allowed
+                elif visual_strictness == 'high':
+                    # Most strict - catches subtle corruption but may have false positives
+                    block_threshold = 0.15  # Only 15% of the image needs to be uniform
+                    uniform_threshold = 15  # Larger color variations are considered uniform
+                else:  # medium (default)
+                    block_threshold = 0.20  # 20% threshold
+                    uniform_threshold = 10
+                # Check for visual corruption with appropriate thresholds
+                is_visually_corrupt, msg = check_visual_corruption(
+                    file_path,
+                    block_threshold=block_threshold,
+                    uniform_threshold=uniform_threshold,
+                    # Only use additional detection methods in high strictness mode
+                    strict_mode=(visual_strictness == 'high')
+                )
+                if is_visually_corrupt:
+                    logging.debug(f"Visual corruption detected in {file_path}: {msg}")
+                    return False
+            # If thorough checking is disabled, return after basic check
+            if not thorough or sensitivity == 'low':
+                return True
+            # For JPEG files, do additional structure checking
+            if file_path.lower().endswith(tuple(SUPPORTED_FORMATS['JPEG'])):
+                # Check JPEG structure
+                is_valid, error_msg = check_jpeg_structure(file_path)
+                if not is_valid:
+                    # If ignore_eof is enabled and the only issue is missing EOI marker, consider it valid
+                    if ignore_eof and error_msg == "Missing EOI marker at end of file":
+                        logging.debug(f"Ignoring missing EOI marker for {file_path} as requested")
+                    else:
+                        logging.debug(f"JPEG structure invalid for {file_path}: {error_msg}")
+                        return False
+                # Try full decode test (catches subtle corruption)
+                is_valid, error_msg = try_full_decode_check(file_path)
+                if not is_valid:
+                    logging.debug(f"Full decode test failed for {file_path}: {error_msg}")
+                    return False
+                # Try external tools if applicable
+                is_valid, error_msg = try_external_tools(file_path)
+                if not is_valid:
+                    logging.debug(f"External tool validation failed for {file_path}: {error_msg}")
+                    return False
+            # For PNG files, do additional structure checking
+            elif file_path.lower().endswith(tuple(SUPPORTED_FORMATS['PNG'])):
+                # Check PNG structure
+                is_valid, error_msg = check_png_structure(file_path)
+                if not is_valid:
+                    logging.debug(f"PNG structure invalid for {file_path}: {error_msg}")
+                    return False
+                # Try full decode test (catches subtle corruption)
+                is_valid, error_msg = try_full_decode_check(file_path)
+                if not is_valid:
+                    logging.debug(f"Full decode test failed for {file_path}: {error_msg}")
+                    return False
+            return True
+    except Exception as e:
+        logging.debug(f"Invalid image {file_path}: {str(e)}")
+        return False
+def attempt_repair(file_path, backup_dir=None):
+    """
+    Attempts to repair corrupt image files.
+    Returns: (success, message, fixed_width, fixed_height)
+    """
+    # Create backup if requested
+    if backup_dir:
+        backup_path = os.path.join(backup_dir, os.path.basename(file_path) + ".bak")
+        try:
+            shutil.copy2(file_path, backup_path)
+            logging.debug(f"Created backup at {backup_path}")
+        except Exception as e:
+            logging.warning(f"Could not create backup: {str(e)}")
+    try:
+        # First, diagnose the issue
+        issue_type, details = diagnose_image_issue(file_path)
+        logging.debug(f"Diagnosis for {file_path}: {issue_type} - {details}")
+        file_ext = os.path.splitext(file_path)[1].lower()
+        # Check if file format is supported for repair
+        format_supported = False
+        for fmt in REPAIRABLE_FORMATS:
+            if file_ext in SUPPORTED_FORMATS[fmt]:
+                format_supported = True
+                break
+        if not format_supported:
+            return False, f"Format not supported for repair ({file_ext})", None, None
+        # Try to open and resave the image with PIL's error forgiveness
+        # This works for many truncated files
+        try:
+            with Image.open(file_path) as img:
+                width, height = img.size
+                format = img.format
+                # Create a buffer for the fixed image
+                buffer = io.BytesIO()
+                img.save(buffer, format=format)
+                # Write the repaired image back to the original file
+                with open(file_path, 'wb') as f:
+                    f.write(buffer.getvalue())
+                # Verify the repaired image
+                if is_valid_image(file_path):
+                    return True, f"Repaired {issue_type} issue", width, height
+                else:
+                    # If verification fails, try again with JPEG specific options for JPEG files
+                    if format == 'JPEG':
+                        with Image.open(file_path) as img:
+                            buffer = io.BytesIO()
+                            # Use optimize=True and quality=85 for better repair chances
+                            img.save(buffer, format='JPEG', optimize=True, quality=85)
+                            with open(file_path, 'wb') as f:
+                                f.write(buffer.getvalue())
+                            if is_valid_image(file_path):
+                                return True, f"Repaired {issue_type} issue with JPEG optimization", width, height
+                    return False, f"Failed to repair {issue_type} issue", None, None
+        except Exception as e:
+            logging.debug(f"Repair attempt failed for {file_path}: {str(e)}")
+            return False, f"Repair failed: {str(e)}", None, None
+    except Exception as e:
+        logging.debug(f"Error during repair of {file_path}: {str(e)}")
+        return False, f"Repair error: {str(e)}", None, None
+def process_file(args):
+    """Process a single image file."""
+    file_path, repair_mode, repair_dir, thorough_check, sensitivity, ignore_eof, check_visual, visual_strictness, enable_security_checks = args
+    # Security validation (if enabled)
+    if enable_security_checks:
+        try:
+            is_safe, warnings = validate_file_security(file_path, check_size=True, check_dimensions=True)
+            # Log security warnings
+            for warning in warnings:
+                logging.warning(f"Security warning for {file_path}: {warning}")
+            if not is_safe:
+                # File failed security checks - treat as invalid
+                size = os.path.getsize(file_path)
+                return file_path, False, size, "security_failed", "Failed security validation", None
+        except ValueError as e:
+            # Critical security failure (file too large, dimensions too big, etc.)
+            logging.error(f"Security check failed for {file_path}: {e}")
+            size = os.path.getsize(file_path) if os.path.exists(file_path) else 0
+            return file_path, False, size, "security_failed", str(e), None
+        except Exception as e:
+            # Unexpected error during security validation
+            logging.debug(f"Security validation error for {file_path}: {e}")
+            # Continue processing anyway for this case
+    # Check if the image is valid
+    is_valid = is_valid_image(file_path, thorough=thorough_check, sensitivity=sensitivity,
+                             ignore_eof=ignore_eof, check_visual=check_visual, visual_strictness=visual_strictness)
+    if not is_valid and repair_mode:
+        # Try to repair the file
+        repair_success, repair_msg, width, height = attempt_repair(file_path, repair_dir)
+        if repair_success:
+            # File was repaired
+            return file_path, True, 0, "repaired", repair_msg, (width, height)
+        else:
+            # File is still corrupt
+            size = os.path.getsize(file_path)
+            return file_path, False, size, "repair_failed", repair_msg, None
+    else:
+        # No repair attempted or file is valid
+        size = os.path.getsize(file_path) if not is_valid else 0
+        return file_path, is_valid, size, "not_repaired", None, None
+def get_session_id(directory, formats, recursive):
+    """Generate a unique session ID based on scan parameters."""
+    # Create a unique identifier for this scan session
+    dir_path = str(directory).encode('utf-8')
+    formats_str = ",".join(sorted(formats)).encode('utf-8')
+    recursive_str = str(recursive).encode('utf-8')
+    # Use SHA256 instead of MD5 for better security
+    # MD5 is cryptographically broken and should not be used
+    hash_obj = hashlib.sha256()
+    hash_obj.update(dir_path)
+    hash_obj.update(formats_str)
+    hash_obj.update(recursive_str)
+    return hash_obj.hexdigest()[:16]  # Use first 16 chars of hash for uniqueness
+def _deduplicate(seq):
+    """Return a list with duplicates removed while preserving order."""
+    seen = set()
+    deduped = []
+    for item in seq:
+        if item not in seen:
+            deduped.append(item)
+            seen.add(item)
+    return deduped
+def validate_file_security(file_path, check_size=True, check_dimensions=True):
+    """
+    Perform security validation on a file before processing.
+    Args:
+        file_path: Path to the file
+        check_size: Whether to check file size limits
+        check_dimensions: Whether to check image dimension limits
+    Returns:
+        (is_safe, warnings) - tuple of boolean and list of warning messages
+    Raises:
+        ValueError: If file fails critical security checks
+    """
+    warnings = []
+    # Check if file exists
+    if not os.path.exists(file_path):
+        raise ValueError(f"File does not exist: {file_path}")
+    # Check file size to prevent DoS via huge files
+    if check_size:
+        file_size = os.path.getsize(file_path)
+        if file_size > MAX_FILE_SIZE:
+            raise ValueError(f"File too large ({file_size} bytes, max {MAX_FILE_SIZE}). "
+                           f"This could indicate a malicious file or decompression bomb.")
+        # Warn about suspiciously large files (over 10MB for images is unusual)
+        if file_size > 10 * 1024 * 1024:
+            warnings.append(f"Large file size: {humanize.naturalsize(file_size)}")
+    # Check image dimensions to prevent decompression bombs
+    if check_dimensions:
+        try:
+            with Image.open(file_path) as img:
+                width, height = img.size
+                total_pixels = width * height
+                if total_pixels > MAX_IMAGE_PIXELS:
+                    raise ValueError(f"Image dimensions too large ({width}x{height} = {total_pixels} pixels, "
+                                   f"max {MAX_IMAGE_PIXELS}). This could be a decompression bomb attack.")
+                # Warn about very large images
+                if total_pixels > 10000 * 10000:
+                    warnings.append(f"Large image dimensions: {width}x{height}")
+                # Check for format mismatch (file extension vs actual format)
+                actual_format = img.format
+                expected_formats = []
+                for fmt, extensions in SUPPORTED_FORMATS.items():
+                    if file_path.lower().endswith(extensions):
+                        expected_formats.append(fmt)
+                if actual_format and expected_formats and actual_format not in expected_formats:
+                    warnings.append(f"Format mismatch: file has '{file_path.split('.')[-1]}' extension "
+                                  f"but is actually '{actual_format}' format")
+        except UnidentifiedImageError:
+            raise ValueError(f"Cannot identify image format - file may be corrupted or malicious")
+        except Exception as e:
+            raise ValueError(f"Error validating image: {str(e)}")
+    return True, warnings
+def calculate_file_hash(file_path, algorithm='sha256'):
+    """
+    Calculate cryptographic hash of a file.
+    Args:
+        file_path: Path to the file
+        algorithm: Hash algorithm to use (sha256, sha512, etc.)
+    Returns:
+        Hexadecimal hash string
+    """
+    hash_obj = hashlib.new(algorithm)
+    # Read file in chunks to handle large files
+    with open(file_path, 'rb') as f:
+        for chunk in iter(lambda: f.read(4096), b''):
+            hash_obj.update(chunk)
+    return hash_obj.hexdigest()
+def safe_join_path(base_dir, user_path):
+    """
+    Safely join paths and prevent path traversal attacks.
+    Args:
+        base_dir: Base directory (trusted)
+        user_path: User-provided path component (untrusted)
+    Returns:
+        Safe absolute path within base_dir
+    Raises:
+        ValueError: If path traversal is detected
+    """
+    # Normalize base directory
+    base_dir = os.path.abspath(base_dir)
+    # Join paths
+    full_path = os.path.normpath(os.path.join(base_dir, user_path))
+    # Resolve any symlinks
+    full_path = os.path.abspath(full_path)
+    # Ensure the result is within base_dir
+    if not full_path.startswith(base_dir + os.sep) and full_path != base_dir:
+        raise ValueError(f"Path traversal detected: '{user_path}' resolves outside base directory")
+    return full_path
+def save_progress(session_id, directory, formats, recursive, processed_files,
+                 bad_files, repaired_files, progress_dir=DEFAULT_PROGRESS_DIR):
+    """Save the current progress to a file."""
+    # Create progress directory if it doesn't exist
+    if not os.path.exists(progress_dir):
+        os.makedirs(progress_dir, exist_ok=True)
+    # Create a progress state object
+    progress_state = {
+        'version': VERSION,
+        'timestamp': datetime.now().isoformat(),
+        'directory': str(directory),
+        'formats': formats,
+        'recursive': recursive,
+        'processed_files': _deduplicate(processed_files),
+        'bad_files': _deduplicate(bad_files),
+        'repaired_files': _deduplicate(repaired_files)
+    }
+    # Save to file using JSON instead of pickle for security
+    # This prevents arbitrary code execution via malicious progress files
+    progress_file = os.path.join(progress_dir, f"session_{session_id}.progress.json")
+    with open(progress_file, 'w') as f:
+        json.dump(progress_state, f, indent=2)
+    logging.debug(f"Progress saved to {progress_file}")
+    return progress_file
+def load_progress(session_id, progress_dir=DEFAULT_PROGRESS_DIR):
+    """Load progress from a saved session."""
+    # Try new JSON format first (more secure)
+    progress_file_json = os.path.join(progress_dir, f"session_{session_id}.progress.json")
+    progress_file_legacy = os.path.join(progress_dir, f"session_{session_id}.progress")
+    # Prefer JSON format for security
+    if os.path.exists(progress_file_json):
+        progress_file = progress_file_json
+        use_json = True
+    elif os.path.exists(progress_file_legacy):
+        progress_file = progress_file_legacy
+        use_json = False
+        logging.warning("Loading legacy pickle format. This format is deprecated for security reasons.")
+    else:
+        return None
+    try:
+        if use_json:
+            # Secure JSON deserialization
+            with open(progress_file, 'r') as f:
+                progress_state = json.load(f)
+        else:
+            # Legacy pickle support (with warning)
+            # TODO: Remove pickle support in future versions
+            import pickle
+            with open(progress_file, 'rb') as f:
+                progress_state = pickle.load(f)
+            logging.warning("SECURITY WARNING: Loaded progress file using unsafe pickle format. "
+                          "Please delete old .progress files and use new .progress.json format.")
+        # Remove any duplicate entries from lists
+        for key in ('processed_files', 'bad_files', 'repaired_files'):
+            if key in progress_state:
+                progress_state[key] = _deduplicate(progress_state[key])
+        # Check version compatibility
+        if progress_state.get('version', '0.0.0') != VERSION:
+            logging.warning("Progress file was created with a different version. Some incompatibilities may exist.")
+        logging.info(f"Loaded progress from {progress_file}")
+        return progress_state
+    except Exception as e:
+        logging.error(f"Failed to load progress: {str(e)}")
+        return None
+def list_saved_sessions(progress_dir=DEFAULT_PROGRESS_DIR):
+    """List all saved sessions with their details."""
+    if not os.path.exists(progress_dir):
+        return []
+    sessions = []
+    for filename in os.listdir(progress_dir):
+        # Support both new JSON format and legacy pickle format
+        if filename.endswith('.progress.json') or filename.endswith('.progress'):
+            try:
+                filepath = os.path.join(progress_dir, filename)
+                use_json = filename.endswith('.progress.json')
+                if use_json:
+                    with open(filepath, 'r') as f:
+                        progress_state = json.load(f)
+                else:
+                    # Legacy pickle format
+                    import pickle
+                    with open(filepath, 'rb') as f:
+                        progress_state = pickle.load(f)
+                # Extract session ID from filename
+                if filename.endswith('.progress.json'):
+                    session_id = filename.replace('session_', '').replace('.progress.json', '')
+                else:
+                    session_id = filename.replace('session_', '').replace('.progress', '')
+                session_info = {
+                    'id': session_id,
+                    'timestamp': progress_state.get('timestamp', 'Unknown'),
+                    'directory': progress_state.get('directory', 'Unknown'),
+                    'formats': progress_state.get('formats', []),
+                    'processed_count': len(progress_state.get('processed_files', [])),
+                    'bad_count': len(progress_state.get('bad_files', [])),
+                    'repaired_count': len(progress_state.get('repaired_files', [])),
+                    'filepath': filepath,
+                    'format': 'JSON' if use_json else 'Pickle (Legacy)'
+                }
+                sessions.append(session_info)
+            except Exception as e:
+                logging.debug(f"Failed to load session from {filename}: {str(e)}")
+    # Sort by timestamp, newest first
+    sessions.sort(key=lambda x: x['timestamp'], reverse=True)
+    return sessions
+def get_extensions_for_formats(formats):
+    """Get all file extensions for the specified formats."""
+    extensions = []
+    for fmt in formats:
+        if fmt in SUPPORTED_FORMATS:
+            extensions.extend(SUPPORTED_FORMATS[fmt])
+    return tuple(extensions)
+def find_image_files(directory, formats, recursive=True):
+    """Find all image files of specified formats in a directory."""
+    image_files = []
+    extensions = get_extensions_for_formats(formats)
+    if not extensions:
+        logging.warning("No valid image formats specified!")
+        return []
+    format_names = ", ".join(formats)
+    if recursive:
+        logging.info(f"Recursively scanning for {format_names} files...")
+        for root, _, files in os.walk(directory):
+            for file in files:
+                if file.lower().endswith(extensions):
+                    image_files.append(os.path.join(root, file))
+    else:
+        logging.info(f"Scanning for {format_names} files in {directory} (non-recursive)...")
+        for file in os.listdir(directory):
+            if os.path.isfile(os.path.join(directory, file)) and file.lower().endswith(extensions):
+                image_files.append(os.path.join(directory, file))
+    logging.info(f"Found {len(image_files)} image files")
+    return image_files
+def process_images(directory, formats, dry_run=True, repair=False,
+                  max_workers=None, recursive=True, move_to=None, repair_dir=None,
+                  save_progress_interval=5, resume_session=None, progress_dir=DEFAULT_PROGRESS_DIR,
+                  thorough_check=False, sensitivity='medium', ignore_eof=False, check_visual=False,
+                  visual_strictness='medium', enable_security_checks=False):
+    """Find corrupt image files and optionally repair, delete, or move them."""
+    start_time = time.time()
+    # Generate session ID for this scan
+    session_id = get_session_id(directory, formats, recursive)
+    processed_files = []
+    bad_files = []
+    repaired_files = []
+    total_size_saved = 0
+    last_progress_save = time.time()
+    # If resuming, load previous progress
+    if resume_session:
+        try:
+            progress = load_progress(resume_session, progress_dir)
+            if progress and progress['directory'] == str(directory) and progress['formats'] == formats:
+                processed_files = progress['processed_files']
+                bad_files = progress['bad_files']
+                repaired_files = progress['repaired_files']
+                logging.info(f"Resuming session: {len(processed_files)} files already processed")
+            else:
+                if progress:
+                    logging.warning("Session parameters don't match current parameters. Starting fresh scan.")
+                else:
+                    logging.warning(f"Couldn't find session {resume_session}. Starting fresh scan.")
+        except Exception as e:
+            logging.error(f"Error loading session: {str(e)}. Starting fresh scan.")
+    # Find all image files
+    image_files = find_image_files(directory, formats, recursive)
+    if not image_files:
+        logging.warning("No image files found!")
+        return [], [], 0
+    # Filter out already processed files if resuming
+    if processed_files:
+        remaining_files = [f for f in image_files if f not in processed_files]
+        skipped_count = len(image_files) - len(remaining_files)
+        image_files = remaining_files
+        logging.info(f"Skipping {skipped_count} already processed files")
+    if not image_files:
+        logging.info("All files have already been processed in the previous session!")
+        return bad_files, repaired_files, total_size_saved
+    # Create directories if they don't exist
+    if move_to and not os.path.exists(move_to):
+        os.makedirs(move_to)
+        logging.info(f"Created directory for corrupt files: {move_to}")
+    if repair and repair_dir and not os.path.exists(repair_dir):
+        os.makedirs(repair_dir)
+        logging.info(f"Created directory for backup files: {repair_dir}")
+    # Prepare input arguments for workers
+    input_args = [(file_path, repair, repair_dir, thorough_check, sensitivity, ignore_eof, check_visual, visual_strictness, enable_security_checks) for file_path in image_files]
+    # Process files in parallel
+    logging.info("Processing files in parallel...")
+    # Create a custom progress bar class that saves progress periodically
+    class ProgressSavingBar(tqdm_auto.tqdm):
+        def update(self, n=1):
+            nonlocal last_progress_save, processed_files
+            result = super().update(n)
+            # Save progress periodically
+            current_time = time.time()
+            if save_progress_interval > 0 and current_time - last_progress_save >= save_progress_interval * 60:
+                # Save the progress using the list of files that have actually
+                # completed processing. ``processed_files`` is updated as each
+                # future finishes so we can safely persist it as-is.
+                save_progress(
+                    session_id,
+                    directory,
+                    formats,
+                    recursive,
+                    processed_files,
+                    bad_files,
+                    repaired_files,
+                    progress_dir,
+                )
+                last_progress_save = current_time
+                logging.debug(f"Progress saved at {self.n} / {len(image_files)} files")
+            return result
+    try:
+        with concurrent.futures.ProcessPoolExecutor(max_workers=max_workers) as executor:
+            # Colorful progress bar with progress saving
+            results = []
+            futures = {executor.submit(process_file, arg): arg[0] for arg in input_args}
+            with ProgressSavingBar(
+                total=len(image_files),
+                desc=f"{colorama.Fore.BLUE}Checking image files{colorama.Style.RESET_ALL}",
+                unit="file",
+                bar_format="{desc}: {percentage:3.0f}%|{bar:30}| {n_fmt}/{total_fmt} [{elapsed}<{remaining}, {rate_fmt}]",
+                colour="blue"
+            ) as pbar:
+                for future in concurrent.futures.as_completed(futures):
+                    file_path = futures[future]
+                    try:
+                        result = future.result()
+                        results.append(result)
+                        # Track this file as processed for resuming later if needed
+                        processed_files.append(file_path)
+                        # Update progress for successful or failed processing
+                        pbar.update(1)
+                        # Update our tracking of bad/repaired files in real-time for progress saving
+                        file_path, is_valid, size, repair_status, repair_msg, dimensions = result
+                        if repair_status == "repaired":
+                            repaired_files.append(file_path)
+                        elif not is_valid:
+                            bad_files.append(file_path)
+                    except Exception as e:
+                        logging.error(f"Error processing {file_path}: {str(e)}")
+                        pbar.update(1)
+    except KeyboardInterrupt:
+        # If the user interrupts, save progress before exiting
+        logging.warning("Process interrupted by user. Saving progress...")
+        save_progress(session_id, directory, formats, recursive,
+                     processed_files, bad_files, repaired_files, progress_dir)
+        logging.info(f"Progress saved. You can resume with --resume {session_id}")
+        raise
+    # Process results
+    total_size_saved = 0
+    for file_path, is_valid, size, repair_status, repair_msg, dimensions in results:
+        if repair_status == "repaired":
+            # File was successfully repaired (already added to repaired_files during processing)
+            width, height = dimensions
+            msg = f"Repaired: {file_path} ({width}x{height}) - {repair_msg}"
+            logging.info(msg)
+        elif not is_valid:
+            # File is corrupt and wasn't repaired (or repair failed)
+            # (already added to bad_files during processing)
+            total_size_saved += size
+            size_str = humanize.naturalsize(size)
+            if repair_status == "repair_failed":
+                fail_msg = f"Repair failed: {file_path} ({size_str}) - {repair_msg}"
+                logging.warning(fail_msg)
+            if dry_run:
+                msg = f"Would delete: {file_path} ({size_str})"
+                logging.info(msg)
+            elif move_to:
+                # Preserve the subdirectory structure by getting the relative path from the search directory
+                try:
+                    # Get the relative path from the base directory
+                    rel_path = os.path.relpath(file_path, str(directory))
+                    # If relpath starts with ".." it means file_path is not within directory
+                    # In this case, just use the basename as fallback
+                    if rel_path.startswith('..'):
+                        rel_path = os.path.basename(file_path)
+                    # Use safe path joining to prevent path traversal attacks
+                    # This ensures files can't be written outside the move_to directory
+                    try:
+                        dest_path = safe_join_path(move_to, rel_path)
+                    except ValueError as ve:
+                        logging.error(f"Security error moving {file_path}: {ve}")
+                        continue
+                    # Create parent directories if they don't exist
+                    os.makedirs(os.path.dirname(dest_path), exist_ok=True)
+                    # Use shutil.move instead of os.rename to handle cross-device file movements
+                    shutil.move(file_path, dest_path)
+                    # Add arrow with color
+                    arrow = f"{colorama.Fore.CYAN}→{colorama.Style.RESET_ALL}"
+                    msg = f"Moved: {file_path} {arrow} {dest_path} ({size_str})"
+                    logging.info(msg)
+                except Exception as e:
+                    logging.error(f"Failed to move {file_path}: {e}")
+            else:
+                try:
+                    os.remove(file_path)
+                    msg = f"Deleted: {file_path} ({size_str})"
+                    logging.info(msg)
+                except Exception as e:
+                    logging.error(f"Failed to delete {file_path}: {e}")
+    # Final progress save
+    save_progress(session_id, directory, formats, recursive,
+                 processed_files, bad_files, repaired_files, progress_dir)
+    elapsed = time.time() - start_time
+    logging.info(f"Processed {len(processed_files)} files in {elapsed:.2f} seconds")
+    logging.info(f"Session ID: {session_id} (use --resume {session_id} to resume if needed)")
+    return bad_files, repaired_files, total_size_saved
+def print_banner():
+    """Print 2PAC-themed ASCII art banner"""
+    banner = r"""
+    ░▒▓███████▓▒░░▒▓███████▓▒░ ░▒▓██████▓▒░ ░▒▓██████▓▒░
+           ░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░
+           ░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░
+     ░▒▓██████▓▒░░▒▓███████▓▒░░▒▓████████▓▒░▒▓█▓▒░
+    ░▒▓█▓▒░      ░▒▓█▓▒░      ░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░
+    ░▒▓█▓▒░      ░▒▓█▓▒░      ░▒▓█▓▒░░▒▓█▓▒░▒▓█▓▒░░▒▓█▓▒░
+    ░▒▓████████▓▒░▒▓█▓▒░      ░▒▓█▓▒░░▒▓█▓▒░░▒▓██████▓▒░
+    ╔═════════════════════════════════════════════════════════╗
+    ║ The Picture Analyzer & Corruption killer                ║
+    ║ In memory of Jeff Young - Bringing people together      ║
+    ╚═════════════════════════════════════════════════════════╝
+    """
+    # Colored version of the banner, highlighting PAC for Picture Analyzer Corruption
+    if 'colorama' in sys.modules:
+        banner_lines = banner.strip().split('\n')
+        colored_banner = []
+        # Color the new gradient ASCII art logo (lines 0-6)
+        for i, line in enumerate(banner_lines):
+            if i < 7:  # The ASCII art logo lines for the new gradient style
+                # For "2" part (first column)
+                part1 = line[:11]
+                # For "P" part (second column)
+                part2 = line[11:24]
+                # For "A" part (third column)
+                part3 = line[24:38]
+                # For "C" part (fourth column)
+                part4 = line[38:]
+                colored_line = f"{colorama.Fore.WHITE}{part1}" + \
+                               f"{colorama.Fore.RED}{part2}" + \
+                               f"{colorama.Fore.GREEN}{part3}" + \
+                               f"{colorama.Fore.BLUE}{part4}{colorama.Style.RESET_ALL}"
+                colored_banner.append(colored_line)
+            elif i >= 7 and i <= 10:  # The box and text lines
+                if i == 8:  # Title line with PAC highlighted
+                    parts = line.split("Picture Analyzer & Corruption")
+                    if len(parts) == 2:
+                        prefix = parts[0]
+                        suffix = parts[1]
+                        colored_title = f"{colorama.Fore.YELLOW}{prefix}" + \
+                                       f"{colorama.Fore.RED}Picture " + \
+                                       f"{colorama.Fore.GREEN}Analyzer " + \
+                                       f"{colorama.Fore.WHITE}& " + \
+                                       f"{colorama.Fore.BLUE}Corruption" + \
+                                       f"{colorama.Fore.YELLOW}{suffix}{colorama.Style.RESET_ALL}"
+                        colored_banner.append(colored_title)
+                    else:
+                        colored_banner.append(f"{colorama.Fore.YELLOW}{line}{colorama.Style.RESET_ALL}")
+                elif i == 9:  # Jeff Young tribute line
+                    colored_banner.append(f"{colorama.Fore.CYAN}{line}{colorama.Style.RESET_ALL}")
+                else:  # Box border lines
+                    colored_banner.append(f"{colorama.Fore.YELLOW}{line}{colorama.Style.RESET_ALL}")
+            else:
+                colored_banner.append(f"{colorama.Fore.WHITE}{line}{colorama.Style.RESET_ALL}")
+        print('\n'.join(colored_banner))
+    else:
+        print(banner)
+    print()
+def main():
+    print_banner()
+    # Check for 'q' command to quit
+    if len(sys.argv) == 2 and sys.argv[1].lower() == 'q':
+        print(f"{colorama.Fore.YELLOW}Exiting 2PAC. Stay safe!{colorama.Style.RESET_ALL}")
+        sys.exit(0)
+    parser = argparse.ArgumentParser(
+        description='2PAC: The Picture Analyzer & Corruption killer',
+        epilog='Created by Richard Young - "All Eyez On Your Images" - https://github.com/ricyoung/2pac'
+    )
+    # Main action (mutually exclusive)
+    action_group = parser.add_mutually_exclusive_group()
+    action_group.add_argument('directory', nargs='?', help='Directory to search for image files')
+    action_group.add_argument('--list-sessions', action='store_true', help='List all saved sessions')
+    action_group.add_argument('--check-file', type=str, help='Check a specific file for corruption (useful for testing)')
+    # Basic options
+    parser.add_argument('--delete', action='store_true', help='Delete corrupt image files (without this flag, runs in dry-run mode)')
+    parser.add_argument('--move-to', type=str, help='Move corrupt files to this directory instead of deleting them')
+    parser.add_argument('--workers', type=int, default=None, help='Number of worker processes (default: CPU count)')
+    parser.add_argument('--non-recursive', action='store_true', help='Only search in the specified directory, not subdirectories')
+    parser.add_argument('--output', type=str, help='Save list of corrupt files to this file')
+    parser.add_argument('--verbose', '-v', action='store_true', help='Enable verbose logging')
+    parser.add_argument('--no-color', action='store_true', help='Disable colored output')
+    parser.add_argument('--version', action='version', version=f'Bad Image Finder v{VERSION} by Richard Young')
+    # Repair options
+    repair_group = parser.add_argument_group('Repair options')
+    repair_group.add_argument('--repair', action='store_true', help='Attempt to repair corrupt image files')
+    repair_group.add_argument('--backup-dir', type=str, help='Directory to store backups of files before repair')
+    repair_group.add_argument('--repair-report', type=str, help='Save list of repaired files to this file')
+    # Format options
+    format_group = parser.add_argument_group('Image format options')
+    format_group.add_argument('--formats', type=str, nargs='+', choices=SUPPORTED_FORMATS.keys(),
+                             help=f'Image formats to check (default: all formats)')
+    format_group.add_argument('--jpeg', action='store_true', help='Check JPEG files only')
+    format_group.add_argument('--png', action='store_true', help='Check PNG files only')
+    format_group.add_argument('--tiff', action='store_true', help='Check TIFF files only')
+    format_group.add_argument('--gif', action='store_true', help='Check GIF files only')
+    format_group.add_argument('--bmp', action='store_true', help='Check BMP files only')
+    # Validation options
+    validation_group = parser.add_argument_group('Validation options')
+    validation_group.add_argument('--thorough', action='store_true',
+                                 help='Perform thorough image validation (slower but catches more subtle corruption)')
+    validation_group.add_argument('--sensitivity', type=str, choices=['low', 'medium', 'high'], default='medium',
+                                help='Set validation sensitivity level: low (basic checks), medium (standard checks), high (most strict)')
+    validation_group.add_argument('--ignore-eof', action='store_true',
+                                help='Ignore missing end-of-file markers (useful for truncated but viewable files)')
+    validation_group.add_argument('--check-visual', action='store_true',
+                                help='Analyze image content to detect visible corruption like gray/black areas')
+    validation_group.add_argument('--visual-strictness', type=str, choices=['low', 'medium', 'high'], default='medium',
+                                help='Set strictness level for visual corruption detection: low (most permissive), medium (balanced), high (only clear corruption)')
+    # Security options
+    security_group = parser.add_argument_group('Security options')
+    security_group.add_argument('--security-checks', action='store_true',
+                               help='Enable enhanced security validation (file size limits, dimension checks, format verification)')
+    security_group.add_argument('--max-file-size', type=int, default=MAX_FILE_SIZE,
+                               help=f'Maximum file size in bytes to process (default: {MAX_FILE_SIZE} = 100MB)')
+    security_group.add_argument('--max-pixels', type=int, default=MAX_IMAGE_PIXELS,
+                               help=f'Maximum image dimensions in pixels (default: {MAX_IMAGE_PIXELS} = 50MP)')
+    # Progress saving options
+    progress_group = parser.add_argument_group('Progress options')
+    progress_group.add_argument('--save-interval', type=int, default=5,
+                              help='Save progress every N minutes (0 to disable progress saving)')
+    progress_group.add_argument('--progress-dir', type=str, default=DEFAULT_PROGRESS_DIR,
+                               help='Directory to store progress files')
+    progress_group.add_argument('--resume', type=str, metavar='SESSION_ID',
+                              help='Resume from a previously saved session')
+    args = parser.parse_args()
+    # Setup logging
+    setup_logging(args.verbose, args.no_color)
+    # Handle specific file check mode
+    if args.check_file:
+        file_path = args.check_file
+        if not os.path.exists(file_path):
+            logging.error(f"Error: File not found: {file_path}")
+            sys.exit(1)
+        print(f"\n{colorama.Style.BRIGHT}Checking file: {file_path}{colorama.Style.RESET_ALL}\n")
+        # Basic check
+        print(f"{colorama.Fore.CYAN}Basic validation:{colorama.Style.RESET_ALL}")
+        try:
+            with Image.open(file_path) as img:
+                print(f"✓ File can be opened by PIL")
+                print(f"  Format: {img.format}")
+                print(f"  Mode: {img.mode}")
+                print(f"  Size: {img.size[0]}x{img.size[1]}")
+                try:
+                    img.verify()
+                    print(f"✓ Header verification passed")
+                except Exception as e:
+                    print(f"❌ Header verification failed: {str(e)}")
+                try:
+                    with Image.open(file_path) as img2:
+                        img2.load()
+                    print(f"✓ Data loading test passed")
+                except Exception as e:
+                    print(f"❌ Data loading test failed: {str(e)}")
+        except Exception as e:
+            print(f"❌ Cannot open file with PIL: {str(e)}")
+        # Detailed format-specific checks
+        if file_path.lower().endswith(tuple(SUPPORTED_FORMATS['JPEG'])):
+            print(f"\n{colorama.Fore.CYAN}JPEG structure checks:{colorama.Style.RESET_ALL}")
+            is_valid, msg = check_jpeg_structure(file_path)
+            if is_valid:
+                print(f"✓ JPEG structure valid: {msg}")
+            else:
+                print(f"❌ JPEG structure invalid: {msg}")
+        elif file_path.lower().endswith(tuple(SUPPORTED_FORMATS['PNG'])):
+            print(f"\n{colorama.Fore.CYAN}PNG structure checks:{colorama.Style.RESET_ALL}")
+            is_valid, msg = check_png_structure(file_path)
+            if is_valid:
+                print(f"✓ PNG structure valid: {msg}")
+            else:
+                print(f"❌ PNG structure invalid: {msg}")
+        # Decode test
+        print(f"\n{colorama.Fore.CYAN}Full decode test:{colorama.Style.RESET_ALL}")
+        is_valid, msg = try_full_decode_check(file_path)
+        if is_valid:
+            print(f"✓ Full decode test passed: {msg}")
+        else:
+            print(f"❌ Full decode test failed: {msg}")
+        # External tools check
+        print(f"\n{colorama.Fore.CYAN}External tools check:{colorama.Style.RESET_ALL}")
+        is_valid, msg = try_external_tools(file_path)
+        if is_valid:
+            print(f"✓ External tools: {msg}")
+        else:
+            print(f"❌ External tools: {msg}")
+        # Visual corruption check
+        print(f"\n{colorama.Fore.CYAN}Visual content analysis:{colorama.Style.RESET_ALL}")
+        is_visually_corrupt, vis_msg = check_visual_corruption(file_path)
+        if not is_visually_corrupt:
+            print(f"✓ No visual corruption detected: {vis_msg}")
+        else:
+            print(f"❌ {vis_msg}")
+        # Final verdict
+        print(f"\n{colorama.Fore.CYAN}Final verdict:{colorama.Style.RESET_ALL}")
+        is_valid_basic = is_valid_image(file_path, thorough=False)
+        is_valid_thorough = is_valid_image(file_path, thorough=True)
+        is_valid_visual = not is_visually_corrupt
+        if is_valid_basic and is_valid_thorough and is_valid_visual:
+            print(f"{colorama.Fore.GREEN}This file appears to be valid by all checks.{colorama.Style.RESET_ALL}")
+        elif not is_valid_visual:
+            print(f"{colorama.Fore.RED}This file shows visible corruption in the image content.{colorama.Style.RESET_ALL}")
+            print(f"Recommendation: Use --check-visual to detect this type of corruption.")
+        elif is_valid_basic and not is_valid_thorough:
+            print(f"{colorama.Fore.YELLOW}This file passes basic validation but fails thorough checks.{colorama.Style.RESET_ALL}")
+            print(f"Recommendation: Use --thorough mode to detect this type of corruption.")
+        else:
+            print(f"{colorama.Fore.RED}This file is corrupt and would be detected by the basic scan.{colorama.Style.RESET_ALL}")
+        sys.exit(0)
+    # Handle session listing mode
+    if args.list_sessions:
+        sessions = list_saved_sessions(args.progress_dir)
+        if sessions:
+            print(f"\n{colorama.Style.BRIGHT}Saved Sessions:{colorama.Style.RESET_ALL}")
+            for i, session in enumerate(sessions):
+                ts = datetime.fromisoformat(session['timestamp']).strftime('%Y-%m-%d %H:%M:%S')
+                print(f"\n{colorama.Fore.CYAN}Session ID: {session['id']}{colorama.Style.RESET_ALL}")
+                print(f"  Created: {ts}")
+                print(f"  Directory: {session['directory']}")
+                print(f"  Formats: {', '.join(session['formats'])}")
+                print(f"  Progress: {session['processed_count']} files processed, "
+                      f"{session['bad_count']} corrupt, {session['repaired_count']} repaired")
+                # Show resume command
+                resume_cmd = f"find_bad_images.py --resume {session['id']}"
+                if os.path.exists(session['directory']):
+                    print(f"  {colorama.Fore.GREEN}Resume command: {resume_cmd}{colorama.Style.RESET_ALL}")
+                else:
+                    print(f"  {colorama.Fore.YELLOW}Directory no longer exists, cannot resume{colorama.Style.RESET_ALL}")
+        else:
+            print("No saved sessions found.")
+        sys.exit(0)
+    # Check if directory is specified for a new scan
+    if not args.directory and not args.resume:
+        logging.error("Error: You must specify a directory to scan or use --resume to continue a session")
+        sys.exit(1)
+    # If we're resuming without a directory, load from previous session
+    directory = None
+    if args.resume and not args.directory:
+        progress = load_progress(args.resume, args.progress_dir)
+        if progress:
+            directory = Path(progress['directory'])
+            logging.info(f"Using directory from saved session: {directory}")
+        else:
+            logging.error(f"Could not load session {args.resume}")
+            sys.exit(1)
+    elif args.directory:
+        directory = Path(args.directory)
+    # Verify the directory exists
+    if not directory.exists() or not directory.is_dir():
+        logging.error(f"Error: {directory} is not a valid directory")
+        sys.exit(1)
+    # Check for incompatible options
+    if args.delete and args.move_to:
+        logging.error("Error: Cannot use both --delete and --move-to options")
+        sys.exit(1)
+    # Determine which formats to check
+    formats = []
+    if args.formats:
+        formats = args.formats
+    elif args.jpeg:
+        formats.append('JPEG')
+    elif args.png:
+        formats.append('PNG')
+    elif args.tiff:
+        formats.append('TIFF')
+    elif args.gif:
+        formats.append('GIF')
+    elif args.bmp:
+        formats.append('BMP')
+    else:
+        # Default: check all formats
+        formats = DEFAULT_FORMATS
+    dry_run = not (args.delete or args.move_to)
+    # Colorful mode indicators
+    if args.repair:
+        mode_str = f"{colorama.Fore.MAGENTA}REPAIR MODE{colorama.Style.RESET_ALL}: Attempting to fix corrupt files"
+        logging.info(mode_str)
+        repairable_formats = [fmt for fmt in formats if fmt in REPAIRABLE_FORMATS]
+        if repairable_formats:
+            logging.info(f"Repairable formats: {', '.join(repairable_formats)}")
+        else:
+            logging.warning("None of the selected formats support repair")
+    if dry_run:
+        mode_str = f"{colorama.Fore.YELLOW}DRY RUN MODE{colorama.Style.RESET_ALL}: No files will be deleted or moved"
+        logging.info(mode_str)
+    elif args.move_to:
+        mode_str = f"{colorama.Fore.BLUE}MOVE MODE{colorama.Style.RESET_ALL}: Corrupt files will be moved to {args.move_to}"
+        logging.info(mode_str)
+    else:
+        mode_str = f"{colorama.Fore.RED}DELETE MODE{colorama.Style.RESET_ALL}: Corrupt files will be permanently deleted"
+        logging.info(mode_str)
+    # Add progress saving info
+    if args.save_interval > 0:
+        save_interval_str = f"{colorama.Fore.CYAN}PROGRESS SAVING{colorama.Style.RESET_ALL}: Every {args.save_interval} minutes"
+        logging.info(save_interval_str)
+    else:
+        logging.info("Progress saving is disabled")
+    if args.resume:
+        resume_str = f"{colorama.Fore.CYAN}RESUMING{colorama.Style.RESET_ALL}: From session {args.resume}"
+        logging.info(resume_str)
+    if args.thorough:
+        thorough_str = f"{colorama.Fore.MAGENTA}THOROUGH MODE{colorama.Style.RESET_ALL}: Using deep validation checks (slower but more accurate)"
+        logging.info(thorough_str)
+    # Show sensitivity level
+    sensitivity_colors = {
+        'low': colorama.Fore.GREEN,
+        'medium': colorama.Fore.YELLOW,
+        'high': colorama.Fore.RED
+    }
+    sensitivity_color = sensitivity_colors.get(args.sensitivity, colorama.Fore.YELLOW)
+    sensitivity_str = f"{sensitivity_color}SENSITIVITY: {args.sensitivity.upper()}{colorama.Style.RESET_ALL}"
+    logging.info(sensitivity_str)
+    # Show EOF handling
+    if args.ignore_eof:
+        eof_str = f"{colorama.Fore.CYAN}IGNORING EOF MARKERS{colorama.Style.RESET_ALL}: Allowing truncated but viewable files"
+        logging.info(eof_str)
+    # Show visual corruption checking status
+    if args.check_visual:
+        strictness_color = {
+            'low': colorama.Fore.GREEN,
+            'medium': colorama.Fore.YELLOW,
+            'high': colorama.Fore.RED
+        }.get(args.visual_strictness, colorama.Fore.YELLOW)
+        visual_str = f"{colorama.Fore.MAGENTA}VISUAL CHECK{colorama.Style.RESET_ALL}: " + \
+                     f"Analyzing image content (strictness: {strictness_color}{args.visual_strictness.upper()}{colorama.Style.RESET_ALL})"
+        logging.info(visual_str)
+    # Show security checks status
+    if args.security_checks:
+        security_str = f"{colorama.Fore.RED}SECURITY CHECKS ENABLED{colorama.Style.RESET_ALL}: " + \
+                      f"Validating file sizes (max {humanize.naturalsize(MAX_FILE_SIZE)}), " + \
+                      f"dimensions (max {MAX_IMAGE_PIXELS:,} pixels), and format integrity"
+        logging.info(security_str)
+    # Show which formats we're checking
+    format_list = ", ".join(formats)
+    logging.info(f"Checking image formats: {format_list}")
+    logging.info(f"Searching for corrupt image files in {directory}")
+    try:
+        bad_files, repaired_files, total_size_saved = process_images(
+            directory,
+            formats,
+            dry_run=dry_run,
+            repair=args.repair,
+            max_workers=args.workers,
+            recursive=not args.non_recursive,
+            move_to=args.move_to,
+            repair_dir=args.backup_dir,
+            save_progress_interval=args.save_interval,
+            resume_session=args.resume,
+            progress_dir=args.progress_dir,
+            thorough_check=args.thorough,
+            sensitivity=args.sensitivity,
+            ignore_eof=args.ignore_eof,
+            check_visual=args.check_visual,
+            visual_strictness=args.visual_strictness,
+            enable_security_checks=args.security_checks
+        )
+        # Colorful summary
+        count_color = colorama.Fore.RED if bad_files else colorama.Fore.GREEN
+        file_count = f"{count_color}{len(bad_files)}{colorama.Style.RESET_ALL}"
+        logging.info(f"Found {file_count} corrupt image files")
+        if args.repair:
+            repair_color = colorama.Fore.GREEN if repaired_files else colorama.Fore.YELLOW
+            repair_count = f"{repair_color}{len(repaired_files)}{colorama.Style.RESET_ALL}"
+            logging.info(f"Successfully repaired {repair_count} files")
+            if args.repair_report and repaired_files:
+                with open(args.repair_report, 'w') as f:
+                    for file_path in repaired_files:
+                        f.write(f"{file_path}\n")
+                logging.info(f"Saved list of repaired files to {args.repair_report}")
+        savings_str = humanize.naturalsize(total_size_saved)
+        savings_color = colorama.Fore.GREEN if total_size_saved > 0 else colorama.Fore.RESET
+        savings_msg = f"Total space savings: {savings_color}{savings_str}{colorama.Style.RESET_ALL}"
+        logging.info(savings_msg)
+        if not args.no_color:
+            # Add signature at the end of the run
+            signature = f"\n{colorama.Fore.CYAN}2PAC v{VERSION} by Richard Young{colorama.Style.RESET_ALL}"
+            quote = f"{colorama.Fore.YELLOW}\"{random.choice(QUOTES)}\"{colorama.Style.RESET_ALL}"
+            print(signature)
+            print(quote)
+        # Save list of corrupt files if requested
+        if args.output and bad_files:
+            with open(args.output, 'w') as f:
+                for file_path in bad_files:
+                    f.write(f"{file_path}\n")
+            logging.info(f"Saved list of corrupt files to {args.output}")
+        if bad_files and dry_run:
+            logging.info("Run with --delete to remove these files or --move-to to relocate them")
+    except KeyboardInterrupt:
+        logging.info("Operation cancelled by user")
+        sys.exit(130)
+    except Exception as e:
+        logging.error(f"Error: {str(e)}")
+        if args.verbose:
+            import traceback
+            traceback.print_exc()
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

rat_finder.py ADDED Viewed

	@@ -0,0 +1,1223 @@

+#!/usr/bin/env python3
+"""
+RAT Finder - Beta steganography detection tool for 2PAC
+This tool is designed to detect potential steganography in images.
+It's part of the 2PAC toolkit but focused on security aspects.
+Author: Richard Young
+License: MIT
+"""
+import os
+import sys
+import argparse
+import concurrent.futures
+import logging
+import tempfile
+import numpy as np
+from pathlib import Path
+from PIL import Image
+import matplotlib.pyplot as plt
+from scipy import stats
+import colorama
+from tqdm import tqdm
+# Initialize colorama
+colorama.init()
+# Version
+VERSION = "0.2.0"
+# Set up logging
+def setup_logging(verbose, no_color=False):
+    level = logging.DEBUG if verbose else logging.INFO
+    # Define color codes
+    if not no_color:
+        # Color scheme
+        COLORS = {
+            'DEBUG': colorama.Fore.CYAN,
+            'INFO': colorama.Fore.GREEN,
+            'WARNING': colorama.Fore.YELLOW,
+            'ERROR': colorama.Fore.RED,
+            'CRITICAL': colorama.Fore.MAGENTA + colorama.Style.BRIGHT,
+            'RESET': colorama.Style.RESET_ALL
+        }
+        # Custom formatter with colors
+        class ColoredFormatter(logging.Formatter):
+            def format(self, record):
+                levelname = record.levelname
+                if levelname in COLORS:
+                    record.levelname = f"{COLORS[levelname]}{levelname}{COLORS['RESET']}"
+                    record.msg = f"{COLORS[levelname]}{record.msg}{COLORS['RESET']}"
+                return super().format(record)
+        formatter = ColoredFormatter('%(asctime)s - %(levelname)s - %(message)s')
+    else:
+        formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
+    handler = logging.StreamHandler()
+    handler.setFormatter(formatter)
+    logging.basicConfig(
+        level=level,
+        handlers=[handler]
+    )
+def print_banner():
+    """Print RAT Finder themed ASCII art banner"""
+    banner = r"""
+    ██████╗  █████╗ ████████╗   ███████╗██╗███╗   ██╗██████╗ ███████╗██████╗
+    ██╔══██╗██╔══██╗╚══██╔══╝   ██╔════╝██║████╗  ██║██╔══██╗██╔════╝██╔══██╗
+    ██████╔╝███████║   ██║█████╗█████╗  ██║██╔██╗ ██║██║  ██║█████╗  ██████╔╝
+    ██╔══██╗██╔══██║   ██║╚════╝██╔══╝  ██║██║╚██╗██║██║  ██║██╔══╝  ██╔══██╗
+    ██║  ██║██║  ██║   ██║      ██║     ██║██║ ╚████║██████╔╝███████╗██║  ██║
+    ╚═╝  ╚═╝╚═╝  ╚═╝   ╚═╝      ╚═╝     ╚═╝╚═╝  ╚═══╝╚═════╝ ╚══════╝╚═╝  ╚═╝
+    ╔═══════════════════════════════════════════════════════════════════════╗
+    ║ Steganography Detection Tool (v0.2.0) - Part of the 2PAC toolkit       ║
+    ║ "What the eyes see and the ears hear, the mind believes"               ║
+    ╚═══════════════════════════════════════════════════════════════════════╝
+    """
+    if 'colorama' in sys.modules:
+        banner_lines = banner.strip().split('\n')
+        colored_banner = []
+        # Color the RAT part in red, the FINDER part in blue
+        for i, line in enumerate(banner_lines):
+            if i < 6:  # The logo lines
+                # Add the RAT part in red
+                part1 = line[:24]
+                # Add the FINDER part in blue
+                part2 = line[24:]
+                colored_line = f"{colorama.Fore.RED}{part1}{colorama.Fore.BLUE}{part2}{colorama.Style.RESET_ALL}"
+                colored_banner.append(colored_line)
+            elif i >= 6 and i <= 9:  # The box with text
+                colored_banner.append(f"{colorama.Fore.YELLOW}{line}{colorama.Style.RESET_ALL}")
+            else:
+                colored_banner.append(f"{colorama.Fore.WHITE}{line}{colorama.Style.RESET_ALL}")
+        print('\n'.join(colored_banner))
+    else:
+        print(banner)
+    print()
+#------------------------------------------------------------------------------
+# STEGANOGRAPHY DETECTION TECHNIQUES
+#------------------------------------------------------------------------------
+def perform_ela_analysis(image_path, quality=75):
+    """
+    Performs Error Level Analysis (ELA) to detect manipulated areas in an image.
+    ELA works by intentionally resaving an image at a known quality level and
+    analyzing the differences between the original and resaved versions.
+    Areas that have been manipulated often show up as having different error levels.
+    Args:
+        image_path: Path to the image
+        quality: JPEG quality level to use for recompression (default: 75)
+    Returns:
+        (is_suspicious, confidence, details)
+    """
+    try:
+        # Only perform ELA on JPEG images
+        if not image_path.lower().endswith(('.jpg', '.jpeg', '.jfif')):
+            return False, 0, {"error": "ELA is only effective for JPEG images"}
+        with Image.open(image_path) as original_img:
+            # Convert to RGB if needed
+            if original_img.mode != 'RGB':
+                original_img = original_img.convert('RGB')
+            # Create a temporary file for the resaved image
+            temp_file = tempfile.NamedTemporaryFile(suffix='.jpg', delete=True)
+            resaved_path = temp_file.name
+            # Save the image with the specified quality
+            original_img.save(resaved_path, quality=quality)
+            # Read the resaved image
+            with Image.open(resaved_path) as resaved_img:
+                # Convert both to numpy arrays
+                original_array = np.array(original_img).astype('int32')
+                resaved_array = np.array(resaved_img).astype('int32')
+                # Calculate absolute difference
+                diff = np.abs(original_array - resaved_array)
+                # Calculate statistics from the difference
+                mean_diff = np.mean(diff)
+                std_diff = np.std(diff)
+                max_diff = np.max(diff)
+                # Scale the differences to make them more visible (for visualization)
+                diff_scaled = diff * 10
+                # Look for suspicious patterns
+                # 1. High variance in error levels can indicate manipulation
+                # 2. Localized areas with significantly different error levels are suspicious
+                # 3. Unnaturally low error in complex areas can indicate steganography
+                # Calculate local variation using sliding window approach
+                # We're looking for areas where the difference between neighboring pixels
+                # has unusually high or low variance
+                # Use a simple method - check variance in blocks
+                block_size = 8  # 8x8 blocks, common in JPEG
+                shape = diff.shape
+                block_variance = []
+                # Sample blocks throughout the image
+                for i in range(0, shape[0] - block_size, block_size):
+                    for j in range(0, shape[1] - block_size, block_size):
+                        # Extract block for each channel
+                        for c in range(3):  # RGB channels
+                            block = diff[i:i+block_size, j:j+block_size, c]
+                            block_var = np.var(block)
+                            if block_var > 0:  # Avoid divisions by zero
+                                block_variance.append(block_var)
+                if not block_variance:
+                    return False, 0, {"error": "Could not calculate block variance"}
+                # Calculate statistics on block variances
+                mean_block_var = np.mean(block_variance)
+                max_block_var = np.max(block_variance)
+                std_block_var = np.std(block_variance)
+                # What we're looking for:
+                # 1. Unusually high block variance in some areas (significantly above the mean)
+                # 2. Unusually consistent error levels (too perfect - could indicate manipulation)
+                # Determine suspiciousness based on these factors
+                # Calculate a normalized ratio of max variance to mean variance
+                if mean_block_var > 0:
+                    var_ratio = max_block_var / mean_block_var
+                else:
+                    var_ratio = 0
+                # Calculate coefficient of variation for block variances
+                if mean_block_var > 0:
+                    coeff_var = std_block_var / mean_block_var
+                else:
+                    coeff_var = 0
+                # Heuristics based on ELA characteristics
+                # Unusually high variation ratio can indicate manipulation
+                is_suspicious_var_ratio = var_ratio > 50
+                # High coefficient of variation indicates inconsistent error levels
+                is_suspicious_coeff_var = coeff_var > 2.0
+                # Unusually high mean difference can indicate manipulation
+                is_suspicious_mean_diff = mean_diff > 15
+                # Combine factors
+                is_suspicious = (is_suspicious_var_ratio or
+                                is_suspicious_coeff_var or
+                                is_suspicious_mean_diff)
+                # Calculate confidence based on these factors
+                confidence = 0
+                if is_suspicious_var_ratio:
+                    # Scale based on how extreme the ratio is
+                    confidence += min(40, var_ratio / 2)
+                if is_suspicious_coeff_var:
+                    # Scale based on coefficient of variation
+                    confidence += min(30, coeff_var * 10)
+                if is_suspicious_mean_diff:
+                    # Scale based on mean difference
+                    confidence += min(30, mean_diff)
+                # Cap confidence at 90%
+                confidence = min(confidence, 90)
+                # Save results for return
+                details = {
+                    "mean_diff": float(mean_diff),
+                    "max_diff": float(max_diff),
+                    "var_ratio": float(var_ratio),
+                    "coeff_var": float(coeff_var),
+                    "diff_image": diff_scaled.astype(np.uint8),  # For visualization
+                    "quality_used": quality
+                }
+                return is_suspicious, confidence, details
+    except Exception as e:
+        logging.debug(f"Error performing ELA on {image_path}: {str(e)}")
+        return False, 0, {"error": str(e)}
+def check_lsb_anomalies(image_path, threshold=0.03):
+    """
+    Detect potential LSB steganography by analyzing bit plane patterns.
+    Args:
+        image_path: Path to the image
+        threshold: Threshold for statistical anomaly detection
+    Returns:
+        (is_suspicious, confidence, details)
+    """
+    try:
+        with Image.open(image_path) as img:
+            # Convert to RGB
+            if img.mode != 'RGB':
+                img = img.convert('RGB')
+            # Get image data as numpy array
+            img_array = np.array(img)
+            # Extract least significant bits from each channel
+            red_lsb = img_array[:,:,0] % 2
+            green_lsb = img_array[:,:,1] % 2
+            blue_lsb = img_array[:,:,2] % 2
+            # Calculate statistics
+            # Chi-square test to detect non-random patterns in LSB
+            red_chi = stats.chisquare(np.bincount(red_lsb.flatten()))[1]
+            green_chi = stats.chisquare(np.bincount(green_lsb.flatten()))[1]
+            blue_chi = stats.chisquare(np.bincount(blue_lsb.flatten()))[1]
+            # Calculate entropy of the LSB plane
+            red_entropy = stats.entropy(np.bincount(red_lsb.flatten()))
+            green_entropy = stats.entropy(np.bincount(green_lsb.flatten()))
+            blue_entropy = stats.entropy(np.bincount(blue_lsb.flatten()))
+            # Suspicious if chi-square test shows non-random distribution
+            # or if entropy is too high (close to 1 for random, lower for non-random)
+            chi_suspicious = min(red_chi, green_chi, blue_chi) < threshold
+            entropy_suspicious = abs(np.mean([red_entropy, green_entropy, blue_entropy]) - 1.0) > 0.1
+            # Calculate a confidence score (0-100%)
+            confidence = 0
+            if chi_suspicious:
+                confidence += 50
+            if entropy_suspicious:
+                confidence += 30
+            # Additional checks for common LSB steganography patterns
+            # Check for abnormal color distributions
+            color_distribution = np.std([np.std(red_lsb), np.std(green_lsb), np.std(blue_lsb)])
+            if color_distribution < 0.1:  # Suspicious if too uniform
+                confidence += 20
+            is_suspicious = confidence > 50
+            details = {
+                "chi_square_values": [red_chi, green_chi, blue_chi],
+                "entropy_values": [red_entropy, green_entropy, blue_entropy],
+                "color_distribution": color_distribution
+            }
+            return is_suspicious, confidence, details
+    except Exception as e:
+        logging.debug(f"Error analyzing LSB in {image_path}: {str(e)}")
+        return False, 0, {"error": str(e)}
+def check_file_size_anomalies(image_path):
+    """
+    Check if file size is suspicious compared to image dimensions.
+    Args:
+        image_path: Path to the image
+    Returns:
+        (is_suspicious, confidence, details)
+    """
+    try:
+        # Get file size
+        file_size = os.path.getsize(image_path)
+        with Image.open(image_path) as img:
+            width, height = img.size
+            pixel_count = width * height
+            # Calculate expected file size range based on image type
+            expected_size = 0
+            if image_path.lower().endswith('.png'):
+                # PNG files have variable compression but generally follow a pattern
+                # This is a very rough estimate
+                expected_min = pixel_count * 0.1  # Minimum expected size
+                expected_max = pixel_count * 3    # Maximum expected size
+            elif image_path.lower().endswith(('.jpg', '.jpeg')):
+                # JPEG files are typically smaller due to compression
+                expected_min = pixel_count * 0.05  # Minimum for very compressed JPEG
+                expected_max = pixel_count * 1.5   # Maximum for high quality JPEG
+            else:
+                # For other formats, use a more generic range
+                expected_min = pixel_count * 0.1
+                expected_max = pixel_count * 4
+            # Check if file size is within expected range
+            is_too_small = file_size < expected_min
+            is_too_large = file_size > expected_max
+            is_suspicious = is_too_small or is_too_large
+            # Calculate confidence
+            confidence = 0
+            if is_too_large:
+                # More likely to contain hidden data if too large
+                ratio = file_size / expected_max
+                confidence = min(int((ratio - 1) * 100), 90)  # Cap at 90%
+            elif is_too_small:
+                # Less likely but still suspicious if too small
+                ratio = expected_min / file_size
+                confidence = min(int((ratio - 1) * 50), 70)   # Cap at 70%
+            details = {
+                "file_size": file_size,
+                "expected_min": expected_min,
+                "expected_max": expected_max,
+                "pixel_count": pixel_count,
+                "width": width,
+                "height": height
+            }
+            return is_suspicious, confidence, details
+    except Exception as e:
+        logging.debug(f"Error analyzing file size in {image_path}: {str(e)}")
+        return False, 0, {"error": str(e)}
+def check_histogram_anomalies(image_path):
+    """
+    Analyze image histogram for unusual patterns that might indicate steganography.
+    Args:
+        image_path: Path to the image
+    Returns:
+        (is_suspicious, confidence, details)
+    """
+    try:
+        with Image.open(image_path) as img:
+            # Convert to RGB
+            if img.mode != 'RGB':
+                img = img.convert('RGB')
+            # Get image data as numpy array
+            img_array = np.array(img)
+            # Calculate histograms for each color channel
+            hist_r = np.histogram(img_array[:,:,0], bins=256, range=(0, 256))[0]
+            hist_g = np.histogram(img_array[:,:,1], bins=256, range=(0, 256))[0]
+            hist_b = np.histogram(img_array[:,:,2], bins=256, range=(0, 256))[0]
+            # Normalize histograms
+            pixel_count = img_array.shape[0] * img_array.shape[1]
+            hist_r = hist_r / pixel_count
+            hist_g = hist_g / pixel_count
+            hist_b = hist_b / pixel_count
+            # Analyze histogram characteristics
+            # 1. Check for comb patterns (alternating peaks/valleys) which can indicate LSB steganography
+            comb_pattern_r = np.sum(np.abs(np.diff(np.diff(hist_r))))
+            comb_pattern_g = np.sum(np.abs(np.diff(np.diff(hist_g))))
+            comb_pattern_b = np.sum(np.abs(np.diff(np.diff(hist_b))))
+            # 2. Check for unusual peaks at specific values
+            # LSB steganography often causes unusual spikes at even or odd values
+            even_odd_ratio_r = np.sum(hist_r[::2]) / np.sum(hist_r[1::2]) if np.sum(hist_r[1::2]) > 0 else 1
+            even_odd_ratio_g = np.sum(hist_g[::2]) / np.sum(hist_g[1::2]) if np.sum(hist_g[1::2]) > 0 else 1
+            even_odd_ratio_b = np.sum(hist_b[::2]) / np.sum(hist_b[1::2]) if np.sum(hist_b[1::2]) > 0 else 1
+            # Calculate an evenness score - how far from 1.0 (perfect balance) are we?
+            even_odd_deviation = max(
+                abs(even_odd_ratio_r - 1.0),
+                abs(even_odd_ratio_g - 1.0),
+                abs(even_odd_ratio_b - 1.0)
+            )
+            # 3. Calculate histogram smoothness (natural images tend to have smoother histograms)
+            smoothness_r = np.mean(np.abs(np.diff(hist_r)))
+            smoothness_g = np.mean(np.abs(np.diff(hist_g)))
+            smoothness_b = np.mean(np.abs(np.diff(hist_b)))
+            # Suspicious if large even/odd ratio deviation or high comb pattern values
+            is_suspicious_comb = max(comb_pattern_r, comb_pattern_g, comb_pattern_b) > 0.015
+            is_suspicious_even_odd = even_odd_deviation > 0.1
+            is_suspicious_smoothness = max(smoothness_r, smoothness_g, smoothness_b) > 0.01
+            is_suspicious = is_suspicious_comb or is_suspicious_even_odd or is_suspicious_smoothness
+            # Calculate confidence
+            confidence = 0
+            if is_suspicious_comb:
+                confidence += 30
+            if is_suspicious_even_odd:
+                confidence += 40
+            if is_suspicious_smoothness:
+                confidence += 20
+            # Cap confidence at 90%
+            confidence = min(confidence, 90)
+            details = {
+                "comb_pattern_values": [comb_pattern_r, comb_pattern_g, comb_pattern_b],
+                "even_odd_ratios": [even_odd_ratio_r, even_odd_ratio_g, even_odd_ratio_b],
+                "smoothness_values": [smoothness_r, smoothness_g, smoothness_b],
+                "even_odd_deviation": even_odd_deviation
+            }
+            return is_suspicious, confidence, details
+    except Exception as e:
+        logging.debug(f"Error analyzing histogram in {image_path}: {str(e)}")
+        return False, 0, {"error": str(e)}
+def check_metadata_anomalies(image_path):
+    """
+    Look for unusual metadata or metadata inconsistencies that could indicate steganography.
+    Args:
+        image_path: Path to the image
+    Returns:
+        (is_suspicious, confidence, details)
+    """
+    try:
+        with Image.open(image_path) as img:
+            # Extract metadata (EXIF, etc)
+            metadata = {}
+            if hasattr(img, '_getexif') and img._getexif() is not None:
+                metadata = {k: v for k, v in img._getexif().items()}
+            # Check for known steganography software markers
+            steganography_markers = [
+                'outguess', 'stegano', 'steghide', 'jsteg', 'f5', 'secret',
+                'hidden', 'conceal', 'invisible', 'steganography'
+            ]
+            found_markers = []
+            for key, value in metadata.items():
+                if isinstance(value, str):
+                    value_lower = value.lower()
+                    for marker in steganography_markers:
+                        if marker in value_lower:
+                            found_markers.append((key, marker, value))
+            # Check for unusual metadata structure
+            is_suspicious = len(found_markers) > 0
+            confidence = min(len(found_markers) * 30, 90) if is_suspicious else 0
+            # Check for metadata size anomalies
+            if len(metadata) > 30:  # Unusually large metadata
+                is_suspicious = True
+                confidence = max(confidence, 50)
+            details = {
+                "metadata_count": len(metadata),
+                "suspicious_markers": found_markers
+            }
+            return is_suspicious, confidence, details
+    except Exception as e:
+        logging.debug(f"Error analyzing metadata in {image_path}: {str(e)}")
+        return False, 0, {"error": str(e)}
+def check_trailing_data(image_path):
+    """Detect suspicious data appended after the official end markers."""
+    try:
+        with open(image_path, 'rb') as f:
+            data = f.read()
+        appended_bytes = 0
+        lower = image_path.lower()
+        if lower.endswith(('.jpg', '.jpeg', '.jfif')):
+            marker = data.rfind(b'\xFF\xD9')
+            if marker != -1 and marker < len(data) - 2:
+                appended_bytes = len(data) - marker - 2
+        elif lower.endswith('.png'):
+            marker = data.rfind(b'\x00\x00\x00\x00IEND\xAEB\x60\x82')
+            if marker != -1 and marker < len(data) - 12:
+                appended_bytes = len(data) - marker - 12
+        else:
+            return False, 0, {"error": "unsupported format"}
+        is_suspicious = appended_bytes > 0
+        confidence = 0
+        if is_suspicious:
+            ratio = appended_bytes / len(data)
+            confidence = min(95, 50 + int(ratio * 500))
+        details = {
+            "appended_bytes": appended_bytes
+        }
+        return is_suspicious, confidence, details
+    except Exception as e:
+        logging.debug(f"Error analyzing trailing data in {image_path}: {str(e)}")
+        return False, 0, {"error": str(e)}
+def check_visual_noise_anomalies(image_path):
+    """
+    Analyze visual noise patterns to detect potential steganography.
+    Args:
+        image_path: Path to the image
+    Returns:
+        (is_suspicious, confidence, details)
+    """
+    try:
+        with Image.open(image_path) as img:
+            # Convert to RGB
+            if img.mode != 'RGB':
+                img = img.convert('RGB')
+            # Resize if image is too large for faster processing
+            width, height = img.size
+            if width > 1000 or height > 1000:
+                ratio = min(1000 / width, 1000 / height)
+                new_width = int(width * ratio)
+                new_height = int(height * ratio)
+                img = img.resize((new_width, new_height))
+            # Get image data as numpy array
+            img_array = np.array(img)
+            # Apply noise detection
+            # Calculate noise in each channel by looking at differences between adjacent pixels
+            red_noise = np.mean(np.abs(np.diff(img_array[:,:,0], axis=0))) + np.mean(np.abs(np.diff(img_array[:,:,0], axis=1)))
+            green_noise = np.mean(np.abs(np.diff(img_array[:,:,1], axis=0))) + np.mean(np.abs(np.diff(img_array[:,:,1], axis=1)))
+            blue_noise = np.mean(np.abs(np.diff(img_array[:,:,2], axis=0))) + np.mean(np.abs(np.diff(img_array[:,:,2], axis=1)))
+            # Calculate noise ratio between channels
+            # In natural images, noise should be roughly similar across channels
+            # Large differences might indicate steganographic content
+            avg_noise = (red_noise + green_noise + blue_noise) / 3
+            noise_diffs = [abs(red_noise - avg_noise), abs(green_noise - avg_noise), abs(blue_noise - avg_noise)]
+            max_diff_ratio = max(noise_diffs) / avg_noise if avg_noise > 0 else 0
+            # Suspicious if significant differences between channels
+            is_suspicious = max_diff_ratio > 0.2
+            confidence = min(int(max_diff_ratio * 100), 90) if is_suspicious else 0
+            details = {
+                "red_noise": red_noise,
+                "green_noise": green_noise,
+                "blue_noise": blue_noise,
+                "max_diff_ratio": max_diff_ratio
+            }
+            return is_suspicious, confidence, details
+    except Exception as e:
+        logging.debug(f"Error analyzing visual noise in {image_path}: {str(e)}")
+        return False, 0, {"error": str(e)}
+def analyze_image(image_path, sensitivity='medium'):
+    """
+    Perform comprehensive steganography detection on an image.
+    Args:
+        image_path: Path to the image
+        sensitivity: 'low', 'medium', or 'high'
+    Returns:
+        (is_suspicious, overall_confidence, detection_details)
+    """
+    # Set threshold based on sensitivity
+    thresholds = {
+        'low': 0.01,      # More likely to find steganography but more false positives
+        'medium': 0.03,   # Balanced detection
+        'high': 0.05      # Fewer false positives but might miss some steganography
+    }
+    confidence_required = {
+        'low': 60,       # Lower bar for detection
+        'medium': 70,    # Moderate confidence required
+        'high': 80       # High confidence required to report
+    }
+    threshold = thresholds.get(sensitivity, 0.03)
+    min_confidence = confidence_required.get(sensitivity, 70)
+    try:
+        results = {}
+        # Run all detection methods
+        lsb_result = check_lsb_anomalies(image_path, threshold)
+        results['lsb_analysis'] = {
+            'suspicious': lsb_result[0],
+            'confidence': lsb_result[1],
+            'details': lsb_result[2]
+        }
+        size_result = check_file_size_anomalies(image_path)
+        results['file_size_analysis'] = {
+            'suspicious': size_result[0],
+            'confidence': size_result[1],
+            'details': size_result[2]
+        }
+        metadata_result = check_metadata_anomalies(image_path)
+        results['metadata_analysis'] = {
+            'suspicious': metadata_result[0],
+            'confidence': metadata_result[1],
+            'details': metadata_result[2]
+        }
+        trailing_result = check_trailing_data(image_path)
+        results['trailing_data_analysis'] = {
+            'suspicious': trailing_result[0],
+            'confidence': trailing_result[1],
+            'details': trailing_result[2]
+        }
+        noise_result = check_visual_noise_anomalies(image_path)
+        results['visual_noise_analysis'] = {
+            'suspicious': noise_result[0],
+            'confidence': noise_result[1],
+            'details': noise_result[2]
+        }
+        # Add the new histogram analysis
+        histogram_result = check_histogram_anomalies(image_path)
+        results['histogram_analysis'] = {
+            'suspicious': histogram_result[0],
+            'confidence': histogram_result[1],
+            'details': histogram_result[2]
+        }
+        # Add Error Level Analysis (ELA) for JPEG images
+        if image_path.lower().endswith(('.jpg', '.jpeg', '.jfif')):
+            ela_result = perform_ela_analysis(image_path)
+            results['ela_analysis'] = {
+                'suspicious': ela_result[0],
+                'confidence': ela_result[1],
+                'details': ela_result[2]
+            }
+        # Calculate overall confidence
+        # Weight the different tests
+        weights = {
+            'lsb_analysis': 0.25,           # LSB is a common technique
+            'histogram_analysis': 0.20,      # Histogram patterns are strong indicators
+            'file_size_analysis': 0.10,      # Size can be indicative
+            'metadata_analysis': 0.10,       # Metadata less common but useful indicator
+            'trailing_data_analysis': 0.10,  # Detects data after EOF markers
+            'visual_noise_analysis': 0.15,   # Visual noise can be a good indicator
+            'ela_analysis': 0.20            # Error Level Analysis is effective for JPEG manipulation
+        }
+        # Only include weights for methods that were actually run
+        used_weights = {k: v for k, v in weights.items() if k in results}
+        # Normalize the weights to ensure they sum to 1.0
+        weight_sum = sum(used_weights.values())
+        if weight_sum > 0:
+            used_weights = {k: v/weight_sum for k, v in used_weights.items()}
+        # Calculate weighted confidence
+        overall_confidence = sum(
+            results[key]['confidence'] * used_weights[key] for key in used_weights
+        )
+        # Determine if image is suspicious overall
+        is_suspicious = overall_confidence >= min_confidence
+        return is_suspicious, overall_confidence, results
+    except Exception as e:
+        logging.debug(f"Error analyzing {image_path}: {str(e)}")
+        return False, 0, {"error": str(e)}
+def process_file(args):
+    """Process a single image file."""
+    image_path, sensitivity, output_dir = args
+    try:
+        is_suspicious, confidence, details = analyze_image(image_path, sensitivity)
+        result = {
+            'path': image_path,
+            'suspicious': is_suspicious,
+            'confidence': confidence,
+            'details': details
+        }
+        # Create visual report if output directory is specified
+        if output_dir and is_suspicious:
+            create_visual_report(image_path, confidence, details, output_dir)
+        return result
+    except Exception as e:
+        logging.debug(f"Error processing {image_path}: {str(e)}")
+        return {
+            'path': image_path,
+            'suspicious': False,
+            'confidence': 0,
+            'details': {'error': str(e)}
+        }
+def create_visual_report(image_path, confidence, details, output_dir):
+    """
+    Create a visual report showing the analysis of a suspicious image.
+    Args:
+        image_path: Path to the analyzed image
+        confidence: Detection confidence
+        details: Analysis details
+        output_dir: Directory to save report
+    """
+    try:
+        # Create output directory if it doesn't exist
+        os.makedirs(output_dir, exist_ok=True)
+        # Create a figure with 3x3 subplots to accommodate ELA visualization
+        fig, axs = plt.subplots(3, 3, figsize=(15, 15))
+        fig.suptitle(f"Steganography Analysis: {os.path.basename(image_path)}\nConfidence: {confidence:.1f}%", fontsize=16)
+        # Original image
+        with Image.open(image_path) as img:
+            axs[0, 0].imshow(img)
+            axs[0, 0].set_title("Original Image")
+            axs[0, 0].axis('off')
+            # LSB visualization
+            img_array = np.array(img.convert('RGB'))
+            lsb_img = np.zeros_like(img_array)
+            # Amplify LSB data by 255 for visibility
+            lsb_img[:,:,0] = (img_array[:,:,0] % 2) * 255
+            lsb_img[:,:,1] = (img_array[:,:,1] % 2) * 255
+            lsb_img[:,:,2] = (img_array[:,:,2] % 2) * 255
+            axs[0, 1].imshow(lsb_img)
+            axs[0, 1].set_title("LSB Visualization")
+            axs[0, 1].axis('off')
+            # ELA visualization (NEW)
+            if 'ela_analysis' in details and 'details' in details['ela_analysis']:
+                ela_data = details['ela_analysis']['details']
+                if 'diff_image' in ela_data and not isinstance(ela_data.get('error', ''), str):
+                    # Display the ELA image
+                    axs[0, 2].imshow(ela_data['diff_image'])
+                    axs[0, 2].set_title("Error Level Analysis (ELA)")
+                    axs[0, 2].axis('off')
+                    # Add annotation with key metrics
+                    metrics = []
+                    if 'var_ratio' in ela_data:
+                        metrics.append(f"Variance ratio: {ela_data['var_ratio']:.2f}")
+                    if 'coeff_var' in ela_data:
+                        metrics.append(f"Coefficient of var: {ela_data['coeff_var']:.2f}")
+                    if 'mean_diff' in ela_data:
+                        metrics.append(f"Mean diff: {ela_data['mean_diff']:.2f}")
+                    if metrics:
+                        axs[0, 2].text(0.05, 0.05, "\n".join(metrics), transform=axs[0, 2].transAxes,
+                                      fontsize=9, verticalalignment='bottom',
+                                      bbox=dict(boxstyle='round,pad=0.5',
+                                               facecolor='white', alpha=0.7))
+                else:
+                    axs[0, 2].text(0.5, 0.5, "ELA data not available",
+                                 horizontalalignment='center', verticalalignment='center')
+                    axs[0, 2].axis('off')
+            else:
+                axs[0, 2].text(0.5, 0.5, "ELA analysis not available",
+                             horizontalalignment='center', verticalalignment='center')
+                axs[0, 2].axis('off')
+            # Histogram visualization
+            if 'histogram_analysis' in details:
+                # Create histograms for each channel
+                hist_r = np.histogram(img_array[:,:,0], bins=256, range=(0, 256))[0]
+                hist_g = np.histogram(img_array[:,:,1], bins=256, range=(0, 256))[0]
+                hist_b = np.histogram(img_array[:,:,2], bins=256, range=(0, 256))[0]
+                # Plot the histograms
+                bin_edges = np.arange(0, 257)
+                axs[1, 0].plot(bin_edges[:-1], hist_r, color='red', alpha=0.7)
+                axs[1, 0].plot(bin_edges[:-1], hist_g, color='green', alpha=0.7)
+                axs[1, 0].plot(bin_edges[:-1], hist_b, color='blue', alpha=0.7)
+                axs[1, 0].set_title("Color Channel Histograms")
+                axs[1, 0].set_xlabel("Pixel Value")
+                axs[1, 0].set_ylabel("Frequency")
+                axs[1, 0].legend(['Red', 'Green', 'Blue'])
+                # Show odd/even distribution analysis
+                histogram_data = details['histogram_analysis']['details']
+                # Get even/odd ratio values
+                if 'even_odd_ratios' in histogram_data:
+                    even_odd_ratios = histogram_data['even_odd_ratios']
+                    # Plot as bar chart
+                    axs[1, 1].bar(['Red', 'Green', 'Blue'], even_odd_ratios,
+                                 color=['red', 'green', 'blue'], alpha=0.7)
+                    axs[1, 1].axhline(y=1.0, linestyle='--', color='gray')
+                    axs[1, 1].set_title("Even/Odd Value Ratios")
+                    axs[1, 1].set_ylabel("Ratio (1.0 = balanced)")
+                    # Annotate with explanatory text
+                    deviation = histogram_data.get('even_odd_deviation', 0)
+                    assessment = "SUSPICIOUS" if deviation > 0.1 else "NORMAL"
+                    axs[1, 1].annotate(f"Deviation: {deviation:.3f}\nAssessment: {assessment}",
+                                     xy=(0.05, 0.05), xycoords='axes fraction')
+                else:
+                    axs[1, 1].text(0.5, 0.5, "Histogram ratio data not available",
+                                  horizontalalignment='center', verticalalignment='center')
+                    axs[1, 1].axis('off')
+            else:
+                axs[1, 0].text(0.5, 0.5, "Histogram analysis not available",
+                             horizontalalignment='center', verticalalignment='center')
+                axs[1, 0].axis('off')
+                axs[1, 1].axis('off')
+            # Noise visualization
+            if 'visual_noise_analysis' in details:
+                noise_data = details['visual_noise_analysis']['details']
+                noise_values = [noise_data.get('red_noise', 0),
+                                noise_data.get('green_noise', 0),
+                                noise_data.get('blue_noise', 0)]
+                axs[1, 2].bar(['Red', 'Green', 'Blue'], noise_values, color=['red', 'green', 'blue'])
+                axs[1, 2].set_title("Noise Levels by Channel")
+                axs[1, 2].set_ylabel("Noise Level")
+            else:
+                axs[1, 2].text(0.5, 0.5, "Noise analysis not available",
+                              horizontalalignment='center', verticalalignment='center')
+                axs[1, 2].axis('off')
+            # File size analysis visualization
+            if 'file_size_analysis' in details and 'details' in details['file_size_analysis']:
+                size_data = details['file_size_analysis']['details']
+                if ('file_size' in size_data and 'expected_min' in size_data
+                        and 'expected_max' in size_data and 'pixel_count' in size_data):
+                    # Create a simple bar chart comparing actual vs expected size
+                    sizes = [size_data['file_size'],
+                            size_data['expected_min'],
+                            size_data['expected_max']]
+                    labels = ['Actual Size', 'Min Expected', 'Max Expected']
+                    colors = ['blue', 'green', 'green']
+                    axs[2, 0].bar(labels, sizes, color=colors, alpha=0.7)
+                    axs[2, 0].set_title("File Size Analysis")
+                    axs[2, 0].set_ylabel("Size (bytes)")
+                    # Format y-axis to show human-readable sizes
+                    axs[2, 0].get_yaxis().set_major_formatter(
+                        plt.FuncFormatter(lambda x, loc: f"{x/1024:.1f}KB" if x >= 1024 else f"{x}B"))
+                    # Is it suspiciously large?
+                    is_too_large = size_data['file_size'] > size_data['expected_max']
+                    is_too_small = size_data['file_size'] < size_data['expected_min']
+                    if is_too_large:
+                        assessment = f"SUSPICIOUS: {(size_data['file_size'] - size_data['expected_max'])/1024:.1f}KB larger than expected"
+                    elif is_too_small:
+                        assessment = f"SUSPICIOUS: {(size_data['expected_min'] - size_data['file_size'])/1024:.1f}KB smaller than expected"
+                    else:
+                        assessment = "NORMAL: Size within expected range"
+                    axs[2, 0].annotate(assessment, xy=(0.05, 0.05), xycoords='axes fraction',
+                                     fontsize=9, verticalalignment='bottom')
+                    if 'trailing_data_analysis' in details:
+                        tdata = details['trailing_data_analysis']['details']
+                        if tdata.get('appended_bytes', 0) > 0:
+                            axs[2, 0].annotate(
+                                f"Appended data: {tdata['appended_bytes']} bytes",
+                                xy=(0.05, 0.85), xycoords='axes fraction',
+                                fontsize=9, verticalalignment='bottom',
+                                color='red'
+                            )
+                else:
+                    axs[2, 0].text(0.5, 0.5, "Size analysis data not available",
+                                 horizontalalignment='center', verticalalignment='center')
+                    axs[2, 0].axis('off')
+            else:
+                axs[2, 0].text(0.5, 0.5, "Size analysis not available",
+                              horizontalalignment='center', verticalalignment='center')
+                axs[2, 0].axis('off')
+            # Metadata analysis visualization
+            if 'metadata_analysis' in details and 'details' in details['metadata_analysis']:
+                metadata = details['metadata_analysis']['details']
+                metadata_text = f"Total metadata entries: {metadata.get('metadata_count', 0)}\n\n"
+                if 'suspicious_markers' in metadata and metadata['suspicious_markers']:
+                    metadata_text += "Suspicious markers found:\n"
+                    for key, marker, value in metadata['suspicious_markers'][:3]:  # Show top 3
+                        metadata_text += f"- '{marker}' in {key}\n"
+                    if len(metadata['suspicious_markers']) > 3:
+                        metadata_text += f"...and {len(metadata['suspicious_markers'])-3} more\n"
+                else:
+                    metadata_text += "No suspicious metadata markers found"
+                axs[2, 1].text(0.1, 0.5, metadata_text, fontsize=10,
+                             verticalalignment='center', horizontalalignment='left')
+                axs[2, 1].set_title("Metadata Analysis")
+                axs[2, 1].axis('off')
+            else:
+                axs[2, 1].text(0.5, 0.5, "Metadata analysis not available",
+                              horizontalalignment='center', verticalalignment='center')
+                axs[2, 1].axis('off')
+            # Overall analysis metrics
+            axs[2, 2].axis('off')
+            metrics_text = "Detection Confidence by Method:\n\n"
+            for analysis_type, results in details.items():
+                if isinstance(results, dict) and 'confidence' in results:
+                    confidence_value = results['confidence']
+                    if confidence_value > 70:
+                        highlight = " 🚨 HIGH"
+                    elif confidence_value > 40:
+                        highlight = " ⚠️ MEDIUM"
+                    else:
+                        highlight = ""
+                    metrics_text += f"{analysis_type.replace('_', ' ').title()}: {confidence_value:.1f}%{highlight}\n"
+            axs[2, 2].text(0.1, 0.5, metrics_text, fontsize=10, verticalalignment='center')
+            axs[2, 2].set_title("Overall Analysis Results")
+        # Adjust layout
+        plt.tight_layout(rect=[0, 0, 1, 0.95])
+        # Save figure
+        report_filename = os.path.join(output_dir, f"steganalysis_{os.path.basename(image_path)}.png")
+        plt.savefig(report_filename)
+        plt.close()
+        logging.debug(f"Created visual report: {report_filename}")
+        return report_filename
+    except Exception as e:
+        logging.debug(f"Error creating visual report for {image_path}: {str(e)}")
+        return None
+def find_image_files(directory, recursive=True):
+    """Find all image files in a directory."""
+    image_extensions = ('.jpg', '.jpeg', '.png', '.bmp', '.gif', '.tiff', '.tif', '.webp')
+    image_files = []
+    if recursive:
+        for root, _, files in os.walk(directory):
+            for file in files:
+                if file.lower().endswith(image_extensions):
+                    image_files.append(os.path.join(root, file))
+    else:
+        for file in os.listdir(directory):
+            if os.path.isfile(os.path.join(directory, file)) and file.lower().endswith(image_extensions):
+                image_files.append(os.path.join(directory, file))
+    return image_files
+def analyze_images(directory, sensitivity='medium', recursive=True, output_dir=None, max_workers=None):
+    """
+    Analyze all images in a directory for steganography.
+    Args:
+        directory: Directory to scan
+        sensitivity: 'low', 'medium', or 'high'
+        recursive: Whether to scan subdirectories
+        output_dir: Directory to save visual reports
+        max_workers: Number of worker processes
+    Returns:
+        List of suspicious image details
+    """
+    # Find all image files
+    image_files = find_image_files(directory, recursive)
+    if not image_files:
+        logging.warning("No image files found!")
+        return []
+    logging.info(f"Found {len(image_files)} image files to analyze")
+    # Create output directory if specified
+    if output_dir:
+        os.makedirs(output_dir, exist_ok=True)
+        logging.info(f"Visual reports will be saved to: {output_dir}")
+    # Prepare input arguments for workers
+    input_args = [(file_path, sensitivity, output_dir) for file_path in image_files]
+    suspicious_images = []
+    # Process files in parallel
+    with concurrent.futures.ProcessPoolExecutor(max_workers=max_workers) as executor:
+        # Colorful progress bar
+        results = []
+        futures = {executor.submit(process_file, arg): arg[0] for arg in input_args}
+        with tqdm(
+            total=len(image_files),
+            desc=f"{colorama.Fore.RED}Analyzing images for steganography{colorama.Style.RESET_ALL}",
+            unit="file",
+            bar_format="{desc}: {percentage:3.0f}%|{bar:30}| {n_fmt}/{total_fmt} [{elapsed}<{remaining}, {rate_fmt}]",
+            colour="red"
+        ) as pbar:
+            for future in concurrent.futures.as_completed(futures):
+                file_path = futures[future]
+                try:
+                    result = future.result()
+                    results.append(result)
+                    # Update progress
+                    pbar.update(1)
+                    # Add to suspicious images if applicable
+                    if result['suspicious']:
+                        suspicious_images.append(result)
+                        logging.info(f"Suspicious image found: {file_path} (confidence: {result['confidence']:.1f}%)")
+                except Exception as e:
+                    logging.error(f"Error analyzing {file_path}: {str(e)}")
+                    pbar.update(1)
+    # Sort suspicious images by confidence
+    suspicious_images.sort(key=lambda x: x['confidence'], reverse=True)
+    logging.info(f"Analysis complete. Found {len(suspicious_images)} suspicious images")
+    return suspicious_images
+def main():
+    print_banner()
+    # Check for 'q' command to quit
+    if len(sys.argv) == 2 and sys.argv[1].lower() == 'q':
+        print(f"{colorama.Fore.YELLOW}Exiting RAT Finder. Stay vigilant!{colorama.Style.RESET_ALL}")
+        sys.exit(0)
+    parser = argparse.ArgumentParser(
+        description='RAT Finder: Steganography Detection Tool (v0.2.0)',
+        epilog='Part of the 2PAC toolkit - Created by Richard Young'
+    )
+    # Main action
+    parser.add_argument('directory', nargs='?', help='Directory to search for images')
+    parser.add_argument('--check-file', type=str, help='Check a specific file for steganography')
+    # Options
+    parser.add_argument('--sensitivity', type=str, choices=['low', 'medium', 'high'], default='medium',
+                       help='Set detection sensitivity level (default: medium)')
+    parser.add_argument('--non-recursive', action='store_true', help='Only search in the specified directory, not subdirectories')
+    parser.add_argument('--output', type=str, help='Save list of suspicious files to this file')
+    parser.add_argument('--visual-reports', type=str, help='Directory to save visual analysis reports')
+    parser.add_argument('--workers', type=int, default=None, help='Number of worker processes (default: CPU count)')
+    parser.add_argument('--verbose', '-v', action='store_true', help='Enable verbose logging')
+    parser.add_argument('--no-color', action='store_true', help='Disable colored output')
+    parser.add_argument('--version', action='version', version=f'RAT Finder v{VERSION} by Richard Young')
+    args = parser.parse_args()
+    # Setup logging
+    setup_logging(args.verbose, args.no_color)
+    # Handle specific file check mode
+    if args.check_file:
+        file_path = args.check_file
+        if not os.path.exists(file_path):
+            logging.error(f"Error: File not found: {file_path}")
+            sys.exit(1)
+        print(f"\n{colorama.Style.BRIGHT}Analyzing file for steganography: {file_path}{colorama.Style.RESET_ALL}\n")
+        is_suspicious, confidence, details = analyze_image(file_path, args.sensitivity)
+        # Print results
+        if is_suspicious:
+            print(f"{colorama.Fore.RED}[!] SUSPICIOUS: This image may contain hidden data{colorama.Style.RESET_ALL}")
+            print(f"Confidence: {confidence:.1f}%\n")
+        else:
+            print(f"{colorama.Fore.GREEN}[✓] No steganography detected in this image{colorama.Style.RESET_ALL}")
+            print(f"Confidence: {(100 - confidence):.1f}% clean\n")
+        # Details of analysis
+        print(f"{colorama.Fore.CYAN}Detection Details:{colorama.Style.RESET_ALL}")
+        for analysis_type, results in details.items():
+            if isinstance(results, dict) and 'confidence' in results:
+                detection_status = f"{colorama.Fore.RED}[DETECTED]" if results['suspicious'] else f"{colorama.Fore.GREEN}[OK]"
+                print(f"{detection_status} {analysis_type.replace('_', ' ').title()}: {results['confidence']:.1f}%{colorama.Style.RESET_ALL}")
+                # Print specific findings
+                if 'details' in results and isinstance(results['details'], dict):
+                    for key, value in results['details'].items():
+                        if key != 'error':
+                            print(f"  - {key}: {value}")
+        # Create visual report if requested
+        if args.visual_reports:
+            report_path = create_visual_report(file_path, confidence, details, args.visual_reports)
+            if report_path:
+                print(f"\n{colorama.Fore.CYAN}Visual report saved to: {report_path}{colorama.Style.RESET_ALL}")
+        sys.exit(0)
+    # Check if directory is specified
+    if not args.directory:
+        logging.error("Error: You must specify a directory to scan or use --check-file for a specific file")
+        sys.exit(1)
+    directory = Path(args.directory)
+    # Verify the directory exists
+    if not directory.exists() or not directory.is_dir():
+        logging.error(f"Error: {directory} is not a valid directory")
+        sys.exit(1)
+    # Begin analysis
+    logging.info(f"Starting steganography analysis with {args.sensitivity} sensitivity")
+    logging.info(f"Scanning for images in {directory}")
+    try:
+        suspicious_images = analyze_images(
+            directory,
+            sensitivity=args.sensitivity,
+            recursive=not args.non_recursive,
+            output_dir=args.visual_reports,
+            max_workers=args.workers
+        )
+        # Print summary
+        if suspicious_images:
+            count_str = f"{colorama.Fore.RED}{len(suspicious_images)}{colorama.Style.RESET_ALL}"
+            logging.info(f"Found {count_str} suspicious images that may contain hidden data")
+            # Print top findings
+            print("\nTop suspicious images:")
+            for i, result in enumerate(suspicious_images[:10]):  # Show top 10
+                confidence_color = colorama.Fore.RED if result['confidence'] > 80 else colorama.Fore.YELLOW
+                print(f"{i+1}. {result['path']} - Confidence: {confidence_color}{result['confidence']:.1f}%{colorama.Style.RESET_ALL}")
+            if len(suspicious_images) > 10:
+                print(f"... and {len(suspicious_images) - 10} more")
+        else:
+            logging.info(f"{colorama.Fore.GREEN}No suspicious images found{colorama.Style.RESET_ALL}")
+        # Save output if requested
+        if args.output and suspicious_images:
+            with open(args.output, 'w') as f:
+                for result in suspicious_images:
+                    f.write(f"{result['path']},{result['confidence']:.1f}\n")
+            logging.info(f"Saved list of suspicious files to {args.output}")
+    except KeyboardInterrupt:
+        logging.info("Operation cancelled by user")
+        sys.exit(130)
+    except Exception as e:
+        logging.error(f"Error: {str(e)}")
+        if args.verbose:
+            import traceback
+            traceback.print_exc()
+        sys.exit(1)
+    # Add signature at the end
+    if not args.no_color:
+        signature = f"\n{colorama.Fore.RED}RAT Finder v{VERSION} by Richard Young{colorama.Style.RESET_ALL}"
+        tagline = f"{colorama.Fore.YELLOW}\"Uncovering what's hidden in plain sight.\"{colorama.Style.RESET_ALL}"
+        print(signature)
+        print(tagline)
+if __name__ == "__main__":
+    main()

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+Pillow
+tqdm
+humanize
+colorama
+numpy
+scipy
+matplotlib
+gradio>=4.0.0

steg_embedder.py ADDED Viewed

	@@ -0,0 +1,337 @@

+#!/usr/bin/env python3
+"""
+LSB Steganography Embedder for 2PAC
+Hides and extracts data in images using Least Significant Bit technique
+"""
+import io
+import hashlib
+import struct
+from typing import Tuple, Optional
+from PIL import Image
+import numpy as np
+class StegEmbedder:
+    """
+    LSB (Least Significant Bit) Steganography implementation
+    Hides data in the least significant bits of image pixels
+    """
+    HEADER_SIZE = 12  # 4 bytes for data length + 8 bytes for checksum
+    MAGIC_NUMBER = b'2PAC'  # Signature to identify embedded data
+    def __init__(self):
+        self.last_capacity = 0
+        self.last_used = 0
+    def calculate_capacity(self, image: Image.Image, bits_per_channel: int = 1) -> int:
+        """
+        Calculate how many bytes can be hidden in the image
+        Args:
+            image: PIL Image object
+            bits_per_channel: Number of LSBs to use per color channel (1-4)
+        Returns:
+            Maximum bytes that can be hidden
+        """
+        if image.mode not in ['RGB', 'RGBA']:
+            raise ValueError(f"Unsupported image mode: {image.mode}. Use RGB or RGBA.")
+        width, height = image.size
+        channels = len(image.mode)  # 3 for RGB, 4 for RGBA
+        # Total bits available
+        total_bits = width * height * channels * bits_per_channel
+        # Account for header (magic number + length + checksum)
+        header_bits = (len(self.MAGIC_NUMBER) + self.HEADER_SIZE) * 8
+        available_bits = total_bits - header_bits
+        capacity = available_bits // 8  # Convert to bytes
+        self.last_capacity = capacity
+        return capacity
+    def _string_to_bits(self, data: str) -> str:
+        """Convert string to binary representation"""
+        return ''.join(format(byte, '08b') for byte in data.encode('utf-8'))
+    def _bits_to_string(self, bits: str) -> str:
+        """Convert binary representation back to string"""
+        chars = []
+        for i in range(0, len(bits), 8):
+            byte = bits[i:i+8]
+            if len(byte) == 8:
+                chars.append(chr(int(byte, 2)))
+        return ''.join(chars)
+    def _encrypt_data(self, data: str, password: str) -> bytes:
+        """Simple XOR encryption with password-derived key"""
+        key = hashlib.sha256(password.encode()).digest()
+        data_bytes = data.encode('utf-8')
+        encrypted = bytearray()
+        for i, byte in enumerate(data_bytes):
+            encrypted.append(byte ^ key[i % len(key)])
+        return bytes(encrypted)
+    def _decrypt_data(self, encrypted_data: bytes, password: str) -> str:
+        """Decrypt XOR-encrypted data"""
+        key = hashlib.sha256(password.encode()).digest()
+        decrypted = bytearray()
+        for i, byte in enumerate(encrypted_data):
+            decrypted.append(byte ^ key[i % len(key)])
+        return bytes(decrypted).decode('utf-8', errors='replace')
+    def embed_data(
+        self,
+        image_path: str,
+        data: str,
+        output_path: str,
+        password: Optional[str] = None,
+        bits_per_channel: int = 1
+    ) -> Tuple[bool, str, dict]:
+        """
+        Hide data in an image using LSB steganography
+        Args:
+            image_path: Path to input image
+            data: Text data to hide
+            output_path: Path for output image (will be PNG)
+            password: Optional password for encryption
+            bits_per_channel: LSBs to use per channel (1=subtle, 2-4=more capacity)
+        Returns:
+            Tuple of (success, message, stats_dict)
+        """
+        try:
+            # Load image
+            img = Image.open(image_path)
+            if img.mode not in ['RGB', 'RGBA']:
+                img = img.convert('RGB')
+            # Calculate capacity
+            capacity = self.calculate_capacity(img, bits_per_channel)
+            # Encrypt data if password provided
+            if password:
+                data_bytes = self._encrypt_data(data, password)
+                is_encrypted = True
+            else:
+                data_bytes = data.encode('utf-8')
+                is_encrypted = False
+            data_length = len(data_bytes)
+            if data_length > capacity:
+                return False, f"Data too large! Maximum: {capacity} bytes, Provided: {data_length} bytes", {}
+            # Create header: MAGIC + encrypted_flag + length + checksum
+            checksum = hashlib.md5(data_bytes).digest()[:8]
+            encrypted_flag = b'\x01' if is_encrypted else b'\x00'
+            header = self.MAGIC_NUMBER + encrypted_flag + struct.pack('<I', data_length) + checksum
+            # Combine header and data
+            full_data = header + data_bytes
+            # Convert to bit string
+            bit_string = ''.join(format(byte, '08b') for byte in full_data)
+            # Embed in image
+            img_array = np.array(img, dtype=np.uint8)
+            flat_array = img_array.flatten()
+            bit_index = 0
+            for i in range(len(flat_array)):
+                if bit_index >= len(bit_string):
+                    break
+                # Clear LSBs and set new bits
+                pixel = flat_array[i]
+                for bit in range(bits_per_channel):
+                    if bit_index >= len(bit_string):
+                        break
+                    # Clear bit
+                    pixel = (pixel & ~(1 << bit))
+                    # Set new bit
+                    if bit_string[bit_index] == '1':
+                        pixel = pixel | (1 << bit)
+                    bit_index += 1
+                flat_array[i] = pixel
+            # Reshape and save
+            steg_img_array = flat_array.reshape(img_array.shape)
+            steg_img = Image.fromarray(steg_img_array, img.mode)
+            # Save as PNG to preserve data
+            steg_img.save(output_path, 'PNG', optimize=False)
+            self.last_used = data_length
+            stats = {
+                'data_size': data_length,
+                'capacity': capacity,
+                'utilization': f"{(data_length / capacity * 100):.1f}%",
+                'encrypted': is_encrypted,
+                'bits_per_channel': bits_per_channel,
+                'image_size': f"{img.width}x{img.height}"
+            }
+            return True, f"Successfully embedded {data_length} bytes", stats
+        except Exception as e:
+            return False, f"Error embedding data: {str(e)}", {}
+    def extract_data(
+        self,
+        image_path: str,
+        password: Optional[str] = None,
+        bits_per_channel: int = 1
+    ) -> Tuple[bool, str, str]:
+        """
+        Extract hidden data from a steganographic image
+        Args:
+            image_path: Path to image with hidden data
+            password: Password if data is encrypted
+            bits_per_channel: LSBs used per channel (must match embedding)
+        Returns:
+            Tuple of (success, message, extracted_data)
+        """
+        try:
+            # Load image
+            img = Image.open(image_path)
+            img_array = np.array(img, dtype=np.uint8)
+            flat_array = img_array.flatten()
+            # Extract header first
+            header_bits = (len(self.MAGIC_NUMBER) + 1 + 4 + 8) * 8
+            extracted_bits = []
+            bit_index = 0
+            for i in range(len(flat_array)):
+                if bit_index >= header_bits:
+                    break
+                pixel = flat_array[i]
+                for bit in range(bits_per_channel):
+                    if bit_index >= header_bits:
+                        break
+                    extracted_bits.append(str((pixel >> bit) & 1))
+                    bit_index += 1
+            # Convert bits to bytes
+            header_bytes = bytearray()
+            for i in range(0, len(extracted_bits), 8):
+                byte_bits = ''.join(extracted_bits[i:i+8])
+                if len(byte_bits) == 8:
+                    header_bytes.append(int(byte_bits, 2))
+            # Verify magic number
+            magic = bytes(header_bytes[:len(self.MAGIC_NUMBER)])
+            if magic != self.MAGIC_NUMBER:
+                return False, "No hidden data found (invalid magic number)", ""
+            # Parse header
+            offset = len(self.MAGIC_NUMBER)
+            is_encrypted = header_bytes[offset] == 1
+            offset += 1
+            data_length = struct.unpack('<I', bytes(header_bytes[offset:offset+4]))[0]
+            offset += 4
+            stored_checksum = bytes(header_bytes[offset:offset+8])
+            offset += 8
+            # Extract data
+            total_bits_needed = (len(self.MAGIC_NUMBER) + 1 + 4 + 8 + data_length) * 8
+            extracted_bits = []
+            bit_index = 0
+            for i in range(len(flat_array)):
+                if bit_index >= total_bits_needed:
+                    break
+                pixel = flat_array[i]
+                for bit in range(bits_per_channel):
+                    if bit_index >= total_bits_needed:
+                        break
+                    extracted_bits.append(str((pixel >> bit) & 1))
+                    bit_index += 1
+            # Convert to bytes
+            data_bytes = bytearray()
+            for i in range(0, len(extracted_bits), 8):
+                byte_bits = ''.join(extracted_bits[i:i+8])
+                if len(byte_bits) == 8:
+                    data_bytes.append(int(byte_bits, 2))
+            # Skip header and get data
+            data_bytes = bytes(data_bytes[offset:offset+data_length])
+            # Verify checksum
+            calculated_checksum = hashlib.md5(data_bytes).digest()[:8]
+            if calculated_checksum != stored_checksum:
+                return False, "Data corruption detected (checksum mismatch)", ""
+            # Decrypt if needed
+            if is_encrypted:
+                if not password:
+                    return False, "Data is encrypted but no password provided", ""
+                try:
+                    data_str = self._decrypt_data(data_bytes, password)
+                except Exception as e:
+                    return False, f"Decryption failed (wrong password?): {str(e)}", ""
+            else:
+                data_str = data_bytes.decode('utf-8', errors='replace')
+            return True, f"Successfully extracted {data_length} bytes", data_str
+        except Exception as e:
+            return False, f"Error extracting data: {str(e)}", ""
+def main():
+    """Command-line interface for testing"""
+    import argparse
+    parser = argparse.ArgumentParser(description='LSB Steganography Tool')
+    parser.add_argument('mode', choices=['embed', 'extract'], help='Operation mode')
+    parser.add_argument('image', help='Input image path')
+    parser.add_argument('--data', help='Data to embed (for embed mode)')
+    parser.add_argument('--output', help='Output image path (for embed mode)')
+    parser.add_argument('--password', help='Encryption password (optional)')
+    parser.add_argument('--bits', type=int, default=1, help='Bits per channel (1-4)')
+    args = parser.parse_args()
+    embedder = StegEmbedder()
+    if args.mode == 'embed':
+        if not args.data or not args.output:
+            print("Error: --data and --output required for embed mode")
+            return
+        success, message, stats = embedder.embed_data(
+            args.image, args.data, args.output, args.password, args.bits
+        )
+        print(message)
+        if success:
+            print(f"Stats: {stats}")
+    elif args.mode == 'extract':
+        success, message, data = embedder.extract_data(
+            args.image, args.password, args.bits
+        )
+        print(message)
+        if success:
+            print(f"Extracted data:\n{data}")
+if __name__ == '__main__':
+    main()