SDXL-Deepfake-Detector

Detecting AI-Generated Faces with Precision and Purpose

Not just another classifier — a tool for digital truth.

Why This Matters

As generative AI (like SDXL, DALL·E, and Midjourney) becomes more accessible, the line between real and synthetic media blurs — especially for vulnerable communities. This project started as a technical experiment but evolved into a privacy-aware, open-source defense against visual misinformation, with a focus on ethical AI deployment.

Model Overview

SDXL-Deepfake-Detector is a fine-tuned vision transformer that classifies human faces as artificial (0) or human (1).

Training Approach

This model was obtained by fine-tuning the Organika/sdxl-detector — a vision transformer pre-trained specifically to detect SDXL-generated faces — on the 140k Real and Fake Faces dataset.

This approach leverages:

Prior knowledge of SDXL artifacts from the base model
Broader generalization from a large-scale real/fake face dataset
Efficient training on limited hardware (single RTX 3060)

The result is a lightweight, high-accuracy detector optimized for both SDXL and general diffusion-based deepfakes.

Key Highlights

Architecture: Fine-tuned Vision Transformer (ViT) via Hugging Face transformers
Dataset: 140k balanced real/fake face images
License: MIT — free for research and commercial use
Hardware: Trained on a single NVIDIA RTX 3060 (12GB VRAM) — proving high impact doesn’t require massive resources

Quick Start

Dependencies

pip install transformers torch pillow

Python Script

#predict.py
import argparse
from transformers import AutoModelForImageClassification, AutoFeatureExtractor
from PIL import Image
import torch
import os

def main():
    parser = argparse.ArgumentParser(
        description="Classify an image as 'artificial' or 'human' using the SDXL-Deepfake-Detector."
    )
    parser.add_argument("--image", type=str, required=True, help="Path to the input image file")
    args = parser.parse_args()

    # Validate image path
    if not os.path.isfile(args.image):
        raise FileNotFoundError(f"Image file not found: {args.image}")

    # Load model and feature extractor from Hugging Face Hub
    model_name = "SADRACODING/SDXL-Deepfake-Detector"
    print(f"Loading model '{model_name}'...")
    model = AutoModelForImageClassification.from_pretrained(model_name)
    feature_extractor = AutoFeatureExtractor.from_pretrained(model_name)

    # Set device (GPU if available)
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)
    model.eval()
    print(f"Running on device: {device}")

    # Load and preprocess image
    image = Image.open(args.image).convert("RGB")
    inputs = feature_extractor(images=image, return_tensors="pt").to(device)

    # Inference
    with torch.no_grad():
        outputs = model(**inputs)
    
    logits = outputs.logits
    predicted_class_idx = logits.argmax(-1).item()
    predicted_label = model.config.id2label[predicted_class_idx]

    # Output
    print(f"Prediction Result")
    print(f"Class Index: {predicted_class_idx}")
    print(f"Label      : {predicted_label}")

if __name__ == "__main__":
    main()

How to use

python predict.py --image path/to/image

Performance & Limitations

Note: Final test accuracy will be reported after full evaluation. Preliminary results show strong generalization on SDXL- and diffusion-based face forgeries.

Known Limitations

Trained primarily on frontal, well-lit, aligned face crops — may underperform on:
- Low-resolution or blurry images
- Heavily occluded or non-frontal faces
- GAN-generated faces (e.g., StyleGAN2/3)
Label mapping:
- 0 → "artificial" (AI-generated / Deepfake)
- 1 → "human" (authentic human face)

⚠️ This tool is not a forensic proof, but a probabilistic detector. Use responsibly.

Philosophy & Ethics

This model is open-source because:

Transparency is essential in the fight against synthetic media.
Accessibility ensures researchers, journalists, and civil society can audit and use detection tools without gatekeeping.
Privacy matters: The model runs entirely offline — your images never leave your device.

As a developer from a vulnerable community, I believe AI safety tools must be inclusive, ethical, and human-centered — not just technically accurate.

Acknowledgements

Dataset: 140k Real and Fake Faces by xhlulu
Framework: Hugging Face Transformers
Model & Code: GitHub Repository | Hugging Face Hub

How to Contribute

Fine-tune this model on your domain-specific data using Hugging Face Trainer.

Built with curiosity, ethics, and a 12GB GPU — because impactful AI doesn’t require a data center, just purpose.
— Sadra Milani Moghaddam

Downloads last month: 293

Safetensors

Model size

86.8M params

Tensor type

I64

F32

Space using SadraCoding/SDXL-Deepfake-Detector 1

Evaluation results

Accuracy on 140k Real and Fake Faces
self-reported

0.910

View on Papers With Code