SadraCoding's picture
Update README.md
3d18a5f verified
---
license: mit
tags:
- image-classification
- deepfake-detection
- computer-vision
- vision-transformer
- sdxl
- fake-face-detection
datasets:
- xhlulu/140k-real-and-fake-faces
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: SDXL-Deepfake-Detector
results:
- task:
type: image-classification
name: Image Classification
dataset:
name: 140k Real and Fake Faces
type: xhlulu/140k-real-and-fake-faces
metrics:
- type: accuracy
value: 0.86
name: Accuracy
---
# SDXL-Deepfake-Detector
### Detecting AI-Generated Faces with Precision and Purpose
>*Not just another classifier — a tool for digital truth.*
>
Developed by **[Sadra Milani Moghaddam](https://sadramilani.ir/)**
---
## Why This Matters
As generative AI (like SDXL, DALL·E, and Midjourney) becomes more accessible, the line between real and synthetic media blurs — especially for vulnerable communities. This project started as a technical experiment but evolved into a **privacy-aware, open-source defense** against visual misinformation, with a focus on **ethical AI deployment**.
---
## Model Overview
**SDXL-Deepfake-Detector** is a fine-tuned vision transformer that classifies human faces as **artificial (0)** or **human (1)**, achieving an accuracy of **86%**.
## Training Approach
This model was obtained by **fine-tuning** the [`Organika/sdxl-detector`](https://huggingface.co/Organika/sdxl-detector) — a vision transformer pre-trained specifically to detect SDXL-generated faces — on the [140k Real and Fake Faces](https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces) dataset.
This approach leverages:
- Prior knowledge of SDXL artifacts from the base model
- Broader generalization from a large-scale real/fake face dataset
- Efficient training on limited hardware (single RTX 3060)
The result is a lightweight, high-accuracy detector optimized for **both SDXL and general diffusion-based deepfakes**.
### Key Highlights
- **Architecture**: Fine-tuned Vision Transformer (ViT) via Hugging Face `transformers`
- **Dataset**: 140k balanced real/fake face images
- **License**: [MIT](https://opensource.org/licenses/MIT) — free for research and commercial use
- **Hardware**: Trained on a single NVIDIA RTX 3060 (12GB VRAM) — proving high impact doesn’t require massive resources
---
## Quick Start
### Dependencies
```bash
pip install transformers torch pillow
```
### Python Script
```python
#predict.py
import argparse
from transformers import AutoModelForImageClassification, AutoFeatureExtractor
from PIL import Image
import torch
import os
def main():
parser = argparse.ArgumentParser(
description="Classify an image as 'artificial' or 'human' using the SDXL-Deepfake-Detector."
)
parser.add_argument("--image", type=str, required=True, help="Path to the input image file")
args = parser.parse_args()
# Validate image path
if not os.path.isfile(args.image):
raise FileNotFoundError(f"Image file not found: {args.image}")
# Load model and feature extractor from Hugging Face Hub
model_name = "SADRACODING/SDXL-Deepfake-Detector"
print(f"Loading model '{model_name}'...")
model = AutoModelForImageClassification.from_pretrained(model_name)
feature_extractor = AutoFeatureExtractor.from_pretrained(model_name)
# Set device (GPU if available)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()
print(f"Running on device: {device}")
# Load and preprocess image
image = Image.open(args.image).convert("RGB")
inputs = feature_extractor(images=image, return_tensors="pt").to(device)
# Inference
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predicted_class_idx = logits.argmax(-1).item()
predicted_label = model.config.id2label[predicted_class_idx]
# Output
print(f"Prediction Result")
print(f"Class Index: {predicted_class_idx}")
print(f"Label : {predicted_label}")
if __name__ == "__main__":
main()
```
### How to use
```bash
python predict.py --image path/to/image
```
## Performance & Limitations
> **Note**: Final test accuracy will be reported after full evaluation. Preliminary results show strong generalization on SDXL- and diffusion-based face forgeries.
### Known Limitations
- Trained primarily on **frontal, well-lit, aligned face crops** — may underperform on:
- Low-resolution or blurry images
- Heavily occluded or non-frontal faces
- GAN-generated faces (e.g., StyleGAN2/3)
- Label mapping:
- `0``"artificial"` (AI-generated / Deepfake)
- `1``"human"` (authentic human face)
> ⚠️ This tool is **not a forensic proof**, but a probabilistic detector. Use responsibly.
---
## Philosophy & Ethics
This model is open-source because:
- **Transparency** is essential in the fight against synthetic media.
- **Accessibility** ensures researchers, journalists, and civil society can audit and use detection tools without gatekeeping.
- **Privacy matters**: The model runs **entirely offline** — your images never leave your device.
As a developer from a vulnerable community, I believe AI safety tools must be **inclusive, ethical, and human-centered** — not just technically accurate.
---
## Acknowledgements
- **Dataset**: [140k Real and Fake Faces](https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces) by xhlulu
- **Framework**: [Hugging Face Transformers](https://huggingface.co/docs/transformers)
- **Model & Code**: [GitHub Repository](https://github.com/SadraCoding/SDXL-Deepfake-Detector) | [Hugging Face Hub](https://huggingface.co/SADRACODING/SDXL-Deepfake-Detector)
---
## How to Contribute
Fine-tune this model on your domain-specific data using Hugging Face `Trainer`.
---
> *Built with curiosity, ethics, and a 12GB GPU — because impactful AI doesn’t require a data center, just purpose.*
> — Sadra Milani Moghaddam