Update README.md

3d18a5f verified 1 day ago

6.1 kB

	---
	license: mit
	tags:
	- image-classification
	- deepfake-detection
	- computer-vision
	- vision-transformer
	- sdxl
	- fake-face-detection
	datasets:
	- xhlulu/140k-real-and-fake-faces
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	model-index:
	- name: SDXL-Deepfake-Detector
	results:
	- task:
	type: image-classification
	name: Image Classification
	dataset:
	name: 140k Real and Fake Faces
	type: xhlulu/140k-real-and-fake-faces
	metrics:
	- type: accuracy
	value: 0.86
	name: Accuracy
	---

	# SDXL-Deepfake-Detector
	### Detecting AI-Generated Faces with Precision and Purpose

	>Not just another classifier — a tool for digital truth.
	>
	Developed by [Sadra Milani Moghaddam](https://sadramilani.ir/)

	---

	## Why This Matters
	As generative AI (like SDXL, DALL·E, and Midjourney) becomes more accessible, the line between real and synthetic media blurs — especially for vulnerable communities. This project started as a technical experiment but evolved into a privacy-aware, open-source defense against visual misinformation, with a focus on ethical AI deployment.

	---

	## Model Overview

	SDXL-Deepfake-Detector is a fine-tuned vision transformer that classifies human faces as artificial (0) or human (1), achieving an accuracy of 86%.

	## Training Approach

	This model was obtained by fine-tuning the [`Organika/sdxl-detector`](https://huggingface.co/Organika/sdxl-detector) — a vision transformer pre-trained specifically to detect SDXL-generated faces — on the [140k Real and Fake Faces](https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces) dataset.

	This approach leverages:
	- Prior knowledge of SDXL artifacts from the base model
	- Broader generalization from a large-scale real/fake face dataset
	- Efficient training on limited hardware (single RTX 3060)

	The result is a lightweight, high-accuracy detector optimized for both SDXL and general diffusion-based deepfakes.

	### Key Highlights
	- Architecture: Fine-tuned Vision Transformer (ViT) via Hugging Face `transformers`
	- Dataset: 140k balanced real/fake face images
	- License: [MIT](https://opensource.org/licenses/MIT) — free for research and commercial use
	- Hardware: Trained on a single NVIDIA RTX 3060 (12GB VRAM) — proving high impact doesn’t require massive resources

	---

	## Quick Start

	### Dependencies
	```bash
	pip install transformers torch pillow
	```
	### Python Script
	```python
	#predict.py
	import argparse
	from transformers import AutoModelForImageClassification, AutoFeatureExtractor
	from PIL import Image
	import torch
	import os

	def main():
	parser = argparse.ArgumentParser(
	description="Classify an image as 'artificial' or 'human' using the SDXL-Deepfake-Detector."
	)
	parser.add_argument("--image", type=str, required=True, help="Path to the input image file")
	args = parser.parse_args()

	# Validate image path
	if not os.path.isfile(args.image):
	raise FileNotFoundError(f"Image file not found: {args.image}")

	# Load model and feature extractor from Hugging Face Hub
	model_name = "SADRACODING/SDXL-Deepfake-Detector"
	print(f"Loading model '{model_name}'...")
	model = AutoModelForImageClassification.from_pretrained(model_name)
	feature_extractor = AutoFeatureExtractor.from_pretrained(model_name)

	# Set device (GPU if available)
	device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
	model.to(device)
	model.eval()
	print(f"Running on device: {device}")

	# Load and preprocess image
	image = Image.open(args.image).convert("RGB")
	inputs = feature_extractor(images=image, return_tensors="pt").to(device)

	# Inference
	with torch.no_grad():
	outputs = model(**inputs)

	logits = outputs.logits
	predicted_class_idx = logits.argmax(-1).item()
	predicted_label = model.config.id2label[predicted_class_idx]

	# Output
	print(f"Prediction Result")
	print(f"Class Index: {predicted_class_idx}")
	print(f"Label : {predicted_label}")

	if __name__ == "__main__":
	main()
	```
	### How to use
	```bash
	python predict.py --image path/to/image
	```

	## Performance & Limitations

	> Note: Final test accuracy will be reported after full evaluation. Preliminary results show strong generalization on SDXL- and diffusion-based face forgeries.

	### Known Limitations
	- Trained primarily on frontal, well-lit, aligned face crops — may underperform on:
	- Low-resolution or blurry images
	- Heavily occluded or non-frontal faces
	- GAN-generated faces (e.g., StyleGAN2/3)
	- Label mapping:
	- `0` → `"artificial"` (AI-generated / Deepfake)
	- `1` → `"human"` (authentic human face)

	> ⚠️ This tool is not a forensic proof, but a probabilistic detector. Use responsibly.

	---

	## Philosophy & Ethics

	This model is open-source because:
	- Transparency is essential in the fight against synthetic media.
	- Accessibility ensures researchers, journalists, and civil society can audit and use detection tools without gatekeeping.
	- Privacy matters: The model runs entirely offline — your images never leave your device.

	As a developer from a vulnerable community, I believe AI safety tools must be inclusive, ethical, and human-centered — not just technically accurate.

	---

	## Acknowledgements

	- Dataset: [140k Real and Fake Faces](https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces) by xhlulu
	- Framework: [Hugging Face Transformers](https://huggingface.co/docs/transformers)
	- Model & Code: [GitHub Repository](https://github.com/SadraCoding/SDXL-Deepfake-Detector) \| [Hugging Face Hub](https://huggingface.co/SADRACODING/SDXL-Deepfake-Detector)

	---

	## How to Contribute

	Fine-tune this model on your domain-specific data using Hugging Face `Trainer`.

	---

	> Built with curiosity, ethics, and a 12GB GPU — because impactful AI doesn’t require a data center, just purpose.
	> — Sadra Milani Moghaddam