Spaces:
				
			
			
	
			
			
		Runtime error
		
	
	
	
			
			
	
	
	
	
		
		A newer version of the Gradio SDK is available:
									5.49.1
OminiControl Training ๐ ๏ธ
Preparation
Setup
- Environment
conda create -n omini python=3.10 conda activate omini - Requirements
pip install -r train/requirements.txt 
Dataset
- Download dataset Subject200K. (subject-driven generation)
bash train/script/data_download/data_download1.sh - Download dataset text-to-image-2M. (spatial control task)
Note: By default, only a few files are downloaded. You can modifybash train/script/data_download/data_download2.shdata_download2.shto download additional datasets. Remember to update the config file to specify the training data accordingly. 
Training
Start training training
Config file path: ./train/config
Scripts path: ./train/script
- Subject-driven generation
bash train/script/train_subject.sh - Spatial control task
bash train/script/train_canny.sh 
Note: Detailed WanDB settings and GPU settings can be found in the script files and the config files.
Other spatial control tasks
This repository supports 5 spatial control tasks:
- Canny edge to image (
canny) - Image colorization (
coloring) - Image deblurring (
deblurring) - Depth map to image (
depth) - Image to depth map  (
depth_pred) - Image inpainting (
fill) - Super resolution (
sr) 
You can modify the condition_type parameter in config file config/canny_512.yaml to switch between different tasks.
Customize your own task
You can customize your own task by constructing a new dataset and modifying the training code.
Instructions
Dataset :
Construct a new dataset with the following format: (
src/train/data.py)class MyDataset(Dataset): def __init__(self, ...): ... def __len__(self): ... def __getitem__(self, idx): ... return { "image": image, "condition": condition_img, "condition_type": "your_condition_type", "description": description, "position_delta": position_delta }Note: For spatial control tasks, set the
position_deltato be[0, 0]. For non-spatial control tasks, setposition_deltato be[0, -condition_width // 16].Condition:
Add a new condition type in the
Conditionclass. (src/flux/condition.py)condition_dict = { ... "your_condition_type": your_condition_id_number, # Add your condition type here } ... if condition_type in [ ... "your_condition_type", # Add your condition type here ]: ...Test:
Add a new test function for your task. (
src/train/callbacks.py)if self.condition_type == "your_condition_type": condition_img = ( Image.open("images/vase.jpg") .resize((condition_size, condition_size)) .convert("RGB") ) ... test_list.append((condition_img, [0, 0], "A beautiful vase on a table."))Import relevant dataset in the training script Update the file in the following section. (
src/train/train.py)from .data import ( ImageConditionDataset, Subject200KDateset, MyDataset ) ... # Initialize dataset and dataloader if training_config["dataset"]["type"] == "your_condition_type": ...
Hardware requirement
Note: Memory optimization (like dynamic T5 model loading) is pending implementation.
Recommanded
- Hardware: 2x NVIDIA H100 GPUs
 - Memory: ~80GB GPU memory
 
Minimal
- Hardware: 1x NVIDIA L20 GPU
 - Memory: ~48GB GPU memory