Spaces:

JahDaGanj
/

FashionMatrix

Configuration error

App Files Files Community

FashionMatrix / README.md

JahDaGanj

Upload 10 files

e500bb8 verified about 1 month ago

preview code

raw

history blame contribute delete

4.74 kB

	# Fashion Matrix: Editing Photos by Just Talking
	[![Framework: PyTorch](https://img.shields.io/badge/Framework-PyTorch-orange.svg)](https://pytorch.org/)
	[![License](https://img.shields.io/badge/License-MIT-red.svg)](https://opensource.org/licenses/MIT)

	[[`Project page`](https://zheng-chong.github.io/FashionMatrix/)]
	[[`ArXiv`](https://arxiv.org/abs/2307.13240)]
	[[`PDF`](https://arxiv.org/pdf/2307.13240.pdf)]
	[[`Video`](https://www.youtube.com/watch?v=1z-v0RSleMg&t=3s)]
	[[`Demo(temporarily offline)`](https://0742dc8730a5a94a7a.gradio.live)]

	Fashion Matrix is dedicated to bridging various visual and language models and continuously refining its capabilities as a comprehensive fashion AI assistant.
	This project will continue to update new features and optimization effects.

	<div align="center">
	<img src="static/images/teaser.jpeg" width="100%" height="100%"/>
	</div>

	## Updates
	- `2023/08/01`: Code of v2,0 released.
	- `2023/08/01`: Code of v1.1 is released. The details are a bit different from the original version (Paper).
	- `2023/08/01`: [Demo(Label) v1.1](https://0742dc8730a5a94a7a.gradio.live) with new AI model function and security updates is released.
	- `2023/07/28`: Demo(Label) v1.0 is released.
	- `2023/07/26`: [Video](https://www.youtube.com/watch?v=1z-v0RSleMg&t=3s) and [Project Page](https://zheng-chong.github.io/FashionMatrix/) are released.
	- `2023/07/25`: [Arxiv Preprint](https://arxiv.org/abs/2307.13240) is released.

	## Versions

	April 28, 2023

	Fashion Matrix (Label version) v2.0

	We have simplified the utilization of the support model, employing fewer models and GPU memory, while also retaining the original image resolution (up to 1024x1024).


	April 01, 2023

	Fashion Matrix (Label version) v1.1

	We updated the use of ControlNet, currently using inpaint, openpose, lineart and (softedge).
	+ Add the task AI model, which can replace the model while keeping the pose and outfits.
	+ Add NSFW (Not Safe For Work) detection to avoid inappropriate using.


	July 28, 2023

	Fashion Matrix (Label version) v1.0
	+ Basic functions: replace, remove, add, and recolor.

	## Installation
	You can follow the steps indicated in the [Installation Guide](INSTALL.md) for environment configuration and model deployment,
	and models except LLM can be deployed on a single GPU with 13G+ VRAM.
	(In the case of sacrificing some functions, A simplified version of Fashion Matrix can be realized without LLM.
	Maybe the simplified version of Fashion Matrix will be released in the future)


	## Acknowledgement
	Our work is based on the following excellent works:

	[Realistic Vision](https://civitai.com/models/4201/realistic-vision-v20) is a finely calibrated model derived from
	[Stable Diffusion](https://github.com/Stability-AI/stablediffusion) v1.5, designed to enhance the realism of generated
	images, with a particular focus on human portraits.
	[ControlNet](https://github.com/lllyasviel/ControlNet-v1-1-nightly) v1.1 offers more comprehensive and user-friendly
	conditional control models, enabling
	[the concurrent utilization of multiple ControlNets](https://huggingface.co/docs/diffusers/v0.18.2/en/api/pipelines/controlnet#diffusers.StableDiffusionControlNetPipeline).
	This significantly broadens the potential and applicability of text-to-image techniques.
	[BLIP](https://github.com/salesforce/BLIP) facilitates a rapid visual question-answering within our system.

	[Grounded-SAM](https://github.com/IDEA-Research/Grounded-Segment-Anything) create a very interesting demo by combining
	[Grounding DINO](https://github.com/IDEA-Research/GroundingDINO) and
	[Segment Anything](https://github.com/facebookresearch/segment-anything) which aims to detect and segment anything with text inputs!
	[Matting Anything Model (MAM)](https://github.com/SHI-Labs/Matting-Anything) is an efficient and
	versatile framework for estimating the alpha matte ofany instance in an image with flexible and interactive
	visual or linguistic user prompt guidance.
	[Detectron2](https://github.com/facebookresearch/detectron2) is a next generation library that provides state-of-the-art
	detection and segmentation algorithms. The DensePose code we adopted is based on Detectron2.
	[Graphonomy](https://github.com/Gaoyiminggithub/Graphonomy) has the capacity for swift and effortless analysis of
	diverse anatomical regions within the human body.


	## Citation

	```bibtex
	@misc{chong2023fashion,
	title={Fashion Matrix: Editing Photos by Just Talking},
	author={Zheng Chong and Xujie Zhang and Fuwei Zhao and Zhenyu Xie and Xiaodan Liang},
	year={2023},
	eprint={2307.13240},
	archivePrefix={arXiv},
	primaryClass={cs.CV}
	}
	```