Spaces:

iBrokeTheCode
/

Multimodal_Product_Classification

Sleeping

App Files Files Community

iBrokeTheCode commited on Aug 29

Commit

5ff38a4

1 Parent(s): a061490

chore: Add About and Model sections

Browse files

Files changed (3) hide show

.gitignore +216 -0
__pycache__/predictor.cpython-310.pyc +0 -0
app.py +48 -25

.gitignore ADDED Viewed

	@@ -0,0 +1,216 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[codz]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py.cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# UV
+#   Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#uv.lock
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+#poetry.toml
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#   pdm recommends including project-wide configuration in pdm.toml, but excluding .pdm-python.
+#   https://pdm-project.org/en/latest/usage/project/#working-with-version-control
+#pdm.lock
+#pdm.toml
+.pdm-python
+.pdm-build/
+# pixi
+#   Similar to Pipfile.lock, it is generally recommended to include pixi.lock in version control.
+#pixi.lock
+#   Pixi creates a virtual environment in the .pixi directory, just like venv module creates one
+#   in the .venv directory. It is recommended not to include this directory in version control.
+.pixi
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# Redis
+*.rdb
+*.aof
+*.pid
+# RabbitMQ
+mnesia/
+rabbitmq/
+rabbitmq-data/
+# ActiveMQ
+activemq-data/
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.envrc
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+#.idea/
+# Abstra
+# Abstra is an AI-powered process automation framework.
+# Ignore directories containing user credentials, local state, and settings.
+# Learn more at https://abstra.io/docs
+.abstra/
+# Visual Studio Code
+#  Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore
+#  that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
+#  and can be added to the global gitignore or merged into this file. However, if you prefer,
+#  you could uncomment the following to ignore the entire vscode folder
+# .vscode/
+# Ruff stuff:
+.ruff_cache/
+# PyPI configuration file
+.pypirc
+# Marimo
+marimo/_static/
+marimo/_lsp/
+__marimo__/
+# Streamlit
+.streamlit/secrets.toml

__pycache__/predictor.cpython-310.pyc DELETED Viewed

Binary file (1.27 kB)

app.py CHANGED Viewed

@@ -41,7 +41,7 @@ with gr.Blocks(
 ) as demo:
     with gr.Tabs():
         # 📌 APP TAB
-        with gr.TabItem("App"):
             gr.Markdown("""
                 <div style="text-align: center;">
                     <h1>🛍️ Multimodal Product Classification</h1>
@@ -101,39 +101,62 @@ with gr.Blocks(
                         )
         # 📌 ABOUT TAB
-        with gr.TabItem("About"):
             gr.Markdown("""
-## About This Project
-- This project is an image classification app powered by a Convolutional Neural Network (CNN).
-- Simply upload an image, and the app predicts its category from over 1,000 classes using a pre-trained ResNet50 model.
-- Originally developed as a multi-service ML system (FastAPI + Redis + Streamlit), this version has been adapted into a single Streamlit app for lightweight, cost-effective deployment on Hugging Face Spaces.
-## Model & Description
-- Model: ResNet50 (pre-trained on the ImageNet dataset with 1,000+ categories).
-- Pipeline: Images are resized, normalized, and passed to the model.
-- Output: The app displays the Top prediction with confidence score.
-ResNet50 is widely used in both research and production, making it an excellent showcase of deep learning capabilities and transferable ML skills.
 """)
         # 📌 MODEL TAB
-        with gr.TabItem("Model"):
             gr.Markdown("""
-## Original Architecture
-- FastAPI → REST API for image processing
-- Redis → Message broker for service communication
-- Streamlit → Interactive web UI
-- TensorFlow → Deep learning inference engine
-- Locust → Load testing & benchmarking
-- Docker Compose → Service orchestration
-## Simplified Version
-- Streamlit only → UI and model combined in a single app
-- TensorFlow (ResNet50) → Core prediction engine
-- Docker → Containerized for Hugging Face Spaces deployment
-This evolution demonstrates the ability to design a scalable microservices system and also adapt it into a lightweight single-service solution for cost-effective demos.
 """)
     # 📌 FOOTER

 ) as demo:
     with gr.Tabs():
         # 📌 APP TAB
+        with gr.TabItem("🚀 App"):
             gr.Markdown("""
                 <div style="text-align: center;">
                     <h1>🛍️ Multimodal Product Classification</h1>
                         )
         # 📌 ABOUT TAB
+        with gr.TabItem("ℹ️ About"):
             gr.Markdown("""
+## Project Overview
+- This project is a multimodal product classification system for Best Buy products.
+- The core objective is to categorize products using both their text descriptions and images.
+- The system was trained on a dataset of **almost 50,000** products and their corresponding images to generate embeddings and train the classification models.
+<br>
+## Technical Workflow
+1.  **Data Preprocessing:** Product descriptions and images are extracted from the dataset, and a `categories.json` file is used to map product IDs to human-readable category names.
+2.  **Embedding Generation:**
+    - **Text:** A pre-trained `SentenceTransformer` model (`all-MiniLM-L6-v2`) is used to generate dense vector embeddings from the product descriptions.
+    - **Image:** A pre-trained computer vision model from the Hugging Face `transformers` library (`TFConvNextV2Model`) is used to extract image features.
+3.  **Model Training:** The generated text and image embeddings are then used to train a multi-layer perceptron (MLP) model for classification. Separate models were trained for text-only, image-only, and multimodal (combined embeddings) classification.
+4.  **Deployment:** The trained models are deployed via a Gradio web interface, allowing for live prediction on new product data.
+<br>
+> **💡 Want to explore the process in detail?**
+> See the full 👉 [Jupyter notebook](https://huggingface.co/spaces/iBrokeTheCode/Multimodal_Product_Classification/blob/main/notebook_guide.ipynb) 👈️ for an end-to-end walkthrough, including Exploratory Data Analysis, embeddings generation, models training, evaluation, and model selection.
 """)
         # 📌 MODEL TAB
+        with gr.TabItem("🎯 Model"):
             gr.Markdown("""
+## Model Details
+The final classification is performed by a Multi-layer Perceptron (MLP) trained on the embeddings. This architecture allows the model to learn the relationships between the textual and visual features.
+<br>
+## Performance Summary
+The following table summarizes the performance of all models trained in this project.
+<br>
+| Model               | Modality     | Accuracy | Macro Avg F1-Score | Weighted Avg F1-Score |
+| :------------------ | :----------- | :------- | :----------------- | :-------------------- |
+| Random Forest       | Text         | 0.90     | 0.83               | 0.90                  |
+| Logistic Regression | Text         | 0.90     | 0.84               | 0.90                  |
+| Random Forest       | Image        | 0.80     | 0.70               | 0.79                  |
+| Random Forest       | Combined     | 0.89     | 0.79               | 0.89                  |
+| Logistic Regression | Combined     | 0.89     | 0.83               | 0.89                  |
+| **MLP** | **Image** | **0.84** | **0.77** | **0.84** |
+| **MLP** | **Text** | **0.92** | **0.87** | **0.92** |
+| **MLP** | **Combined** | **0.92** | **0.85** | **0.92** |
+<br>
+## Conclusion
+- Based on the overall results, the MLP models consistently outperformed their classical machine learning counterparts, demonstrating their ability to learn intricate, non-linear relationships within the data.
+- Both the Text MLP and Combined MLP models achieved the highest accuracy and weighted F1-score, confirming their superior ability to classify the products.
+- This modular approach demonstrates the ability to handle various data modalities and evaluate the contribution of each to the final prediction.
 """)
     # 📌 FOOTER