altawil
commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,9 +1,3 @@
|
|
| 1 |
-
You got it. Here is the complete, professional, and copy-paste-ready README.md content in English for your model repository on the Hugging Face Hub.
|
| 2 |
-
|
| 3 |
-
This version is structured to be clear, informative, and follows the best practices for creating a Model Card.
|
| 4 |
-
|
| 5 |
-
README.md for Adam-IT/Interfuser-Baseer-v1
|
| 6 |
-
Generated markdown
|
| 7 |
---
|
| 8 |
license: mit
|
| 9 |
language:
|
|
@@ -22,115 +16,146 @@ datasets:
|
|
| 22 |
pipeline_tag: object-detection
|
| 23 |
---
|
| 24 |
|
| 25 |
-
# π InterFuser
|
| 26 |
|
| 27 |
-
|
|
|
|
|
|
|
| 28 |
|
| 29 |
-
|
| 30 |
|
| 31 |
-
|
| 32 |
|
| 33 |
-
|
| 34 |
|
| 35 |
-
|
|
|
|
|
|
|
|
|
|
| 36 |
|
| 37 |
-
|
| 38 |
-
* **Multi-Task Learning:** Simultaneously performs two critical tasks:
|
| 39 |
-
1. **Traffic Object Detection:** Identifies cars, motorcycles, and pedestrians in a 20x20 meter grid in front of the vehicle.
|
| 40 |
-
2. **Waypoint Prediction:** Predicts a safe and drivable trajectory for the next 10 waypoints.
|
| 41 |
-
* **Scene Understanding:** Provides logits for crucial environmental factors, including the presence of junctions, red light hazards, and stop signs.
|
| 42 |
-
* **Optimized for CARLA:** Fine-tuned on the `PDM_Lite_Carla` dataset, making it highly effective for scenarios within the CARLA simulator.
|
| 43 |
|
| 44 |
-
|
| 45 |
|
| 46 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
-
|
| 49 |
-
* **LiDAR Backbone:** `ResNet-18` (architecture defined, but LiDAR input is disabled in this version)
|
| 50 |
-
* **Transformer:**
|
| 51 |
-
* **Embedding Dimension:** 256
|
| 52 |
-
* **Encoder Depth:** 6 Layers
|
| 53 |
-
* **Decoder Depth:** 6 Layers
|
| 54 |
-
* **Attention Heads:** 8
|
| 55 |
-
* **Prediction Heads:**
|
| 56 |
-
* **Waypoints:** Gated Recurrent Unit (GRU) based predictor.
|
| 57 |
-
* **Traffic Detection:** A detection head that outputs a `20x20x7` grid representing object confidence, position offsets, dimensions, and orientation.
|
| 58 |
|
| 59 |
-
|
|
|
|
|
|
|
| 60 |
|
| 61 |
-
|
|
|
|
|
|
|
| 62 |
|
| 63 |
-
**1. Installation**
|
| 64 |
```bash
|
| 65 |
pip install torch torchvision timm huggingface_hub
|
| 66 |
-
|
| 67 |
-
|
| 68 |
|
| 69 |
-
|
| 70 |
-
The recommended way to load the model is by using the custom load_and_prepare_model function from the project, which handles the configuration and weight loading automatically.
|
| 71 |
|
| 72 |
-
|
| 73 |
import torch
|
| 74 |
-
|
| 75 |
-
from config_loader import load_and_prepare_model
|
| 76 |
|
| 77 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
model = load_and_prepare_model(device)
|
| 84 |
-
model.eval()
|
| 85 |
-
print("Model loaded successfully!")
|
| 86 |
-
except Exception as e:
|
| 87 |
-
print(f"Error loading model: {e}")
|
| 88 |
|
| 89 |
-
|
| 90 |
-
# dummy_input = ...
|
| 91 |
-
# with torch.no_grad():
|
| 92 |
-
# outputs = model(dummy_input)
|
| 93 |
-
IGNORE_WHEN_COPYING_START
|
| 94 |
-
content_copy
|
| 95 |
-
download
|
| 96 |
-
Use code with caution.
|
| 97 |
-
Python
|
| 98 |
-
IGNORE_WHEN_COPYING_END
|
| 99 |
-
π Training and Fine-tuning
|
| 100 |
|
| 101 |
-
|
|
|
|
|
|
|
| 102 |
|
| 103 |
-
|
| 104 |
|
| 105 |
-
|
| 106 |
|
| 107 |
-
|
|
|
|
|
|
|
|
|
|
| 108 |
|
| 109 |
-
|
| 110 |
|
| 111 |
-
|
|
|
|
|
|
|
|
|
|
| 112 |
|
| 113 |
-
|
| 114 |
|
| 115 |
-
|
| 116 |
|
| 117 |
-
|
|
|
|
|
|
|
|
|
|
| 118 |
|
| 119 |
-
|
| 120 |
|
| 121 |
-
|
| 122 |
|
| 123 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 124 |
|
| 125 |
-
|
| 126 |
|
| 127 |
-
|
|
|
|
|
|
|
| 128 |
|
| 129 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 130 |
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
download
|
| 135 |
-
Use code with caution.
|
| 136 |
-
IGNORE_WHEN_COPYING_END
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
language:
|
|
|
|
| 16 |
pipeline_tag: object-detection
|
| 17 |
---
|
| 18 |
|
| 19 |
+
# π InterFuser-Baseer-v1: Autonomous Driving Model
|
| 20 |
|
| 21 |
+
[](https://opensource.org/licenses/MIT)
|
| 22 |
+
[](https://pytorch.org/)
|
| 23 |
+
[](https://carla.org/)
|
| 24 |
|
| 25 |
+
## π Overview
|
| 26 |
|
| 27 |
+
InterFuser-Baseer-v1 is a state-of-the-art transformer-based model for autonomous driving, specifically fine-tuned for the **Baseer Self-Driving API**. This model combines computer vision and deep learning to provide real-time traffic object detection and trajectory planning in simulated driving environments.
|
| 28 |
|
| 29 |
+
### π― Key Capabilities
|
| 30 |
|
| 31 |
+
- **Multi-Task Learning**: Simultaneous traffic object detection and waypoint prediction
|
| 32 |
+
- **Transformer Architecture**: Advanced attention mechanisms for scene understanding
|
| 33 |
+
- **Real-Time Processing**: Optimized for real-time inference in driving scenarios
|
| 34 |
+
- **CARLA Integration**: Specifically tuned for CARLA simulation environment
|
| 35 |
|
| 36 |
+
## ποΈ Architecture
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
+
### Model Components
|
| 39 |
|
| 40 |
+
| Component | Specification |
|
| 41 |
+
|-----------|---------------|
|
| 42 |
+
| **Image Backbone** | ResNet-50 (ImageNet pretrained) |
|
| 43 |
+
| **LiDAR Backbone** | ResNet-18 (disabled in this version) |
|
| 44 |
+
| **Transformer** | 6-layer encoder/decoder, 8 attention heads |
|
| 45 |
+
| **Embedding Dimension** | 256 |
|
| 46 |
+
| **Prediction Heads** | GRU-based waypoint predictor + Detection head |
|
| 47 |
|
| 48 |
+
### Output Format
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
|
| 50 |
+
- **Traffic Detection**: 20Γ20Γ7 grid (confidence, position, dimensions, orientation)
|
| 51 |
+
- **Waypoint Prediction**: 10 future trajectory points
|
| 52 |
+
- **Scene Understanding**: Junction, traffic light, and stop sign detection
|
| 53 |
|
| 54 |
+
## π Quick Start
|
| 55 |
+
|
| 56 |
+
### Installation
|
| 57 |
|
|
|
|
| 58 |
```bash
|
| 59 |
pip install torch torchvision timm huggingface_hub
|
| 60 |
+
```
|
|
|
|
| 61 |
|
| 62 |
+
### Usage Example
|
|
|
|
| 63 |
|
| 64 |
+
```python
|
| 65 |
import torch
|
| 66 |
+
from huggingface_hub import hf_hub_download
|
|
|
|
| 67 |
|
| 68 |
+
# Download model weights
|
| 69 |
+
model_path = hf_hub_download(
|
| 70 |
+
repo_id="Adam-IT/Interfuser-Baseer-v1",
|
| 71 |
+
filename="best_model.pth"
|
| 72 |
+
)
|
| 73 |
+
|
| 74 |
+
# Load model (requires InterFuser class definition)
|
| 75 |
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
| 76 |
+
model = torch.load(model_path, map_location=device)
|
| 77 |
+
model.eval()
|
| 78 |
+
|
| 79 |
+
# Inference
|
| 80 |
+
with torch.no_grad():
|
| 81 |
+
outputs = model(input_data)
|
| 82 |
+
```
|
| 83 |
+
|
| 84 |
+
## π Performance
|
| 85 |
+
|
| 86 |
+
### Training Details
|
| 87 |
|
| 88 |
+
- **Dataset**: PDM-Lite-CARLA (Urban driving scenarios)
|
| 89 |
+
- **Training Objective**: Multi-task learning with IoU optimization
|
| 90 |
+
- **Framework**: PyTorch
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
|
| 92 |
+
### Key Metrics
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
|
| 94 |
+
- Optimized for traffic detection accuracy
|
| 95 |
+
- Enhanced bounding box IoU performance
|
| 96 |
+
- Robust waypoint prediction in urban scenarios
|
| 97 |
|
| 98 |
+
## β οΈ Limitations
|
| 99 |
|
| 100 |
+
### Current Constraints
|
| 101 |
|
| 102 |
+
- **Simulation Only**: Trained exclusively on CARLA data
|
| 103 |
+
- **Single Camera**: Front-facing camera view only
|
| 104 |
+
- **No LiDAR**: Vision-based approach without LiDAR fusion
|
| 105 |
+
- **Dataset Scope**: Limited to PDM-Lite-CARLA scenarios
|
| 106 |
|
| 107 |
+
### Recommended Use Cases
|
| 108 |
|
| 109 |
+
- β
CARLA simulation environments
|
| 110 |
+
- β
Research and development
|
| 111 |
+
- β
Autonomous driving prototyping
|
| 112 |
+
- β Real-world deployment (requires additional training)
|
| 113 |
|
| 114 |
+
## π οΈ Integration
|
| 115 |
|
| 116 |
+
This model is designed to work with:
|
| 117 |
|
| 118 |
+
- **Baseer Self-Driving API**
|
| 119 |
+
- **CARLA Simulator**
|
| 120 |
+
- **PyTorch Inference Pipeline**
|
| 121 |
+
- **Custom Autonomous Driving Systems**
|
| 122 |
|
| 123 |
+
## π Citation
|
| 124 |
|
| 125 |
+
If you use this model in your research, please cite:
|
| 126 |
|
| 127 |
+
```bibtex
|
| 128 |
+
@misc{interfuser-baseer-v1,
|
| 129 |
+
title={InterFuser-Baseer-v1: Fine-tuned Autonomous Driving Model},
|
| 130 |
+
author={Adam-IT},
|
| 131 |
+
year={2024},
|
| 132 |
+
publisher={Hugging Face},
|
| 133 |
+
howpublished={\url{https://huggingface.co/Adam-IT/Interfuser-Baseer-v1}}
|
| 134 |
+
}
|
| 135 |
+
```
|
| 136 |
|
| 137 |
+
## π¨βπ» Development
|
| 138 |
|
| 139 |
+
**Developed by**: Adam-IT
|
| 140 |
+
**Project Type**: Graduation Project - AI & Autonomous Driving
|
| 141 |
+
**Institution**: [Your Institution Name]
|
| 142 |
|
| 143 |
+
## π License
|
| 144 |
+
|
| 145 |
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
| 146 |
+
|
| 147 |
+
## π€ Contributing
|
| 148 |
+
|
| 149 |
+
Contributions, issues, and feature requests are welcome! Feel free to check the [issues page](../../issues).
|
| 150 |
+
|
| 151 |
+
## π Support
|
| 152 |
+
|
| 153 |
+
For questions and support:
|
| 154 |
+
- Create an issue in this repository
|
| 155 |
+
- Contact: [Your Contact Information]
|
| 156 |
+
|
| 157 |
+
---
|
| 158 |
|
| 159 |
+
<div align="center">
|
| 160 |
+
<strong>π Drive the Future with AI π</strong>
|
| 161 |
+
</div>
|
|
|
|
|
|
|
|
|