ACT: Condiment Handover (s0101_act_condiment_handover)
Action Chunking Transformer model fine-tuned on ASGARD condiment handover demonstrations
This is an ACT (Action Chunking Transformer) model trained on 40 episodes of condiment manipulation and handover demonstrations from the ASGARD so101_follower robot arm.
Model Details
Model Type
- Architecture: ACT (Action Chunking Transformers)
- Parameters: ~52M
- Task: Condiment manipulation and handover
- Checkpoint: Step 1860 (final)
Training Configuration
- Dataset: asgard-robot/asgard_training_data_condiment
- Episodes: 40 demonstrations
- Total Frames: 31,522
- Robot: ASGARD so101_follower (single-arm 6 DOF)
- Training Steps: 1,860
- Logged Epochs: ~120 (with MetricsTracker bug accounting)
Hardware
- GPUs: 4x NVIDIA H100 PCIe (80GB VRAM each)
- Total VRAM: ~320GB
- Effective Batch Size: 512 (128 per GPU × 4 GPUs)
- Training Time: ~27 minutes
Hyperparameters
- Learning Rate: 1e-5
- Learning Rate (Backbone): 1e-5
- Weight Decay: 1e-4
- Batch Size: 128 per GPU
- Optimizer: AdamW (betas: 0.9, 0.999, eps: 1e-8)
- Gradient Clipping: 10.0
ACT-Specific Parameters
- Chunk Size: 100
- Action Steps: 100
- VAE Training: Yes
- KL Weight: 10.0
- Dropout: 0.1
Architecture Details
- Vision Backbone: ResNet-18 (pretrained on ImageNet)
- Hidden Dimension: 512
- Feedforward Dimension: 3,200
- Attention Heads: 8
- Encoder Layers: 4
- Decoder Layers: 1
- VAE Encoder Layers: 4
- Latent Dimension: 32
Training Results
- Initial Loss: 12.852
- Final Loss: 0.262 (at step 1860)
- Loss Reduction: 98%
- Training Speed: ~0.64 steps/second
- Memory Usage: ~40-50GB per GPU
Model Files
model.safetensors: 198MB (model weights)config.json: ACT configurationtrain_config.json: Training configurationpolicy_preprocessor.json: Input preprocessingpolicy_postprocessor.json: Output postprocessing- Normalizer weights: 7.5KB each
Intended Use
This model is designed for:
- Condiment manipulation: Picking up, moving, and handling condiment bottles
- Handover operations: Coordinated handoff to humans
- Fine-grained manipulation: Precise gripper control
- Home assistant applications: Domestic robot tasks
Performance Characteristics
Training Metrics
- Smooth loss convergence
- No overfitting observed
- Stable gradient magnitudes (~12-28)
- Consistent learning throughout training
- Faster convergence than potato task
Expected Performance
- High success rate on similar condiment manipulation tasks
- Smooth action generation
- Proper force control during handover
- Robust to slight variations in setup
Limitations
- Limited demonstrations: Trained on 40 episodes only
- Overfitting risk: May not generalize to drastically different scenarios
- Camera dependency: Requires similar camera setup (wrist + external)
- Robot-specific: Designed for ASGARD so101_follower robot
- Task-specific: Optimized for condiment handover task
Usage Example
from lerobot.scripts.lerobot_train import load_policy
import torch
# Load the trained model
policy = load_policy(
"asgard-robot/s0101-act-condiment-handover",
device="cuda"
)
# Run inference
# observation should contain:
# - Images from wrist and external cameras
# - Current joint positions
action = policy(observation)
Dataset
- Source: ASGARD teleoperation demonstrations
- Format: LeRobot v3.0
- Cameras: Dual RGB (wrist + external)
- Control: 6 DOF joint positions
- Frequency: 30 FPS
Training Environment
- Framework: LeRobot (Hugging Face)
- Branch: ASGARD teleop control
- Python: 3.10
- PyTorch: With CUDA support
- Multi-GPU: Distributed training with Accelerate
Citation
If you use this model, please cite:
@misc{s0101_act_condiment_handover_2024,
title={ACT: Condiment Handover for ASGARD Robot},
author={ASGARD Team},
year={2024},
url={https://huggingface.co/asgard-robot/s0101-act-condiment-handover}
}
Model Card Author
ASGARD Robot Team
Acknowledgments
- Base architecture: ACT (Zhao et al., 2023) - Learning Fine-Grained Bimanual Manipulation
- Training framework: LeRobot by Hugging Face
- Hardware: Shadeform H100 Multi-GPU Cluster
- Dataset: Collected via ASGARD teleoperation system
- Downloads last month
- 14