Websight-7B (Merged)

This is a merged version of the Websight-7B model, ready for deployment and inference.

Model Details

Base Model: ByteDance-Seed/UI-TARS-1.5-7B
Source PEFT Model: Asanshay/websight-7B (previous model saved here)
Model Type: Vision-Language Model for Web Agent Tasks
License: Apache 2.0

Usage

from transformers import pipeline

# Load the model
pipe = pipeline("image-text-to-text", model="tanvirb/websight-7B")

# Use for web agent tasks
result = pipe(text="Click the login button", images=[screenshot])

Deployment

This model is ready for:

Hugging Face Inference Endpoints
Local inference
Integration with web automation pipelines

Training

This model was fine-tuned using PEFT (Parameter Efficient Fine-Tuning) techniques on web interaction data.

Downloads last month: 98

Safetensors

Model size

8B params

Tensor type

F16

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tanvirb/websight-7B

Base model

ByteDance-Seed/UI-TARS-1.5-7B

Finetuned

(7)

this model