REAL-MM-RAG-Bench is a benchmark designed to evaluate multi-modal retrieval models under realistic and challenging conditions.
AI & ML interests
Enterprise AI and ML, Foundation Models, Responsible AI
Recent Activity
View all activity
Datasets and models of the Otter-Knowledge project
GGUF-formatted versions of IBM Granite 3.2 models. Licensed under the Apache 2.0 license.
-
ibm-research/granite-3.2-2b-instruct-GGUF
Text Generation • 3B • Updated • 770 • 9 -
ibm-research/granite-3.2-8b-instruct-GGUF
Text Generation • 8B • Updated • 917 • 8 -
ibm-research/granite-vision-3.2-2b-GGUF
3B • Updated • 626 • 11 -
ibm-research/granite-guardian-3.2-3b-a800m-GGUF
Text Generation • 3B • Updated • 104 • 1
This category highlights the collective efforts of the AI Automation team in advancing Industry 4.0 applications and exploring innovations beyond it.
-
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance
Paper • 2506.03828 • Published • 15 -
FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes
Paper • 2506.03278 • Published • 6 -
ibm-research/AssetOpsBench
Viewer • Updated • 141 • 1.36k • 4 -
1
AssetOpsBench
📉Evaluating Autonomous AI Agents for Industry 4.0 Tasks
REAL-MM-RAG-Bench is a benchmark designed to evaluate multi-modal retrieval models under realistic and challenging conditions.
Welcome to IBM’s multi-modal foundation model for materials, FM4M, designed to support and advance research in materials science and chemistry.
REAL-MM-RAG-Bench is a benchmark designed to evaluate multi-modal retrieval models under realistic and challenging conditions.
This category highlights the collective efforts of the AI Automation team in advancing Industry 4.0 applications and exploring innovations beyond it.
-
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance
Paper • 2506.03828 • Published • 15 -
FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes
Paper • 2506.03278 • Published • 6 -
ibm-research/AssetOpsBench
Viewer • Updated • 141 • 1.36k • 4 -
1
AssetOpsBench
📉Evaluating Autonomous AI Agents for Industry 4.0 Tasks
Datasets and models of the Otter-Knowledge project
REAL-MM-RAG-Bench is a benchmark designed to evaluate multi-modal retrieval models under realistic and challenging conditions.
GGUF-formatted versions of IBM Granite 3.2 models. Licensed under the Apache 2.0 license.
-
ibm-research/granite-3.2-2b-instruct-GGUF
Text Generation • 3B • Updated • 770 • 9 -
ibm-research/granite-3.2-8b-instruct-GGUF
Text Generation • 8B • Updated • 917 • 8 -
ibm-research/granite-vision-3.2-2b-GGUF
3B • Updated • 626 • 11 -
ibm-research/granite-guardian-3.2-3b-a800m-GGUF
Text Generation • 3B • Updated • 104 • 1
Welcome to IBM’s multi-modal foundation model for materials, FM4M, designed to support and advance research in materials science and chemistry.