Spaces:
Running
Running
update readme
Browse files
README.md
CHANGED
|
@@ -8,12 +8,27 @@ pinned: false
|
|
| 8 |
---
|
| 9 |
|
| 10 |
<div align="center">
|
| 11 |
-
<b><font size="6">
|
| 12 |
</div>
|
| 13 |
|
| 14 |
-
Welcome to
|
| 15 |
|
| 16 |
# Models
|
| 17 |
|
| 18 |
-
- [
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
<div align="center">
|
| 11 |
+
<b><font size="6">AI45Research</font></b>
|
| 12 |
</div>
|
| 13 |
|
| 14 |
+
Welcome to AI45Lab! We are a research group from Shanghai AI Lab focused on Vision-Centric AI research. The GV in our name, OpenGVLab, means general vision, a general understanding of vision, so little effort is needed to adapt to new vision-based tasks.
|
| 15 |
|
| 16 |
# Models
|
| 17 |
|
| 18 |
+
- [InternVL](https://github.com/OpenGVLab/InternVL): a pioneering open-source alternative to GPT-4V.
|
| 19 |
+
- [InternImage](https://github.com/OpenGVLab/InternImage): a large-scale vision foundation models with deformable convolutions.
|
| 20 |
+
- [InternVideo](https://github.com/OpenGVLab/InternVideo): large-scale video foundation models for multimodal understanding.
|
| 21 |
+
- [VideoChat](https://github.com/OpenGVLab/Ask-Anything): an end-to-end chat assistant for video comprehension.
|
| 22 |
+
- [All-Seeing-Project](https://github.com/OpenGVLab/all-seeing): towards panoptic visual recognition and understanding of the open world.
|
| 23 |
|
| 24 |
+
# Datasets
|
| 25 |
+
|
| 26 |
+
- [ShareGPT4o](https://sharegpt4o.github.io/): a groundbreaking large-scale resource that we plan to open-source with 200K meticulously annotated images, 10K videos with highly descriptive captions, and 10K audio files with detailed descriptions.
|
| 27 |
+
- [InternVid](https://github.com/OpenGVLab/InternVideo/tree/main/Data/InternVid): a large-scale video-text dataset for multimodal understanding and generation.
|
| 28 |
+
- [MMPR](https://huggingface.co/datasets/OpenGVLab/MMPR): a high-quality, large-scale multimodal preference dataset.
|
| 29 |
+
|
| 30 |
+
# Benchmarks
|
| 31 |
+
- [MVBench](https://github.com/OpenGVLab/Ask-Anything/tree/main/video_chat2): a comprehensive benchmark for multimodal video understanding.
|
| 32 |
+
- [CRPE](https://github.com/OpenGVLab/all-seeing/tree/main/all-seeing-v2): a benchmark covering all elements of the relation triplets (subject, predicate, object), providing a systematic platform for the evaluation of relation comprehension ability.
|
| 33 |
+
- [MM-NIAH](https://github.com/uni-medical/GMAI-MMBench): a comprehensive benchmark for long multimodal documents comprehension.
|
| 34 |
+
- [GMAI-MMBench](https://huggingface.co/datasets/OpenGVLab/GMAI-MMBench): a comprehensive multimodal evaluation benchmark towards general medical AI.
|