产品展示
  • 汽车贴纸轮眉卡通个性划痕遮挡装饰贴创意可爱车前后轮保险杠贴纸
  • 适用宝马3系车门内拉手F35 F30 316 320 328门把手内侧4系内扶手
  • 大众探岳改装储物盒后备箱储物格备胎置物箱尾箱内饰装饰专用配件
  • 用解放配件平地板加宽一级脚车原厂塑料货车用品JH6汽车踏板护罩
  • 速度与激情7车贴个性侧门改装贴纸车贴字母引擎盖汽车用品
联系方式

邮箱:[email protected]

电话:020-123456789

传真:020-123456789

汽车配件

Musk’s xAI reveals Grok 1.5 Vision, claims top spatial understanding

2024-09-20 00:29:42      点击:800

Elon Musk’s artificial intelligence (AI) company, xAI, has unveiled its first multimodal model, Grok 1.5 Vision, as it looks to compete with OpenAI.

As per the preview, in addition to understanding text, the AI model can also work with documents, charts, diagrams, screenshots, and photos.

One of OpenAI’s funders, Musk advocates that AI can help humanity in unimaginable ways. However, after falling out with the vision of how OpenAI should proceed, Musk started xAI last year with a group of influential AI researchers keen on developing AI models more openly.

Featured Video Related

Last November, the company rolled out the first iteration of its AI model, Grok. Further, it emphasized its push for openness by making its base model weights and network architecture open-sourced last month. The pace at which the company is working is evident, and its first multimodal AI model arrived barely a month after its architecture was made open-source.

What can Grok 1.5V do?

According to its website, the Grok 1.5V connects the physical and digital worlds. The company has highlighted seven examples of its capabilities to explain how the multimodal model works.

A user can share a picture of a flowchart with Grok, and the AI model can translate it into Python code. By simply showing the model a nutrition label, a user can inquire how many calories one would consume by consuming certain portions of the product.

While this might seem like an easy case of multiplication, the AI model can also take a child’s drawing and build an entire bedtime story using it. The model can do the converse, too. Show it a meme, and it will explain why it is funny and provide the context needed to understand it.

The AI model isn’t just built for play. It can convert a table into CSV format or help you correct a piece of code that might not be working. But if you need home repair advice, just share images of the affected area, and the model is designed to help you with that as well, the company lists on its website.

xAI has also released a new benchmark dubbed RealWorldQA to evaluate the spatial understanding shown by multimodal models. From examples shared by the company, Grok 1.5V can look at images and differentiate between objects that are comparatively bigger or give driving advice as well.

Grok 1.5V also handsomely beats other AI models on this benchmark as well as others, according to the company’s data shared in this chart.

Grok 1.5V and other AI models performance on various benchmarks.
Grok 1.5V and other AI models’ performance on various benchmarks. Image credit: x.AI

What’s in store for the future?

With Elon Musk stating in a recent interview that he expects AI to be smarter than any human by the end of 2025, all eyes are on what improvements his company will bring to the AI race in the upcoming months.

xAI has said that in its aim to build beneficial artificial general intelligence (AGI) that can understand the universe, the company will make significant improvements to the capabilities of its models in other areas, such as audio, voice, and video, in the coming months.

Grok 1.5V will soon become available for the company’s testers and existing users, the company added in its blog.

N. Korea lifts respiratory illness
[INTERVIEW] North Korean leader's sister poses dilemma for power transition