The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestMicrosoftv1
MicrosoftVision-Language

Microsoft Releases Fara-7B Vision Agent Model

The 7-billion-parameter model is designed to understand and interact with graphical user interfaces, building on Alibaba's open-source Qwen2.5-VL.

Oct 30, 2025
NotableMIT
Microsoft · Vision-Language
Fara-7B
Fara-7B

Microsoft has introduced Fara-7B, a new 7-billion-parameter vision-language model aimed at a specific and challenging task: controlling a computer. Unlike general-purpose multimodal models, Fara-7B is designed to function as an agent, interpreting graphical user interfaces (GUIs) to understand and execute tasks.

This specialization allows the model to go beyond simply describing what's on a screen. The goal is for Fara-7B to comprehend the layout, elements, and interactive possibilities within an application, paving the way for more sophisticated AI-powered automation and assistance.

Interestingly, Fara-7B is not built from the ground up. According to its official model card, the model is based on Alibaba's recently released Qwen2.5-VL. This approach highlights a growing trend of major AI labs building upon and refining foundational models released by others, accelerating the pace of innovation across the open-source community.

Why it matters

The release of specialized agent models like Fara-7B under a permissive MIT license provides a powerful building block for developers. It opens up new possibilities for creating advanced accessibility tools, automating repetitive software tasks, and developing more capable personal AI assistants that can interact with technology the same way humans do: by seeing and clicking.

Sources

  • microsoft/Fara-7B

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters7B
Context window—
LicenseMIT
Downloads9.1K

Modalities

Vision-Language

More in Vision-Language

Moonshot AI
Kimi-K2.7-Code
Kimi-K2.7-Code
Moonshot AI/Code

Moonshot AI Releases Kimi, a Multimodal Coding Model

The new Mixture-of-Experts model from the Chinese AI company can generate code while also understanding visual inputs, a rare combination in open models.

Jun 11, 2026
Google DeepMind
DiffusionGemma 26B-A4B Instruct
DiffusionGemma 26B-A4B Instruct
Google DeepMind/Text / LLM

Google Releases Open-Source DiffusionGemma 26B Model

The new 26B parameter model from DeepMind uses a diffusion-based architecture, a technique more common in image generation, to produce text.

Jun 9, 2026
MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026