The Open Weights
LatestModelsLeaderboardsUpcomingCompanies
Subscribe
The Open Weights

The daily record of open-source AI. New model releases, leaderboards, and what's coming next — written for people who ship.

Refreshed every 12 hours

Discover

  • Latest releases
  • New today
  • Trending models
  • Upcoming launches

Browse

  • All models
  • Companies
  • Categories
  • Leaderboards

About

  • About
  • Editorial policy
  • RSS feed
  • Newsletter

© 2026 The Open Weights. An independent publication.

Aggregated by Claude · written with Gemini · curated by humans.

LatestByteDance1.0
ByteDanceAny-to-Any

ByteDance Releases Tar-7B for 'Any-to-Any' Multimodality

The new 7-billion-parameter model from the company's SEED team can process and generate a mix of text, images, audio, and video in a single unified framework.

Jul 2, 2025
NotableApache 2.0
ByteDance · Any-to-Any
Tar-7B
Tar-7B

ByteDance's SEED research team has introduced Tar-7B, a new open-source model aimed at unifying multimodal AI. At 7 billion parameters, Tar-7B is designed for "any-to-any" tasks, meaning it can accept any combination of text, images, audio, or video as input and generate any combination in response.

Built on the strong foundation of the recently released Qwen2.5, Tar-7B represents a significant step toward more flexible and general-purpose AI systems. The model is released under the permissive Apache 2.0 license, making it available for commercial use and further research.

A Unified Approach

Unlike specialized models that handle one type of conversion (e.g., text-to-image), Tar-7B uses a unified architecture to manage different data types within a common framework. This allows it to perform a wide range of tasks, including:

  • Generating video from a text prompt
  • Describing a video in text
  • Creating audio to match an image
  • Answering questions about a combination of inputs

This single-model approach could simplify the development of complex, media-rich applications. By moving beyond discrete tasks, Tar-7B and similar models point to a future where AI can understand and create content with the same fluidity as humans. The model and its components are detailed on its Hugging Face page (ByteDance-Seed/Tar-7B).

Sources

  • ByteDance-Seed/Tar-7B

    Hugging Face

    Visit

0 comments

Protected by Turnstile

No comments yet. Be the first to weigh in.

Get the model

Weights

Specs

Parameters7B
Context window—
LicenseAPACHE-2.0
Downloads31

Modalities

Any-to-Any

More in Any-to-Any

MiniMax
MiniMax-M3
MiniMax-M3
MiniMax/Vision-Language

MiniMax Releases M3, a Multimodal MoE Model

The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.

Jun 2, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4 12B Multimodal Model

The new 12-billion-parameter open model from DeepMind introduces a unified 'any-to-any' architecture for advanced multimodal tasks.

May 23, 2026
Google DeepMind
Gemma 4 12B
Gemma 4 12B
Google DeepMind/Any-to-Any

Google Releases Gemma 4, a 12B 'Any-to-Any' Model

The new 12-billion-parameter model from Google DeepMind is designed to handle a flexible mix of data types, moving beyond traditional text and image inputs.

May 23, 2026