Alibaba's Qwen Releases Open 27B Vision Model
The new dense model, licensed under Apache 2.0, brings both text and image understanding to the midrange parameter space.
Company
Releases
The new dense model, licensed under Apache 2.0, brings both text and image understanding to the midrange parameter space.
The new Qwen3.6-35B-A3B from Alibaba's Qwen team combines vision and language capabilities using an efficient sparse architecture.
The new 800-million-parameter model is the smallest in the Qwen3.5 family, designed for efficient multimodal tasks on consumer-grade hardware.
The new Qwen3.5-4B model combines text and image understanding in a compact, permissively licensed package for developers.
The new open-source vision-language model from Alibaba's Qwen team offers strong performance in a compact, Apache 2.0-licensed package.
The new Qwen3.5-122B-A10B combines a massive parameter count with an efficient Mixture-of-Experts architecture for advanced vision and language tasks.
The new model from Alibaba's Qwen team combines multimodal understanding with a 131K token context window under a permissive Apache 2.0 license.
The new Qwen3.5-35B-A3B model from Alibaba combines vision and language capabilities with a resource-friendly Mixture of Experts design.
The new open-source model from Alibaba uses a Mixture-of-Experts architecture to balance massive scale with efficient inference.
The new model from Alibaba's Qwen team uses a Mixture-of-Experts architecture and is released under the commercially-friendly Apache 2.0 license.
The new open-source tool, based on the Qwen3 architecture, precisely synchronizes audio recordings with their corresponding text transcripts.
Alibaba's Qwen team has released a new 1.7-billion-parameter model designed specifically for automatic speech recognition.
The new 600-million-parameter Qwen3-ASR model is designed for efficient, high-quality audio transcription under a permissive license.
The makers of the popular Qwen language models have published their first open-source text-to-image generator with a permissive Apache 2.0 license.
The new 600-million-parameter Qwen3-TTS model can generate speech in multiple languages and clone voices from short audio clips.
The new 600-million-parameter model from Alibaba's Qwen team can clone voices from short audio clips for multilingual speech synthesis.
Alibaba's Qwen team has released a new text-to-speech model capable of cloning voices from just a few seconds of audio.
The new 1.7-billion-parameter text-to-speech model from Alibaba's Qwen team can generate novel voices from short audio prompts.
Alibaba's latest text-to-image generator, Qwen-Image 2512, is optimized for creating visuals from both English and Chinese prompts.
The 8-billion-parameter model from Alibaba's Qwen team understands and generates spoken responses, enabling more natural audio-first applications.
The new diffusion model from Alibaba's team allows for precise, instruction-based image modifications in both English and Chinese.
The new Fun-ASR-Nano model from Alibaba's team packs real-time multilingual transcription, speaker diarization, and hotword detection into an efficient package.
The new 500-million-parameter text-to-speech model from the Qwen team offers multilingual voice cloning and emotional control.
The new text-to-image model from the team behind Qwen uses a diffusion transformer to generate high-resolution images in just a single step.
The latest vision-language model from the popular Qwen series is instruction-tuned and available under an Apache 2.0 license.
The new open-source model from Alibaba uses a Mixture-of-Experts architecture to make its powerful vision-language capabilities more efficient to run.
The new model from Alibaba's Qwen team allows users to modify images using natural language prompts instead of complex tools or masks.
The new 30B Mixture-of-Experts model from Alibaba's Qwen team can process and generate content across text, image, and audio formats.
The new 30-billion-parameter Mixture-of-Experts model from Alibaba's Qwen team is designed to show its reasoning process for complex multimodal tasks.
The new Mixture-of-Experts model from Alibaba is fine-tuned to generate detailed, multilingual descriptions for complex audio content.
The new 14-billion-parameter model from Alibaba's PAI team offers fine-grained control over video generation using inputs like sketches and depth maps.
The new Qwen3-Next model from Alibaba combines a large parameter count with an efficient MoE architecture to balance performance and computational cost.
The new Wan2.2-S2V model takes a still image and a speech track to generate a realistic talking-head animation, available under a permissive license.
The new open-source model from Alibaba lets users edit images with simple text commands in both English and Chinese.
The new Apache 2.0 diffusion model from Alibaba's Qwen team focuses on accurately rendering both English and Chinese characters within generated images.
The new Apache 2.0 model from Alibaba's Qwen team uses a Mixture-of-Experts architecture to deliver strong performance with only 3B active parameters.
The new open-source diffusion model from the team behind Qwen uses a Mixture-of-Experts architecture to animate still images.
The new Apache 2.0 licensed model from Alibaba's team generates video from either text prompts or still images, offering a unified approach in a compact package.
The new Apache 2.0-licensed model from Alibaba's team uses a Mixture-of-Experts architecture for efficient, high-quality video generation.
The new 14-billion parameter model from Alibaba's AI team uses a Mixture-of-Experts design and is available under the permissive Apache 2.0 license.
The new flagship coding model from Alibaba's Qwen team uses a massive Mixture-of-Experts architecture and is released under a permissive Apache-2.0 license.
The new Apache 2.0 licensed model from Alibaba's team can generate video from both text and image prompts, adding a powerful new tool to the open-source creative ecosystem.