$ timeahead.in
← back
$ articles --tag qwen

#qwen

39 articles

01
vLLM Tops the Artificial Analysis Leaderboard May 11, 2026 · 15 min read How vLLM built the leading deployments of DeepSeek V3.2, MiniMax-M2.5, and Qwen 3.5 397B.
vLLM Tops the Artificial Analysis Leaderboard How vLLM built the leading deployments of DeepSeek V3.2, MiniMax-M2.5, and…
vLLM BlogResearch#qwen#inference#benchmark
35d
02
vLLM Tops the Artificial Analysis Leaderboard May 11, 2026 · 15 min read How vLLM built the leading deployments of DeepSeek V3.2, MiniMax-M2.5, and Qwen 3.5 397B.
vLLM Tops the Artificial Analysis Leaderboard How vLLM built the leading deployments of DeepSeek V3.2, MiniMax-M2.5, and…
vLLM BlogResearch#qwen#inference#benchmark
35d
03
CyberSecQwen-4B: Why Defensive Cyber Needs Small, Specialized, Locally-Runnable Models
CyberSecQwen-4B: Why Defensive Cyber Needs Small, Specialized, Locally-Runnable Models Why this matters Frontier models …
Hugging Face BlogModel#qwen
38d
04
Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model
22nd April 2026 - Link Blog Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model (via) Big claims from Qwen about the…
Simon Willison BlogResearch#qwen#agents#coding
54d
05
Advancing Emerging Optimizers for Accelerated LLM Training with NVIDIA Megatron
Higher-order optimization algorithms such as Shampoo have been effectively applied in neural network training for at lea…
NVIDIA Developer BlogInfra#qwen#inference#observability
54d
06
Accelerate Generative AI Inference on Amazon SageMaker AI with G7e Instances
Artificial Intelligence Accelerate Generative AI Inference on Amazon SageMaker AI with G7e Instances As the demand for g…
AWS Machine Learning BlogHardware#qwen#inference#multimodal
56d
07
Ecom-RLVE: Adaptive Verifiable Environments for E-Commerce Conversational Agents
Ecom-RLVE: Adaptive Verifiable Environments for E-Commerce Conversational Agents TL;DR — We extend the RLVE framework fr…
Hugging Face BlogInfra#qwen#agents
60d
08
TorchSpec: Speculative Decoding Training at Scale
Introduction Over the past year, large language models have rapidly expanded in both scale and capability. Frontier mode…
PyTorch BlogModel#qwen#coding#training
88d
09
3/10/2026 Training-Inference Parity in MoE Models: Where Numerics Drift
On this page Kernel fusions that are mathematically equivalent can still drift numerically. Here are the parity bugs we …
Fireworks AI BlogInfra#qwen#inference#training
97d
10
Develop Native Multimodal Agents with Qwen3.5 VLM Using NVIDIA GPU-Accelerated Endpoints
Alibaba has introduced the new open source Qwen3.5 series built for native multimodal agents. The first model in this se…
NVIDIA Developer BlogHardware#qwen#fine-tuning#multimodal
108d
11
1/13/2026 Best Open Source LLMs in 2026: We Reviewed 7 Models
With new open source LLMs launching nearly every week, figuring out which model actually fits your use case has become i…
Fireworks AI BlogResearch#qwen#benchmark#open-source
153d
12
Continuous batching from first principles
Continuous batching TL;DR: in this blog post, starting from attention mechanisms and KV caching, we derive continuous ba…
Hugging Face BlogInfra#claude#qwen#inference
202d
13
OVHcloud on Hugging Face Inference Providers 🔥
OVHcloud on Hugging Face Inference Providers 🔥 We're thrilled to share that OVHcloud is now a supported Inference Provi…
Hugging Face BlogResearch#llama#qwen#fine-tuning
203d
14
New coding models & integrations October 16, 2025 GLM-4.6 and Qwen3-coder-480B are available on Ollama’s cloud service with easy integrations to the tools you are familiar with. Qwen3-Coder-30B has been updated for faster, more reliable tool calling in Ollama’s new engine.
New coding models & integrations October 16, 2025 GLM-4.6 and Qwen3-coder-480B are available on Ollama’s cloud service w…
Ollama BlogAgents#llama#qwen#coding
242d
15
Qwen3-VL October 14, 2025 Ollama now supports Alibaba's Qwen3-VL.
Qwen3-VL October 14, 2025 Qwen3-VL, the most powerful vision language model in the Qwen series is now available on Ollam…
Ollama BlogModel#llama#qwen
244d
16
Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models
Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models TL;DR: Qwen3-8B is one of the most exci…
Hugging Face BlogAgents#qwen#agents
259d
17
Scaleway on Hugging Face Inference Providers 🔥
Scaleway on Hugging Face Inference Providers 🔥 We're thrilled to share that Scaleway is now a supported Inference Provi…
Hugging Face BlogResearch#qwen#inference
269d
18
vLLM Now Supports Qwen3-Next: Hybrid Architecture with Extreme Efficiency Sep 11, 2025 · 3 min read We’re excited to announce that vLLM now supports Qwen3-Next, the latest generation of foundation models from the Qwen team. Qwen3-Next introduces a hybrid architecture with extreme efficiency for...
vLLM Now Supports Qwen3-Next: Hybrid Architecture with Extreme Efficiency We’re excited to announce that vLLM now suppor…
vLLM BlogInfra#qwen#inference
277d
19
Jupyter Agents: training LLMs to reason with notebooks
Jupyter Agents: training LLMs to reason with notebooks A natural way to display multi-step code execution together with …
Hugging Face BlogResearch#qwen#agents#multimodal
278d
20
Understanding and Implementing Qwen3 From Scratch
Understanding and Implementing Qwen3 From Scratch A Detailed Look at One of the Leading Open-Source LLMs Previously, I c…
Ahead of AI (Sebastian Raschka)Open Source#qwen#open-source
282d
21
From GPT-2 to gpt-oss: Analyzing the Architectural Advances
From GPT-2 to gpt-oss: Analyzing the Architectural Advances And How They Stack Up Against Qwen3 OpenAI just released the…
Ahead of AI (Sebastian Raschka)Model#qwen
310d
22
Qwen3 Coder 480B is Live on Cerebras August 01, 2025
Alibaba's Qwen3 Coder 480B Instruct model is now available on Cerebras. Qwen3 Coder is one of the top coding models in t…
Cerebras BlogInfra#qwen#inference
315d
23
Qwen3 235B 2507 Instruct Now Available on Cerebras July 29, 2025
Alibaba's Qwen3 235B 2507 Instruct model is now available on Cerebras. The world’s leading non-reasoning model – Qwen3 2…
Cerebras BlogTutorial#qwen#inference#training
320d
24
Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models
Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models Numina & Kimi Team We're excited to announc…
Hugging Face BlogResearch#qwen#agents#benchmark
340d
25
SmolLM3: smol, multilingual, long-context reasoner
SmolLM3: smol, multilingual, long-context reasoner - Base model: https://hf.co/HuggingFaceTB/SmolLM3-3B-Base - Instruct …
Hugging Face BlogModel#llama#qwen#training
342d
26
Featherless AI on Hugging Face Inference Providers 🔥
Featherless AI on Hugging Face Inference Providers 🔥 We're thrilled to share that Featherless AI is now a supported Inf…
Hugging Face BlogHardware#qwen#inference#open-source
368d
27
GroqCloud™ Now Supports Qwen3 32B
GroqCloud™ Now Supports Qwen3 32B Delivering Fast Inference with the Full 131k Context Window GroqCloud™ now supports Qw…
370d
28
5/6/2025 Qwen 3 on Fireworks AI: Controllable Chain-of-Thought and Tool Calling at Frontier Scale
TL;DR - •Reasoning meets function calls. Qwen 3 now streams an explicit … trace and the exact JSON tool call in the same…
Fireworks AI BlogInfra#qwen#inference#coding
375d
29
🐯 Liger GRPO meets TRL
Thank you for your great work. Anyway, I tested the liger loss with deepspeed zero3 using Qwen/Qwen2.5-0.5B-Instruct in …
Hugging Face BlogInfra#qwen#training
386d
30
The Transformers Library: standardizing model definitions
The Transformers Library: standardizing model definitions Transformers was created in 2019, shortly following the releas…
Hugging Face BlogInfra#llama#qwen#rag
396d
31
Improving Hugging Face Model Access for Kaggle Users
Improving Hugging Face Model Access for Kaggle Users Beginning today, Kaggle is launching an integration that enhances v…
Hugging Face BlogTutorial#qwen#coding
397d
32
The 4 Things Qwen-3’s Chat Template Teaches Us
The 4 Things Qwen-3’s Chat Template Teaches Us The new Qwen-3 model by Qwen ships with a much more sophisticated chat te…
Hugging Face BlogModel#qwen
411d
33
FastRTC: The Real-Time Communication Library for Python
FastRTC: The Real-Time Communication Library for Python - OpenAI and Google released their live multimodal APIs for Chat…
Hugging Face BlogInfra#gpt#gemini#qwen
475d
34
Visual Document Retrieval Goes Multilingual
Visual Document Retrieval Goes Multilingual TL;DR: We presentvdr-2b-multi-v1 , the best multilingual embedding model for…
Hugging Face BlogFrameworks#llama#qwen#llamaindex
521d
35
8/1/2025 Qwen3 Decoded: Choosing the Right Model For Your Task
With Thinking, Instruct, and Coder released simultaneously, confusion spiked. We stress-tested all three on your real wo…
Fireworks AI BlogTutorial#qwen#benchmark
523d
36
SmolVLM - small yet mighty Vision Language Model
SmolVLM - small yet mighty Vision Language Model TLDR This blog post introduces SmolVLM, a 2B VLM, SOTA for its memory f…
Hugging Face BlogInfra#qwen#inference#multimodal
566d
37
SmolLM - blazingly fast and remarkably powerful
SmolLM - blazingly fast and remarkably powerful TL;DR This blog post introduces SmolLM, a family of state-of-the-art sma…
Hugging Face BlogResearch#phi#qwen#inference
699d
38
From cloud to developers: Hugging Face and Microsoft Deepen Collaboration
From cloud to developers: Hugging Face and Microsoft Deepen Collaboration A collaboration for Cloud AI Builders we are e…
Hugging Face BlogInfra#llama#mistral#qwen
755d
39
RVPO: Risk-Sensitive Alignment via Variance Regularization
RVPO: Risk-Sensitive Alignment via Variance Regularization AuthorsIvan Montero, Tomasz Jurczyk, Bhuwan Dhingra RVPO: Ris…
Apple Machine Learning ResearchModel#qwen#fine-tuning#safety
773d