AGEofLLMs.com
Search

Qwen 3 Hits Hard: Alibaba's Open LLM Drop Turns Heads

Calculating... Comments

Qwen 3 Just Landed. And it's bringing serious heat.

  • Alibaba releases full Qwen 3 line under Apache 2.0 license
  • Models range from 0.6B up to 235B Mixture of Experts
  • Small models run well on low-end setups
  • Qwen 3 holds up vs LLaMA DeepSeek and more
  • Reddit users give mixed but mostly solid reviews

Alibaba’s team Qwen just dropped their full stack of open LLMs. We’re talking everything from a tiny 0.6B model to a beastly 235B MoE version. All under Apache 2.0 so yeah you can pretty much use these how you want. Here's the model on Huggingface. You can try the chat online at https://chat.qwen.ai

Qwen benchmarks
Qwen benchmarks

Qwen's Full Lineup

  • Qwen3-0.6B. Pocket-sized but still capable
  • Qwen3-1.7B. A small step up
  • Qwen3-4B. The lightweight worker
  • Qwen3-8B. Handles decent complexity
  • Qwen3-14B. Versatile enough for creative stuff
  • Qwen3-30B-A3B. A Mixture of Experts model using 3B at once
  • Qwen3-32B. Dense model with strong results
  • Qwen3-235B-A22B. Mega model using 22B active parameters

All models are trained and tuned to follow instructions and can handle up to 128K context. Even the small ones hold their own in logic basic coding and storytelling.

Who Should Pay Attention?

If you're into running LLMs locally this is for you. Hobbyist coder or researcher these models work in setups like LM Studio text-generation-webui and Ollama. Even better the sub-4B ones don’t need fancy GPUs so perfect for light gear or edge devices.

How Does Qwen Stack Up?

Vs LLaMA. Similar build but Qwen models are more open and sometimes handle longer context better
Vs DeepSeek or Mistral. Benchmarks say Qwen 32B thinks well but DeepSeek V4 and GLM4 might beat it in coding
Vs Gemini 2.5 Pro. Not really fair since Gemini’s a cloud-only beast but yeah Qwen still can’t touch it on deep tool use

Local Run Tests and Ollama Drama

Can you run Qwen3 on Ollama? Kinda. It works but not without some rough edges. People on Reddit say they’ve had:

  • Weird cutoffs
  • Too much “thinking” before answering
  • Models ignoring the full context window

Some tricks help:

  • Set context to something like 8192 manually
  • Use /no_think in system prompt to shut down extra fluff
  • Try Unsloth versions for smoother runs

LM Studio and text-generation-webui handle Qwen more smoothly especially the 14B Q4_K_M flavor.

Qwen versions 4B, 8B, 14B, 32B and 30B-A3B are good for laptops with LM Studio.

And you can run smaller models like 0.6B and 1.7B on your phone locally with PocketPal. 

Real Talk from Reddit

Here’s what the /r/LocalLLaMA crowd is saying:

  • Qwen3-0.6B. “Great for its size” one user said. It’s doing NPC logic and basic stuff better than expected
  • Qwen3-14B. Creative enough solid coding just not as good as Gemma 3 in translating
  • Qwen3-30B-A3B. Super fast not super smart. Think of it like smart autocomplete
  • Qwen3-32B. A lot of folks are calling this their new fave
  • Qwen3-235B-A22B. Mixed reviews. Some say it flops on math and code. Might be limited by how it was fine-tuned

Final Thoughts

Qwen 3 looks like it isn’t a hype drop that fades out next week. It’s a real player in the open LLM world especially for folks who run things local. Strong logic long context and solid efficiency in the 30B and under range. Just know you might need to do a little tweaking.

Wanna try something new outside the usual names? Qwen 3 might surprise you. The 32B model might be the sweet spot if you want power without all the hassle.

Related Posts

Visitor Comments

Please prove you are human by selecting the tree.