Jan v1: 4B open model for web search with 91% SimpleQA, slightly outperforms Perplexity Pro

arxiv.org

cross-posted to:
machinelearning@lemmy.world

Jan v1: 4B open model for web search with 91% SimpleQA, slightly outperforms Perplexity Pro

arxiv.org

☆ Yσɠƚԋσʂ ☆@lemmy.ml to Machine Learning@lemmy.mlEnglish · 3 days ago

cross-posted to:
machinelearning@lemmy.world

Lucy: edgerunning agentic web search on mobile with machine generated task vectors

arxiv.org

Small language models (SLMs) are inherently limited in knowledge-intensive tasks due to their constrained capacity. While test-time computation offers a path to enhanced performance, most approaches treat reasoning as a fixed or heuristic process. In this work, we propose a new paradigm: viewing the model's internal reasoning, delimited by and tags, as a dynamic task vector machine. Rather than treating the content inside these tags as a mere trace of thought, we interpret the generation process itself as a mechanism through which the model \textbf{constructs and refines its own task vectors} on the fly. We developed a method to optimize this dynamic task vector machine through RLVR and successfully trained an agentic web-search model. We present Lucy, a 1.7B-parameter SLM that leverages this dynamic reasoning mechanism with MCP integration to achieve 78.3% accuracy on the SimpleQA benchmark, performing on par with much larger models such as DeepSeek-V3. This demonstrates that small models can rival large ones when equipped with structured, self-constructed task reasoning.

Jan v1 delivers 91% SimpleQA accuracy, slightly outperforming Perplexity Pro while running fully locally. It’s built on the new version of Qwen’s Qwen3-4B-Thinking (up to 256k context length), fine-tuned for reasoning and tool use in Jan.

The model in llama.cpp and vLLM and uses serper-mcp to access the web https://github.com/marcopesani/mcp-server-serper

Model links:

Jan-v1-4B: https://huggingface.co/janhq/Jan-v1-4B
Jan-v1-4B-GGUF: https://huggingface.co/janhq/Jan-v1-4B-GGUF

Recommended parameters:

    temperature: 0.6
    top_p: 0.95
    top_k: 20
    min_p: 0.0
    max_tokens: 2048

You must log in or # to comment.

Chat