V2EX rune15
$V2EX
Solana
Give SOL to Copy Address
使用 SOL 向 rune15 打赏,数额会 100% 进入 rune15 的钱包。
 rune15's recent timeline updates
rune15

rune15

V2EX member #85383, joined on 2014-12-05 21:05:44 +08:00
rune15's recent replies
@KaiWuBOSS 收到,我再试试
D:\AI\models>kaiwu run Qwen3-30B-A3B







本地大模型部署器 vv0.1.1 llama.cpp b8864
by llmbbs.ai 本地 AI 技术社区

[1/6] Probing hardware...
GPU: NVIDIA GeForce RTX 4070 Ti (SM89, 12282 MB VRAM, 0 GB/s)
RAM: 31 GB DDR4
OS: windows amd64

[2/6] Selecting configuration...
Model: Qwen3-30B-A3B (moe, 30B total / 3B active)
Quant: ud-q3-k-xl (14.0 GB)
Mode: moe_offload (experts on CPU)
Accel: Flash Attention + MTP (native)

[3/6] Checking files...
Using bundled iso3 binary: llama-server-cuda.exe
Binary: llama-server-cuda.exe [cached]
Downloading model: Qwen3-30B-A3B-UD-Q3_K_XL.gguf
From: https://hf-mirror.com/unsloth/Qwen3-30B-A3B-GGUF/resolve/main/Qwen3-30B-A3B-UD-Q3_K_XL.gguf
Downloading 100% || (14/14 GB, 25 MB/s) [9m10s:0s]
Model: Qwen3-30B-A3B-UD-Q3_K_XL.gguf [cached]

[4/6] Preflight check...
VRAM sufficient

[5/6] Warmup benchmark...
Probe 1: ctx=128K ... OOM
Probe 2: ctx=64K ... OOM
Probe 3: ctx=32K ... OOM
Probe 4: ctx=16K ... OOM
Probe 5: ctx=8K ... OOM
Warmup failed: all ctx probes failed (tried down to 4K)
Using default parameters

[6/6] Starting server...
llama-server 不支持 iso3 ,回退到 q8_0/q4_0
Waiting for llama-server to be ready (port 11434)...
显存不足,降低上下文至 4K 重试...
Waiting for llama-server to be ready (port 11434)...
Error: failed to start llama-server: 连续 2 次启动失败,即使最小上下文(4K)也无法运行
建议:选择更小的量化或使用 MoE offload 模型
Usage:
kaiwu run <model> [flags]

Flags:
--bench Run benchmark after starting
--ctx-size int 手动指定上下文大小( 0=自动)
--fast Skip warmup, use cached profile
-h, --help help for run
--reset 清除缓存,重新 warmup 探测最优参数

我的 4070-Ti 也同样加载不了
供应商
@b1iy 我是 7x 车主,感觉售后并不差。
Feb 24
Replied to a topic by 6581 生活 2026 年,大伙有啥年度计划呢
换个轻松点的工作
Feb 24
Replied to a topic by jonty 职场话题 开工第一天提了离职
别犹豫,开弓没有回头箭
Feb 23
Replied to a topic by resten VPS 请教一下稳定的 VPS 推荐。
我在用 upcloud 这家的 vps ,体验下来还算稳定吧
7x 车主路过,提车半年多,目前开了 8 千多公里,感觉还行吧。
公司顺的普通戴尔键盘
准备看看《犯罪现场调查》
About     Help     Advertise     Blog     API     FAQ     Solana     5191 Online   Highest 6679       Select Language
创意工作者们的社区
World is powered by solitude
VERSION: 3.9.8.5 17ms UTC 07:54 PVG 15:54 LAX 00:54 JFK 03:54
Do have faith in what you're doing.
ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86