new-api

Author	SHA1	Message	Date
CaIon	fddf54ccc5	perf: reduce heap residency for large base64 relay requests Three layered optimizations targeting Gemini-style 5MB base64 payloads where RSS could balloon to tens of GB under concurrent load: 1. Byte-based param override (relay/common/override.go) - Switch legacy/operations hot paths from common.Marshal round-trips and map[string]any conversions to gjson/sjson on []byte directly. - Avoids cloning 5MB strings during each Set/Delete operation. 2. strings.Builder for Gemini response markdown (relay/channel/gemini/relay-gemini.go) - Replace string concatenation + strings.Join when assembling "![image](data:...;base64,DATA)" content for inline image responses. - Pre-allocates capacity from inline_data byte sizes. 3. Outbound BodyStorage + streaming Decoder (this commit's core) - New relay/common/outbound_body.go helper wraps marshaled upstream bodies in common.BodyStorage, allowing disk-cache mode to offload jsonData to a temp file while waiting for upstream TTFB. The original []byte can then be GC'd, removing ~5MB/req of heap residency during the longest window of a request. - All 7 relay handlers (gemini/claude/responses/embedding/image/compatible/ rerank) plus chat_completions_via_responses adopt the helper with defer closer.Close() and explicit jsonData = nil. - relay/common/relay_info.go: new UpstreamRequestBodySize so relay/channel/api_request.go can populate req.ContentLength (lost when body becomes a type-erased io.Reader). - common/gin.go UnmarshalBodyReusable: when storage is disk-backed and content-type is JSON, decode via DecodeJson(storage) instead of storage.Bytes()+Unmarshal, removing one transient 5MB copy per request. memory mode and form/multipart paths unchanged.	2026-05-22 19:08:38 +08:00
Seefs	ae6a03364d	perf: optimize request metadata extraction and disabled field filtering (#5009 ) * perf: optimize request metadata extraction and disabled field filtering * perf: optimize stream usage estimation path	2026-05-22 10:32:11 +08:00
Seefs	0936e25046	perf: avoid eager formatting in debug log calls (#4929 )	2026-05-19 12:11:24 +08:00
CaIon	aa56667b8f	feat: track upstream request ID and prevent response header override When proxying through another new-api instance, the upstream X-Oneapi-Request-Id was overwriting the local one in client responses. This adds a new `upstream_request_id` field to the logs table, captures the upstream ID during relay, and filters it from being copied back to the client. Frontend gains search/filter and detail display support.	2026-05-12 21:53:54 +08:00
Seefs	38a3314b9b	fix: preserve OpenAI image edit reference fields (#4646 ) * fix: preserve OpenAI image edit reference fields * feat: support json image edit requests	2026-05-06 21:27:47 +08:00
Calcium-Ion	5114ad0677	Merge pull request #4200 from yyhhyyyyyy/fix/vertex-gateway-base-url fix(vertex): honor custom base_url as gateway prefix	2026-04-30 20:11:17 +08:00
yyhhyyyyyy	987b7ecd22	fix(vertex): honor custom base_url as gateway prefix	2026-04-30 15:08:10 +08:00
heimoshuiyu	8ca103342d	fix: Message.ReasoningContent/Reasoning 改为 string，修复空思考内容在请求转发时被静默丢弃的问题问题：在非 passThrough 模式下，客户端发送的 reasoning_content: "" 经过 Go struct 反序列化再序列化后，因 string + omitempty 无法区分空串和字段缺失，导致空的思考内容被静默丢弃。根因： dto.Message.ReasoningContent 和 Message.Reasoning 使用 string（非指针）加 omitempty，违反 AGENTS.md Rule 6（可选标量字段必须用指针类型）。修复： 1. Message.ReasoningContent/Reasoning 类型从 string 改为 string - nil = 字段缺失 → JSON 省略 - &"" = 显式空串 → JSON 保留 reasoning_content: "" 2. 新增 Message.GetReasoningContent() 辅助方法 3. 更新所有读写处：relay-openai, relay-claude, relay-gemini, ollama 4. 新增测试覆盖空串保留、字段省略、getter 回退逻辑	2026-04-29 13:43:26 +08:00
Calcium-Ion	d604f48c06	Merge pull request #4469 from seefs001/fix/tool-arguments-object fix: support raw JSON response tool arguments	2026-04-26 20:20:03 +08:00
Seefs	db89b57e1c	fix: support raw JSON response tool arguments	2026-04-26 13:47:37 +08:00
Seefs	62d4b63fc3	feat: configure native messages model matching	2026-04-26 13:37:59 +08:00
HynoR	435d7ae0dd	feat: support DeepSeek V4 reasoning suffix handling	2026-04-24 16:50:35 +08:00
CaIon	eab478bdc8	fix: miscellaneous quick fixes from CodeRabbit review - log_info_generate.go: add nil guard in InjectTieredBillingInfo - billing_expr_request.go: merge headers instead of replacing - go.mod: remove incorrect // indirect on expr-lang/expr - ToolPriceSettings.jsx: add null check in syncToVisual - tool_billing.go: fix PricePer1K for image_generation (per-call, not per-1K) - utils.jsx: add minute() to time condition regex - useUsageLogsData.jsx: pass displayMode to renderTieredModelPrice - AGENTS.md, CLAUDE.md: fix Rule 6/7 ordering - relay-gemini.go: add TEXT modality case in CandidatesTokensDetails	2026-04-24 00:34:06 +08:00
CaIon	6bde1a9c8d	Merge origin/main into nightly Resolve conflicts: - .gitignore: keep nightly additions (.test, skills-lock.json) - relay/helper/price.go: keep both billingexpr and model imports - en.json / zh-CN.json: keep nightly's superset of i18n entries - service/billing_session.go: add missing 3rd arg to DecreaseUserQuota - en.json / zh-CN.json: deduplicate 129+320 duplicate i18n keys	2026-04-23 21:37:03 +08:00
papersnake	47d7bca268	feat: support claude-opus-4-7 (#4293 ) * feat: support claude-opus-4-7 * feat: summarized display for opus 4.7	2026-04-17 13:52:34 +08:00
CaIon	3cad6b9d7f	fix(claude): improve handling of empty string content in OpenAI to Claude message conversion Some checks failed Publish Docker image (Multi Registries, native amd64+arm64) / Build & push (amd64) [native] (push) Has been cancelled Details Publish Docker image (Multi Registries, native amd64+arm64) / Build & push (arm64) [native] (push) Has been cancelled Details Publish Docker image (Multi Registries, native amd64+arm64) / Create multi-arch manifests (Docker Hub) (push) Has been cancelled Details Build Electron App / build (windows-latest) (push) Has been cancelled Details Build Electron App / release (push) Has been cancelled Details Release (Linux, macOS, Windows) / Linux Release (push) Has been cancelled Details Release (Linux, macOS, Windows) / macOS Release (push) Has been cancelled Details Release (Linux, macOS, Windows) / Windows Release (push) Has been cancelled Details	2026-04-16 17:44:38 +08:00
woan1136	3ab65a8221	fix: add Azure channel support for /v1/responses/compact URL routing (#4149 ) Some checks failed Publish Docker image (Multi Registries, native amd64+arm64) / Build & push (amd64) [native] (push) Has been cancelled Details Publish Docker image (Multi Registries, native amd64+arm64) / Build & push (arm64) [native] (push) Has been cancelled Details Publish Docker image (Multi Registries, native amd64+arm64) / Create multi-arch manifests (Docker Hub) (push) Has been cancelled Details Build Electron App / build (windows-latest) (push) Has been cancelled Details Build Electron App / release (push) Has been cancelled Details Release (Linux, macOS, Windows) / Linux Release (push) Has been cancelled Details Release (Linux, macOS, Windows) / macOS Release (push) Has been cancelled Details Release (Linux, macOS, Windows) / Windows Release (push) Has been cancelled Details The Azure channel's GetRequestURL method only handled RelayModeResponses but missed RelayModeResponsesCompact. This caused compact requests to fall through to the generic deployments URL pattern, producing an incorrect path that Azure returns 404 for. This fix extends the existing responses API special handling to also cover the compact mode, appending /compact to the subUrl when the relay mode is ResponsesCompact. Affected URLs (before → after): - Normal Azure: /openai/deployments/{model}/responses/compact → /openai/v1/responses/compact - cognitiveservices: same pattern → /openai/responses/compact - Custom AzureResponsesVersion: properly respected for compact too Co-authored-by: 彭俊杰 <pengjunjie@onero.com>	2026-04-13 15:23:38 +08:00
CaIon	8b22161527	fix: set TopP to nil in Claude request configuration	2026-04-13 14:36:22 +08:00
CaIon	4d2993e4cc	Merge remote-tracking branch 'origin/main' into nightly Some checks failed Release (Linux, macOS, Windows) / Linux Release (push) Has been cancelled Details Release (Linux, macOS, Windows) / macOS Release (push) Has been cancelled Details Release (Linux, macOS, Windows) / Windows Release (push) Has been cancelled Details # Conflicts: # web/src/helpers/render.jsx # web/src/hooks/usage-logs/useUsageLogsData.jsx # web/src/i18n/locales/en.json	2026-04-09 17:12:21 +08:00
Calcium-Ion	b07f0b9626	Merge pull request #4154 from seefs001/feature/vllm-extensions-params feat: fill in some custom fields for vllm-omini	2026-04-09 14:35:05 +08:00
Calcium-Ion	53cf37a469	fix(ali): accept string usage values in task polling (#4155 )	2026-04-09 14:34:44 +08:00
NyaMisty	160cb28572	fix(zhipu_4v): use correct endpoint for coding plan image generation (#4146 )	2026-04-09 14:33:48 +08:00
Seefs	274307b0a9	fix(ali): accept string usage values in task polling	2026-04-09 12:48:17 +08:00
Seefs	a19a63b98c	feat: fill in some custom fields for vllm-omini.	2026-04-09 12:41:51 +08:00
forsakenyang	c734db34e8	feat: add minimax image generation relay support (#4103 )	2026-04-08 16:57:44 +08:00
zuiho	c66636a0c7	fix: 采纳 CodeRabbit 建议，!Done 时也用 fallback 覆盖占位 CompletionTokens message_start 阶段可能给 CompletionTokens 非零占位值，只检查 == 0 不够，加上 !Done && fallback > current 条件。	2026-04-07 17:52:11 +08:00
zuiho	f7cdc727df	fix: Claude 流式断流时不再整份覆盖 usage，保留 cache 计费字段 HandleStreamFinalResponse 在 !Done 时调用 ResponseText2Usage 整份覆盖 claudeInfo.Usage，导致 message_start 已获取的 CacheReadInputTokens、 CacheCreationInputTokens 等字段丢失，prompt 退化为占位值 1。修复： - 只补缺失的 CompletionTokens/PromptTokens，保留已有 cache 数据 - PromptTokens 兜底改用 info.GetEstimatePromptTokens()（与其他渠道对齐） Fixes #4127	2026-04-07 17:41:08 +08:00
CaIon	03758a4a85	refactor(file-source): unify file source creation and enhance caching mechanisms	2026-04-06 15:54:55 +08:00
CaIon	8fc0eb78e2	feat(billing): enhance task billing process with video input detection and updated pricing logic - Added `EstimateBilling` function to check for video input in request metadata and return corresponding discount ratios. - Updated `ModelPriceHelperPerCall` to incorporate new pricing logic based on model ratios and video input. - Enhanced task billing logs to include model ratio information and adjusted calculations for actual quota based on additional multipliers. - Introduced `renderTaskBillingProcess` to improve rendering of task billing information in the UI.	2026-04-06 15:54:55 +08:00
Seefs	82c2008d2c	fix: emit claude message_delta for usage-only final stream chunk	2026-04-04 20:21:13 +08:00
CaIon	bb5b9eaca2	fix(relay-claude): set TopP to nil in Claude request to align with API requirements	2026-04-03 20:18:28 +08:00
Calcium-Ion	9816ad87e3	feat: add HEIC/HEIF image format support for Gemini channel (#4049 ) * feat: add HEIC/HEIF image format support Add detection, MIME type mapping, and dimension parsing for HEIC/HEIF images via ISOBMFF ftyp brand inspection and ispe box parsing. Update Gemini relay to accept these formats and refactor getImageConfig to properly retry decoders using buffered data. * fix: handle ISOBMFF extended sizes in HEIF dimension parser parseHEIFDimensions now correctly handles boxSize==1 (64-bit extended size) and boxSize==0 (box-to-EOF), preventing the parser from breaking out of the loop when encountering these valid ISOBMFF box headers before reaching the meta box.	2026-04-02 21:32:42 +08:00
Calcium-Ion	0193018af6	Merge pull request #4042 from feitianbubu/pr/fe9713dcbf8795e127fbea2fcb1f3011da86ad54 新增seedance2.0视频接口支持	2026-04-02 21:30:31 +08:00
RedwindA	79527c0ab1	feat: add HEIC/HEIF image format support Add detection, MIME type mapping, and dimension parsing for HEIC/HEIF images via ISOBMFF ftyp brand inspection and ispe box parsing. Update Gemini relay to accept these formats and refactor getImageConfig to properly retry decoders using buffered data.	2026-04-02 16:40:45 +08:00
Calcium-Ion	41cd051ea9	Merge pull request #3505 from seefs001/fix/claude-media-support fix: add basic inline file support for Claude relay	2026-04-02 13:29:21 +08:00
Seefs	c04f82bfb5	TODO: fix chat -> messages file type	2026-04-02 13:16:58 +08:00
feitianbubu	dafc7618c3	feat: add seedance fail reason	2026-04-02 12:26:44 +08:00
feitianbubu	22692b3f87	feat: seedance support seconds	2026-04-02 12:26:37 +08:00
feitianbubu	d36e892905	fix: seedance only one text	2026-04-02 12:26:33 +08:00
feitianbubu	3cd1ba4673	fix: seedance metadata override prompt	2026-04-02 12:26:27 +08:00
feitianbubu	b7c0f754ad	feat: add seedance2.0 video api	2026-04-02 12:24:24 +08:00
CaIon	35d0704640	Merge branch 'origin/main' into nightly Resolve 4 conflicts: - relay/compatible_handler.go: accept main's refactor (postConsumeQuota -> service.PostTextConsumeQuota) - service/quota.go: accept main's PostClaudeConsumeQuota deletion, keep nightly's tiered billing in PostWssConsumeQuota and PostAudioConsumeQuota - web/src/i18n/locales/{en,zh-CN}.json: merge both sets of translation keys Post-merge integration: - Add tiered billing (TryTieredSettle, InjectTieredBillingInfo) to PostTextConsumeQuota - Update tool pricing calls to use nightly's generic GetToolPriceForModel/GetToolPrice API	2026-04-02 00:39:13 +08:00
Calcium-Ion	7efb1922fe	Merge pull request #3526 from feitianbubu/pr/e560265b6e57aa7b95bc98cb53397ef0a3082d9d 支持wan2.7生图-wan2.7-image	2026-04-02 00:15:04 +08:00
Calcium-Ion	89fe99f3bd	Merge pull request #3512 from imlhb/patch-2 fix: prevent double-counting of image count n in billing	2026-04-02 00:14:39 +08:00
feitianbubu	e5b5331d3b	feat: wan 2.7 support N for gen images number	2026-04-01 17:39:50 +08:00
feitianbubu	18373c6eac	feat: add wan 2.7	2026-04-01 17:39:11 +08:00
CaIon	ab99c30884	fix: move image count n to OtherRatio to prevent double-counting The previous commit commented out AddOtherRatio("n") in the Ali adaptor to fix double-counting but this could cause billing evasion when n is specified via extra["parameters"] instead of request.N. Root cause: ImagePriceRatio in GetTokenCountMeta() already included n, AND channel adaptors added OtherRatio("n"), resulting in n² billing. Proper fix: - Remove n from ImagePriceRatio (keep sizeRatio * qualityRatio only) - In ImageHelper, add default OtherRatio("n") when adaptor hasn't set one; set fallback tokens to 1 (base unit) - Restore Ali adaptor's AddOtherRatio("n") — it uses actual upstream parameters/response count, preventing billing evasion	2026-03-31 23:58:10 +08:00
CaIon	d22f889e5d	fix(xAI): set MaxTokens to nil when MaxCompletionTokens is 0 for grok-3-mini model	2026-03-31 19:16:16 +08:00
刘泓宾	53aeee4ff7	Comment out price data adjustment logic Comment out code that modifies price data based on image count. 测试发现，如果是接入阿里百炼平台的qwen-image-2.0系列模型，这边计费的时候会出现 0.2n倍率n的情况，最前面的0.2n会直接显示为模型价格。例如：日志详情模型价格 $0.600000，专属倍率 0.3 其他详情大小 10801920, 品质 standard, 生成数量 3, 其他倍率 n: 3.000000 计费过程模型价格：$0.600000 专属倍率：0.3 = $0.180000	2026-03-31 17:12:06 +08:00
CaIon	5238f279db	feat: record stream interruption reasons via StreamStatus Some checks failed Publish Docker image (Multi Registries, native amd64+arm64) / Build & push (amd64) [native] (push) Has been cancelled Details Publish Docker image (Multi Registries, native amd64+arm64) / Build & push (arm64) [native] (push) Has been cancelled Details Publish Docker image (Multi Registries, native amd64+arm64) / Create multi-arch manifests (Docker Hub) (push) Has been cancelled Details Build Electron App / build (windows-latest) (push) Has been cancelled Details Build Electron App / release (push) Has been cancelled Details Release (Linux, macOS, Windows) / Linux Release (push) Has been cancelled Details Release (Linux, macOS, Windows) / macOS Release (push) Has been cancelled Details Release (Linux, macOS, Windows) / Windows Release (push) Has been cancelled Details - Add StreamStatus type (relay/common) to track stream end reason (done/timeout/client_gone/scanner_error/eof/panic/ping_fail) and accumulate soft errors during streaming via sync.Once + sync.Mutex. - Add StreamResult (relay/helper) as the callback interface: adapters call sr.Error() for soft errors, sr.Stop() for fatal, sr.Done() for normal completion. No early-return problem — multiple errors per chunk are naturally supported. - Refactor StreamScannerHandler callback from func(string) bool to func(string, *StreamResult). All 9 channel adapters updated. - Write stream_status into log other JSON field (admin-only) with status ok/error, end_reason, error_count, and error messages. - Frontend: display stream status in log detail expansion for admins.	2026-03-31 16:54:39 +08:00

1 2 3 4 5 ...

1143 Commits