Google's native Gemini API uses the URL action :streamGenerateContent
to indicate streaming intent, not just the ?alt=sse query parameter.
The current IsStream() only checks c.Query("alt") == "sse", causing
all :streamGenerateContent requests (without ?alt=sse) to be treated
as non-streaming.
This adds a strings.Contains check for "streamGenerateContent" in the
request URL path, so both streaming indicators are recognized.
- Introduced a new rule for the Billing Expression System, emphasizing the importance of reading `pkg/billingexpr/expr.md` for dynamic billing.
- Updated the billing expression logic to support new variables and improved handling of image and audio tokens.
- Enhanced the tiered billing functionality with versioning support for expressions and refined quota calculations.
- Added tests to validate the new billing expression features and ensure correctness in pricing calculations.
This commit introduces a major architectural refactoring to improve quota management, centralize logging, and streamline the relay handling logic.
Key changes:
- **Pre-consume Quota:** Implements a new mechanism to check and reserve user quota *before* making the request to the upstream provider. This ensures more accurate quota deduction and prevents users from exceeding their limits due to concurrent requests.
- **Unified Relay Handlers:** Refactors the relay logic to use generic handlers (e.g., `ChatHandler`, `ImageHandler`) instead of provider-specific implementations. This significantly reduces code duplication and simplifies adding new channels.
- **Centralized Logger:** A new dedicated `logger` package is introduced, and all system logging calls are migrated to use it, moving this responsibility out of the `common` package.
- **Code Reorganization:** DTOs are generalized (e.g., `dalle.go` -> `openai_image.go`) and utility code is moved to more appropriate packages (e.g., `common/http.go` -> `service/http.go`) for better code structure.