简单记录2025年底我的sillytavern配置-THsInk

1. sillytavern服务器配置

docker-compose.yml

services:
  sillytavern:
    container_name: sillytavern
    hostname: sillytavern
    image: ghcr.io/sillytavern/sillytavern:latest
    environment:
      - NODE_ENV=production
      - FORCE_COLOR=1
      - TZ=Asia/Hong_Kong
    ports:
      - "8000:8000"
    volumes:
      - "./config:/home/node/app/config"
      - "./data:/home/node/app/data"
      - "./plugins:/home/node/app/plugins"
      - "./extensions:/home/node/app/public/scripts/extensions/third-party"
    restart: unless-stopped
    logging:
      driver: json-file
      options:
        max-size: "5m"   # 单个日志文件最大 5MB
        max-file: "3"     # 最多保留 3 个轮转文件
        compress: "true"  # 轮转日志压缩（Docker 20.10+ 可用）
    networks:
      - dockernetwork

networks:
  dockernetwork:
    external: true

.config/config.yaml

# -- DATA CONFIGURATION --
    # Root directory for user data storage
    dataRoot: ./data
    # -- SERVER CONFIGURATION --
    # Listen for incoming connections
    listen: true
    # Listen on a specific address, supports IPv4 and IPv6
    listenAddress:
      ipv4: 0.0.0.0
      ipv6: '[::]'
    # Enables IPv6 and/or IPv4 protocols. Need to have at least one enabled!
    # - Use option "auto" to automatically detect support
    # - Use true or false (no qoutes) to enable or disable each protocol
    protocol:
        ipv4: true
        ipv6: false
    # Prefers IPv6 for DNS. Enable this on ISPs that don't have issues with IPv6
    dnsPreferIPv6: false
    # -- BROWSER LAUNCH CONFIGURATION --
    browserLaunch:
      # Open the browser automatically on server startup.
      enabled: true
      # Browser to use for opening the URL.
      # NOT SUPPORTED ON ANDROID DEVICES.
      # - Use "default" to use the system default browser
      # - Use "firefox", "chrome", "edge"
      browser: 'default'
      # Overrides the hostname that opens in the browser.
      # - Use "auto" to let the server decide
      # - Use options like 'localhost', 'st.example.com'
      hostname: 'auto'
      # Overrides the port for run in the browser.
      # - Use -1 to use the server port.
      # - Specify a port to override the default.
      port: -1
      # Avoids using 'localhost' as the hostname in auto mode.
      # Use if you don't have 'localhost' in your hosts file
      avoidLocalhost: false
    # Server port
    port: 8000
    # -- SSL options --
    ssl:
      # Enable SSL/TLS encryption
      enabled: false
      # Path to certificate (relative to server root)
      certPath: "./certs/cert.pem"
      # Path to private key (relative to server root)
      keyPath: "./certs/privkey.pem"
      # Private key passphrase (leave empty if not needed)
      # For better security, use a CLI argument or an environment variable (SILLYTAVERN_SSL_KEYPASSPHRASE)
      keyPassphrase: ""
    # -- SECURITY CONFIGURATION --
    # Toggle whitelist mode
    whitelistMode: false
    # Whitelist will also verify IP in X-Forwarded-For / X-Real-IP headers
    enableForwardedWhitelist: true
    # Whitelist of allowed IP addresses
    whitelist:
      - ::1
      - 127.0.0.1
    # Automatically whitelist Docker host and gateway IPs
    whitelistDockerHosts: true
    # Toggle basic authentication for endpoints
    basicAuthMode: true
    # Basic authentication credentials
    basicAuthUser:
      username: "修改修改"
      password: "修改修改"
    # Enables CORS proxy middleware
    enableCorsProxy: false
    # -- REQUEST PROXY CONFIGURATION --
    requestProxy:
      # If a proxy is enabled, all outgoing HTTP/HTTPS requests will be routed through it.
      enabled: false
      # Proxy URL. Possible protocols: http, https, socks, socks5, socks4, pac
      url: "socks5://username:password@example.com:1080"
      # Proxy bypass list. Requests to these hosts won't be routed through the proxy.
      bypass:
        - localhost
        - 127.0.0.1
    # Enable multi-user mode
    enableUserAccounts: true
    # Enable discreet login mode: hides user list on the login screen
    enableDiscreetLogin: false
    # If `basicAuthMode` and this are enabled then
    # the username and passwords for basic auth are the same as those
    # for the individual accounts
    perUserBasicAuth: false
    
    # -- SSO LOGIN CONFIGURATION --
    sso:
      # Enable's authlia based auto login. Only enable this if you
      # have setup and installed Authelia as a middle-ware on your
      # reverse proxy
      # https://www.authelia.com/
      # This will use auto login to an account with the same username
      # as that used for authlia. (Ensure the username in authlia
      # is an exact match in lowercase with that in sillytavern)
      autheliaAuth: false
      # Enable's authentik based auto login. Only enable this if you
      # have setup and installed Authentik as a middle-ware on your
      # reverse proxy.
      # https://goauthentik.io/
      # This will use auto login to an account with the same username
      # as that used for authentik. (Ensure the username in authentik
      # is an exact match in lowercase with that in sillytavern).
      authentikAuth: false
    
    # Host whitelist configuration. Recommended if you're using a listen mode
    hostWhitelist:
      # Enable or disable host whitelisting
      enabled: false
      # Scan incoming requests for potential host header spoofing
      scan: true
      # List of allowed hosts. Do not include localhost or IPs, these are safe.
      # Use a dot to create subdomain patterns.
      # Examples:
      # - example.com
      # - .trycloudflare.com
      hosts: []
    
    # User session timeout *in seconds* (defaults to 24 hours).
    ## Set to a positive number to expire session after a certain time of inactivity
    ## Set to 0 to expire session when the browser is closed
    ## Set to a negative number to disable session expiration
    sessionTimeout: -1
    # Disable CSRF protection - NOT RECOMMENDED
    disableCsrfProtection: false
    # Disable startup security checks - NOT RECOMMENDED
    securityOverride: false
    # -- LOGGING CONFIGURATION --
    logging:
      # Enable access logging to access.log file and console output
      # Records new connections with timestamp, IP address and user agent
      enableAccessLog: true
      # Minimum log level to display in the terminal (DEBUG = 0, INFO = 1, WARN = 2, ERROR = 3)
      minLogLevel: 0
    # -- RATE LIMITING CONFIGURATION --
    rateLimiting:
      # Use X-Real-IP header instead of socket IP for rate limiting
      # Only enable this if you are using a properly configured reverse proxy (like Nginx/traefik/Caddy)
      preferRealIpHeader: false
    
    ## BACKUP CONFIGURATION
    backups:
      # Common settings for all backup types
      common:
        # Number of backups to keep for each chat and settings file
        numberOfBackups: 50
      chat:
        # Enable automatic chat backups
        enabled: true
        # Verify integrity of chat files before saving
        checkIntegrity: true
        # Maximum number of chat backups to keep per user (starting from the most recent). Set to -1 to keep all backups.
        maxTotalBackups: -1
        # Interval in milliseconds to throttle chat backups per user
        throttleInterval: 10000
    
    # THUMBNAILING CONFIGURATION
    thumbnails:
      # Enable thumbnail generation
      enabled: true
      # Image format of avatar thumbnails:
      # * "jpg": best compression with adjustable quality, no transparency
      # * "png": preserves transparency but increases filesize by about 100%
      # Changing this only affects new thumbnails. To recreate the old ones, clear out /thumbnails folder in your user data.
      format: "jpg"
      # JPG thumbnail quality (0-100)
      quality: 95
      # Maximum thumbnail dimensions per type [width, height]
      dimensions: { 'bg': [160, 90], 'avatar': [96, 144], 'persona': [96, 144] }
    
    # PERFORMANCE-RELATED CONFIGURATION
    performance:
      # Enables lazy loading of character cards. Improves performances with large card libraries.
      # May have compatibility issues with some extensions.
      lazyLoadCharacters: false
      # The maximum amount of memory that parsed character cards can use. Set to 0 to disable memory caching.
      memoryCacheCapacity: '100mb'
      # Enables disk caching for character cards. Improves performances with large card libraries.
      useDiskCache: true
    
    # CACHE BUSTER CONFIGURATION
    # IMPORTANT: Requires localhost or a domain with HTTPS, otherwise will not work!
    cacheBuster:
      # Clear browser cache on first load or after uploading image files
      enabled: false
      # Only clear cache for the specified user agent regex pattern
      # Example: 'firefox|safari' (case-insensitive)
      userAgentPattern: ''
    
    # Allow secret keys exposure via API
    allowKeysExposure: false
    # Skip new default content checks
    skipContentCheck: false
    # Allowed hosts for card downloads
    whitelistImportDomains:
      - localhost
      - cdn.discordapp.com
      - files.catbox.moe
      - raw.githubusercontent.com
      - char-archive.evulid.cc
    # API request overrides (for KoboldAI and Text Completion APIs)
    ## Note: host includes the port number if it's not the default (80 or 443)
    ## Format is an array of objects:
    ## - hosts:
    ##   - example.com
    ##   headers:
    ##     Content-Type: application/json
    ##   - 127.0.0.1:5001
    ##   headers:
    ##     User-Agent: "Googlebot/2.1 (+http://www.google.com/bot.html)"
    requestOverrides: []
    
    # EXTENSIONS CONFIGURATION
    extensions:
      # Enable UI extensions
      enabled: true
      # Automatically update extensions when a release version changes
      autoUpdate: true
      models:
        # Enables automatic model download from HuggingFace
        autoDownload: true
        # Additional models for extensions. Expects model IDs from HuggingFace model hub in ONNX format
        classification: Cohee/distilbert-base-uncased-go-emotions-onnx
        captioning: Xenova/vit-gpt2-image-captioning
        embedding: Cohee/jina-embeddings-v2-base-en
        speechToText: Xenova/whisper-small
        textToSpeech: Xenova/speecht5_tts
    
    # Additional model tokenizers can be downloaded on demand.
    # Disabling will fallback to another locally available tokenizer.
    enableDownloadableTokenizers: true
    # -- OPENAI CONFIGURATION --
    # A placeholder message to use in strict prompt post-processing mode when the prompt doesn't start with a user message
    promptPlaceholder: "[Start a new chat]"
    openai:
      # Will send a random user ID to OpenAI completion API
      randomizeUserId: false
      # If not empty, will add this as a system message to the start of every caption completion prompt
      # Example: "Perform the instructions to the best of your ability.\n" (for LLaVA)
      # Not used in image inlining mode
      captionSystemPrompt: ""
    # -- DEEPL TRANSLATION CONFIGURATION --
    deepl:
      # Available options: default, more, less, prefer_more, prefer_less
      formality: default
    # -- MISTRAL API CONFIGURATION --
    mistral:
      # Enables prefilling of the reply with the last assistant message in the prompt
      # CAUTION: The prefix is echoed into the completion. You may want to use regex to trim it out.
      enablePrefix: false
    # -- OLLAMA API CONFIGURATION --
    ollama:
      # Controls how long the model will stay loaded into memory following the request
      # * -1: Keep the model loaded indefinitely
      # * 0: Unload the model immediately after the request
      # * N (any positive number): Keep the model loaded for N seconds after the request.
      keepAlive: -1
      # Controls the "num_batch" (batch size) parameter of the generation request
      # * -1: Use the default value of the model
      # * N (positive number): Use the specified value. Must be a power of 2, e.g. 128, 256, 512, etc.
      batchSize: -1
    # -- ANTHROPIC CLAUDE API CONFIGURATION --
    claude:
      # Enables caching of the system prompt (if supported).
      # https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
      # -- IMPORTANT! --
      # Use only when the prompt before the chat history is static and doesn't change between requests
      # (e.g {{random}} macro or lorebooks not as in-chat injections).
      # Otherwise, you'll just waste money on cache misses.
      enableSystemPromptCache: false
      # Enables caching of the message history at depth (if supported).
      # https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
      # -- IMPORTANT! --
      # Use with caution. Behavior may be unpredictable and no guarantees can or will be made.
      # Set to an integer to specify the desired depth. 0 (which does NOT include the prefill)
      # should be ideal for most use cases.
      # Any value other than a non-negative integer will be ignored and caching at depth will not be enabled.
      cachingAtDepth: -1
      # Use 1h TTL instead of the default 5m.
      ## 5m: base price x 1.25
      ## 1h: base price x 2
      extendedTTL: false
    # -- GOOGLE GEMINI API CONFIGURATION --
    gemini:
      # API endpoint version ("v1beta" or "v1alpha")
      apiVersion: 'v1beta'
    # -- SERVER PLUGIN CONFIGURATION --
    enableServerPlugins: false
    # Attempt to automatically update server plugins on startup
    enableServerPluginsAutoUpdate: true

主要修改了

    listen: true
    whitelistMode: false
    basicAuthUser:
      username: "修改修改"
      password: "修改修改"
    # Enable multi-user mode
    enableUserAccounts: true

此配置文件设置了允许多用户，但没有隐藏用户，没有对用户分别进行认证，我准备将用户当作分类用。

使用nginx proxy manager时，添加以下自定义配置让容器获取正确的访客ip

real_ip_header CF-Connecting-IP;
proxy_set_header X-Real-IP         $remote_addr;
proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Host  $host;
proxy_read_timeout 360s;
proxy_send_timeout 360s;
client_body_timeout 360s;

2. web配置

在用户设置-管理员面板中可以管理多用户。

我记得之前好像需要手动勾选在机器人消息中允许{{user}}，现在新装的sillytavern默认勾选了，其余设置我大致检查了一遍也暂时没发现需要手动更改的。

3. API

3.1 cli转api

最开始玩sillytavern用的是反代Claude，后来用便宜的you[.]com，后来封号严重后比较折磨，准备时间需要很久。后来换成了白嫖google大善人的gemini cli，相当于一个号每日免费1000次2.5pro调用。可能体验还是比Claude差点，但也算能用了，而且不存在各种无意义的折腾过程。

使用此项目 https://github.com/su-kaka/gcli2api

docker-compose.yml

services:
  gcli2api:
    image: ghcr.io/su-kaka/gcli2api:latest
    container_name: gcli2api
    restart: unless-stopped
    environment:
      - PASSWORD=修改修改
      - PORT=7861
    volumes:
      - ./data/creds:/app/creds
    ports:
      - "7861:7861"   # 外部访问端口
    networks:
      - dockernetwork # 让其它容器通过服务名访问
    healthcheck:
      test: ["CMD-SHELL", "python -c \"import sys, urllib.request, os; port = os.environ.get('PORT', '7861'); req = urllib.request.Request(f'http://localhost:{port}/v1/models', headers={'Authorization': 'Bearer ' + os.environ.get('PASSWORD', 'pwd')}); sys.exit(0 if urllib.request.urlopen(req, timeout=5).getcode() == 200 else 1)\""]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

networks:
  dockernetwork:
    external: true

首次获取配置遇到过一些问题，不太顺畅，忘了怎么解决的了，希望新版本流程已经比较顺畅了。如果有问题还是搜索下issue试试吧。隐约记得我当时获取配置时，只有容器启动后第一次有获取成功的可能。

可以参考改为以下配置用于npm反代

services:
  gcli2api:
    image: ghcr.io/cetaceang/gcli2api:latest
    container_name: gcli2api
    restart: unless-stopped
    expose:
      - "7861:7861"
      - "8080:8080"
    environment:
      - PASSWORD=修改修改
      # - GOOGLE_CREDENTIALS=${GOOGLE_CREDENTIALS}  # 可选：从环境变量传入凭证

    volumes:
      - ./data/creds:/app/creds
    healthcheck:
      test: ["CMD-SHELL", "python -c \"import sys, urllib.request, os; req = urllib.request.Request('http://localhost:7861/v1/models', headers={'Authorization': 'Bearer ' + os.environ.get('PASSWORD', 'pwd')}); sys.exit(0 if urllib.request.urlopen(req, timeout=5).getcode() == 200 else 1)\""]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

networks:
  default:
    external: true
    name: dockernetwork

在相同docker network容器间可以将 http://gcli2api:7861 设为 自定义端点（基础 URL）

截止本文首次发布，gemini3还不可用，不过应该快了。目前通常使用

gemini-2.5-pro-preview-06-05

3.2 google ai studio 转 api

cli迟迟用不了3.0，因此又尝试了一下这种方式。使用了此项目

https://gcn02iwpisfi.feishu.cn/wiki/UMDzwFu0ki3AEfkQ3A7c7bHvnIL

ps. 我测试运行了一段时间后，内存占用保持在600MB多点（当前为633.9 MB），比作者说明的少一些，估计是优化过。

根据教程说明，获取认证文件，项目没有做mac适配

教程中通过一键脚本使用docker run启动容器，我转换成docker compose运行
docker-compose.yml

services:
  ais2api:
    image: ellinalopez/cloud-studio:latest
    container_name: ais2api
    restart: unless-stopped
    ports:
      - "8889:7860"
    env_file:
      - app.env                  # 等同于 --env-file app.env
    # 若需要代理，可在环境中提供 PROXY_URL（例如 .env 文件或 shell 环境变量）
    environment:
      - HTTP_PROXY=${PROXY_URL:-}
      - HTTPS_PROXY=${PROXY_URL:-}
    logging:
      driver: json-file
      options:
        max-size: "5m"
        max-file: "3"
        compress: "true"
    volumes:
      - ./auth:/app/auth

app.env

对于新模型和酒馆版本，可以（建议）在预设中关闭流式传输，即可按预期使用更不易截断的假流式。当使用其他一些强制只能流式传输的客户端（如Cherry Studio）时，需要将 STREAMING_MODE改为fake，或启动后在webui点击按钮切换。

# 用在酒馆的代理密码
API_KEYS=

# 如果你需要使用【app.env】加载认证信息，删除下面的 # 以去除注释，并填入之前获取的认证文件auth_single里的全部内容！
# AUTH_JSON_1=
# 如果还需要添加更多账号，依此类推增加 AUTH_JSON_2、AUTH_JSON_3

# 以下为可选参数
# （选填）账号使用40次后切换到下一个账号（建议用50或以下，太高导致内存占用高，然后卡顿）
SWITCH_ON_USES=40

# （选填）服务器端重试次数1次（具体说明：由于内部WebSocket出现断联几率很低，浏览器部分已经有3次自动重试了，所以服务器部分重试次数默认设置为1，如果将服务器端重试次数修改为2，会导致总重试次数变成2*3=6次。）
MAX_RETRIES=1

# （选填）失败3次后重试
FAILURE_THRESHOLD=3

# （选填）遇到429和503报错立刻切换账号
IMMEDIATE_SWITCH_STATUS_CODES=429,503

# （选填）起始账号，默认是第一个
INITIAL_AUTH_INDEX=null

# 使用假流式
STREAMING_MODE=real

ai studio限制2.5次数每日100次，3.0我还没试，不确定。因此按在auth文件夹设置多账号配置。env文件只需修改用在酒馆的代理密码API_KEYS

按推荐使用聊天补全来源：Google AI Studio。代理密码：之前设置的API_KEYS API ，密钥：留空

4. 常用插件

基本必装提示词模板 https://github.com/zonde306/ST-Prompt-Template/

基本必装酒馆助手 https://github.com/n0vi028/JS-Slash-Runner

许多预设需要安装酒馆助手插件 SoliUmbra (SPreset)，好像没有github连接，就不放了。只要发现需要用到，就能找到在哪里下载。

其余插件通常可根据预设和角色卡作者的推荐选择安装，比如：

记忆表格 https://github.com/muyoou/st-memory-enhancement

小白box https://github.com/RT15548/LittleWhiteBox

5. 文生图

我最开始用一个叫小画师的预设，感觉搭配本地stable diffusion比较方便顺滑，现在好像不更新了，很多人用油猴脚本:

https://update.greasyfork.org/scripts/489295/%E9%80%9A%E7%94%A8ai%E6%8F%92%E5%9B%BE%E8%84%9A%E6%9C%AC8.user.js

前两个月体验的时候觉得对sd支持比较一般，生图效果（提示词生成）也不怎么精准实用，就不再用了。

文章版权归作者所有，未经允许请勿转载。

THE END

笔记
# AI # SillyTavern

简单记录2025年底我的sillytavern配置

1. sillytavern服务器配置

2. web配置

3. API

3.1 cli转api

3.2 google ai studio 转 api

4. 常用插件

5. 文生图

在u盘中刷入debian live，并添加一个加密持久化存储分区

删除ventoy安装的安全启动证书

ChatGPT 5.2 的内容审查已经严重影响到回复质量

买了一个灵车C-Servers Texas 3C1G15G 小鸡

用clawdbot发起了一笔eth转账

Matrix 使用 BorgBackup 自动备份

Steam用CreamAPI解锁DLC

群晖7.1 docker 安装 Emby开心版

开源VAM管理工具VarManager for VAM(Virt-a-mate)

使用KoboldCpp简单运行本地大语言模型，替代ChatGPT生成NSFW文本

记录一次whmcs 8.7 开心版搭建

使用群晖docker安装mi-gpt，将小爱音箱接入ChatGPT