TinyMCE AI on-premises

The TinyMCE AI on-premises service is a self-hosted back end that powers AI writing assistance. It can be used with the TinyMCE rich text editor, particularly the TinyMCE AI plugin, or as a standalone service. It runs entirely within the host infrastructure. Document content, conversation history, file attachments, and user data stay within the host network and are not stored by Tiny. Data sent to a configured LLM provider is subject to that provider’s data handling policies.

The service ships as a single Open Container Initiative (OCI) container image (registry.containers.tiny.cloud/ai-service-tiny). It exposes a REST API, a Management Panel, Server-Sent Events streaming, and an OpenAPI spec.

Architecture

High-level architecture showing client token endpoint AI service LLM provider and data layer

Data flow for a single AI request:

  1. The client application requests a JWT from the token endpoint.

  2. The client sends a prompt with the JWT to the AI service over HTTPS.

  3. The AI service verifies the token, checks per-feature permissions, and forwards the prompt to the configured large language model (LLM).

  4. The LLM streams its response back to the AI service.

  5. The AI service relays the response to the client through Server-Sent Events (SSE).

When used with TinyMCE tinymceai, the plugin handles steps 1, 2, and 5 automatically through the tinymceai_token_provider callback.

The browser connects directly to the AI service — requests do not pass through the application back end. The AI service must be network-reachable from the end-user browser, which means it must have a public URL (or be accessible through a VPN/internal network when deployed on an intranet). Configure CORS and TLS on the AI service accordingly.

The shared secret (API Secret) never leaves the back end; the editor and the AI service only ever see signed tokens.

Capabilities

Capability Details

Conversational AI assistant

Multi-turn AI chat with support for document and file context. Conversation history is isolated per user through the JWT sub claim.

Document review

Review a document for correctness, clarity, readability, tone, and more, or translate to another language.

Quick actions

Rewrite, summarize, expand, change tone, fix grammar, translate, continue, and improve writing.

LLM provider flexibility

OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Google Vertex AI, or any self-hosted OpenAI-compatible endpoint. Multiple providers can coexist.

Model Context Protocol (MCP) integration

Connect internal tools, databases, and knowledge bases through Model Context Protocol over Streamable HTTP transport.

Web scraping and web search

Pluggable endpoints for fetching web pages and running searches.

Multi-tenant environments

Isolated conversation history and per-tenant access keys through Environments.

Per-user, per-feature permissions

Fine-grained control through the auth.ai.permissions JWT claim.

Streaming responses

Server-Sent Events from the LLM back to the browser.

File attachments

Use additional files as context for AI conversations. Storage options include database, filesystem, Amazon S3, or Azure Blob Storage.

Observability

Structured request logs, OpenTelemetry, and Langfuse. All three run as independent simultaneous pipelines.

Horizontal scaling

The service is stateless; add replicas behind a load balancer without shared local state.

OpenAPI specification

Published at /v1/api/doc.json with interactive documentation at /docs/. Auto-generate clients in any language.

Credentials

Three credentials are involved in an on-premises deployment. They are distinct and serve different purposes.

Credential Where it lives What it does Required?

LICENSE_KEY

AI service container (environment variable)

Activates the AI service. A long string provided by the Tiny account representative.

Yes — the service refuses to start without it.

TINYMCE_API_KEY

Editor page (CDN script URL) or build configuration

Authenticates against cdn.tiny.cloud when loading TinyMCE from the CDN. This is the short string from the tiny.cloud dashboard.

Only when loading TinyMCE from the CDN. Omit for self-hosted editor bundles.

license_key (init option)

tinymce.init({ license_key: 'T8LK:…​' })

Activates premium TinyMCE features when using a self-hosted editor bundle (not the CDN).

Only for self-hosted editor deployments. Provided by the Tiny account representative.

LICENSE_KEY (the AI service license) and TINYMCE_API_KEY / license_key (the editor license) are different credentials from different sources. Do not interchange them.

Prerequisites

Requirement Details

Container runtime

Docker 20.10+, Podman 4+, or any OCI-compatible runtime. Kubernetes, AWS ECS, or Azure Container Apps are also supported.

SQL database

MySQL 8.0 or PostgreSQL 13+ (16 recommended).

Redis

3.2.6+ (7.x recommended). Single node or Cluster mode supported. Sentinel not supported.

LLM access

At least one provider. Multiple providers can coexist.

License key and registry credentials

Provided by a Tiny account representative.

Token endpoint

A back end that signs HS256 JWTs.

Reverse proxy (recommended)

The AI service does not terminate Transport Layer Security (TLS). A reverse proxy such as nginx, HAProxy, or a cloud load balancer is recommended for TLS termination in production.

Choosing a setup path

Setup path decision tree

All setup paths lead to the same set of topic guides listed below. The decision tree helps identify which guides to prioritize based on the deployment target.

Topic guides

For a first-time deployment, progress through the guides in order. Each guide can also be used independently as a reference for a specific topic.

Guide Scope

Getting started

Five-minute Docker Compose quick start. Stand up the AI service, database, Redis, token server, and a browser editor.

Database, Redis, and storage

Data layer setup: MySQL and PostgreSQL configuration, Redis connectivity, file storage options (S3, Azure Blob, filesystem, database), and host-local database connectivity.

LLM providers

Connect to OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Google Vertex AI, or any OpenAI-compatible endpoint (Ollama, vLLM, LM Studio). Custom model catalog and API key rotation.

MCP and web integrations

Model Context Protocol (MCP) tool integration, web scraping endpoints, and web search endpoints.

JWT authentication

HS256 signing model, required and optional claims, permissions reference, token endpoint examples in 8 languages, and multi-tenant deployment patterns.

TinyMCE integration

Editor-side configuration: plugin options, token provider, authentication patterns, Cross-Origin Resource Sharing (CORS), and deployment checklists.

Production deployment

Kubernetes manifests, AWS ECS task definitions, horizontal scaling, security hardening, rate limiting, observability, backup and recovery, and upgrades.

Troubleshooting

Quick triage, container startup failures, JWT errors, LLM provider errors, editor issues, performance, and diagnostic recipes.

Reference

Environment variable reference, API endpoint reference, Server-Sent Events reference, and error code reference.

Support

When submitting a support request, include:

Container logs
docker logs ai-service --tail 200
Health check
curl -fsS http://localhost:8000/health

Expected response:

{"serviceName":"on-premises-http","uptime":1234}
Decoded JWT payload

Strip the signature and decode with a JWT library.

Environment variables

Redact secrets before submitting.

docker inspect ai-service | jq '.[0].Config.Env'
Image version
docker inspect ai-service | jq '.[0].Config.Image'