All modules require at least 6 Gb of VRAM to run. With Stable Diffusion disabled, it will probably fit in 4 Gb. Alternatively, everything could also be run on the CPU.

How to run

Install Python 3.10
Run pip install -r requirements.txt
Run python server.py
Get the API URL. Defaults to http://localhost:5100 if you run locally.
Start TavernAI with extensions support: set enableExtensions to true in config.conf
Navigate to TavernAI settings and put in an API URL and tap "Connect" to load the extensions

UI Extensions

Name	Description	Required Modules	Screenshot
Image Captioning	Send a cute picture to your bot!	`caption`

Modules

Name	Description
`caption`	Image captioning
`summarize`	Text summarization
`classify`	Text sentiment classification
`keywords`	Text key phrases extraction
`prompt`	SD prompt generation from text
`sd`	Stable Diffusion image generation

API Endpoints

Get UI extensions list

GET /api/extensions

Input

None

Output

{"extensions":[{"metadata":{"css":"file.css","display_name":"human-friendly name","js":"file.js","requires":["module_id"]},"name":"extension_name"}]}

Get UI extension JS script

GET /api/script/<name>

Input

Extension name in a route

Output

File content

Get UI extension CSS stylesheet

GET /api/style/<name>

Input

Extension name in a route

Output

File content

BLIP model for image captioning

POST /api/caption

Input

{ "image": "base64 encoded image" }

Output

{ "caption": "caption of the posted image" }

BART model for text summarization

POST /api/summarize

Input

{ "text": "text to be summarize", "params": {} }

Output

{ "summary": "summarized text" }

Optional: `params` object for control over summarization:

Name	Default value
`temperature`	1.0
`repetition_penalty`	1.0
`max_length`	500
`min_length`	200
`length_penalty`	1.5
`bad_words`	["\n", '"', "*", "[", "]", "{", "}", ":", "(", ")", "<", ">"]

BERT model for text classification

POST /api/classify

Input

{ "text": "text to classify sentiment of" }

Output

{
    "classification": [
        {
            "label": "joy",
            "score": 1.0
        },
        {
            "label": "anger",
            "score": 0.7
        },
        {
            "label": "love",
            "score": 0.6
        },
        {
            "label": "sadness",
            "score": 0.5
        },
        {
            "label": "fear",
            "score": 0.4
        },
        {
            "label": "surprise",
            "score": 0.3
        }
    ]
}

NOTES

Sorted by descending score order

Six fixed categories

Value range from 0.0 to 1.0

Key phrase extraction

POST /api/keywords

Input

{ "text": "text to be scanned for key phrases" }

Output

{
    "keywords": [
        "array of",
        "extracted",
        "keywords",
    ]
}

GPT-2 for Stable Diffusion prompt generation

POST /api/prompt

Input

{ "name": "character name (optional)", "text": "textual summary of a character" }

Output

{ "prompts": [ "array of generated prompts" ] }

Stable Diffusion for image generation

POST /api/image

Input

{ "prompt": "prompt to be generated" }

Output

{ "image": "base64 encoded image" }

Additional options

Flag	Description
`--port`	Specify the port on which the application is hosted. Default: 5100
`--listen`	Hosts the app on the local network
`--share`	Shares the app on CloudFlare tunnel
`--cpu`	Run the models on the CPU instead of CUDA
`--summarization-model`	Load a custom BART summarization model. Expects a HuggingFace model ID. Default: Qiliang/bart-large-cnn-samsum-ChatGPT_v3
`--classification-model`	Load a custom BERT sentiment classification model. Expects a HuggingFace model ID. Default: bhadresh-savani/distilbert-base-uncased-emotion
`--captioning-model`	Load a custom BLIP captioning model. Expects a HuggingFace model ID. Default: Salesforce/blip-image-captioning-base
`--keyphrase-model`	Load a custom key phrase extraction model. Expects a HuggingFace model ID. Default: ml6team/keyphrase-extraction-distilbert-inspec
`--prompt-model`	Load a custom GPT-2 prompt generation model. Expects a HuggingFace model ID. Default: FredZhang7/anime-anything-promptgen-v2
`--sd-model`	Load a custom Stable Diffusion image generation model. Expects a HuggingFace model ID. Default: ckpt/anything-v4.5-vae-swapped Must have VAE pre-baked in PyTorch format or the output will look drab!
`--sd-cpu`	Forces the Stable Diffusion generation pipeline to run on the CPU. SLOW!
`--enable-modules`	Override a list of enabled modules. Runs with everything enabled by default. Expects a comma-separated list of module names. See Modules

README.md

TavernAI - Extras

What is this

How to run

UI Extensions

Modules

API Endpoints

Get UI extensions list

Input

Output

Get UI extension JS script

Input

Output

Get UI extension CSS stylesheet

Input

Output

BLIP model for image captioning

Input

Output

BART model for text summarization

Input

Output

Optional: params object for control over summarization:

BERT model for text classification

Input

Output

Key phrase extraction

Input

Output

GPT-2 for Stable Diffusion prompt generation

Input

Output

Stable Diffusion for image generation

Input

Output

Additional options

Optional: `params` object for control over summarization: