flan-t5-large-grammar-synthesis Llama.cpp

I'm using fairydreaming/T5-branch, I'm not sure current llama-cpp-python server support t5

Model-Q6_K-GGUF, Reference1

Chatbot

Message

Response

Examples

Model

Select the AI model to use for chat

System Prompt

Define the AI assistant's personality and behavior

Max Tokens

Maximum length of response (higher = longer replies)

512 512

Temperature

Creativity level (higher = more creative, lower = more focused)

0.1 2

Top-p

Nucleus sampling threshold

0.1 1

Top-k

Limit vocabulary choices to top K tokens

1 100

Repetition Penalty

Penalize repeated words (higher = less repetition)

1 2

·

Built with Gradio logo

·