CLI — gunni.ai docs

Install & Configure

The Gunni CLI lets you generate, edit, and transform media from your terminal. All operations go through the Gunni API server, so you only need your API key.

Install

# Use directly (no install)
npx gunni

# Or install globally
npm install -g gunni

Configure

Set your API key once. All subsequent commands authenticate automatically.

gunni config --set-gunni-key YOUR_KEY

Verify

gunni list models

If you see the model list, you are connected and ready to go.

image

Unified image command. The CLI auto-routes based on what you provide: text only generates, image + text edits, image only describes, image + --upscale upscales, image + --remove-bg removes the background.

Usage

gunni image [image_path] [prompt] [flags]

Examples

# Generate from text
gunni image "a fox in watercolor style" -o output.png

# Edit an image with a prompt
gunni image photo.jpg "make it warmer and more cinematic" -o edited.png

# Describe an image
gunni image photo.jpg

# Upscale an image
gunni image photo.jpg --upscale -o upscaled.png
gunni image photo.jpg --upscale --scale 4 -o upscaled-4x.png

# Remove background
gunni image photo.jpg --remove-bg -o clean.png

# Generate multiple variants
gunni image "product shot on white" --variants 4

Flags

Name	Type	Default	Description
-m, --model	string	—	Override the default model for this operation.
-o, --output	string	—	Output file path.
--width	number	—	Output width in pixels.
--height	number	—	Output height in pixels.
--seed	number	—	Seed for reproducible results.
--upscale	boolean	—	Upscale the input image.
--remove-bg	boolean	—	Remove the background from the input image.
--scale	number	2	Upscale factor. Accepts 2 or 4.
--variants	number	1	Number of image variants to generate.
--style	string	—	Apply a saved visual style by name.
--preset	string	—	Apply a saved production preset by name.

Default models

Generate: nano-banana

Edit: nano-banana-edit

Describe: florence-2

Upscale: topaz-upscale

Background removal: bria-bg-remove

video

Generate video from a still image (image-to-video) or a text prompt (text-to-video). The CLI auto-selects the correct model variant based on input.

Usage

gunni video [image_path] [prompt] [flags]

Examples

# Image to video
gunni video photo.jpg "slow zoom out" -o video.mp4

# Text to video
gunni video "ocean waves at sunset, golden hour" -o video.mp4

# Specify model and duration
gunni video photo.jpg "pan left" -m veo-3.1 --duration 10 -o video.mp4

Flags

Name	Type	Default	Description
-p, --prompt	string	—	Text prompt describing the desired motion or scene.
-m, --model	string	—	Override the default video model.
-o, --output	string	—	Output file path.
--duration	number	—	Video duration in seconds (typically 5 or 10).
--text-only	boolean	—	Force text-to-video mode (no input image).
--style	string	—	Apply a saved visual style by name.
--preset	string	—	Apply a saved production preset by name.

Default models

Image-to-video: kling-v3-pro

Text-to-video: kling-v3-pro-t2v (auto-selected when no image provided)

audio

Convert text to natural-sounding speech.

Usage

gunni audio "text to speak" [flags]

Examples

gunni audio "Welcome to Gunni" -o speech.mp3
gunni audio "Breaking news: AI can now talk" -m elevenlabs-tts -o news.mp3

Flags

Name	Type	Default	Description
-m, --model	string	minimax-speech	TTS model to use.
-o, --output	string	—	Output file path.
--voice	string	—	Voice ID or name (model-dependent).

lipsync

Lip synchronization: sync audio onto a video, or generate a talking avatar from a single image plus audio.

Usage

gunni lipsync <audio> [flags]

Examples

# Lip sync: video + audio
gunni lipsync narration.mp3 -v speaker.mp4 -o synced.mp4

# Avatar: image + audio
gunni lipsync narration.mp3 -i headshot.jpg --model kling-avatar -o avatar.mp4

Flags

Name	Type	Default	Description
-v, --video	string	—	Input video file for lip sync mode.
-i, --image	string	—	Input image file for avatar mode.
-m, --model	string	—	Model to use (kling-lipsync, kling-avatar, sync-lipsync).
-o, --output	string	—	Output file path.

Modes

Lip sync: provide a video + audio. The model syncs lips to match the audio.

Avatar: provide an image + audio. The model animates the face to speak the audio.

learn

Access the Gunni creative knowledge base. Browse topics on prompting technique, brand design, UI mockups, advertising, product photography, video production, and more.

Usage

gunni learn [topic]

Available topics

overview — General introduction to creative AI

exploration — Techniques for exploring visual directions

prompting — Prompt engineering for image and video models

brand — Brand identity and visual consistency

ui-design — UI and interface mockup generation

advertising — Ad creative and campaign visuals

product-photo — Product photography and compositing

concept-art — Concept art and illustration workflows

video — Video generation tips and motion techniques

models

List all available models and their capabilities.

Usage

gunni list models [--type category]

Examples

# List all models
gunni list models

# Filter by category
gunni list models --type video
gunni list models --type image

config

Manage CLI configuration. Set your API key and view current settings.

Usage

gunni config [flags]

Flags

Name	Type	Default	Description
--set-gunni-key	string	—	Set your Gunni API key.
--show	boolean	—	Display current configuration.

history

Search and browse your past generations.

Usage

gunni history [query]

Examples

# Browse recent generations
gunni history

# Search by prompt text
gunni history "fox watercolor"

style

CRUD operations for visual styles. Styles encapsulate a visual direction (description, negative prompts, model preferences, reference images) that can be applied to any generation with --style.

Usage

gunni style list
gunni style get <name>
gunni style create <name> [flags]
gunni style delete <name>

preset

CRUD operations for production presets. Presets define platform-specific output settings (aspect ratio, framing rules, prompt suffixes) for consistent delivery across platforms like Instagram, TikTok, or web.

Usage

gunni preset list
gunni preset get <name>
gunni preset create <name> [flags]
gunni preset delete <name>

template

CRUD operations for prompt templates. Templates are reusable prompt structures with variables, letting you standardize prompts across a team or project.

Usage

gunni template list
gunni template get <name>
gunni template create <name> [flags]
gunni template delete <name>

pipeline

CRUD operations for multi-step workflows. Pipelines define a sequence of generation steps that execute in order, passing outputs forward.

Usage

gunni pipeline list
gunni pipeline get <name>
gunni pipeline create <name> [flags]
gunni pipeline delete <name>

ref

Reference asset management. Save, organize, and retrieve reference images by name and collection.

Usage

gunni ref list
gunni ref get <name>
gunni ref save <url> --name <name> [--collection <collection>]
gunni ref delete <name>

research

Visual search. Find reference images and inspiration for a given query.

Usage

gunni research "query" [--num count]

Examples

# Default: 6 results
gunni research "minimalist packaging design"

# More results
gunni research "retro poster art" --num 15

Chaining Workflows

The output of one Gunni operation becomes the input to the next. Chain steps to build complete production workflows without leaving the terminal.

Example: product image pipeline

Generate a product shot, remove the background, composite into a new scene, then upscale for print:

# Step 1: Generate product shot
gunni image "sneaker on clean surface" -o product.png

# Step 2: Remove background
gunni image product.png --remove-bg -o clean.png

# Step 3: Composite into new scene
gunni image clean.png "place on rainy city street" -o scene.png

# Step 4: Upscale to 4x for print
gunni image scene.png --upscale --scale 4 -o final.png

Each command reads the previous output file and passes it through the next operation. This works because Gunni auto-routes based on the combination of inputs and flags.

Global Flags

These flags work on any command.

Name	Type	Default	Description
--json	boolean	—	Output structured JSON instead of human-readable text. Useful for scripting and piping.