Skip to content

title: AI chat interfaces (ChatGPT, Claude, LLM-based UX) applies_to: [ai-chat, llm, chatgpt, conversational] version: 1.0.0 last_updated: 2026-05 stability: stable


AI chat interfaces

LLM-based chat (ChatGPT, Claude, Gemini, Perplexity, character.ai) has its own emerging UX conventions. Different from rule-based chatbots: more capable, more open-ended, but with novel UX challenges around hallucinations, latency, context length, and trust.

Read conversational-ui-fundamentals.md and chatbot-design.md first.

What's different about AI chat

vs traditional chatbot: - Free-form input — users type anything, model interprets. - Long-form output — paragraphs, code, lists, not just one-line replies. - Streamed responses — token-by-token, not waiting for completion. - Multi-turn context — system holds prior messages within session. - Multimodal — image upload, file analysis, voice (varies by model). - Hallucinations — model can be confidently wrong. - Cost per token — long conversations expensive.

These shape the UI.

Anatomy of an AI chat UI

┌──────────────────────────────────────────┐
│ [Brand] Chat                       [⚙ ⏷] │   ← header (model selector)
├──────────────────────────────────────────┤
│ [Sidebar]  │  Today                       │
│            │  ─────                       │
│ + New chat │                              │
│            │  User: How do I fix...?      │
│ Today      │                              │
│ • Chat 1   │  Assistant: To fix this...   │
│ • Chat 2   │  [code block]                │
│            │  [continue][regenerate][copy]│
│ Yesterday  │                              │
│ • Chat 3   │  ───── input ─────           │
│            │ [Type a message...]    [▶]   │
│            │ [Attach] [Voice] [Search]    │
└──────────────────────────────────────────┘

Input affordances

Multi-line input

Most AI chat allows multi-line input (Shift+Enter for newline; Enter to send):

[Long multi-line text area, expanding as user types]
[Send button]

For complex queries, code paste, full-document review.

Attachments

  • Image upload (Claude, GPT-4o): drag-drop or button.
  • File upload (PDFs, docs, code): paid tiers; drag-drop.
  • Multiple files in one message.
  • Visible attachments above input field as chips before send.

Slash commands / mentions

Some AI tools have: - Slash commands/help, /reset, /model gpt-4. - Mentions@persona or @tool to invoke specific AI / tool. - Sample prompts — pre-written suggestion chips.

For developer tools (Cursor, Claude Code): slash commands are core.

Voice input

Increasingly common (ChatGPT voice mode, Claude voice on mobile): - Tap-to-talk. - Real-time transcription preview. - Voice + text hybrid.

Output rendering

AI responses are richer than chatbot replies:

Element Render
Markdown Bold, italic, lists, headings rendered
Code blocks Syntax-highlighted; with copy button
Inline code Monospace
Math (LaTeX) Rendered with KaTeX / MathJax
Tables Proper HTML tables
Links Clickable, sometimes with preview
Mermaid diagrams Rendered if requested
Images Inline if generated / referenced

Don't show raw markdown — render it.

For Korean: same markdown rendering; Korean text in code blocks works (Pretendard / Monaspace Argon supports Korean monospace if needed).

Streaming responses

The single most important UX difference:

[blank screen — bad]
[• word           ]  ← starts at 200ms
[• word word      ]
[• word word word ]  ← user reads while it streams
[...]

Implementation: - Server-Sent Events (SSE) stream tokens. - Render incrementally — append to message bubble. - Cursor / blinking indicator at end of streaming text. - Stop button — let user halt mid-stream.

Streaming makes 30-second responses feel acceptable. Without streaming, 30s feels broken.

Stop / regenerate / continue

Standard controls below an AI response:

Action Use
Stop generating While streaming; halt response
Regenerate New response to same prompt
Continue Long response truncated; ask for more
Copy Copy entire response or code block
Edit (user message) Edit user's prior message; regenerates everything after
Branch Save current path; explore alternative
Share Share conversation link

Place inline below each AI response, not in a hidden menu.

Context length and truncation

Models have context limits (8k, 32k, 128k, 200k+ tokens). Long conversations exceed:

  • Sliding window — drop oldest messages when over limit.
  • Summarization — auto-summarize old messages, keep recent.
  • Visible indicator — "context near limit" warning.
  • Manual reset — "New chat" button always visible.

For long conversations: file uploads or "memory" features offload context.

Hallucination handling

LLMs confidently produce wrong answers. UX mitigations:

  • Source citations — show which docs / web results informed the answer.
  • Confidence indicators — vague when unsure.
  • Disclaimers — "AI can make mistakes. Verify important info."
  • Verifiable claims — direct quotes / page references.
  • User feedback — thumbs up/down to improve.

For factual products (search, research): always provide citations. For creative products (writing, ideation): less critical.

Trust and disclosure

Be transparent: - AI-generated label — small but visible. - Model version — "Powered by GPT-4o" / "Claude 3.5 Sonnet". - Data usage policy — does prompt training happen? Disclose. - Conversation storage — retained / deletable.

Chat history sidebar

Sidebar shows past conversations:

[+ New chat]

Today
• Working on the design system
• Help with React component
• ...

Yesterday
• Korean language practice
• ...

Last week
• ...

Features: - Search through history. - Rename conversations. - Delete individual or all. - Export conversation. - Pinned / favorites at top.

Multi-modal output

Some AI models output more than text:

Output Render
Generated image (DALL-E, Imagen) Inline image with download
Generated audio (TTS, music) Audio player
Generated code execution (Code Interpreter, Artifacts) Sandboxed runtime, output rendering
Generated documents (Claude Artifacts) Side-pane preview

For Claude / GPT artifacts (live HTML / React preview): split-pane interface, code on left, preview on right.

Suggestions / next steps

After a response, suggest follow-ups:

[AI response about React performance]

Ask follow-up:
[Show me an example with hooks] [What about Vue?] [Profile this code]

Reduces typing for common follow-ups.

Voice mode UX

ChatGPT / Claude voice modes:

Element Behavior
Activation Mic icon or wake word (varies)
Visual Animated orb / wave during listening / speaking
Interruption User can interrupt AI mid-speech
Transcript Optional visible text
Modes "push to talk" or "open conversation"

Voice mode latency target: < 1 second to first audio for natural feel. Hard to achieve with full LLM round-trip + TTS.

Mobile vs desktop

Surface Differences
Mobile Vertical stack; full-screen chat; voice prominent
Desktop Sidebar + main; multi-column; keyboard shortcuts
Tablet Either; usually closer to desktop

For developer tools (Cursor, Claude Code): heavy keyboard shortcuts, side-by-side code + chat, multi-window.

Memory / personalization

Modern AI tools have "memory": - Persistent memory — model remembers facts across sessions. - Custom instructions — user-defined preferences applied to all chats. - Project / Workspace — scoped memory + files for related work.

UX: - Memory indicator — visible when memory used / created. - Edit / delete memory — user can manage. - Per-project scope — separate from general memory.

Keyboard shortcuts (desktop)

Shortcut Action
Cmd/Ctrl + Enter Send
Shift + Enter Newline
Cmd/Ctrl + K New chat
Cmd/Ctrl + L Focus input
Cmd/Ctrl + Shift + O New chat
Cmd/Ctrl + / Show shortcuts
Esc Close current modal / cancel generation
Up arrow Edit last message

For developer tools: more shortcuts (Vim-style, multi-cursor).

Korean AI chat

Korean LLM products (2024+): - HyperCLOVA X (Naver) — Korean-strong LLM. - Bixby Text Call (Samsung) — AI text generation in messaging. - A.X / Kakao i (Kakao) — Korean assistant. - GPT / Claude / Gemini Korean support — adequate to strong.

For Korean AI chat UI: - Korean primary; English code / technical content OK. - 해요체 default tone (friendly). - Pretendard / NanumSquare / Apple SD Gothic Neo for body. - Korean code-block content rendered correctly (Pretendard JetBrains Mono / Monaspace Argon).

Common AI chat mistakes

  • No streaming — feels broken at 5+ second responses.
  • Hidden controls — regenerate / stop in submenu.
  • Truncated responses without "continue" — frustrating.
  • No copy button — users can't copy code / responses.
  • Markdown not rendered — raw asterisks / brackets visible.
  • No history search — users can't find past chat.
  • No disclaimer — users trust AI as authoritative.
  • No way to start fresh — context grows indefinitely.
  • No model identification — users don't know what they're using.

Don't

  • Don't ship without streaming.
  • Don't hide stop / regenerate / copy.
  • Don't render raw markdown.
  • Don't pretend the AI is infallible. Disclose.
  • Don't truncate without "continue" affordance.
  • Don't lose chat history without warning.
  • Don't make context-management opaque. Show when limits hit.
  • Don't ship without keyboard shortcuts on desktop.

Cross-reference