add streaming of ai text

author: Adam Malczewski <[email protected]> 2026-03-24 14:01:20 +0900
committer: Adam Malczewski <[email protected]> 2026-03-24 14:01:20 +0900
commit: 7f9b25a1479f9897aea7f85c3fb58a568b0bd642 (patch)
tree: ecb9dec930ba46c4f749d07ec023c2a1960599f8
parent: 5a44a97111d304945bbfc3da02d29a83191d816c (diff)
download: ai-pulse-obsidian-plugin-7f9b25a1479f9897aea7f85c3fb58a568b0bd642.tar.gz
ai-pulse-obsidian-plugin-7f9b25a1479f9897aea7f85c3fb58a568b0bd642.zip
5 files changed, 305 insertions, 18 deletions
diff --git a/.rules/changelog/2026-03/24/07.md b/.rules/changelog/2026-03/24/07.md
new file mode 100644
index 0000000..d1327e7
--- /dev/null
+++ b/.rules/changelog/2026-03/24/07.md
@@ -0,0 +1,29 @@
+# Changelog — 2026-03-24 (07)
+
+## Added: Streaming AI Responses
+
+### New
+- **`src/ollama-client.ts`** — Added `sendChatMessageStreaming()` function:
+  - Uses native `fetch()` with `stream: true` for real-time token delivery
+  - `readNdjsonStream()` async generator parses newline-delimited JSON, handles partial lines across chunks
+  - `onChunk` callback fires for every content delta
+  - `AbortSignal` support for cancellation
+  - Full tool-calling agent loop support (tools execute non-streamed, final answer streams)
+  - `StreamingChatOptions` interface for clean parameter passing
+- **Plan file** at `.rules/plan/streaming.md`
+
+### Changed
+- **`src/chat-view.ts`** — Switched from `sendChatMessage` to `sendChatMessageStreaming`:
+  - Creates empty assistant bubble immediately, appends text chunks progressively
+  - `AbortController` created on send, aborted on Stop click or view close
+  - Send button becomes red "Stop" button during streaming
+  - `debouncedScrollToBottom()` limits scroll updates to every 50ms
+  - On completion/abort, streaming class removed from bubble
+- **`styles.css`** — Added `.ai-organizer-stop-btn` red styling for the stop button
+
+### Tool call UI refinements (from previous session)
+- Removed "Tool Use:" prefix from header (icon is sufficient)
+- Removed redundant "Searched for:" / "Read file:" from summaries
+- Added leading `/` to file paths in read_file summary
+- Added result summary line (e.g. "3 results found", "no results found")
+- JSON args and result are now inside a collapsible `<details>` element
diff --git a/.rules/plan/streaming.md b/.rules/plan/streaming.md
new file mode 100644
index 0000000..925a2f1
--- /dev/null
+++ b/.rules/plan/streaming.md
@@ -0,0 +1,39 @@
+# Plan: Streaming AI Responses
+
+## Goal
+Show AI text token-by-token as it arrives, instead of waiting for the full response.
+
+## Key Changes
+
+### 1. `src/ollama-client.ts` — New streaming chat function
+- Add a `sendChatMessageStreaming()` function that sets `stream: true` in the request body.
+- Ollama returns `application/x-ndjson` — each line is a JSON chunk with `message.content` (partial text).
+- The final chunk has `done: true`.
+- Use native `fetch()` API with `response.body.getReader()` + `TextDecoder` to read chunks, split by newlines, parse each JSON line. (`requestUrl()` does not support streaming.)
+- Accept an `onChunk(text: string)` callback that fires for every content delta.
+- Accept an `AbortSignal` for cancellation support.
+- For the tool-calling agent loop: after streaming completes, check the accumulated message for `tool_calls`. If present, execute tools (non-streamed), then stream the next turn. Only the final text turn streams visibly. Tool execution rounds stay non-streamed.
+- Return the full accumulated text at the end.
+
+### 2. `src/chat-view.ts` — Progressive message rendering
+- When sending, create an empty assistant message bubble immediately.
+- Pass an `onChunk` callback that appends each text delta to that bubble's `textContent`.
+- Scroll to bottom on each chunk (debounced to ~50ms to avoid performance issues).
+- While streaming: disable the Send button, show a "Stop" button that can abort the stream via `AbortController`.
+- On stream end or abort: finalize the message, re-enable input.
+
+### 3. `styles.css` — Minor additions
+- Add a blinking cursor indicator on the streaming message bubble (e.g. `::after` pseudo-element).
+- Style the Stop button.
+
+## Considerations
+- **`fetch()` vs `requestUrl()`:** `requestUrl` is Obsidian's abstraction (works on mobile, handles CORS). `fetch` works for localhost calls on desktop. On mobile, `fetch` to `localhost` may not work. Use `fetch` only for streaming and keep `requestUrl` for non-streaming/fallback.
+- **Tool calling + streaming:** Ollama supports streaming with tools. Streamed chunks accumulate `tool_calls` across multiple chunks. Simpler approach: use streaming for the text response; if tool calls come back, fall back to non-streamed agent loop for tool execution, then stream the final response.
+- **Abort handling:** `AbortController` passed to `fetch`. On abort, clean up gracefully — keep partial text visible in the chat but don't add it to message history for future context.
+
+## Estimated file changes
+| File | Change |
+|------|--------|
+| `ollama-client.ts` | Add `sendChatMessageStreaming()`, keep existing `sendChatMessage()` |
+| `chat-view.ts` | New streaming send flow, onChunk handler, stop button, debounced scroll |
+| `styles.css` | Streaming cursor animation, stop button style |
diff --git a/src/chat-view.ts b/src/chat-view.ts
index 16b9676..9263c57 100644
--- a/src/chat-view.ts
+++ b/src/chat-view.ts
@@ -1,7 +1,7 @@
 import { ItemView, Notice, WorkspaceLeaf, setIcon } from "obsidian";
 import type AIOrganizer from "./main";
 import type { ChatMessage, ToolCallEvent } from "./ollama-client";
-import { sendChatMessage } from "./ollama-client";
+import { sendChatMessageStreaming } from "./ollama-client";
 import { SettingsModal } from "./settings-modal";
 import { ToolModal } from "./tool-modal";
 import { TOOL_REGISTRY } from "./tools";
@@ -16,6 +16,8 @@ export class ChatView extends ItemView {
 	private textarea: HTMLTextAreaElement | null = null;
 	private sendButton: HTMLButtonElement | null = null;
 	private toolsButton: HTMLButtonElement | null = null;
+	private abortController: AbortController | null = null;
+	private scrollDebounceTimer: ReturnType<typeof setTimeout> | null = null;
 
 	constructor(leaf: WorkspaceLeaf, plugin: AIOrganizer) {
 		super(leaf);
@@ -86,6 +88,11 @@ export class ChatView extends ItemView {
 		});
 
 		this.sendButton.addEventListener("click", () => {
+			if (this.abortController !== null) {
+				// Currently streaming — abort
+				this.abortController.abort();
+				return;
+			}
 			void this.handleSend();
 		});
 
@@ -94,12 +101,16 @@ export class ChatView extends ItemView {
 	}
 
 	async onClose(): Promise<void> {
+		if (this.abortController !== null) {
+			this.abortController.abort();
+		}
 		this.contentEl.empty();
 		this.messages = [];
 		this.messageContainer = null;
 		this.textarea = null;
 		this.sendButton = null;
 		this.toolsButton = null;
+		this.abortController = null;
 	}
 
 	private getEnabledTools(): OllamaToolDefinition[] {
@@ -146,8 +157,12 @@ export class ChatView extends ItemView {
 		// Track in message history
 		this.messages.push({ role: "user", content: text });
 
-		// Disable input
-		this.setInputEnabled(false);
+		// Switch to streaming state
+		this.abortController = new AbortController();
+		this.setStreamingState(true);
+
+		// Create the assistant bubble for streaming into
+		const streamingBubble = this.createStreamingBubble();
 
 		try {
 			const enabledTools = this.getEnabledTools();
@@ -158,30 +173,52 @@ export class ChatView extends ItemView {
 				this.scrollToBottom();
 			};
 
-			const response = await sendChatMessage(
-				this.plugin.settings.ollamaUrl,
-				this.plugin.settings.model,
-				this.messages,
-				hasTools ? enabledTools : undefined,
-				hasTools ? this.plugin.app : undefined,
-				hasTools ? onToolCall : undefined,
-			);
+			const onChunk = (chunk: string): void => {
+				streamingBubble.textContent += chunk;
+				this.debouncedScrollToBottom();
+			};
 
+			const response = await sendChatMessageStreaming({
+				ollamaUrl: this.plugin.settings.ollamaUrl,
+				model: this.plugin.settings.model,
+				messages: this.messages,
+				tools: hasTools ? enabledTools : undefined,
+				app: hasTools ? this.plugin.app : undefined,
+				onChunk,
+				onToolCall: hasTools ? onToolCall : undefined,
+				abortSignal: this.abortController.signal,
+			});
+
+			// Finalize the streaming bubble
+			streamingBubble.removeClass("ai-organizer-streaming");
 			this.messages.push({ role: "assistant", content: response });
-			this.appendMessage("assistant", response);
 			this.scrollToBottom();
 		} catch (err: unknown) {
+			// Finalize bubble even on error
+			streamingBubble.removeClass("ai-organizer-streaming");
+
 			const errMsg = err instanceof Error ? err.message : "Unknown error.";
 			new Notice(errMsg);
 			this.appendMessage("error", `Error: ${errMsg}`);
 			this.scrollToBottom();
 		}
 
-		// Re-enable input
-		this.setInputEnabled(true);
+		// Restore normal state
+		this.abortController = null;
+		this.setStreamingState(false);
 		this.textarea.focus();
 	}
 
+	private createStreamingBubble(): HTMLDivElement {
+		if (this.messageContainer === null) {
+			// Should not happen, but satisfy TS
+			throw new Error("Message container not initialized.");
+		}
+		return this.messageContainer.createDiv({
+			cls: "ai-organizer-message assistant ai-organizer-streaming",
+		});
+	}
+
 	private appendMessage(role: "user" | "assistant" | "error", content: string): void {
 		if (this.messageContainer === null) {
 			return;
@@ -227,13 +264,21 @@ export class ChatView extends ItemView {
 		}
 	}
 
-	private setInputEnabled(enabled: boolean): void {
+	private debouncedScrollToBottom(): void {
+		if (this.scrollDebounceTimer !== null) return;
+		this.scrollDebounceTimer = setTimeout(() => {
+			this.scrollDebounceTimer = null;
+			this.scrollToBottom();
+		}, 50);
+	}
+
+	private setStreamingState(streaming: boolean): void {
 		if (this.textarea !== null) {
-			this.textarea.disabled = !enabled;
+			this.textarea.disabled = streaming;
 		}
 		if (this.sendButton !== null) {
-			this.sendButton.disabled = !enabled;
-			this.sendButton.textContent = enabled ? "Send" : "...";
+			this.sendButton.textContent = streaming ? "Stop" : "Send";
+			this.sendButton.toggleClass("ai-organizer-stop-btn", streaming);
 		}
 	}
 }
diff --git a/src/ollama-client.ts b/src/ollama-client.ts
index 91bd40c..66badc6 100644
--- a/src/ollama-client.ts
+++ b/src/ollama-client.ts
@@ -190,3 +190,170 @@ export async function sendChatMessage(
 
 	throw new Error("Tool calling loop exceeded maximum iterations.");
 }
+
+/**
+ * Streaming chat options.
+ */
+export interface StreamingChatOptions {
+	ollamaUrl: string;
+	model: string;
+	messages: ChatMessage[];
+	tools?: OllamaToolDefinition[];
+	app?: App;
+	onChunk: (text: string) => void;
+	onToolCall?: (event: ToolCallEvent) => void;
+	abortSignal?: AbortSignal;
+}
+
+/**
+ * Parse ndjson lines from a streamed response body.
+ * Handles partial lines that may span across chunks from the reader.
+ */
+async function* readNdjsonStream(
+	reader: ReadableStreamDefaultReader<Uint8Array>,
+	decoder: TextDecoder,
+): AsyncGenerator<Record<string, unknown>> {
+	let buffer = "";
+
+	while (true) {
+		const { done, value } = await reader.read();
+		if (done) break;
+
+		buffer += decoder.decode(value, { stream: true });
+		const lines = buffer.split("\n");
+		// Last element may be incomplete — keep it in buffer
+		buffer = lines.pop() ?? "";
+
+		for (const line of lines) {
+			const trimmed = line.trim();
+			if (trimmed === "") continue;
+			yield JSON.parse(trimmed) as Record<string, unknown>;
+		}
+	}
+
+	// Process any remaining data in buffer
+	const trimmed = buffer.trim();
+	if (trimmed !== "") {
+		yield JSON.parse(trimmed) as Record<string, unknown>;
+	}
+}
+
+/**
+ * Send a chat message with streaming.
+ * Streams text chunks via onChunk callback. Supports tool-calling agent loop:
+ * tool execution rounds are non-streamed, only the final text response streams.
+ * Returns the full accumulated response text.
+ */
+export async function sendChatMessageStreaming(
+	opts: StreamingChatOptions,
+): Promise<string> {
+	const { ollamaUrl, model, messages, tools, app, onChunk, onToolCall, abortSignal } = opts;
+	const maxIterations = 10;
+	let iterations = 0;
+
+	const workingMessages = messages.map((m) => ({ ...m }));
+
+	while (iterations < maxIterations) {
+		iterations++;
+
+		const body: Record<string, unknown> = {
+			model,
+			messages: workingMessages,
+			stream: true,
+		};
+
+		if (tools !== undefined && tools.length > 0) {
+			body.tools = tools;
+		}
+
+		const response = await fetch(`${ollamaUrl}/api/chat`, {
+			method: "POST",
+			headers: { "Content-Type": "application/json" },
+			body: JSON.stringify(body),
+			signal: abortSignal,
+		});
+
+		if (!response.ok) {
+			throw new Error(`Ollama returned status ${response.status}.`);
+		}
+
+		if (response.body === null) {
+			throw new Error("Response body is null — streaming not supported.");
+		}
+
+		const reader = response.body.getReader();
+		const decoder = new TextDecoder();
+
+		let content = "";
+		const toolCalls: ToolCallResponse[] = [];
+
+		try {
+			for await (const chunk of readNdjsonStream(reader, decoder)) {
+				const msg = chunk.message as Record<string, unknown> | undefined;
+				if (msg !== undefined && msg !== null) {
+					if (typeof msg.content === "string" && msg.content !== "") {
+						content += msg.content;
+						onChunk(msg.content);
+					}
+					if (Array.isArray(msg.tool_calls)) {
+						toolCalls.push(...(msg.tool_calls as ToolCallResponse[]));
+					}
+				}
+			}
+		} catch (err: unknown) {
+			if (err instanceof DOMException && err.name === "AbortError") {
+				// User cancelled — return whatever we accumulated
+				return content;
+			}
+			throw err;
+		}
+
+		// If no tool calls, we're done
+		if (toolCalls.length === 0) {
+			return content;
+		}
+
+		// Tool calling: append assistant message and execute tools
+		const assistantMsg: ChatMessage = {
+			role: "assistant",
+			content,
+			tool_calls: toolCalls,
+		};
+		workingMessages.push(assistantMsg);
+
+		if (app === undefined) {
+			throw new Error("App reference required for tool execution.");
+		}
+
+		for (const tc of toolCalls) {
+			const fnName = tc.function.name;
+			const fnArgs = tc.function.arguments;
+			const toolEntry = findToolByName(fnName);
+
+			let result: string;
+			if (toolEntry === undefined) {
+				result = `Error: Unknown tool "${fnName}".`;
+			} else {
+				result = await toolEntry.execute(app, fnArgs);
+			}
+
+			if (onToolCall !== undefined) {
+				const friendlyName = toolEntry !== undefined ? toolEntry.friendlyName : fnName;
+				const summary = toolEntry !== undefined ? toolEntry.summarize(fnArgs) : `Called ${fnName}`;
+				const resultSummary = toolEntry !== undefined ? toolEntry.summarizeResult(result) : "";
+				onToolCall({ toolName: fnName, friendlyName, summary, resultSummary, args: fnArgs, result });
+			}
+
+			workingMessages.push({
+				role: "tool",
+				tool_name: fnName,
+				content: result,
+			});
+		}
+
+		// Reset content for next streaming round
+		// (tool call content was intermediate, next round streams the final answer)
+	}
+
+	throw new Error("Tool calling loop exceeded maximum iterations.");
+}
diff --git a/styles.css b/styles.css
index c9cd66b..cdf5b54 100644
--- a/styles.css
+++ b/styles.css
@@ -196,3 +196,10 @@
 	font-size: 0.9em;
 	margin-bottom: 8px;
 }
+
+
+.ai-organizer-stop-btn {
+	background-color: var(--text-error) !important;
+	color: var(--text-on-accent) !important;
+	border-color: var(--text-error) !important;
+}
author	Adam Malczewski <[email protected]>	2026-03-24 14:01:20 +0900
committer	Adam Malczewski <[email protected]>	2026-03-24 14:01:20 +0900
commit	7f9b25a1479f9897aea7f85c3fb58a568b0bd642 (patch)
tree	ecb9dec930ba46c4f749d07ec023c2a1960599f8
parent	5a44a97111d304945bbfc3da02d29a83191d816c (diff)
download	ai-pulse-obsidian-plugin-7f9b25a1479f9897aea7f85c3fb58a568b0bd642.tar.gz ai-pulse-obsidian-plugin-7f9b25a1479f9897aea7f85c3fb58a568b0bd642.zip