Wednesday, April 23, 2025

Running An LLM Locally

When I get home I'd like to do some experimentation with an LLM running locally on my Mac to query all of my blog posts.

I can use something like Ollarma to spin up Gemma 3 and then send HTTP requests to it via TypeScript.

So load all the markdown files like this

import fg from "fast-glob";
import fs from "fs/promises";
import path from "path";

export async function loadMarkdownFiles(folder: string) {
  const files = await fg("**/*.md", { cwd: folder, absolute: true });
  const posts: { filename: string; content: string }[] = [];

  for (const file of files) {
    const content = await fs.readFile(file, "utf8");
    posts.push({ filename: path.basename(file), content });
  }

  return posts;
}

and have an HTTP client

import axios from 'axios';

export async function askGemma(prompt: string): Promise<string> {
  const response = await axios.post('http://localhost:11434/api/generate', {
    model: 'gemma3:1b',
    prompt,
    stream: false,
  });

  return response.data.response;
}

and then prompt the LLM like this

import { loadMarkdownFiles } from "./blog-loader";
import { askGemma } from "./ollama-client";

const folder = `${process.env.HOME}/Desktop/notes`;

async function main() {
  const posts = await loadMarkdownFiles(folder);
  const combinedText = posts.map(p => p.content).join("\n\n");

  const question = "What are the main themes across my blog posts?";
  const prompt = `${combinedText}\n\nAnswer the question: ${question}`;

  const answer = await askGemma(prompt);
  console.log("🤖 Gemma says:\n", answer);
}

main();

It seems simple enough. Not sure how much value I get out of being able to query all of my blog posts but it's an interesting experiment.