Three Lessons from Introducing AI to a 20-Person NGO

What I learned bringing AI into a 20-person political NGO: choosing retrieval infrastructure, onboarding staff from the bottom up, and where humans still beat the model.

elek 2026-06-13 · 807 字

I had the opportunity to introduce AI to a 20-person NGO office. Here are three lessons I took away.

The office works in the political sphere: it has to react to fast-moving events, research long-term policy, and help laypeople access legal and institutional resources. A knowledge base its members can align with and consult is critical to that mission. At the same time, staff readiness varied widely: some asked how to build a service for monitoring public sentiment, while others had just begun their first conversations with an AI.

So after kicking off with the heads of the office, I set two objectives: build the data pipelines and the MCP server, and, at a micro level, get familiar enough with each person's workflow to win their faith and buy-in. The three lessons below came out of pursuing both.

Lesson 1: Build the infrastructure first, and choose retrieval for the job

I started from a list of sources to crawl and a set of previously built databases, and experimented to decide which methods of retrieval (SQL, vector RAG, graph RAG) the server should support.

Graph RAG can look like hype, but it earned its place for two reasons. First, it improves factual validity: it resolves same-name entities without my having to prescribe rules, so the model grounds its answers on the right entity instead of conflating them. Second, it surfaces patterns without prior knowledge: connections that only a seasoned domain expert would otherwise catch. (Surfacing a pattern is not the same as knowing it matters; judging a pattern's value still needs other methods.)

Two stack decisions framed everything:

Privacy first: the boundary between external and internal data has to be strictly maintained; in politics, the consequences of getting it wrong can be devastating. Sensitive data is anonymized before ingestion, and the whole thing sits behind a Tailnet as an extra layer of protection.
Vendor-agnostic: neither the architecture nor the underlying models of the knowledge base should be locked in.

The build itself was deliberately lean. Opus 4.7/4.8 and I did the planning; more affordable models did the building. On a tight budget, I leaned on prompt engineering and the Handoff skill (by Matt Pocock), had the affordable models consult advanced models only to solve the crux, and cross-reviewed with less capable models to hold the bar.

Lesson 2: Top-down segmentation fails; individualize from the bottom up

To onboard the staff, I first surveyed the office. My initial plan was to "divide and conquer": to push the bulk of the curve, where most people cluster, toward the right (with the x-axis being proficiency at using AI). The head of the office and I built skills, extracted from SOPs, for common scenarios, and had those skills call the MCP server to ground responses in facts via RAG. Since most of the staff were still on the web-chat version of the model, I made Claude Desktop's Cowork mode the definitive environment, so no one had to spend disproportionate time getting to grips with the skills and the MCP server on their own.

But in the first two workshops, I quickly learnt that everyone already had their own workflows and preferred models. Building their confidence in AI, incentivized with reduced work hours and more autonomy to dismantle and recompose their workflows, would take longer to tread, but is the more solid path.

So I stopped segmenting from the top down and reworked from the bottom up. AI is at its best for individualization: using the same survey data, I offered each staff member the prompts most likely to help in their own use cases, and, because common structures surfaced across office routines, I could connect colleagues sitting at consecutive nodes of the same workflow.

Lesson 3: AI augments; humans orchestrate

Many frame technological change in deterministic terms, and the most scaremongering version is "AI is replacing your work." It is not entirely wrong, but it is right only to a limited extent — more limited once you step outside software engineering and the Silicon Valley scene. Most organizations simply want better-quality work at higher output, and getting there means onboarding the people you already have with enough AI literacy to try every task with an agent first, and to know when to spike and when to recompose.

That points to a concrete job for leaders, whether they run a team of five or an organization of thousands. First, lay the infrastructure: a model cannot have all domain knowledge baked into it, so it needs a rich knowledge base and robust retrieval to do the work while exceeding the quality bar. Then build capability from the bottom up: let AI individually coach each colleague toward their own best workflow. Human beings are what get the most out of an advanced model, and they are even more valuable when it comes to connecting the dots into a smooth workflow.

關聯項目：生產力