
Building an Engineering Knowledge Agent with LlamaIndex, Ollama, and Intent-Driven Engineering
- Mark Kendall
- 2 hours ago
- 4 min read
Building an Engineering Knowledge Agent with LlamaIndex, Ollama, and Intent-Driven Engineering
Introduction
One of the biggest challenges in modern software development is not writing code — it’s understanding systems.
Developers constantly ask questions like:
How does this integration work?
Where is the architecture documented?
Why was this service designed this way?
Which runbook applies to this incident?
The answers often exist somewhere — in documentation, Git repositories, Jira tickets, or architecture diagrams — but finding them can take hours.
Using Intent-Driven Engineering, combined with tools like LlamaIndex, LangChain, and local LLMs, we can build something powerful:
An Engineering Knowledge Agent.
This article walks through a complete playbook for building a TeamBrain-style AI assistant that allows engineers to ask questions about their systems and receive answers grounded in real documentation.
The entire system runs locally, requires no API keys, and can be generated using a single intent artifact with Claude Code.
What Is an Engineering Knowledge Agent?
An Engineering Knowledge Agent is an AI system that understands your engineering documentation and can answer questions about it.
Instead of manually searching documentation, developers simply ask questions like:
“How does the Nautobot to Salesforce integration work?”
The system retrieves relevant documentation, analyzes it, and produces an answer.
This approach is powered by Retrieval Augmented Generation (RAG).
RAG combines three capabilities:
Document indexing
Semantic search
Large language model reasoning
The result is an AI assistant that answers questions using your actual documentation, not generic training data.
Architecture Overview
The architecture behind this system is surprisingly simple.
It consists of four main components.
1. Documentation Sources
These are the documents you want the system to understand.
Examples include:
architecture documentation
integration specifications
runbooks
design documents
markdown documentation
These documents are stored locally in a data directory.
2. Knowledge Index (LlamaIndex)
LlamaIndex converts documentation into a searchable knowledge base.
The process works like this:
Documents → chunks → embeddings → vector index
This allows the system to retrieve relevant pieces of documentation based on meaning rather than keyword matching.
3. Local LLM (Ollama)
Instead of sending data to cloud APIs, the system uses a local language model.
This example uses:
Ollama + Llama 3
This allows the entire assistant to run locally without exposing internal documentation to external services.
4. Query Interface
Developers interact with the assistant through a simple command-line interface.
They ask questions, and the system retrieves documentation and generates answers.
Implementation Playbook
This section provides a step-by-step guide to implementing the system.
Step 1 — Install Ollama
First install Ollama.
Visit:
After installation, start a model.
Example:
ollama run llama3
This launches the local language model used by the assistant.
Step 2 — Create the Project Structure
Create the following directory structure:
teambrain-agent/
requirements.txt
data/
The data folder will contain the documentation to index.
Step 3 — Install Dependencies
Create a file called requirements.txt.
llama-index
llama-index-llms-ollama
llama-index-embeddings-ollama
ollama
Install dependencies:
pip install -r requirements.txt
Step 4 — Add Documentation
Create a markdown document inside the data directory.
Example: architecture.md
# Nautobot to Salesforce Integration
The Nautobot to Salesforce integration is implemented as a Spring Boot microservice.
The service receives webhook events from Nautobot when location data changes.
The microservice transforms the Nautobot data model into the Salesforce location schema.
Kafka is used as the transport layer to guarantee delivery and reliability.
A dead letter queue handles message failures.
Observability is implemented using structured logging and metrics.
This document will become part of the assistant’s knowledge base.
Step 5 — Build the Knowledge Index
Create a Python script called ingest.py.
This script loads documents and builds the vector index.
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.core.settings import Settings
Settings.llm = Ollama(model="llama3")
Settings.embed_model = OllamaEmbedding(model_name="nomic-embed-text")
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
index.storage_context.persist(persist_dir="./storage")
Run the ingestion process:
python ingest.py
This builds the searchable knowledge index.
Step 6 — Launch the AI Assistant
Create a file called app.py.
This script loads the index and allows developers to ask questions.
from llama_index.core import StorageContext, load_index_from_storage
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.core.settings import Settings
Settings.llm = Ollama(model="llama3")
Settings.embed_model = OllamaEmbedding(model_name="nomic-embed-text")
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
query_engine = index.as_query_engine()
print("TeamBrain Assistant Ready")
while True:
question = input("Ask a question: ")
if question.lower() in ["exit","quit"]:
break
response = query_engine.query(question)
print(response)
Run the assistant:
python app.py
Step 7 — Ask Questions
You can now interact with the system.
Example:
Ask a question:
How does the Nautobot to Salesforce integration work?
The assistant will retrieve relevant documentation and generate an answer.
Using Claude Code and Intent-Driven Engineering
Instead of manually creating the project, you can generate the entire system using Claude Code.
Create an intent file that describes the system requirements.
Claude Code will then:
Generate the project structure
Write the code
Install dependencies
Build the knowledge index
Launch the assistant
This demonstrates the power of Intent-Driven Engineering, where developers describe what they want rather than writing everything manually.
Why This Matters
Engineering organizations suffer from knowledge fragmentation.
Information is spread across:
documentation
Git repositories
ticketing systems
internal chats
runbooks
An Engineering Knowledge Agent solves this by creating a central intelligence layer for engineering teams.
Developers can quickly understand systems, troubleshoot problems, and learn architectures.
The result is faster onboarding, fewer production issues, and stronger architectural understanding across teams.
Key Takeaways
AI assistants can be grounded in real engineering documentation using Retrieval Augmented Generation.
LlamaIndex provides a powerful framework for indexing and retrieving documentation.
Ollama allows organizations to run AI models locally without external APIs.
Intent-Driven Engineering enables systems like this to be generated automatically using tools like Claude Code.
Engineering Knowledge Agents can become the foundation for organizational engineering intelligence.
Comments