Sitemap

Building an Architecture Decision Record Writer Agent

10 min readSep 1, 2025

When I joined Nationale Nederlanden (NN Group) not too long ago, one of my very first assignments was to help shape a future-state architecture for data & AI. After mapping out that vision, the next logical step was to document the key choices and rationales behind it. Enter the Architecture Decision Record (ADR) — a structured way of capturing the why behind architectural changes. Architecture decision records (ADRs) are documents that capture important architecture decisions made along with its context and consequences.

At first, the task sounded straightforward. But as soon as I realized just how many major decisions we needed to document, it quickly became daunting. Crafting ADRs is valuable but time-intensive: choosing the right wording, ensuring consistency, and articulating the rationale clearly all take effort.

So naturally, I started asking myself: could AI help with this?

This journey led me to create an ADR Writer Agent. In this post, I’ll walk you through how I built this tool, share the code, and reflect on some lessons learned along the way.

From Question to Prototype

That question sparked the journey that led me to create the ADR Writer Agent. In this post, I’ll walk through how I built it, share the code, and reflect on lessons learned along the way.

I had already experimented with prompting and retrieval-augmented generation (RAG) in OpenAI’s Playground. But this project was different: I wanted to explore how large language models (LLMs) could streamline the ADR process end-to-end. Could a single AI agent handle the nuanced tasks involved, or would multiple agents be necessary? My goal was to discover which parts of the workflow could be automated, reducing manual effort without sacrificing quality.

To sharpen my approach, I also enrolled in a course on The Complete Agentic AI Engineering Course. It offers valuable best practices for building practical AI agents, and I was eager to apply those ideas in a real-world setting.

Building in a Corporate Environment

Because this wasn’t just a side project — I wanted the agent to run inside NN Group’s corporate environment. That meant aligning with existing controls, services, and compliance requirements from the start.

Within NN, we rely on Azure OpenAI for secure access to LLMs, so I initially prototyped on my laptop but quickly realized that a production-ready version would require proper app registration for safe authentication. This setup ensured not only security but also full integration into the organization’s AI framework.

For the prototype, I used Gradio, which is fantastic for quickly spinning up chat interfaces in a notebook environment. However, once the project matured, I migrated everything to Streamlit, which provided a smoother path for hosting and sharing the tool with others.

Choosing the Right Framework

The code itself begins with the usual step: importing libraries. For the agent framework, I opted for the OpenAI Agent SDK. Its simplicity and clean design made it a natural fit for this project. I did explore alternatives like LangGraph and Crew, but ultimately the OpenAI Agent SDK offered the right balance of flexibility and elegance.

It’s worth noting that the OpenAI Agent SDK isn’t limited to OpenAI models — you can use it with other LLM providers as well!

Implementation Walkthrough

The core setup looks familiar at first:

# Import packages
import os
from openai import AsyncAzureOpenAI, OpenAIError
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from dotenv import load_dotenv
from pypdf import PdfReader
import gradio as gr
from agents import Agent, Runner, function_tool, OpenAIChatCompletionsModel, set_tracing_disabled
set_tracing_disabled(disabled=True)

I use environment variables to authenticate securely, then initialize the Azure OpenAI client:

# Load environment variables from .env file
load_dotenv(override=True)

endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
deployment = os.getenv("DEPLOYMENT_NAME", "gpt-4o-mini")

# Initialize Azure OpenAI client with Entra ID authentication
token_provider = get_bearer_token_provider(
DefaultAzureCredential(),
"https://cognitiveservices.azure.com/.default"
)

openai_client = AsyncAzureOpenAI(
azure_endpoint=endpoint,
azure_ad_token_provider=token_provider,
api_version="2025-01-01-preview",
)

Next, I extract context from our architecture proposal (PDF) and load an ADR template (Markdown). Together, they give the agent the inputs it needs to generate meaningful ADRs.

reader = PdfReader("files/WS2_Architecture_proposal.pdf")
architecture_proposal = ""
for page in reader.pages:
text = page.extract_text()
if text:
architecture_proposal += text

For the ADR template, NN uses a Markdown Architectural Decision Records. You can get this from the following location and then read it using the code below.

# Specify the file path where the Markdown file is saved  
file_path = 'adr_template.md'

# Read the content from the file
with open(file_path, 'r') as file:
adr_template = file.read()

With the context in place, we’re ready to move on to the agentic part. The prompt below is for the main agent — and yes, this is a multi-agent setup. The main agent’s role is to write Architecture Decision Records (ADRs). It’s instructed to strictly follow the ADR template provided earlier and to base its output on the context extracted from the architecture document.

Furthermore, the agent is able to collaborate with other agents. Once its own task is complete, it will first request another agent to validate the results, and then delegate to a different agent to write the final output.

system_prompt = f"You are an experienced enterprise architect. You assist in writing Architecture Decision Records (ADRs) \
Given the following context, generate an Architecture Decision Record (ADR) in Markdown format. Use the context from the Architecture document. Include the problem, decision, status, alternatives considered, pros/cons, and consequences. \
You are provided via the chat with instructions about what to write the ADR for. \
You must adhere to the structure of the following template: {adr_template}"

system_prompt += f"\n\n## Architecture document:\n{architecture_proposal}\n\n"
system_prompt += f"With this context, please chat with the user."

system_prompt += f"Crucial Rules:\
- You must welcome the user in a way that you present yourself as an ADR writer.\
- You must use the context from the Architecture document.\
- You must allow the user to make revisions on the ADR you write.\
- You must ask the user whether the ADR is ready to be handed off, and if so, you must first use the validator agent, and then secondly, you hand the ADR off to the Markdown writer agent.\
"

Extending with Tools

One of the biggest insights from this project was the power of tools. In this case, tools let agents do more than just generate text — they can save files, call APIs, or even fetch external knowledge.

For example, I added a tool that lets the agent ground design choices with Bing search results when explicitly requested. This enables richer ADRs with references to vendor docs or expert sources, while still weighting internal architecture documents more heavily.

@function_tool
def markdown_writer(name, content):
# Define the file name and content
file_name = name

# Open the file in write mode ('w')
with open('files/' + file_name + '.md', 'w') as file:
file.write(content)
return {"recorded": "ok"}

The next piece of code defines the second agent. Its task is to validate the ADR produced by the first agent. Specifically, it checks that the ADR is formatted correctly in Markdown. Once validation is complete, it hands the result over to the markdown_writer tool we defined earlier to save the output.

writer_instructions = f"You are a Markdown writer. You receive content in Markdown format. Your task is to validate the content. \
If the content looks valid you use the markdown_writer tool for storing the ADR. If not, you reformat the content and then use the markdown_writer tool."

markdown_writer_agent = Agent(
name="Markdown writer",
instructions=writer_instructions,
tools=[markdown_writer],
model=OpenAIChatCompletionsModel(
model="gpt-4o-mini",
openai_client=openai_client
),
handoff_description="Store the ADR using Markdown")

The next piece of code defines another agent. Its role is to validate the ADR for any missing mandatory fields. If required fields are missing, the agent will prompt the user to provide the missing information before the ADR can move forward.

validator_instructions = f"You are Architecture Decision Record (ADR) validator. \
Your task is to validate whether all mandatory items are present. If mandatory items are missing, you point them out and ask the user to provide them."

adr_validator_agent = Agent(
name="Architecture Decision Record (ADR) Validator",
instructions=validator_instructions,
model=OpenAIChatCompletionsModel(
model="gpt-4o-mini",
openai_client=openai_client
))

validator_tool = adr_validator_agent.as_tool(tool_name="adr_validator_agent", tool_description="Validate the ADR content and check if all mandatory items are present")

We’ve now reached the final part: the chat interface. Here, we pass in the chat history and connect it to the main agent. Note that this agent can make use of the validator_tool for validation. Once the ADR is ready, it can then hand it off to the writer agent to save the final output.

async def chat(message, history):
# Process the incoming message and history. Only select and use content attributes
processed_history = []
for msg in history:
processed_history.append({"role": msg['role'], "content": msg['content']})

# Append the new user message
processed_history.append({"role": "user", "content": message})

try:
# Make an agent with name, instructions, model
agent = Agent(
name="ADR Writer",
instructions=system_prompt,
model=OpenAIChatCompletionsModel(
model="gpt-4o-mini",
openai_client=openai_client
),
handoffs=[markdown_writer_agent],
tools=[validator_tool]
)
result = await Runner.run(agent, processed_history)
return result.final_output
except OpenAIError as e:
print(f"OpenAI API Error: {str(e)}")
except Exception as e:
print(f"An unexpected error occurred: {str(e)}")

gr.ChatInterface(fn=chat, type="messages").launch()

When ready, all the code blocks can be executed and the chat interface will appear.

Press enter or click to view image in full size

This setup already works really well, but we can push the boundaries even further with two major improvements. First, we can allow the agent to retrieve public information from the web, but only when it’s necessary to handle unstructured queries. Second, I plan to host this as a full-fledged application, making it easily accessible for others to use.

For the public content, I decided to leverage Bing Grounding. Below is a snippet of the code. If you want to explore further, check out the official documentation here:
https://learn.microsoft.com/en-us/azure/ai-foundry/agents/how-to/tools/bing-code-samples?pivots=python

@function_tool
def invoke_grounding_with_bing_search(query: str) -> List[str]:
"""Grounding function to get public information from Bing."""

# Remove any word in query that starts with "site:"
filtered_query = " ".join(
word
for word in query.split()
if not word.lower().startswith("site:")
)
query = filtered_query

find_webshops_with_grounding_prompt = f"""
Information is needed to justify design choices that addresses a functional or non-functional requirement for the following subject: {filtered_query}

Your job is to find:
1. High quality technical websites that provide information about {filtered_query}.
2. Vendor websites that provide the latest information.

The steps to complete this task are:
1. Find websites Technical websites that provide information about {filtered_query}. If multiple products are part of the query, consider a product or service comparison.
2. Output a list of the links and the information that you found.

In general, pay attention to the following:
- Avoid promotions and marketing-related content. Focus on expert opinions.
- Only return links that are relevant.
- Use trustworthy websites with high quality content.
- Use English websites.

Format instructions:
- Output a list containing all urls with summaries of your results.
"""

HEADERS = {
"api-key": API_KEY,
"Content-Type": "application/json"
}

# Create a thread
thread = new_thread(AZURE_AI_FOUNDRY_PROJECT_ENDPOINT, HEADERS, API_VERSION)
print("Thread:", thread)

Next, we need to adjust the prompt slightly. The crucial rules have been extended to allow the agent to search for public information, but strictly only when explicitly requested.

system_prompt += f"Crucial Rules:\
- You must welcome the user in a way that you present yourself as an ADR writer.\
- You must use the context from the Architecture document.\
- You must allow the user to make revisions on the ADR you write.\
- Don't make any assumptions or use any fake names. \
- You are allowed to use the invoke_grounding_with_bing_search tool for getting more information, but only when the user asks for it. If so, you must include the urls from the results into the ADR. Using the new context, revise the ADR. Importantly, the results are for providing extra context. The information from Architecture document is weighted more heavily.\
- You must ask the user whether the ADR is ready to be handed off, and if so, you must first use the validator agent, and then secondly, you hand the ADR off to the Markdown writer agent.\
"

Furthermore, you need to give the agent access to the Bing search tool. To do this, update the following line in the Gradio code block:

tools=[validator_tool, invoke_grounding_with_bing_search]

Let’s see this in action. The animated GIF below demonstrates how it works. You can see the agent in the background making calls to fetch information from the public web. Once the data is retrieved, the agent revises the ADR using the new context, ensuring the record is complete and up-to-date.

Press enter or click to view image in full size

From Notebook to Web App

The final adjustment is turning this into a hostable web app. For this, I plan to use Streamlit. The implementation is straightforward: convert all of your notebook code into a .py file, replace the Gradio code block with the Streamlit block below, and you’re ready to go.

# Function to run the agent asynchronously
async def run_agent(input_string):
result = await Runner.run(agent, input_string)
return result

# --- Streamlit UI ---
st.title("ADR Writer Agent")

# Initialize chat history in session state
if "messages" not in st.session_state:
st.session_state.messages = []

# Display chat history with bubbles
for msg in st.session_state.messages:
with st.chat_message(msg["role"]):
st.markdown(msg["content"])

# User input (chat-style)
if user_input := st.chat_input("Enter a query..."):
# Save user input
st.session_state.messages.append({"role": "user", "content": user_input})
with st.chat_message("user"):
st.markdown(user_input)

# Run agent
response = asyncio.run(run_agent(st.session_state.messages))
agent_reply = response.final_output

# Save agent response
st.session_state.messages.append({"role": "assistant", "content": agent_reply})
with st.chat_message("assistant"):
st.markdown(agent_reply)

Next, run the script. I’m using uv run to launch the Streamlit application:

uv run streamlit run multi_agent_openai_agent_sdk_streamlit.py

Below is a screenshot of how the application appears in the browser.

Press enter or click to view image in full size

Lessons Learned

What surprised me most was how little code is needed once you adopt an agentic approach. The real power comes from agents collaborating with each other — and with tools.

By chaining responsibilities (writer → validator → formatter) and giving agents access to the right tools, you can orchestrate workflows that would otherwise be tedious and error-prone to implement manually.

Another key takeaway: prompts can often replace complex business logic. Instead of hardcoding rules, you let agents handle nuance through carefully designed instructions. The Python layer stays lightweight, while the real intelligence lives in the prompting strategy.

I also learned the importance of starting small. Begin with a single agent, validate the value, and then decompose complex problems into specialized agents that can delegate and collaborate.

This project wasn’t just about writing ADRs — it’s a preview of how future AI-driven applications will work: modular, lightweight, and tool-augmented, with agents orchestrating complex tasks through simple glue code.

And truthfully? It’s been a lot of fun building it.

--

--

Piethein Strengholt
Piethein Strengholt

Written by Piethein Strengholt

Piethein Strengholt is a seasoned expert in data management with significant experience in chief data officer (CDO) and chief data architect roles.

Responses (1)