Browse Source

Add LLM production article and diagrams

Added new community article 'Building Production-Ready LLM Applications' with cover image, multiple SVG architecture diagrams, and summary. These resources illustrate hybrid chat history, MCP protocol, multilingual RAG, pgvector integration, parent-child RAG, reasoning effort, and PostgreSQL architecture for LLM app development.
pull/24256/head
SALİH ÖZKARA 3 months ago
parent
commit
a17c0fa0ea
  1. BIN
      docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/coverimage.png
  2. 114
      docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/chat-history-hybrid.svg
  3. 150
      docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/mcp-architecture.svg
  4. 135
      docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/multilingual-rag.svg
  5. 112
      docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/pgvector-integration.svg
  6. 118
      docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/rag-parent-child.svg
  7. 60
      docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/reasoning-effort-diagram.svg
  8. 149
      docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/svg-diagram-example.svg
  9. 414
      docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/post.md
  10. 1
      docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/summary.md

BIN
docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/coverimage.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 632 KiB

114
docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/chat-history-hybrid.svg

@ -0,0 +1,114 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 800 450">
<!-- Background -->
<rect width="800" height="450" fill="#f8f9fa"/>
<!-- Title -->
<text x="400" y="30" font-family="Arial" font-size="18" font-weight="bold" text-anchor="middle" fill="#0066cc">
Hybrid Chat History: Truncation + RAG on History
</text>
<!-- Full conversation -->
<rect x="50" y="60" width="250" height="200" fill="#e3f2fd" stroke="#1976d2" stroke-width="2" rx="5"/>
<text x="175" y="85" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#0d47a1">
Full Chat History
</text>
<text x="175" y="105" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">
(100 messages, 20K tokens)
</text>
<!-- Messages visual -->
<rect x="60" y="120" width="230" height="15" fill="#90caf9" stroke="#1976d2" stroke-width="1" rx="2"/>
<text x="175" y="131" font-family="Arial" font-size="9" text-anchor="middle" fill="#0d47a1">Messages 1-10 (1 day ago)</text>
<rect x="60" y="140" width="230" height="15" fill="#90caf9" stroke="#1976d2" stroke-width="1" rx="2"/>
<text x="175" y="151" font-family="Arial" font-size="9" text-anchor="middle" fill="#0d47a1">Messages 11-20 (12 hours ago)</text>
<text x="175" y="175" font-family="Arial" font-size="11" text-anchor="middle" fill="#666">...</text>
<rect x="60" y="190" width="230" height="15" fill="#90caf9" stroke="#1976d2" stroke-width="1" rx="2"/>
<text x="175" y="201" font-family="Arial" font-size="9" text-anchor="middle" fill="#0d47a1">Messages 81-90</text>
<rect x="60" y="210" width="230" height="35" fill="#64b5f6" stroke="#1976d2" stroke-width="2" rx="2"/>
<text x="175" y="230" font-family="Arial" font-size="10" font-weight="bold" text-anchor="middle" fill="#0d47a1">Messages 91-100 (Last 10)</text>
<!-- Arrow split -->
<defs>
<marker id="arrow" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
<polygon points="0 0, 10 3, 0 6" fill="#333"/>
</marker>
</defs>
<line x1="300" y1="150" x2="370" y2="100" stroke="#d32f2f" stroke-width="2" marker-end="url(#arrow)"/>
<line x1="300" y1="227" x2="370" y2="250" stroke="#388e3c" stroke-width="2" marker-end="url(#arrow)"/>
<text x="310" y="130" font-family="Arial" font-size="10" fill="#d32f2f">Old Messages</text>
<text x="310" y="270" font-family="Arial" font-size="10" fill="#388e3c">Recent Messages</text>
<!-- Vector DB (Long-term memory) -->
<rect x="370" y="60" width="180" height="100" fill="#ffebee" stroke="#d32f2f" stroke-width="2" rx="5"/>
<text x="460" y="85" font-family="Arial" font-size="12" font-weight="bold" text-anchor="middle" fill="#b71c1c">
Vector DB
</text>
<text x="460" y="105" font-family="Arial" font-size="10" text-anchor="middle" fill="#b71c1c">
(Long-term Memory)
</text>
<text x="460" y="125" font-family="Arial" font-size="9" text-anchor="middle" fill="#b71c1c">
Messages 1-90 with embeddings
</text>
<text x="460" y="140" font-family="Arial" font-size="9" text-anchor="middle" fill="#b71c1c">
Tool: SearchChatHistory()
</text>
<!-- Short-term memory (Prompt) -->
<rect x="370" y="200" width="180" height="100" fill="#c8e6c9" stroke="#388e3c" stroke-width="2" rx="5"/>
<text x="460" y="225" font-family="Arial" font-size="12" font-weight="bold" text-anchor="middle" fill="#1b5e20">
Prompt (Short-term)
</text>
<text x="460" y="245" font-family="Arial" font-size="10" text-anchor="middle" fill="#1b5e20">
Messages 91-100
</text>
<text x="460" y="265" font-family="Arial" font-size="9" text-anchor="middle" fill="#1b5e20">
Truncation (Last 10 messages)
</text>
<text x="460" y="280" font-family="Arial" font-size="9" text-anchor="middle" fill="#1b5e20">
Low tokens, fast
</text>
<!-- LLM -->
<line x1="550" y1="110" x2="600" y2="180" stroke="#666" stroke-width="2" stroke-dasharray="5,5" marker-end="url(#arrow)"/>
<line x1="550" y1="250" x2="600" y2="210" stroke="#666" stroke-width="2" marker-end="url(#arrow)"/>
<rect x="600" y="150" width="150" height="100" fill="#f3e5f5" stroke="#7b1fa2" stroke-width="2" rx="5"/>
<text x="675" y="185" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#4a148c">
LLM
</text>
<text x="675" y="205" font-family="Arial" font-size="10" text-anchor="middle" fill="#4a148c">
Short-term context +
</text>
<text x="675" y="220" font-family="Arial" font-size="10" text-anchor="middle" fill="#4a148c">
Long-term memory via tool
</text>
<text x="675" y="235" font-family="Arial" font-size="10" text-anchor="middle" fill="#4a148c">
access when needed
</text>
<!-- Benefits -->
<rect x="50" y="330" width="700" height="100" fill="#fff9c4" stroke="#f57f17" stroke-width="2" rx="5"/>
<text x="400" y="360" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#f57f17">
✅ Hybrid Approach Benefits
</text>
<g id="benefit1">
<circle cx="80" cy="385" r="5" fill="#fbc02d"/>
<text x="95" y="390" font-family="Arial" font-size="11" fill="#f57f17">
<tspan font-weight="bold">Low Cost:</tspan> Only last 10 messages in prompt per request (truncation)
</text>
</g>
<g id="benefit2">
<circle cx="80" cy="410" r="5" fill="#fbc02d"/>
<text x="95" y="415" font-family="Arial" font-size="11" fill="#f57f17">
<tspan font-weight="bold">High Fidelity:</tspan> LLM can access old messages via SearchChatHistory tool when needed
</text>
</g>
</svg>

After

Width:  |  Height:  |  Size: 5.4 KiB

150
docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/mcp-architecture.svg

@ -0,0 +1,150 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 900 500">
<!-- Background -->
<rect width="900" height="500" fill="#f8f9fa"/>
<!-- Title -->
<text x="450" y="30" font-family="Arial" font-size="20" font-weight="bold" text-anchor="middle" fill="#0066cc">
Model Context Protocol (MCP): Out-of-Process Tools
</text>
<!-- MCP Hosts -->
<text x="150" y="70" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
MCP Hosts (Clients)
</text>
<g id="semantic-kernel-host">
<rect x="50" y="80" width="200" height="70" fill="#0078d4" stroke="#005a9e" stroke-width="2" rx="5"/>
<text x="150" y="110" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">
Semantic Kernel
</text>
<text x="150" y="130" font-family="Arial" font-size="11" text-anchor="middle" fill="#fff">
(.NET Agent)
</text>
</g>
<g id="vscode-host">
<rect x="50" y="170" width="200" height="70" fill="#007acc" stroke="#005a8c" stroke-width="2" rx="5"/>
<text x="150" y="200" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">
VS Code Copilot
</text>
<text x="150" y="220" font-family="Arial" font-size="11" text-anchor="middle" fill="#fff">
(.vscode/mcp.json)
</text>
</g>
<g id="claude-host">
<rect x="50" y="260" width="200" height="70" fill="#d97706" stroke="#b45309" stroke-width="2" rx="5"/>
<text x="150" y="290" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">
Claude Desktop
</text>
<text x="150" y="310" font-family="Arial" font-size="11" text-anchor="middle" fill="#fff">
(Anthropic)
</text>
</g>
<!-- Arrows -->
<defs>
<marker id="arrow" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
<polygon points="0 0, 10 3, 0 6" fill="#333"/>
</marker>
</defs>
<line x1="250" y1="115" x2="350" y2="200" stroke="#666" stroke-width="2" stroke-dasharray="5,5" marker-end="url(#arrow)"/>
<line x1="250" y1="205" x2="350" y2="200" stroke="#666" stroke-width="2" stroke-dasharray="5,5" marker-end="url(#arrow)"/>
<line x1="250" y1="295" x2="350" y2="200" stroke="#666" stroke-width="2" stroke-dasharray="5,5" marker-end="url(#arrow)"/>
<text x="290" y="160" font-family="Arial" font-size="10" text-anchor="middle" fill="#666">
stdio/http
</text>
<text x="290" y="175" font-family="Arial" font-size="10" text-anchor="middle" fill="#666">
JSON-RPC
</text>
<!-- MCP Protocol Layer -->
<rect x="350" y="160" width="200" height="80" fill="#00c853" stroke="#00a040" stroke-width="3" rx="5"/>
<text x="450" y="190" font-family="Arial" font-size="16" font-weight="bold" text-anchor="middle" fill="#fff">
MCP Protocol
</text>
<text x="450" y="210" font-family="Arial" font-size="11" text-anchor="middle" fill="#fff">
(Standardized Interface)
</text>
<text x="450" y="225" font-family="Arial" font-size="10" text-anchor="middle" fill="#fff">
ModelContextProtocol SDK
</text>
<!-- Arrows to servers -->
<line x1="550" y1="200" x2="620" y2="130" stroke="#333" stroke-width="2" marker-end="url(#arrow)"/>
<line x1="550" y1="200" x2="620" y2="205" stroke="#333" stroke-width="2" marker-end="url(#arrow)"/>
<line x1="550" y1="200" x2="620" y2="280" stroke="#333" stroke-width="2" marker-end="url(#arrow)"/>
<!-- MCP Servers -->
<text x="725" y="70" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
MCP Servers (Tools)
</text>
<g id="filesystem-server">
<rect x="620" y="90" width="210" height="80" fill="#9c27b0" stroke="#7b1fa2" stroke-width="2" rx="5"/>
<text x="725" y="120" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#fff">
filesystem.mcp.exe
</text>
<text x="725" y="140" font-family="Arial" font-size="10" text-anchor="middle" fill="#fff">
ReadFile(), ListFiles()
</text>
<text x="725" y="155" font-family="Arial" font-size="10" text-anchor="middle" fill="#fff">
(.NET Console App)
</text>
</g>
<g id="database-server">
<rect x="620" y="180" width="210" height="80" fill="#1976d2" stroke="#0d47a1" stroke-width="2" rx="5"/>
<text x="725" y="210" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#fff">
sqlserver.mcp.exe
</text>
<text x="725" y="230" font-family="Arial" font-size="10" text-anchor="middle" fill="#fff">
ExecuteQuery(), GetSchema()
</text>
<text x="725" y="245" font-family="Arial" font-size="10" text-anchor="middle" fill="#fff">
(.NET Console App)
</text>
</g>
<g id="github-server">
<rect x="620" y="270" width="210" height="80" fill="#333" stroke="#000" stroke-width="2" rx="5"/>
<text x="725" y="300" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#fff">
github.mcp.js
</text>
<text x="725" y="320" font-family="Arial" font-size="10" text-anchor="middle" fill="#fff">
CreateIssue(), GetPR()
</text>
<text x="725" y="335" font-family="Arial" font-size="10" text-anchor="middle" fill="#fff">
(Node.js / TypeScript)
</text>
</g>
<!-- Benefits box -->
<rect x="50" y="370" width="800" height="120" fill="#e8f5e9" stroke="#388e3c" stroke-width="2" rx="5"/>
<text x="450" y="400" font-family="Arial" font-size="15" font-weight="bold" text-anchor="middle" fill="#1b5e20">
✅ MCP Benefits
</text>
<g id="benefit1">
<circle cx="80" cy="425" r="5" fill="#4caf50"/>
<text x="95" y="430" font-family="Arial" font-size="11" fill="#1b5e20">
<tspan font-weight="bold">Reusability:</tspan> Write once, use everywhere (SK, VS Code, Claude)
</text>
</g>
<g id="benefit2">
<circle cx="80" cy="450" r="5" fill="#4caf50"/>
<text x="95" y="455" font-family="Arial" font-size="11" fill="#1b5e20">
<tspan font-weight="bold">Independence:</tspan> MCP server runs separately, doesn't affect main app (out-of-process)
</text>
</g>
<g id="benefit3">
<circle cx="80" cy="475" r="5" fill="#4caf50"/>
<text x="95" y="480" font-family="Arial" font-size="11" fill="#1b5e20">
<tspan font-weight="bold">Language Agnostic:</tspan> Can be written in C#, Python, Node.js, everyone speaks same protocol
</text>
</g>
</svg>

After

Width:  |  Height:  |  Size: 6.3 KiB

135
docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/multilingual-rag.svg

@ -0,0 +1,135 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1000 400">
<!-- Background -->
<rect width="1000" height="400" fill="#f8f9fa"/>
<!-- Title -->
<text x="500" y="30" font-family="Arial" font-size="20" font-weight="bold" text-anchor="middle" fill="#0066cc">
Multilingual RAG: Query Translation Pattern
</text>
<!-- User Query (Turkish) -->
<rect x="50" y="80" width="180" height="80" fill="#fff3cd" stroke="#ffc107" stroke-width="2" rx="5"/>
<text x="140" y="110" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#856404">
User Query
</text>
<text x="140" y="130" font-family="Arial" font-size="12" text-anchor="middle" fill="#856404">
🇹🇷 "Yazıcıyı ağa
</text>
<text x="140" y="145" font-family="Arial" font-size="12" text-anchor="middle" fill="#856404">
nasıl bağlarım?"
</text>
<!-- Arrow -->
<defs>
<marker id="arrow" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
<polygon points="0 0, 10 3, 0 6" fill="#333"/>
</marker>
</defs>
<line x1="230" y1="120" x2="290" y2="120" stroke="#d32f2f" stroke-width="3" marker-end="url(#arrow)"/>
<text x="260" y="110" font-family="Arial" font-size="11" font-weight="bold" text-anchor="middle" fill="#d32f2f">
Tool 1
</text>
<!-- Translation Tool -->
<rect x="290" y="80" width="180" height="80" fill="#ffccbc" stroke="#d84315" stroke-width="2" rx="5"/>
<text x="380" y="105" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#bf360c">
TranslationPlugin
</text>
<text x="380" y="125" font-family="Arial" font-size="11" text-anchor="middle" fill="#bf360c">
TranslateText()
</text>
<text x="380" y="145" font-family="Arial" font-size="10" text-anchor="middle" fill="#bf360c">
Target: English
</text>
<!-- Translated Query (English) -->
<line x1="470" y1="120" x2="530" y2="120" stroke="#388e3c" stroke-width="3" marker-end="url(#arrow)"/>
<text x="500" y="110" font-family="Arial" font-size="11" font-weight="bold" text-anchor="middle" fill="#1b5e20">
Tool 2
</text>
<rect x="530" y="80" width="180" height="80" fill="#c8e6c9" stroke="#388e3c" stroke-width="2" rx="5"/>
<text x="620" y="110" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#1b5e20">
RAGPlugin
</text>
<text x="620" y="130" font-family="Arial" font-size="11" text-anchor="middle" fill="#1b5e20">
🇬🇧 "How do I connect
</text>
<text x="620" y="145" font-family="Arial" font-size="11" text-anchor="middle" fill="#1b5e20">
the printer to network?"
</text>
<!-- Vector DB -->
<line x1="710" y1="120" x2="770" y2="220" stroke="#1976d2" stroke-width="2" stroke-dasharray="5,5" marker-end="url(#arrow)"/>
<text x="740" y="170" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">
Vector Search
</text>
<rect x="770" y="220" width="180" height="80" fill="#e3f2fd" stroke="#1976d2" stroke-width="2" rx="5"/>
<text x="860" y="245" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#0d47a1">
Vector DB
</text>
<text x="860" y="265" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">
(English Docs)
</text>
<text x="860" y="280" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">
"Navigate to Settings
</text>
<text x="860" y="292" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">
&gt; Network &gt; Wi-Fi..."
</text>
<!-- Retrieved Context -->
<line x1="770" y1="260" x2="710" y2="260" stroke="#1976d2" stroke-width="2" marker-end="url(#arrow)"/>
<rect x="530" y="220" width="180" height="80" fill="#bbdefb" stroke="#1976d2" stroke-width="2" rx="5"/>
<text x="620" y="245" font-family="Arial" font-size="12" font-weight="bold" text-anchor="middle" fill="#0d47a1">
Retrieved Context
</text>
<text x="620" y="265" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">
🇬🇧 English text
</text>
<text x="620" y="280" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">
(Manual excerpt)
</text>
<!-- Arrow to LLM -->
<line x1="530" y1="260" x2="470" y2="260" stroke="#9c27b0" stroke-width="3" marker-end="url(#arrow)"/>
<!-- LLM Final Generation -->
<rect x="290" y="220" width="180" height="80" fill="#f3e5f5" stroke="#7b1fa2" stroke-width="2" rx="5"/>
<text x="380" y="245" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#4a148c">
LLM (GPT-5)
</text>
<text x="380" y="265" font-family="Arial" font-size="10" text-anchor="middle" fill="#4a148c">
Context: [English]
</text>
<text x="380" y="280" font-family="Arial" font-size="10" text-anchor="middle" fill="#4a148c">
Generates: [Turkish Response]
</text>
<!-- Arrow to user -->
<line x1="290" y1="260" x2="230" y2="260" stroke="#9c27b0" stroke-width="3" marker-end="url(#arrow)"/>
<!-- Final Answer -->
<rect x="50" y="220" width="180" height="80" fill="#e1bee7" stroke="#7b1fa2" stroke-width="2" rx="5"/>
<text x="140" y="245" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#4a148c">
Response to User
</text>
<text x="140" y="265" font-family="Arial" font-size="11" text-anchor="middle" fill="#4a148c">
🇹🇷 "Ayarlar &gt;&gt;
</text>
<text x="140" y="280" font-family="Arial" font-size="11" text-anchor="middle" fill="#4a148c">
Wi-Fi bölümüne gidin..."
</text>
<!-- Benefit note -->
<rect x="50" y="320" width="900" height="60" fill="#fff9c4" stroke="#f57f17" stroke-width="2" rx="5"/>
<text x="500" y="345" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#f57f17">
✅ Benefit: Single language (English) docs, multi-language query support
</text>
<text x="500" y="365" font-family="Arial" font-size="11" text-anchor="middle" fill="#f57f17">
Tool Chain: TranslationPlugin → RAGPlugin → LLM Final Generation (Original language)
</text>
</svg>

After

Width:  |  Height:  |  Size: 6.0 KiB

112
docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/pgvector-integration.svg

@ -0,0 +1,112 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 950 400">
<!-- Background -->
<rect width="950" height="400" fill="#f8f9fa"/>
<!-- Title -->
<text x="475" y="30" font-family="Arial" font-size="20" font-weight="bold" text-anchor="middle" fill="#0066cc">
PostgreSQL + pgvector: Integrated RAG with EF Core
</text>
<!-- .NET App -->
<rect x="50" y="80" width="200" height="80" fill="#0078d4" stroke="#005a9e" stroke-width="2" rx="5"/>
<text x="150" y="115" font-family="Arial" font-size="16" font-weight="bold" text-anchor="middle" fill="#fff">
.NET Application
</text>
<text x="150" y="135" font-family="Arial" font-size="12" text-anchor="middle" fill="#fff">
(EF Core DbContext)
</text>
<!-- Arrow -->
<defs>
<marker id="arrow" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
<polygon points="0 0, 10 3, 0 6" fill="#333"/>
</marker>
</defs>
<line x1="250" y1="120" x2="330" y2="120" stroke="#333" stroke-width="2" marker-end="url(#arrow)"/>
<text x="290" y="110" font-family="Arial" font-size="11" text-anchor="middle" fill="#666">
LINQ Query
</text>
<!-- Pgvector.EntityFrameworkCore -->
<rect x="330" y="80" width="240" height="80" fill="#00c853" stroke="#00a040" stroke-width="3" rx="5"/>
<text x="450" y="110" font-family="Arial" font-size="15" font-weight="bold" text-anchor="middle" fill="#fff">
Pgvector.EntityFrameworkCore
</text>
<text x="450" y="130" font-family="Arial" font-size="11" text-anchor="middle" fill="#fff">
CosineDistance(), L2Distance()
</text>
<text x="450" y="145" font-family="Arial" font-size="11" text-anchor="middle" fill="#fff">
EF Core Extensions
</text>
<!-- Arrow -->
<line x1="570" y1="120" x2="650" y2="120" stroke="#333" stroke-width="2" marker-end="url(#arrow)"/>
<text x="610" y="110" font-family="Arial" font-size="11" text-anchor="middle" fill="#666">
SQL Query
</text>
<!-- PostgreSQL -->
<rect x="650" y="60" width="250" height="120" fill="#336791" stroke="#1a4a6d" stroke-width="2" rx="5"/>
<text x="775" y="95" font-family="Arial" font-size="16" font-weight="bold" text-anchor="middle" fill="#fff">
PostgreSQL + pgvector
</text>
<!-- Table visualization -->
<g id="table">
<rect x="670" y="110" width="210" height="60" fill="#4a7ba7" stroke="#fff" stroke-width="1" rx="3"/>
<!-- Header -->
<text x="685" y="127" font-family="Arial" font-size="10" font-weight="bold" fill="#fff">id</text>
<text x="725" y="127" font-family="Arial" font-size="10" font-weight="bold" fill="#fff">content</text>
<text x="800" y="127" font-family="Arial" font-size="10" font-weight="bold" fill="#fff">embedding</text>
<!-- Rows -->
<line x1="670" y1="130" x2="880" y2="130" stroke="#fff" stroke-width="1"/>
<text x="685" y="145" font-family="Arial" font-size="9" fill="#fff">1</text>
<text x="725" y="145" font-family="Arial" font-size="9" fill="#fff">Contoso...</text>
<text x="800" y="145" font-family="Arial" font-size="9" fill="#fff">[0.2, -0.1,...]</text>
<text x="685" y="160" font-family="Arial" font-size="9" fill="#fff">2</text>
<text x="725" y="160" font-family="Arial" font-size="9" fill="#fff">Revenue...</text>
<text x="800" y="160" font-family="Arial" font-size="9" fill="#fff">[0.5, 0.3,...]</text>
</g>
<!-- Benefits Box -->
<rect x="50" y="220" width="850" height="150" fill="#e8f5e9" stroke="#388e3c" stroke-width="2" rx="5"/>
<text x="475" y="250" font-family="Arial" font-size="16" font-weight="bold" text-anchor="middle" fill="#1b5e20">
✅ Benefits
</text>
<g id="benefit1">
<circle cx="80" cy="275" r="5" fill="#4caf50"/>
<text x="95" y="280" font-family="Arial" font-size="12" fill="#1b5e20">
<tspan font-weight="bold">Existing SQL Knowledge:</tspan> PostgreSQL is already a familiar database
</text>
</g>
<g id="benefit2">
<circle cx="80" cy="300" r="5" fill="#4caf50"/>
<text x="95" y="305" font-family="Arial" font-size="12" fill="#1b5e20">
<tspan font-weight="bold">EF Core Integration:</tspan> Vector queries with LINQ (.OrderBy(), .Where())
</text>
</g>
<g id="benefit3">
<circle cx="80" cy="325" r="5" fill="#4caf50"/>
<text x="95" y="330" font-family="Arial" font-size="12" fill="#1b5e20">
<tspan font-weight="bold">Metadata JOIN:</tspan> Vector + Relational data in same query (tenant_id, user_id...)
</text>
</g>
<g id="benefit4">
<circle cx="80" cy="350" r="5" fill="#4caf50"/>
<text x="95" y="355" font-family="Arial" font-size="12" fill="#1b5e20">
<tspan font-weight="bold">ACID Compliant:</tspan> Transaction support (rollback, commit)
</text>
</g>
<!-- Code Example Box -->
<rect x="50" y="390" width="850" height="10" fill="none" stroke="none"/>
</svg>

After

Width:  |  Height:  |  Size: 4.8 KiB

118
docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/rag-parent-child.svg

@ -0,0 +1,118 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1000 500">
<!-- Background -->
<rect width="1000" height="500" fill="#f8f9fa"/>
<!-- Title -->
<text x="500" y="30" font-family="Arial" font-size="20" font-weight="bold" text-anchor="middle" fill="#0066cc">
Parent-Child RAG Pattern: Search Small, Respond Large
</text>
<!-- Original Document -->
<rect x="50" y="60" width="180" height="200" fill="#e3f2fd" stroke="#1976d2" stroke-width="2" rx="5"/>
<text x="140" y="85" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#0d47a1">Original Document</text>
<!-- Parent chunks -->
<rect x="60" y="100" width="160" height="50" fill="#90caf9" stroke="#1976d2" stroke-width="1" rx="3"/>
<text x="140" y="125" font-family="Arial" font-size="11" text-anchor="middle" fill="#0d47a1">Parent 1 (800 token)</text>
<rect x="60" y="160" width="160" height="50" fill="#90caf9" stroke="#1976d2" stroke-width="1" rx="3"/>
<text x="140" y="185" font-family="Arial" font-size="11" text-anchor="middle" fill="#0d47a1">Parent 2 (800 token)</text>
<rect x="60" y="220" width="160" height="30" fill="#90caf9" stroke="#1976d2" stroke-width="1" rx="3"/>
<text x="140" y="237" font-family="Arial" font-size="11" text-anchor="middle" fill="#0d47a1">Parent 3...</text>
<!-- Arrow to Child chunks -->
<defs>
<marker id="arrowBlue" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
<polygon points="0 0, 10 3, 0 6" fill="#1976d2"/>
</marker>
<marker id="arrowGreen" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
<polygon points="0 0, 10 3, 0 6" fill="#388e3c"/>
</marker>
<marker id="arrowRed" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
<polygon points="0 0, 10 3, 0 6" fill="#d32f2f"/>
</marker>
</defs>
<line x1="230" y1="125" x2="300" y2="125" stroke="#1976d2" stroke-width="2" marker-end="url(#arrowBlue)"/>
<!-- Child Chunks (Vector DB) -->
<rect x="300" y="60" width="200" height="200" fill="#c8e6c9" stroke="#388e3c" stroke-width="2" rx="5"/>
<text x="400" y="85" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#1b5e20">Child Chunks</text>
<text x="400" y="100" font-family="Arial" font-size="10" text-anchor="middle" fill="#1b5e20">(In Vector DB)</text>
<!-- Child chunk items -->
<g id="child1">
<rect x="310" y="110" width="180" height="25" fill="#a5d6a7" stroke="#66bb6a" stroke-width="1" rx="3"/>
<text x="400" y="126" font-family="Arial" font-size="10" text-anchor="middle" fill="#1b5e20">Child 1.1 (100 token) [ParentID=1]</text>
</g>
<g id="child2">
<rect x="310" y="140" width="180" height="25" fill="#a5d6a7" stroke="#66bb6a" stroke-width="1" rx="3"/>
<text x="400" y="156" font-family="Arial" font-size="10" text-anchor="middle" fill="#1b5e20">Child 1.2 (100 token) [ParentID=1]</text>
</g>
<g id="child3">
<rect x="310" y="170" width="180" height="25" fill="#a5d6a7" stroke="#66bb6a" stroke-width="1" rx="3"/>
<text x="400" y="186" font-family="Arial" font-size="10" text-anchor="middle" fill="#1b5e20">Child 1.3 (100 token) [ParentID=1]</text>
</g>
<g id="child4">
<rect x="310" y="200" width="180" height="25" fill="#a5d6a7" stroke="#66bb6a" stroke-width="1" rx="3"/>
<text x="400" y="216" font-family="Arial" font-size="10" text-anchor="middle" fill="#1b5e20">Child 2.1 (100 token) [ParentID=2]</text>
</g>
<g id="child5">
<rect x="310" y="230" width="180" height="20" fill="#a5d6a7" stroke="#66bb6a" stroke-width="1" rx="3"/>
<text x="400" y="243" font-family="Arial" font-size="10" text-anchor="middle" fill="#1b5e20">Child 2.2...</text>
</g>
<!-- User Query -->
<rect x="50" y="320" width="180" height="80" fill="#fff3cd" stroke="#ffc107" stroke-width="2" rx="5"/>
<text x="140" y="345" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#856404">User Query</text>
<text x="140" y="365" font-family="Arial" font-size="11" text-anchor="middle" fill="#856404">"What was Contoso's</text>
<text x="140" y="380" font-family="Arial" font-size="11" text-anchor="middle" fill="#856404">2024 revenue?"</text>
<!-- Arrow to Vector Search -->
<line x1="230" y1="360" x2="300" y2="200" stroke="#388e3c" stroke-width="2" stroke-dasharray="5,5" marker-end="url(#arrowGreen)"/>
<text x="265" y="270" font-family="Arial" font-size="11" fill="#1b5e20">1. Vector Search</text>
<text x="265" y="285" font-family="Arial" font-size="11" fill="#1b5e20">(On Child chunks)</text>
<!-- Best Match -->
<rect x="520" y="150" width="180" height="60" fill="#ffccbc" stroke="#ff5722" stroke-width="2" rx="5"/>
<text x="610" y="175" font-family="Arial" font-size="12" font-weight="bold" text-anchor="middle" fill="#bf360c">Best Match</text>
<text x="610" y="195" font-family="Arial" font-size="10" text-anchor="middle" fill="#bf360c">Child 1.2 (Score: 0.95)</text>
<line x1="500" y1="152" x2="520" y2="180" stroke="#ff5722" stroke-width="2" marker-end="url(#arrowRed)"/>
<!-- Arrow to Parent Retrieval -->
<line x1="700" y1="180" x2="750" y2="180" stroke="#d32f2f" stroke-width="2" marker-end="url(#arrowRed)"/>
<text x="690" y="230" font-family="Arial" font-size="11" fill="#b71c1c">2. Fetch Parent via</text>
<text x="690" y="240" font-family="Arial" font-size="11" fill="#b71c1c">ParentID</text>
<!-- Retrieved Parent -->
<rect x="750" y="140" width="200" height="80" fill="#ffebee" stroke="#d32f2f" stroke-width="3" rx="5"/>
<text x="850" y="165" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#b71c1c">Retrieved Parent Chunk</text>
<text x="850" y="185" font-family="Arial" font-size="10" text-anchor="middle" fill="#b71c1c">Parent 1 (800 tokens)</text>
<text x="850" y="200" font-family="Arial" font-size="10" text-anchor="middle" fill="#b71c1c">Full context + details</text>
<!-- Arrow to LLM -->
<line x1="850" y1="220" x2="850" y2="290" stroke="#0066cc" stroke-width="2" marker-end="url(#arrowBlue)"/>
<text x="880" y="255" font-family="Arial" font-size="11" fill="#0d47a1">3. Send to LLM</text>
<!-- LLM Response -->
<rect x="750" y="290" width="200" height="100" fill="#e3f2fd" stroke="#1976d2" stroke-width="3" rx="5"/>
<text x="850" y="320" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#0d47a1">LLM Response</text>
<text x="850" y="340" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">"Contoso's 2024</text>
<text x="850" y="355" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">revenue was $2.5 billion</text>
<text x="850" y="370" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">as reported."</text>
<!-- Benefit Box -->
<rect x="50" y="430" width="900" height="60" fill="#d1f2eb" stroke="#00695c" stroke-width="2" rx="5"/>
<text x="500" y="455" font-family="Arial" font-size="13" font-weight="bold" text-anchor="middle" fill="#004d40">
✅ Benefit: Precise search (Child) + Rich context (Parent) = Optimal quality
</text>
<text x="500" y="475" font-family="Arial" font-size="11" text-anchor="middle" fill="#004d40">
Alternative: Only large chunks → Lower precision | Only small chunks → Insufficient context
</text>
</svg>

After

Width:  |  Height:  |  Size: 7.2 KiB

60
docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/reasoning-effort-diagram.svg

@ -0,0 +1,60 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 800 300">
<!-- Background -->
<rect width="800" height="300" fill="#f8f9fa"/>
<!-- Title -->
<text x="400" y="30" font-family="Arial" font-size="20" font-weight="bold" text-anchor="middle" fill="#0066cc">
ReasoningEffortLevel: Cost vs Quality
</text>
<!-- Axis -->
<line x1="50" y1="250" x2="750" y2="250" stroke="#333" stroke-width="2"/>
<line x1="50" y1="250" x2="50" y2="50" stroke="#333" stroke-width="2"/>
<!-- Y-axis labels -->
<text x="40" y="60" font-family="Arial" font-size="12" text-anchor="end" fill="#666">High</text>
<text x="40" y="155" font-family="Arial" font-size="12" text-anchor="end" fill="#666">Medium</text>
<text x="40" y="250" font-family="Arial" font-size="12" text-anchor="end" fill="#666">Low</text>
<!-- Y-axis title -->
<text x="20" y="150" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#333" transform="rotate(-90 20 150)">
Quality / Cost
</text>
<!-- Effort levels -->
<g id="minimal">
<rect x="100" y="200" width="120" height="50" fill="#90caf9" stroke="#1976d2" stroke-width="2" rx="5"/>
<text x="160" y="225" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#0d47a1">Minimal</text>
<text x="160" y="242" font-family="Arial" font-size="10" text-anchor="middle" fill="#0d47a1">Fast + Cheap</text>
</g>
<g id="low">
<rect x="250" y="170" width="120" height="80" fill="#81c784" stroke="#388e3c" stroke-width="2" rx="5"/>
<text x="310" y="205" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#1b5e20">Low</text>
<text x="310" y="222" font-family="Arial" font-size="10" text-anchor="middle" fill="#1b5e20">Simple Queries</text>
</g>
<g id="medium">
<rect x="400" y="120" width="120" height="130" fill="#ffb74d" stroke="#f57c00" stroke-width="2" rx="5"/>
<text x="460" y="175" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#e65100">Medium</text>
<text x="460" y="192" font-family="Arial" font-size="10" text-anchor="middle" fill="#e65100">Standard</text>
</g>
<g id="high">
<rect x="550" y="60" width="120" height="190" fill="#e57373" stroke="#d32f2f" stroke-width="2" rx="5"/>
<text x="610" y="145" font-family="Arial" font-size="14" font-weight="bold" text-anchor="middle" fill="#b71c1c">High</text>
<text x="610" y="162" font-family="Arial" font-size="10" text-anchor="middle" fill="#b71c1c">Complex</text>
<text x="610" y="179" font-family="Arial" font-size="10" text-anchor="middle" fill="#b71c1c">Coding</text>
</g>
<!-- Cost arrow -->
<defs>
<marker id="arrowhead" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
<polygon points="0 0, 10 3, 0 6" fill="#d32f2f"/>
</marker>
</defs>
<line x1="150" y1="270" x2="650" y2="270" stroke="#d32f2f" stroke-width="2" marker-end="url(#arrowhead)"/>
<text x="400" y="290" font-family="Arial" font-size="12" font-style="italic" text-anchor="middle" fill="#d32f2f">
Increasing Cost (Reasoning Tokens ↑)
</text>
</svg>

After

Width:  |  Height:  |  Size: 3.1 KiB

149
docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/images/svg-diagram-example.svg

@ -0,0 +1,149 @@
<svg viewBox="0 0 800 400" xmlns="http://www.w3.org/2000/svg">
<!-- Background -->
<rect width="800" height="400" fill="#f0f4f8"/>
<!-- Title -->
<text x="400" y="30" font-family="Arial, sans-serif" font-size="20" font-weight="bold" fill="#1e3a8a" text-anchor="middle">
PostgreSQL + pgvector Architecture
</text>
<!-- .NET Application -->
<g id="dotnet-app">
<rect x="50" y="80" width="140" height="240" rx="10" fill="#3b82f6" stroke="#1e40af" stroke-width="2"/>
<text x="120" y="110" font-family="Arial, sans-serif" font-size="16" font-weight="bold" fill="white" text-anchor="middle">
.NET Application
</text>
<!-- API Layer -->
<rect x="65" y="130" width="110" height="50" rx="5" fill="#60a5fa" stroke="#2563eb" stroke-width="1.5"/>
<text x="120" y="160" font-family="Arial, sans-serif" font-size="13" fill="white" text-anchor="middle">
Web API /
</text>
<text x="120" y="175" font-family="Arial, sans-serif" font-size="13" fill="white" text-anchor="middle">
Controllers
</text>
<!-- Business Logic -->
<rect x="65" y="190" width="110" height="50" rx="5" fill="#60a5fa" stroke="#2563eb" stroke-width="1.5"/>
<text x="120" y="215" font-family="Arial, sans-serif" font-size="13" fill="white" text-anchor="middle">
Business Logic /
</text>
<text x="120" y="230" font-family="Arial, sans-serif" font-size="13" fill="white" text-anchor="middle">
Services
</text>
<!-- Data Layer -->
<rect x="65" y="250" width="110" height="50" rx="5" fill="#60a5fa" stroke="#2563eb" stroke-width="1.5"/>
<text x="120" y="275" font-family="Arial, sans-serif" font-size="13" fill="white" text-anchor="middle">
Data Access
</text>
<text x="120" y="290" font-family="Arial, sans-serif" font-size="13" fill="white" text-anchor="middle">
Layer
</text>
</g>
<!-- Arrow 1: .NET to EF Core -->
<defs>
<marker id="arrowblue" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto" markerUnits="strokeWidth">
<path d="M0,0 L0,6 L9,3 z" fill="#1e40af" />
</marker>
</defs>
<line x1="190" y1="200" x2="260" y2="200" stroke="#1e40af" stroke-width="3" marker-end="url(#arrowblue)"/>
<text x="225" y="190" font-family="Arial, sans-serif" font-size="11" fill="#1e3a8a" text-anchor="middle">ORM</text>
<!-- EF Core -->
<g id="ef-core">
<rect x="260" y="150" width="140" height="100" rx="10" fill="#2563eb" stroke="#1e40af" stroke-width="2"/>
<text x="330" y="180" font-family="Arial, sans-serif" font-size="16" font-weight="bold" fill="white" text-anchor="middle">
Entity Framework
</text>
<text x="330" y="200" font-family="Arial, sans-serif" font-size="16" font-weight="bold" fill="white" text-anchor="middle">
Core
</text>
<text x="330" y="225" font-family="Arial, sans-serif" font-size="12" fill="#bfdbfe" text-anchor="middle">
DbContext
</text>
<text x="330" y="240" font-family="Arial, sans-serif" font-size="12" fill="#bfdbfe" text-anchor="middle">
LINQ Queries
</text>
</g>
<!-- Arrow 2: EF Core to PostgreSQL -->
<line x1="400" y1="200" x2="470" y2="200" stroke="#1e40af" stroke-width="3" marker-end="url(#arrowblue)"/>
<text x="435" y="190" font-family="Arial, sans-serif" font-size="11" fill="#1e3a8a" text-anchor="middle">Npgsql</text>
<!-- PostgreSQL -->
<g id="postgresql">
<rect x="470" y="80" width="140" height="240" rx="10" fill="#1e40af" stroke="#1e3a8a" stroke-width="2"/>
<text x="540" y="110" font-family="Arial, sans-serif" font-size="16" font-weight="bold" fill="white" text-anchor="middle">
PostgreSQL
</text>
<!-- Tables -->
<rect x="485" y="130" width="110" height="50" rx="5" fill="#3b82f6" stroke="#2563eb" stroke-width="1.5"/>
<text x="540" y="155" font-family="Arial, sans-serif" font-size="13" fill="white" text-anchor="middle">
Relational Tables
</text>
<text x="540" y="170" font-family="Arial, sans-serif" font-size="11" fill="#bfdbfe" text-anchor="middle">
(Standard Data)
</text>
<!-- pgvector Extension -->
<rect x="485" y="190" width="110" height="50" rx="5" fill="#60a5fa" stroke="#3b82f6" stroke-width="1.5"/>
<text x="540" y="215" font-family="Arial, sans-serif" font-size="13" font-weight="bold" fill="white" text-anchor="middle">
pgvector
</text>
<text x="540" y="230" font-family="Arial, sans-serif" font-size="11" fill="white" text-anchor="middle">
Vector Storage
</text>
<!-- Vector Search -->
<rect x="485" y="250" width="110" height="50" rx="5" fill="#93c5fd" stroke="#60a5fa" stroke-width="1.5"/>
<text x="540" y="270" font-family="Arial, sans-serif" font-size="12" fill="#1e3a8a" text-anchor="middle">
Vector Search
</text>
<text x="540" y="285" font-family="Arial, sans-serif" font-size="11" fill="#1e40af" text-anchor="middle">
Similarity Queries
</text>
<text x="540" y="297" font-family="Arial, sans-serif" font-size="10" fill="#1e40af" text-anchor="middle">
(&lt;=&gt;, &lt;-&gt;, &lt;#&gt;)
</text>
</g>
<!-- Arrow 3: Vector Operations -->
<path d="M 540 240 L 540 250" stroke="#1e40af" stroke-width="2" marker-end="url(#arrowblue)"/>
<!-- Vector Search Results -->
<g id="results">
<rect x="640" y="160" width="130" height="80" rx="8" fill="#dbeafe" stroke="#3b82f6" stroke-width="2" stroke-dasharray="5,5"/>
<text x="705" y="185" font-family="Arial, sans-serif" font-size="13" font-weight="bold" fill="#1e40af" text-anchor="middle">
Search Results
</text>
<text x="705" y="205" font-family="Arial, sans-serif" font-size="11" fill="#1e40af" text-anchor="middle">
• Embeddings
</text>
<text x="705" y="220" font-family="Arial, sans-serif" font-size="11" fill="#1e40af" text-anchor="middle">
• Similarity Score
</text>
<text x="705" y="235" font-family="Arial, sans-serif" font-size="11" fill="#1e40af" text-anchor="middle">
• Ranked Results
</text>
</g>
<!-- Arrow to Results -->
<line x1="610" y1="200" x2="640" y2="200" stroke="#3b82f6" stroke-width="2" marker-end="url(#arrowblue)" stroke-dasharray="5,5"/>
<!-- Legend -->
<g id="legend">
<text x="50" y="360" font-family="Arial, sans-serif" font-size="12" font-weight="bold" fill="#1e3a8a">
Data Flow:
</text>
<text x="50" y="380" font-family="Arial, sans-serif" font-size="11" fill="#334155">
1. .NET → EF Core → PostgreSQL (Data Operations)
</text>
<text x="420" y="380" font-family="Arial, sans-serif" font-size="11" fill="#334155">
2. Vector Similarity Search with pgvector
</text>
</g>
</svg>

After

Width:  |  Height:  |  Size: 6.6 KiB

414
docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/post.md

@ -0,0 +1,414 @@
# Building Production-Ready LLM Applications with .NET: A Practical Guide
Large Language Models (LLMs) have evolved rapidly, and integrating them into production .NET applications requires staying current with the latest approaches. In this article, I'll share practical tips and patterns I've learned while building LLM-powered systems, covering everything from API changes in GPT-5 to implementing efficient RAG (Retrieval Augmented Generation) architectures.
Whether you're building a chatbot, a knowledge base assistant, or integrating AI into your enterprise applications, these production-tested insights will help you avoid common pitfalls and build more reliable systems.
## The Temperature Paradigm Shift: GPT-5 Changes Everything
If you've been working with GPT-4 or earlier models, you're familiar with the `temperature` and `top_p` parameters for controlling response randomness. **Here's the critical update**: GPT-5 no longer supports these parameters!
### The Old Way (GPT-4)
```csharp
var chatRequest = new ChatOptions
{
Temperature = 0.7, // ✅ Worked with GPT-4
TopP = 0.9 // ✅ Worked with GPT-4
};
```
### The New Way (GPT-5)
```csharp
var chatRequest = new ChatOptions
{
RawRepresentationFactory = (client => new ChatCompletionOptions()
{
#pragma warning disable OPENAI001
ReasoningEffortLevel = "minimal",
#pragma warning restore OPENAI001
})
};
```
**Why the change?** GPT-5 incorporates an internal reasoning and verification process. Instead of controlling randomness, you now specify how much computational effort the model should invest in reasoning through the problem.
![Reasoning Effort Levels](images/reasoning-effort-diagram.svg)
### Choosing the Right Reasoning Level
- **Low**: Quick responses for simple queries (e.g., "What's the capital of France?")
- **Medium**: Balanced approach for most use cases
- **High**: Complex reasoning tasks (e.g., code generation, multi-step problem solving)
> **Pro Tip**: Reasoning tokens are included in your API costs. Use "High" only when necessary to optimize your budget.
## System Prompts: The "Lost in the Middle" Problem
Here's a critical insight that can save you hours of debugging: **Important rules must be repeated at the END of your prompt!**
### ❌ What Doesn't Work
```
You are a helpful assistant.
RULE: Never share passwords or sensitive information.
[User Input]
```
### ✅ What Actually Works
```
You are a helpful assistant.
RULE: Never share passwords or sensitive information.
[User Input]
⚠️ REMINDER: Apply the rules above strictly, ESPECIALLY regarding passwords.
```
**Why?** LLMs suffer from the "Lost in the Middle" phenomenon—they pay more attention to the beginning and end of the context window. Critical instructions buried in the middle are often ignored.
## RAG Architecture: The Parent-Child Pattern
Retrieval Augmented Generation (RAG) is essential for grounding LLM responses in your own data. The most effective pattern I've found is the **Parent-Child approach**.
![RAG Parent-Child Architecture](images/rag-parent-child.svg)
### How It Works
1. **Split documents into hierarchies**:
- **Parent chunks**: Large sections (1000-2000 tokens) for context
- **Child chunks**: Small segments (200-500 tokens) for precise retrieval
2. **Store both in vector database** with references
3. **Query flow**:
- Search using child chunks (higher precision)
- Return parent chunks to LLM (richer context)
### The Overlap Strategy
Always use overlapping chunks to prevent information loss at boundaries!
```
Chunk 1: Token 0-500
Chunk 2: Token 400-900 ← 100 token overlap
Chunk 3: Token 800-1300 ← 100 token overlap
```
**Standard recommendation**: 10-20% overlap (for 500 tokens, use 50-100 token overlap)
### Implementation with Semantic Kernel
```csharp
using Microsoft.SemanticKernel.Text;
var chunks = TextChunker.SplitPlainTextParagraphs(
documentText,
maxTokensPerParagraph: 500,
overlapTokens: 50
);
foreach (var chunk in chunks)
{
var embedding = await embeddingService.GenerateEmbeddingAsync(chunk);
await vectorDb.StoreAsync(chunk, embedding);
}
```
## PostgreSQL + pgvector: The Pragmatic Choice
For .NET developers, choosing a vector database can be overwhelming. After evaluating multiple options, **PostgreSQL with pgvector** is the most practical choice for most scenarios.
![pgvector Integration](images/pgvector-integration.svg)
### Why pgvector?
**Use existing SQL knowledge** - No new query language to learn
**EF Core integration** - Works with your existing data access layer
**JOIN with metadata** - Combine vector search with traditional queries
**WHERE clause filtering** - Filter by tenant, user, date, etc.
**ACID compliance** - Transaction support for data consistency
**No separate infrastructure** - One database for everything
### Setting Up pgvector with EF Core
First, install the NuGet package:
```bash
dotnet add package Pgvector.EntityFrameworkCore
```
Define your entity:
```csharp
using Pgvector;
using Pgvector.EntityFrameworkCore;
public class DocumentChunk
{
public Guid Id { get; set; }
public string Content { get; set; }
public Vector Embedding { get; set; } // 👈 pgvector type
public Guid ParentChunkId { get; set; }
public DateTime CreatedAt { get; set; }
}
```
Configure in DbContext:
```csharp
protected override void OnModelCreating(ModelBuilder builder)
{
builder.HasPostgresExtension("vector");
builder.Entity<DocumentChunk>()
.Property(e => e.Embedding)
.HasColumnType("vector(1536)"); // 👈 OpenAI embedding dimension
builder.Entity<DocumentChunk>()
.HasIndex(e => e.Embedding)
.HasMethod("hnsw") // 👈 Fast approximate search
.HasOperators("vector_cosine_ops");
}
```
### Performing Vector Search
```csharp
using Pgvector.EntityFrameworkCore;
public async Task<List<DocumentChunk>> SearchAsync(string query)
{
// 1. Convert query to embedding
var queryVector = await _embeddingService.GetEmbeddingAsync(query);
// 2. Search
return await _context.DocumentChunks
.OrderBy(c => c.Embedding.L2Distance(queryVector)) // 👈 Lower is better
.Take(5)
.ToListAsync();
}
```
**Source**: [Pgvector.NET on GitHub](https://github.com/pgvector/pgvector-dotnet?tab=readme-ov-file#entity-framework-core)
## Smart Tool Usage: Make RAG a Tool, Not a Tax
A common mistake is calling RAG on every single user message. This wastes tokens and money. Instead, **make RAG a tool** and let the LLM decide when to use it.
### ❌ Expensive Approach
```csharp
// Always call RAG, even for "Hello"
var context = await PerformRAG(userMessage);
var response = await chatClient.CompleteAsync($"{context}\n\n{userMessage}");
```
### ✅ Smart Approach
```csharp
[KernelFunction]
[Description("Search the company knowledge base for information")]
public async Task<string> SearchKnowledgeBase(
[Description("The search query")] string query)
{
var results = await _vectorDb.SearchAsync(query);
return string.Join("\n---\n", results.Select(r => r.Content));
}
```
The LLM will call `SearchKnowledgeBase` only when needed:
- "Hello" → No tool call
- "What was our 2024 revenue?" → Calls tool
- "Tell me a joke" → No tool call
## Multilingual RAG: Query Translation Strategy
When your documents are in one language (e.g., English) but users query in another (e.g., Turkish), you need a translation strategy.
![Multilingual RAG Architecture](images/multilingual-rag.svg)
### Solution Options
**Option 1**: Use an LLM that automatically calls tools in English
- Many modern LLMs can do this if properly instructed
**Option 2**: Tool chain approach
```csharp
[KernelFunction]
[Description("Translate text to English")]
public async Task<string> TranslateToEnglish(string text)
{
// Translation logic
}
[KernelFunction]
[Description("Search knowledge base (English only)")]
public async Task<string> SearchKnowledgeBase(string englishQuery)
{
// Search logic
}
```
The LLM will:
1. Call `TranslateToEnglish("2024 geliri nedir?")`
2. Get "What was 2024 revenue?"
3. Call `SearchKnowledgeBase("What was 2024 revenue?")`
4. Return results and respond in Turkish
## Model Context Protocol (MCP): Beyond In-Process Tools
Microsoft and Anthropic recently released official C# SDKs for the Model Context Protocol (MCP). This is a game-changer for tool reusability.
![MCP Architecture](images/mcp-architecture.svg)
### MCP vs. Semantic Kernel Plugins
| Feature | SK Plugins | MCP Servers |
|---------|-----------|-------------|
| **Process** | In-process | Out-of-process (stdio/http) |
| **Reusability** | Application-specific | Cross-application |
| **Examples** | Used within your app | VS Code Copilot, Claude Desktop |
### Creating an MCP Server
```csharp
using Microsoft.Extensions.Hosting;
using ModelContextProtocol.Extensions.Hosting;
var builder = Host.CreateEmptyApplicationBuilder(settings: null);
builder.Services.AddMcpServer()
.WithStdioServerTransport()
.WithToolsFromAssembly();
await builder.Build().RunAsync();
```
Define your tools:
```csharp
[McpServerToolType]
public static class FileSystemTools
{
[McpServerTool, Description("Read a file from the file system")]
public static async Task<string> ReadFile(string path)
{
// ⚠️ SECURITY: Always validate paths!
if (!IsPathSafe(path))
throw new SecurityException("Invalid path");
return await File.ReadAllTextAsync(path);
}
private static bool IsPathSafe(string path)
{
// Implement path traversal prevention
var fullPath = Path.GetFullPath(path);
return fullPath.StartsWith(AllowedDirectory);
}
}
```
Your MCP server can now be used by VS Code Copilot, Claude Desktop, or any other MCP client!
## Chat History Management: Truncation + RAG Hybrid
For long conversations, storing all history in the context window becomes impractical. Here's the pattern that works:
![Chat History Hybrid Strategy](images/chat-history-hybrid.svg)
### ❌ Lossy Approach
```
First 50 messages → Summarize with LLM → Single summary message
```
**Problem**: Detail loss (fidelity loss)
### ✅ Hybrid Approach
1. **Recent messages** (last 5-10): Keep in prompt for immediate context
2. **Older messages**: Store in vector database as a tool
```csharp
[KernelFunction]
[Description("Search conversation history for past discussions")]
public async Task<string> SearchChatHistory(
[Description("What to search for")] string query)
{
var relevantMessages = await _vectorDb.SearchAsync(query);
return string.Join("\n", relevantMessages.Select(m =>
$"[{m.Timestamp}] {m.Role}: {m.Content}"));
}
```
The LLM retrieves only relevant past context when needed, avoiding summary-induced information loss.
## RAG vs. Fine-Tuning: Choose Wisely
A common misconception is using fine-tuning for knowledge injection. Here's when to use each:
| Purpose | RAG | Fine-Tuning |
|---------|-----|-------------|
| **Goal** | Memory (provide facts) | Behavior (teach style) |
| **Updates** | Dynamic (add docs anytime) | Static (requires retraining) |
| **Cost** | Low dev, higher inference | High dev, lower inference |
| **Hallucination** | Reduces | Doesn't reduce |
| **Use Case** | Company docs, FAQs | Brand voice, specific format |
**Common mistake**: "Let's fine-tune on our company documents" ❌
**Better approach**: Use RAG! ✅
Fine-tuning is for teaching the model *how* to respond, not *what* to know.
**Source**: [Oracle - RAG vs Fine-Tuning](https://www.oracle.com/artificial-intelligence/generative-ai/retrieval-augmented-generation-rag/rag-fine-tuning/)
## Bonus: Why SVG is Superior for LLM-Generated Images
When using LLMs to generate diagrams and visualizations, always request SVG format instead of PNG or JPG.
### Why SVG?
**Text-based** → LLMs produce better results
**Lower cost** → Fewer tokens than base64-encoded images
**Editable** → Easy to modify after generation
**Scalable** → Perfect quality at any size
**Version control friendly** → Works great in Git
### Example Prompt
```
Create an architecture diagram showing PostgreSQL with pgvector integration.
Format: SVG, 800x400 pixels. Show: .NET Application → EF Core → PostgreSQL → Vector Search.
Use arrows to connect stages. Color scheme: Blue tones.
```
![SVG Diagram Example](images/svg-diagram-example.svg)
All diagrams in this article were generated as SVG, resulting in excellent quality and lower token costs!
> **Pro Tip**: If you don't need photographs or complex renders, always choose SVG.
## Architecture Roadmap: Putting It All Together
Here's the recommended stack for building production LLM applications with .NET:
1. **Orchestration**: Microsoft.Extensions.AI + Semantic Kernel (when needed)
2. **Vector Database**: PostgreSQL + Pgvector.EntityFrameworkCore
3. **RAG Pattern**: Parent-Child chunks with 10-20% overlap
4. **Tools**: MCP servers for reusability
5. **Reasoning**: ReasoningEffortLevel instead of temperature
6. **Prompting**: Critical rules at the end
7. **Cost Optimization**: Make RAG a tool, not automatic
## Key Takeaways
Let me summarize the most important production tips:
1. **Temperature is gone** → Use `ReasoningEffortLevel` with GPT-5
2. **Rules at the end** → Combat "Lost in the Middle"
3. **RAG as a tool** → Reduce costs significantly
4. **Parent-Child pattern** → Search small, respond with large
5. **Always use overlap** → 10-20% is the standard
6. **pgvector for most cases** → Unless you have billions of vectors
7. **MCP for reusability** → One codebase, works everywhere
8. **SVG for diagrams** → Better results, lower cost
9. **Hybrid chat history** → Recent in prompt, old in vector DB
10. **RAG > Fine-tuning** → For knowledge, not behavior
Happy coding! 🚀

1
docs/en/Community-Articles/2025-11-22-building-production-ready-llm-applications/summary.md

@ -0,0 +1 @@
Learn how to build production-ready LLM applications with .NET. This comprehensive guide covers GPT-5 API changes, advanced RAG architectures with parent-child patterns, PostgreSQL pgvector integration, smart tool usage strategies, multilingual query handling, Model Context Protocol (MCP) for cross-application tool reusability, and chat history management techniques for enterprise applications.
Loading…
Cancel
Save