Technology & Architecture
Enterprise AI infrastructure for reliable, scalable and privacy-compliant form assistants.
Architecture Overview
The FINO Suite follows a modular cloud architecture with clear layer separation. Each component is independently scalable and replaceable.
Language Models (LLMs)
FINO is model-agnostic and supports various Large Language Models. The choice of model can be configured per tenant and use case.
| Provider | Models | Use Case | EU Hosting |
|---|---|---|---|
| Anthropic | Claude Sonnet family | Dialogue management, form assistance, complex follow-up questions | ✅ via EU infrastructure |
| Amazon | Nova Pro, Nova Lite, Titan | Document analysis, image processing, embeddings | ✅ EU (Frankfurt) |
| Amazon | Nova Sonic | Speech processing (FINO Voice) | ✅ EU (Stockholm) |
| Others | Configurable on request | Customer-specific requirements | Depends on provider |
RAG - Retrieval Augmented Generation
FINO uses RAG to base AI responses on verified facts rather than general model knowledge. The result: professionally accurate, up-to-date and traceable answers.
Retrieval (Knowledge Retrieval)
For each user query, relevant information is retrieved from the connected knowledge bases.
- Vector search: Semantic matching via embeddings
- Hybrid search: Combination of semantic and keyword search
- Ranking: Relevance scoring and filtering of results
- Source references: Every piece of information is traceable to its source
Generation (Response Generation)
The language model generates a response based on the retrieved information and conversation context.
- Context window: Relevant documents are provided to the model
- Prompt engineering: Domain-specific instructions control tone and accuracy
- Validation: Responses are checked for consistency
- Multilingual: Response in the user's language, form in German
Why RAG instead of pure LLM?
Without RAG (pure LLM):
- Responses based on training data (outdated)
- Hallucinations possible
- No source references
- Not tenant-specific
With RAG (FINO):
- Responses based on current, verified sources
- Fact-based and traceable
- Source references with every response
- Individual knowledge base per tenant
MCP - Model Context Protocol
FINO uses the Model Context Protocol (MCP) as an open standard for communication between AI systems and knowledge sources. This enables a flexible, extensible architecture.
Modularity
- Knowledge sources as independent MCP servers
- Easy addition and removal of data sources
- Independent scaling per source
- Standardised interfaces
Distributed Architecture
- Multiple knowledge sources queryable in parallel
- Multi-tenant configuration
- Real-time knowledge base updates
- Interoperability with various AI models
Integration & Interfaces
Frontend Integration
FINO is provided as a Web Component - a single HTML tag is all you need for integration.
- No framework required
- Works in any website
- Responsive and accessible
- Customisable design
Backend Interfaces
Standardised APIs are available for deeper integrations.
- REST API: Standard HTTP interface for all products
- MCP Protocol: For knowledge base connectivity
- Webhooks: Event-based notifications
- Form mapping: Automatic mapping of AI responses to form fields
CMS Plugins
WordPress (available)
Ready-to-use plugin with graphical configuration interface. Installation via WordPress admin, branding customisation without code changes.
- All branding options (colours, texts, logos)
- Multilingual configuration
- Page-level visibility control
Other CMS (on request)
Integrations for Drupal, Joomla and Shopware are planned. Contact us for your specific use case.
Performance & Scalability
Performance
- Response times: Typically 3–5 seconds for complex queries
- Caching: Intelligent caching for frequent queries
- Streaming: Responses are streamed in real-time
- Availability: 24/7 operation with automatic failover
Scalability
- Horizontal: Automatic scaling during peak loads
- Multi-tenant: Hundreds of tenants on one infrastructure
- Modular: Individual components independently scalable
- From pilot to production: Same architecture, different sizing
Technical questions?
We are happy to explain the architecture in detail and show how FINO fits into your system landscape.