HELVETIC
AI
Independent AI evaluation for Swiss enterprises.
Location Bern, Switzerland Founded 2026 System Inspect AI · Compl-AI · Swiss-Bench Services AI Compliance & AI Performance Focus Swiss SMEs & corporates

AI is already in production, but nobody evaluates it independently.

50% of Swiss financial institutions already use AI, 91% of those use generative AI. Yet governance has not kept pace. Only half have incorporated AI into an explicit strategy.

The EU AI Act is expected to require technical compliance evidence for high-risk systems from December 2027. FINMA already expects traceable model validation. But there is no Swiss evaluation infrastructure and no independent auditors in the mid-market segment.

FINMA survey (published April 2025): Of ~400 surveyed financial institutions, half use AI, the governance gap is significant. Stanford study (2024): 58% hallucination rate in legal AI analysis. Asai et al. (Nature, 2026): LLMs hallucinate citations 78–90% of the time — when models cite legal articles, they fabricate references the majority of the time. The EU AI Act Digital Omnibus pushes high-risk deadlines to December 2027 (Annex III) and August 2028 (Annex I).
Traditional AI Audit Helvetic AI
Timeline3–6 months5–10 days
CostCHF 200K+ (Big Four)from CHF 8,000
MethodologyProprietary black boxReproducible, evidence-based
BasisOpinion-basedEvidence-based, systematic benchmarks
IndependenceVendor relationshipsNo commissions, no pay-for-score
50%
of Swiss financial institutions already use AI
91%
of those use generative AI. Governance lags behind
Dec. 2027
EU AI Act high-risk deadline (Annex III)
5–10 days
from discovery call to finished evaluation report
Sources: FINMA AI Survey (published April 2025), EU AI Act Digital Omnibus 2025
System Foundation & Compliance
Inspect AI (UK AISI) Compl-AI (ETH Zurich) Swiss-Bench nDPA EU AI Act FINMA Swiss Company
Inspect AI: UK AI Safety Institute · Compl-AI: ETH Zurich / INSAIT · Swiss-Bench: in-house Swiss language benchmarks

One evaluation system: independent, reproducible, Swiss-specific.

Our system combines Inspect AI (UK AI Safety Institute), Compl-AI (ETH Zurich), and Swiss-Bench (proprietary Swiss benchmarks). Every model receives a HAAS (Helvetic AI Assurance Score) across 6 dimensions, with confidence intervals and comprehensive benchmark evidence.

HAAS Score

6 dimensions: Performance (incl. hallucination rate), Robustness, Safety, Compliance, Swiss Language, Documentation. Each dimension scored 0–100 with confidence intervals.

Reproducible Methodology

Every evaluation follows a documented, reproducible methodology. You receive comprehensive benchmark evidence and detailed scoring breakdowns with every engagement.

Independence

No commercial relationships with any AI model provider. No referral fees. No vendor partnerships. No pay-for-score. Every model is evaluated equally.

Data Sovereignty

5 handoff modes: benchmark intelligence (standard), API key, Docker on your infrastructure, hardware on-site, anonymize-first. You choose.

Air-gapped evaluation available. For FINMA-regulated institutions and high-security environments: we bring the evaluation to you on dedicated hardware. No data leaves your premises. See all data handoff modes →
Swiss-Bench Leaderboard: How do leading AI models perform on Swiss-specific tasks in DE/FR/IT? See 11 models ranked across 436 scenarios — updated quarterly. View Swiss-Bench →

How Swiss companies use independent AI evaluation.

Compliance

AI Model Validation for Banks

A regional bank validates its credit risk model against FINMA Guidance 08/2024, with HAAS Score and gap analysis for the board.

Compliance

Pre-Certification for High-Risk Systems

An insurer has its AI-based claims management tested against 27+ Compl-AI benchmarks: technical compliance evidence for the proposed December 2027 deadline.

Performance

Model Selection with Data, Not Opinions

A company evaluates 5 AI models for Swiss legal texts. Reproducible benchmarks show which model actually handles Swiss administrative German (Verwaltungsdeutsch), French, and Italian.

Performance

Fact-Checking for GenAI Systems

A financial services firm measures its AI chatbot's hallucination rate on Swiss regulatory questions. Quantified results: which topics are reliable, where does the model fabricate facts?

Compliance

AI Threat Detection in Cybersecurity

A SOC team evaluates whether their AI-powered threat detection system meets EU AI Act high-risk requirements and FINMA operational resilience standards. Compliance evidence for the security operations board.

Compliance

Medical AI in Health & Pharma

A pharmaceutical company validates its AI-assisted drug interaction checker against EU AI Act Annex III medical device requirements, with multilingual Swiss patient safety testing in DE/FR/IT.

Performance

Cybersecurity Incident Intelligence

A managed security provider benchmarks 5 AI models for Swiss-German incident report generation and threat intelligence summarization. Which model produces actionable SOC reports?

Performance

Clinical Documentation in Healthcare

A hospital group evaluates AI models for medical record summarization in DE/FR/IT. Hallucination rates on Swiss clinical terminology and patient safety as key metrics.

From discovery call to finished evaluation report.

Our process minimizes your effort and maximizes clarity. View full methodology →

1
Scoping
We define evaluation objectives, models, and benchmarks together. No preparation needed.
1 hour
2
Configuration
We configure the evaluation pipeline for your models, data, and compliance requirements.
2–4 hours
3
Evaluation
We run the benchmarks: HAAS Score, Swiss language quality, EU AI Act compliance, domain-specific scenarios.
3–8 business days
4
Handoff
You receive the evaluation report with HAAS Scores, gap analysis, recommendations, and a findings call.
Report delivery

Fatih Uenal, PhD

I build AI systems for regulated Swiss enterprises and have seen the governance gap first-hand. Studies show over 75% of employees use AI tools without formal approval. The large consultancies ignore SMEs, the tools are too expensive, and regulation is tightening.

Helvetic AI closes that gap with independent evaluation, Swiss infrastructure, and the principle that AI can be deployed safely when you have the right evidence.

  • Research PhD Political Science (HU Berlin), Postdoc Harvard & Cambridge
  • Technology MSc Computer Science (CU Boulder), MITx Statistics & Data Science
  • Practice AI systems & security operations in regulated Swiss infrastructure
  • Location Bern, Switzerland

Ready for an independent evaluation?

Start with an AI Risk Classification or a full AI Model Evaluation. Within one to two weeks you'll know where your AI systems stand, evidence-based, not opinion-based.

Risk Classification from CHF 3,000 · Evaluation from CHF 8,000 · FINMA Validation from CHF 15,000 · All services
contact@ai-helvetic.ch