Everyone is talking about agentic AI, AI-enabled underwriting, and intelligent workflows. But most of that conversation skips the elephant in the room: the data. Without clean, connected, well-structured data foundations, none of it works. That was the central message of a recent DataArt and AWS webinar, Data First: Building the Foundation AI Needs in Specialty Insurance, hosted by Ed Simmons of Branch Brook Advisors. The panel featured Khalid Desai, Chief Data Officer at Allianz Global Corporate & Specialty (AGCS), Arno de Wever, Head of Commercial P&C Insurance at Amazon Web Services, and Oliver Parker, CTO of Financial Services at DataArt, the practitioners who have spent years trying to make AI actually land in commercial and specialty insurance.
Key Takeaways
- Metadata and data lineage matter more than the latest AI acronym and without them, your models are building on sand.
- A medallion-style architecture (bronze, silver, gold) brings order to underwriting and claims chaos by separating raw ingestion from governed, business-ready data.
- AI doesn’t fix broken processes, it accelerates them. The insurance data foundation must come first.
- Even organizations that consider themselves AI leaders still struggle when their data pipelines are clogged with legacy schemas and one-off fixes.
- Business glossaries and domain ownership are the unglamorous work that separates genuine AI leaders from the ones still chasing pilots.
- Start small, start with external data, and build incrementally. The future of insurance underwriting in the UK specialty market is earned through disciplined data management.
Why an AI-Ready Data Strategy is Non-Negotiable for Insurance Transformation
Start with a simple diagnostic. Ask your team to define “losses by broker” and pull the number from three different systems. You will get three different assumptions, and all of them will be technically correct because they answer three different questions. That is the data problem in specialty insurance, right there.
Khalid Desai described exactly this scenario based on AGCS’s own experience building its “Talk to Your Data” natural language querying capability. When the claims team first tested it, the results were devastating, and not because the technology failed. The technology delivered, in fact. But when a claims handler asked for attritional losses for accident year 2025, the model hallucinated. It didn’t know what “attritional” meant in claims language. It didn’t understand that “AY” meant accident year. The gap wasn’t artificial intelligence, it was the absence of a shared, governed business vocabulary.
This is the core argument for an AI-ready data strategy: large language models are rather arrogant, as Desai noted during the webinar. They produce an answer with confidence, whether or not the underlying data or business context supports it. Feed them ambiguity, and they will resolve it — wrongly, and at scale.
The only solution is a data foundation that provides the model with the right context, definitions, and lineage to work from. Oliver Parker was direct about what happens when organizations skip this step: “Remarkable technology on top of fragmented data gives you faster, more confident wrong answers.” The pilots succeed in demos and die in production, not because AI doesn’t work, but because the data pipes underneath are clogged with legacy schemas, inconsistent source systems, and definitions that mean different things to various parts of the business.
The Underwriting Transformation Challenge in the London Market
The London Market and UK specialty insurers face a data landscape that, to put it generously, is complex. London Market insurance technology has advanced considerably, yet the underlying data challenge remains stubborn.
Arno de Wever offered a statistic that stops most rooms cold: the average commercial insurer manages 23 policy administration systems. AGCS, operating at a global scale, works across more than 100 source systems: policy administration, claims platforms, reinsurance tools, financial reporting, and more.
AI in specialty insurance, from reinsurance underwriting AI to automated risk scoring, depends entirely on a solid insurance data foundation. The implications cascade quickly. Financial data tends to be well-curated because regulatory reporting demands it. Underwriting and claims data? Far less so. In some organizations, it is barely available. And because these systems are siloed and built at different times, for different purposes, by different teams, harmonizing them into something an AI model can use reliably is not a technical task. It is an organizational one.
De Wever described the challenge as a sequence problem. He noted that once you need to dip into internal systems to retrieve or validate data, that is where the difficulty starts.
External submission data is actually, the easier starting point, precisely because you cannot change it. That gives teams a controlled data set to build against and an early win to demonstrate before touching the more politically charged territory of internal source systems.
What this means for chief data officers and heads of underwriting transformation:
- The order in which you pursue transformation is critical, not just the end goal itself.
- Jumping straight to AI-powered underwriting or automated risk scoring isn’t possible if your existing guidelines are a dense, poorly structured ruleset that models can’t interpret.
- A solid data foundation must be established before any AI strategy is layered on top — not in parallel, and definitely not as an afterthought.
- For underwriting program leads specifically, this means resisting the temptation to pilot AI on top of existing guideline documents before the underlying data has been structured and governed.
Oliver Parker distilled the failure pattern: training a model on dirty data produces confident but wrong answers. He added something that tends to get brushed under the carpet when organizations are chasing the next pilot: AI won’t fix a broken process. It will run that process ten times faster. You’ll get the wrong answer quicker.
Architecting Your Data Foundation on AWS for AI Success
AGCS is currently in the middle of a migration to a central data platform designed for operational and portfolio steering needs. Khalid Desai walked through the architecture they chose — a medallion approach built on AWS — and explained why each layer performs a specific job that the others cannot.
Bronze layer – system of record:
- raw data lands here exactly as received — no transformation
- sources include policy data, claims, reinsurance treaties, and financial bordereaux files.
- enables full auditability and historical reconstruction via change data capture
- a regulatory requirement in the London Market, not optional
Silver layer – data quality and integration
- owned by data engineers, this is where the unglamorous but critical work happens
- hard rules applied, schema conflicts resolved
- keys linking policies, premiums, and claims across source systems are identified and enforced
- makes connecting systems that were never designed to integrate possible — and the gold layer viable
Gold layer – business context and AI value
- where soft rules, business glossary definitions, and domain-specific logic are applied
- co-owned by data teams and business stakeholders
- defines org-specific terms (e.g., "loss adjusted expense," "attritional")
- the layer where AI extracts the most value
Think of it as cloud-native data architecture for AI insurance, stripped down to what actually works in practice. AWS handles the heavy lifting which is scalable, secure infrastructure that doesn’t buckle under pressure.
The medallion pattern brings order to chaos, establishing a governance structure that keeps data clean, traceable, and trustworthy. Put the two together, and you get a data foundation built for real AI deployment in insurance, not a whitepaper concept, but something running live in production.
The same architecture that drives specialty insurance data analytics and enables insurance data management at scale also powers genuine insurance underwriting modernization.
One pragmatic note from Arno de Wever on scope: resist the instinct to backfill 30 years of historical data before you begin. Most of that data is irrelevant to the processes you are running today. Start collecting data in the right structure now, and in six months you will have six months of clean, usable data.
From Data Silos to Intelligent Underwriting
The hardest part of insurance data management modernization is rarely technical. It is mostly cultural and organizational. Getting underwriters, claims handlers, and finance teams to agree on what “losses” means is, as Oliver Parker put it, the equivalent of agreeing on what “done” means before you start a project.
Key implications:
- The old “lock people in a room” or “last person standing” approach to reaching data consensus doesn’t scale. Neither do lengthy workshops that produce massive data dictionaries no one reads.
- Start with the 5 metrics the CFO uses to run the business, define them in plain language, get domain-owner sign-off, then build outward from there.
- A business glossary isn’t a one-time artifact, it needs assigned owners, regular updates, and must keep pace with how the business actually operates.
- Companies that maintain it well end up with data foundations that stay grounded in reality and don’t drift over time.
On data governance models, Parker offered a nuanced view of the centralized-versus-federated question. Federated governance is often right for specialty insurance: what matters for marine cargo is genuinely structurally different from personal lines, and centrally imposed definitions across those domains often fail. But federated governance can also become an excuse for avoiding the hard calls.
The model that works is a hub-and-spoke: strong central standards and minimum quality bars set centrally, with domain-owned definitions and metrics owned at the business line level.
Khalid Desai reinforced the point about business ownership from his experience at Allianz. The more successful data teams, he observed, are those that sit close to the core.
Data domain owners from each business function should define harmonization rules for their data, not the data team. Desai added one more piece of advice he would give his earlier self: start the business glossary at least a year before any data transformation initiative begins. It is the bottleneck that derails more initiatives than any technology constraint, and it takes longer than anyone expects.
Arno de Wever offered a practical answer to the question of where to start: begin with stakeholder alignment, not technology. The conversation should start with the CEO and CFO who will see the first benefit from well-governed financial data and then expand to the Chief Underwriting Officer, Chief Claims Officer, and COO. Technology selection comes after business ownership is established, not before.
Accelerating Your Underwriting Transformation Journey
The panel closed with a point that is easy to say and hard to internalize: the data foundation is not a project. It is an ongoing, evolving capability. The insurers that are furthest ahead are those who decided three or four years ago where they wanted to be, worked backwards, and started building. They are not waiting for the AI models to mature. They are making sure that when those models are ready, the data underneath is ready too.
AI hype comes and goes. Solid data foundations built on proven cloud infrastructure do not.
Specialty insurance market trends point consistently in one direction: the companies investing in data now will lead the next decade. Get the data architecture, data lineage, and data quality right, and the AI layer including agents, data-driven underwriting UK and beyond, intelligent claims triage — finally starts to work. Skip it, and you will keep running pilots that produce impressive demos and die in production.
DataArt partners with UK specialty insurers on digital transformation insurance UK practices demand: designing and building AI-ready data foundations on AWS, including a solid insurance data foundation AI teams can build on, from medallion architecture and insurance data governance AI frameworks to business glossary development and the operating model changes that make underwriting transformation solutions stick. A clear data strategy for insurance is the starting point. Whether you are still mapping your data estate or have pilots that are not making it to production, DataArt’s team of professionals can help you define a cloud strategy for insurers that works, implement AI solutions for insurers at scale, and navigate AI implementation in insurance end-to-end.