Our proof of concept with multi-party computation

Introduction
As insurers seek to leverage the power of cutting-edge technologies like machine learning, artificial intelligence, and third-party data integrations to improve their data-driven decision-making, they are faced with the critical task of ensuring the security and control of sensitive data, which is essential for maintaining customer trust and complying with regulatory requirements. A promising solution to this challenge is the development of a collaborative intelligence ecosystem, where multiple parties come together to form a shared knowledge pool, facilitating secure and collective learning through multi-party computations. What's intriguing is that, despite the potentially large number of participants in this distributed learning ecosystem, each participant’s sensitive data and information remain private and undisclosed. Fascinating, isn’t it?
The Integrated Analytics team at Munich Re Life North America investigated the capabilities of this cutting-edge technology. As part of this exploration, we partnered with an external vendor to conduct a proof of concept (POC) that evaluated the performance, accuracy, efficiency, and scalability of their multi-party computation platform. In this article, we will delve into the details of our POC experience, sharing key insights and takeaways from our assessment of this innovative technology. We believe it has the capacity to unlock significant opportunities in the realms of computing and analytics within the insurance industry.
Purpose
A vast amount of data is generated daily, with its growth accelerating exponentially due to technologies like social media, cloud services, and the Internet of Things (IoT). While this data may hold significant value for businesses, concerns about privacy, trust, and other risks often result in much of it remaining inaccessible. This is true for many fields, and especially those such as insurance, where sensitive personal information (PII) is prevalent and preserving privacy is a top priority.
However, advancements in privacy-preserving technologies, such as multi-party computation, offer a promising solution to these challenges by providing a more secure framework for collaborative research. Multi-party computation is a cryptographic technique that allows multiple parties to jointly compute a function on their private inputs without revealing those inputs to each other. Our goal for this POC was to evaluate the technology’s potential to face challenges related to data transfers, collaboration, and vulnerability to bad actors in Munich Re insurance use cases.

Innovative techniques such as multi-party computation can provide a reliable way to enhance business capabilities, enabling secure and private insight sharing and analysis across multiple stakeholders while maintaining data confidentiality and integrity.
How does the technology work?
The vendor we collaborated with for the POC is experienced in developing and deploying privacy-enhancing technologies for commercial use through encryption technologies and has pioneered cryptographic computation technology for secure, privacy-preserving analytics and machine learning. Their product enables sensitive data to be distributed across teams, organizations, and regulated jurisdictions by deploying privacy zones (on-premises or cloud) within the infrastructure and network of each party interested in sharing insights with each other, but not any data.
Through these privacy zones, the data stays local within each party’s respective network. A compile function is run on each party’s local network on its own data. The function then sends updates to a central server, which aggregates these updates to compile the results across parties, effectively keeping sensitive information private within the respective local networks.
For example, if the goal is to train a global machine learning model using insights from different sources, each party/source would train a local model on its own data and only send the updated model parameters (such as the weight, or ‘importance’ of a particular variable) to the central server in order to improve a global model. The central server, in the case of our POC, was a part of a cloud-based Software as a Service (SaaS) framework provided by the vendor.
The previously mentioned example illustrates a concept called federated learning, which is a decentralized machine learning approach that allows multiple parties to train a model collaboratively, where each party preserves its data locally. Federated learning leverages multi-party computation principles to ensure privacy during model training. For example, instead of sharing raw data, parties can share model updates or encrypted data, allowing for secure computation and aggregation of model parameters
Method of evaluation
We ran the POC through two phases. Phase I assessed the basic functionality of the multi-party computation within the vendor’s sandbox environment. Phase II involved running a synthetic two-party computation within Munich Re’s cloud environment. In the second phase, we trained a popular advanced machine learning algorithm on the multi-party computation platform. This allowed us to securely leverage insights from our internal historical data that was hosted in one privacy zone and enhance it with externally acquired sociodemographic data that was hosted in a different privacy zone. Eventually through the phases, we were able to draw insights from both the datasets while ensuring privacy within their respective secure computation zone by not allowing data to leave the respective zones. Our goal was to evaluate the predictive accuracy of the resulting model, lift from increasing sample and features, as well as ease of setup and efficacy of the privacy-preserving computation process.
Our evaluation was split into four broad categories: (1) functionality, (2) accuracy, (3) efficiency, and (4) scalability. For each category, we assessed the following areas:
Functionality
Preservation of data privacy and security behind each player’s firewall
Dedicated functions to join private data (private set intersections)
Granular data privacy controls
Complicated operations available to be implemented for data preparation
Functionality to run advanced algorithms
Accuracy
Accuracy of results from running statistical functions, algorithms, and data processing operations
Efficiency
Speed and quality of technical support
Speed and efficiency in running functions and algorithms
Extent of coordination required among players for data processing and modeling
Scalability
Potential to scale up for faster and optimized multi-thread processing
Level of data engineering expertise required for initial setup
Potential use cases
A POC cannot be considered truly successful without identifying viable use cases. For multi-party computations, we categorize the potential use cases into two broad groups:
- Internally within larger companies: Utilize data across entities/departments within a company to increase data size and features for analytics or model development.
Benefit: This alleviates data residency requirements across business units and potentially borders, and helps meet existing and evolving regulations.
- Externally with partners and third-party data vendors: Fast and efficient third-party data evaluation potentially involving data overlap analysis, model improvement with new features, etc.
Benefit: Raw data is never shared, and sensitive information is protected, mitigating the risk of compromising data security and privacy.
Based on our experience with multi-party computation we believe the technology holds significant potential. This approach offers an efficient way to meet the existing privacy compliance and data sharing best practices in building collaborative intelligence ecosystems within, as well as across companies. It's only a matter of time before companies use multi-party computation frameworks to enhance their informational edge.
For further reading:
Evans, David; Kolesnikov, Vladimir; Rosulek, Mike (2018). "A Pragmatic Introduction to Secure Multi-Party Computation" (PDF). securecomputation.org. Archived from the original (PDF) on 2024-08-12. Retrieved 19 October 2024.
World Economic Forum: The Next Generation of Data-Sharing in Financial Services: Using Privacy Enhancing Techniques to Unlock New Value − This report explores Privacy Enhancing Techniques (PET) and their ability to unlock new value in the financial services industry by facilitating new forms of data-sharing −Link to full report text.
Nvidia Blog: What Is Federated Learning? | NVIDIA Blog.
Google Blog: Federated Learning: Collaborative Machine Learning without Centralized Training Data- https://ai.googleblog.com/2017/04/federated-learning-collaborative.html.
Contact the author

Related Solutions
Newsletter
properties.trackTitle
properties.trackSubtitle