
In a world where artificial intelligence is rapidly reshaping industries, one obstacle has consistently stood in the way of progress: access to the right training data.
Seeking a solution to this challenge, four entrepreneurs—Bobby Samuels, Travis May, Engy Ziedan, and Richard Ho—founded Protege. The company was born from a simple yet powerful belief: unlocking proprietary, high-quality data could ignite the next wave of AI breakthroughs.
When A Spark Turned into Business
By early 2024, Bobby, along with Travis, Engy, and Richard, determined and came together to address a familiar challenge, having experienced how slow and complicated negotiations around training data could be. Travis, a seasoned serial entrepreneur known for co-founding and leading LiveRamp and Datavant as CEO, and Bobby both recognized the urgency of the problem firsthand.
In May 2024, their vision took shape. Bobby and Travis joined forces with Engy Ziedan (Chief Scientific Officer) and Richard Ho (CTO) to define the problem and build the solution with Protege.
From this fierce spark emerged Protege’s inaugural mission: to build a privacy- and security-first platform that returns control to data holders, empowering them while enabling AI developers to innovate faster and more responsibly.
Read Also: Kustomer’s Journey: Two Visionaries Redefining Customer Service for the Future
Recognising A Gap to Building A Solution
The founders realized that while algorithms and computing power had matured, data remained fragmented—locked behind legal, intellectual property, or privacy barriers. After recognizing this clear gap in the AI stack, they didn’t sit still.
Instead, they rolled up their sleeves and spent countless days and nights brainstorming and building a platform designed to help companies train artificial intelligence (AI) systems more effectively.
Protege offers the tools and services needed to collect, organize, and label data—essential steps for AI to learn and improve. By making high-quality training data easier to access and manage, Protege enables businesses to build smarter, more accurate AI models faster and more efficiently.
What Protege Actually Does
At its core, Protege functions as a “data layer for AI training“—combining a marketplace with governance infrastructure. It connects data holders, such as healthcare institutions, media broadcasters, and research organizations, with AI developers eager to build better models.
For data owners, Protege offers a secure and compliant way to license or share their proprietary datasets, complete with governance safeguards, intellectual property protections, and clear contractual terms.
And for developers and model builders gain curated access to hard-to-find, high-quality datasets, enabling faster discovery, licensing, and integration without the lengthy bureaucratic delays that typically accompany such negotiations.
Read Also: Streamline AI: How a Lawyer and a Technologist are Rebuilding Legal Operations from the Inside Out
Protege Spans Its Footprint
In September 2024, Protege raised a $10 million seed round led by CRV, with participation from SV Angel, Liquid 2 Ventures, Bloomberg Beta, Flex Capital, Adam D’Angelo, Travis May, and others.
Less than a year later, in August 2025, they secured a $25 million Series A, led by Footwork with continued support from early investors. This funding has fueled significant growth—deepening product capabilities, expanding into new verticals, and forging key partnerships across healthcare, media, and beyond.
Today, Protege’s footprint spans over 100 data partners and billions of data points—including more than 300,000 hours of video, 500,000 hours of audio, billions of clinical notes, and hundreds of millions of medical images.
Most recently, they’ve expanded into two new verticals: Audio & Speech and Motion Capture—further broadening the scope of what’s possible in AI training.
Envisioning a Future Where Data Powers the AI Era
The visionaries behind Protege envision becoming the connective tissue between proprietary data and AI builders everywhere, an infrastructure layer as essential as computation or algorithms in the AI stack.
This isn’t just about making training data easier to access— but also it’s about enabling thoughtful AI solutions by ensuring those data assets are used transparently, safely, and with respect for their origins.
Bridge Between Human Knowledge & Machine Learning
At its core, Protege is a story of people who see data not as a commodity, but as the key to unlocking better AI—for healthcare, media, safety, and beyond. Bobby Samuels, Travis May, Engy Ziedan, and Richard Ho—driven by experience, empathy, and enterprise—have built more than a platform.
The pioneers have built a bridge between human knowledge and machine learning, enabling AI that can truly understand—and serve—the human world.