DailyGlimpse

Fetch Builds In-House AI on AWS with Hugging Face, Slashing Dev Time by 30%

AI
April 26, 2026 · 5:05 PM
Fetch Builds In-House AI on AWS with Hugging Face, Slashing Dev Time by 30%

Fetch, a consumer rewards company with over 18 million monthly active users, has overhauled its AI infrastructure by moving from third-party solutions to custom-built tools on AWS, guided by Hugging Face. The result: a 30% reduction in development time, 50% lower latency, and the ability to process 11 million receipts daily with richer insights.

Facing limitations with a third-party AI “black box” that provided minimal control or data granularity, Fetch hired computer vision scientist Boris Kogan to build an in-house machine learning team. The company’s entire infrastructure runs on AWS, and with Hugging Face’s Expert Acceleration Program on AWS Marketplace, Fetch gained hands-on guidance to train state-of-the-art document AI models using Hugging Face Transformers and Amazon SageMaker.

“Easy access to Transformers models is something that started with Hugging Face, and they're great at that,” said Kogan. Hugging Face acted as an advisor, teaching Fetch engineers to use resources effectively. ”They wanted to learn how to use Hugging Face to train the models they were building. We showed them how to use the resources, and they ran with it,” said Yifeng Yin, machine learning engineer at Hugging Face.

Before switching over, Fetch built a shadow pipeline to reprocess all 11 million daily receipts in parallel with the old system, auditing results thoroughly. The new system uses AWS Inferentia accelerators for low-cost deep learning inference. Key benefits include cutting processing latency by 50% and reducing model training from days to hours.

“We’ve improved responsiveness for customers with faster processing of uploads, cutting processing latency by 50%,” said Sam Corzine, lead machine learning engineer at Fetch. “By building our own models, we get details we never had before.”

The company allocated 12 months for the project but completed it in just eight because resources were always available on demand through AWS.