• FoundersBrief
  • Posts
  • From Small-Town Punjab to YC: How Sanchit & Shubham Are Building RunAnywhere to Put AI on Every Device

From Small-Town Punjab to YC: How Sanchit & Shubham Are Building RunAnywhere to Put AI on Every Device

By unifying fragmented on-device tools into a single open platform, Sanchit Monga and Shubham Malhotra are enabling fast, private AI to run on everything from phones and laptops to $20 edge devices

FOUNDERS
Sanchit Monga & Shubham Malhotra

Sanchit grew up in Talwandi Bhai, a tiny town in Punjab near the Pakistan border—no traffic lights, a couple of shops, and a rice mill his dad still runs today. Technology felt very far away.

His childhood was defined by motion. Because of family business needs, he switched schools around five times and moved across towns and even states. That instability forced him to adapt quickly, drop into new environments, and figure things out from scratch.

The entrepreneurial template came from his uncle—his dad’s younger brother—who left home with “literally $10 in his pocket,” moved to the city, and over years built a sizeable software company in India. He also sponsored Sanchit’s education.

That combination—small-town roots, a self-made operator in the family, and a front-row seat to what software can unlock—shaped Sanchit’s ambitions early. He didn’t want to be told what to work on; he wanted to build.

Originally, the track looked like the standard Indian “IIT or bust” grind. But a family friend suggested another path: apply abroad. Sanchit took the shot, landed at the Rochester Institute of Technology, and studied software engineering/computer science. From there he did the classic move: RIT → Silicon Valley → mobile engineer.

At Intuit, he was shipping SDKs used on millions of devices—exactly the vantage point where the limitations of today’s cloud‑centric AI became impossible to ignore.

He’s also been uncommonly persistent: by the time RunAnywhere got into YC, it was his fifth time applying.

Shubham’s path starts in New Delhi, but the pattern is similar: trying different tracks until he found where he could move fastest.

He began at the Vellore Institute of Technology in India, bouncing across majors—electrical and electronics → computer science → information technology → back to software engineering—while he figured out where his actual curiosity and aptitude intersected.

Midway through, he engineered a transfer to the US, landing at RIT (after also getting into Purdue) and finishing his degree in software engineering. Because he started earlier on the transfer path, he graduated ahead of Sanchit.

His early career looks like the archetypal high-caliber infra resume:

  • Microsoft – ~2.5 years

  • Amazon – cloud + distributed systems

But unlike Sanchit, Shubham came from a family business background. He was the outlier who took the “safe” corporate route. Over time, it became clear that the learning speed—and upside—he wanted wouldn’t come from staying in FAANG.

Last year, he decided it was time to build.

How They Met & Became Co‑Founders

They didn’t meet through a matching platform.

They met at a poker night.

Back in 2019 at RIT, a mutual friend organized a poker game at Sanchit’s place. Shubham showed up, they hit it off, and then kept bumping into each other in classes. Over time, they started intentionally taking courses together and “pushing each other to our limits.”

They’ve now known each other for roughly seven years.

Fast-forward: in 2025, both were sniffing around startup ideas and co-founder matches. Shubham was on formal co‑founder matching platforms; Sanchit had already cycled through multiple ideas and partners, applying to YC over and over.

In July 2025, it finally clicked. Instead of trying to force-fit new co-founders, they looked at each other’s track records and skills and realized the obvious: their skill sets were deeply complementary, and they already had 6–7 years of trust.

There was one big constraint: visas. Both were employed in the US on work visas. Quitting without funding would have been risky at best, impossible at worst.

So they did it the hard way:

  • Built on nights and weekends

  • Applied to YC (got the interview but not the offer the first time)

  • Pushed through the rejection and focused on fundraising just enough to safely leave their jobs

Shubham describes a pivotal three-month stretch working with Sanchit part-time where he realized just how much faster he could learn and operate if he went all in. They both took the leap—leaving stable jobs at Microsoft/Amazon‑type companies to go full‑time on RunAnywhere.

On their second YC attempt as a team, they got in. The current batch kicked off just days before this conversation.

COMPANY
RunAnywhere

Run Anywhere is building the infrastructure layer to run AI models on the edge—phones, laptops, embedded devices, and cheap hardware—without having to stand up a complex cloud stack or wrestle with fragmented vendor SDKs.

The seed of the idea started not in a research lab, but inside a very practical mobile engineering problem at Intuit.

Sanchit’s team was building an SDK to capture location data across millions of devices. As they looked at how to make that smarter, he started exploring on-device intelligence: could they use small models to make location tracking more adaptive and efficient?

Right as he was poking around, Apple’s WWDC keynote dropped with the Apple Foundation Models framework. Around the same time, Google started unveiling its own on-device stack. Open-source projects like llama.cpp and small models like Qwen 2.5 (700M params) were maturing.

All of it felt to Sanchit like another “GPT‑3 moment”—very early, very rough, but clearly the beginning of something that would become mainstream over a 2–3 year arc.

The glaring issue: fragmentation.

  • Apple has its APIs

  • Google has its own

  • Meta, Microsoft, and open-source projects each have their own expectations, formats, and runtimes

For a regular mobile or edge developer who just wants to “run a model,” the experience is painful:

Different tooling, bindings, optimizations, and deployment paths per platform. No unified way to say: “run this model on this device” in 5–10 lines of code.

That’s the wedge RunAnywhere is going after.

Problem & Solution

The problem:

If you want to run AI on the device today—LLMs, vision models, voice, multimodal—you’re stuck stitching together:

  • Vendor‑specific SDKs (Apple, Google, etc.)

  • Open-source runtimes (llama.cpp, GGUF ecosystems, etc.)

  • Model formats and optimizations that vary per use case and device

At small scale, a dedicated team can brute-force this. At millions of devices, it becomes a real infrastructure problem:

  • How do you abstract away device differences?

  • How do you update models reliably across a massive, heterogeneous fleet?

  • How do you keep latency low, costs down, and data private?

Meanwhile, the business upside is obvious:

  • Cost: Small, fine‑tuned models on device can be comparable to big cloud LLMs for niche tasks at a fraction of the cost, especially at scale.

  • Privacy & compliance: Sensitive data (healthcare, finance, PII) never leaves the device.

  • Latency: Real-time interactions, even on sketchy network connections—or completely offline.

What RunAnywhere does:

RunAnywhere is building a single platform that lets you:

  • Run any model (LLM, VLM, speech‑to‑text, text‑to‑speech, multimodal) on edge devices

  • Integrate with a few lines of code

  • Deploy across a diverse fleet of hardware without bespoke per‑vendor work

Under the hood, they’re constantly dealing with:

  • Model formats and optimization

  • Device capabilities

  • Scheduling and orchestration of on-device inference

On the surface, they want it to feel like a simple, unified API.

Over the last few months, talking to early customers has surfaced especially strong pull in:

  • Voice AI agents that run fully on-device

  • On‑device RAG / knowledge retrieval where sensitive data stays local

  • Use cases where the combination of cost + privacy + latency make cloud‑only architectures a non‑starter

They’re building it fully open source at this stage—not as a marketing gimmick, but to maximize adoption, scrutiny, and contributions from the developer community.

ICP (Ideal Customer Profile)

RunAnywhere’s early focus is on teams where on‑device AI isn’t “nice-to-have”—it’s table stakes.

Three main ICP clusters:

  • Privacy & compliance-heavy enterprises

    • Industries like healthcare and finance

    • Strong requirements around data residency, PHI/PII handling, and regulatory constraints

    • Need to keep sensitive data off the cloud while still deploying intelligent systems at scale

  • AI & infra companies serving massive user bases

    • Teams that expect millions of end devices and can’t afford per‑request cloud inference forever

    • Products where latency and offline capability materially change the user experience

  • India at scale
    Both founders grew up in India and watched the country digitize essentially in the last decade. Their thesis:

    • With 1.3B+ people, routing all intelligence through cloud data centers is structurally expensive and brittle

    • A lot of real impact in India—especially in education for underprivileged communities—will come from cheap, local compute: $10–$20 devices running surprisingly capable models

    • On‑device infrastructure is the only realistic way to deliver “free intelligence” at that scale

Longer term, they see education as a major pillar: personalized, local AI tutors and assistants that don’t require always‑on connectivity or expensive per‑token billing.

GTM: Open Source + Top-Down Enterprise

Despite being very early, their go-to-market already has two clear tracks:

  1. Open Source Adoption (priority #1)

    • Make everything they reasonably can open source

    • Optimize for developer love and usage rather than early monetization

    • Learn from folks who built large OSS projects (Docker, llama.cpp, etc.): adoption is the main currency

  2. Top‑Down Enterprise Motion

    • Leverage a strong early investor network to get into C‑level and senior engineering conversations at big enterprises

    • Position RunAnywhere as the infrastructure layer for large, regulated companies exploring edge AI

    • Use those conversations to shape the roadmap and prioritize the most painful, high‑value problems

The fundraising story is non-trivial in itself:

  • After their first YC rejection, they couldn’t just “quit and figure it out” because of visa constraints.

  • They had to get funding before leaving their jobs to legally remain in the US.

They pieced that together through:

  • Warm intros from YC founder friends like Kevin Tang (Firebender), who helped bring in early angel checks

  • A key relationship with Sudarshan Kamath (Smallest AI CEO), who mentored them through the mechanics of fundraising and investor conversations

  • A surprisingly effective cold DM campaign on X, where Shubham in particular drove outreach that led to their first institutional check from Yohei at Untapped Capital

That early capital gave them the runway and legal safety to go full‑time, keep building, and re‑apply to YC—eventually getting into the batch they’re in now.

The Future of RunAnywhere

In the next 12 months, Sanchit and Shubham want RunAnywhere to become the default choice for running models on edge devices.

A few concrete pillars of that vision:

  • Be the “Linux of edge AI” – the obvious open foundation developers reach for when they want to run models on non‑cloud hardware

  • Support “any edge device” – from phones and laptops to $10–$20 commodity devices, not just premium hardware

  • Double down on voice and agentic use cases – where low latency, privacy, and offline behavior matter most

  • Invest in research at the model interface layer – they’re careful not to over‑promise specifics here because the space is moving quickly, but they know they’ll need to sit close to the model layer to make the platform truly seamless

Their ambition is straightforward:

If you’re building a product that needs intelligence at the edge—whether that’s voice agents, embedded AI, offline copilots, or privacy‑sensitive assistants—RunAnywhere should be the first platform you think of.

Why Sanchit & Shubham Stand Out

A few themes cut across their story:

  • Persistence over prestige

    • Sanchit applied to YC five times before getting in.

    • They kept building through rejection, visa constraints, and co‑founder reshuffles.

  • Deep, complementary skills

    • Sanchit: mobile, product, and an almost stubborn willingness to iterate on ideas until they stick.

    • Shubham: cloud infra, distributed systems, and experience inside some of the biggest-scale environments in tech.

    • Bias toward shipping and learning, not theorizing

      • From cold DMs to open-source releases to scrappy fundraising, they consistently choose actions that force fast feedback.

    • A global, grounded perspective on AI

      • They’re not just thinking about AI for Bay Area startups— they’re thinking about how to bring low‑cost intelligence to places like the small town in Punjab where Sanchit grew up, or across India’s billion‑plus population.

They’re not building another cloud LLM wrapper.

They’re building the infrastructure layer that enables cheap, private, low‑latency intelligence on every device—so that the next generation of AI products isn’t constrained by bandwidth, data centers, or compliance fears.

If you’re working on AI native products, edge devices, or privacy‑sensitive intelligence and you believe your users shouldn’t need a 24/7 internet connection to access powerful models, RunAnywhere is a team—and a platform—worth paying attention to.

TL;DR

From a no‑traffic‑light town in Punjab and a family business in New Delhi to FAANG roles, multiple YC applications, and now a spot in the batch, Sanchit Monga and Shubham Malhotra are building RunAnywhere—an open, unified infrastructure layer for running AI models directly on edge devices.

Their bet: the future of AI isn’t just in the cloud. It’s in billions of cheap, distributed devices—and someone needs to make it dead simple to run powerful models on all of them.