Robotics foundation models (RFMs) have become a major topic of discussion in robotics circles this year with massive amounts of capital going towards the growing field of RFM research and commercialization efforts. And yes, this influx of cash and conversation constitutes a hype cycle — but that’s not necessarily such a negative thing. After all, the promise of RFMs is compelling.
Let’s be clear about what an RFM is. Simply put, it’s a machine learning (ML) model trained in a self-supervised fashion over a huge corpus of data. Theoretically, these models can be iteratively adapted for specific use cases.
Putting aside the debate over whether generative AI models such as those underlying ChatGPT actually have reasoning capabilities, there is no denying that they have had a major impact on how people interact with information on the web. The idea of RFMs is simple: Replicate this impact in the physical world, focusing on mobility and manipulation rather than text and images. Proponents of RFM believe that by leveraging these models, robots can conduct work more efficiently, effectively and with less oversight than in current automation-based systems.
These possibilities are exciting and worth exploring, but how soon can we expect RFMs to have a tangible impact on warehouse efficiency? More likely than not, “generalized capabilities” in warehouse robots won’t manifest productively until the distant future — say, 15 or more years down the line.
Until then, we must interrogate the practical implications of RFM, especially in high-intensity fields like warehousing and logistics. Will robots powered by RFMs have a place on the warehouse floor? If so, when and under what conditions?
Timeline and the customers’ definition of success
What problems could systems using RFMs solve on the warehouse floor? From my perspective, warehousing leaders currently face three significant challenges:
- Staffing and retention
- Operational complexity
- High variability/seasonal throughput
RFMs hold the promise of impacting all three of these areas. Systems trained on diverse workflows may be able to predict performance and help optimize operations. Likewise, systems trained on enough input variation may handle seasonal changes more easily. These improvements could in turn lead to reduced churn in personnel, as staff migrate into higher-value and more rewarding work.
In some respects, these are the same promises of robotics and automation in general: Find the patterns, simplify them, automate them and then let humans get on to more interesting things.
The difference between RFM-centric thought and current robotic automation principles is one of scale. As RFMs ingest more data and connections, they may discover generalized features and patterns, such as those relating to package variance. These advances could bring us very close to a dream “universal model” that can handle any package in a warehouse — much the way a human can.
However, it’s unclear how much data we need or how many connections must be trained to achieve human-level performance in dealing with visual variation. The amount of training data necessary for other, more generic tasks, such as mobility around humans and the manipulation of deformed goods, also remains a question mark. Training may take five years or 50. While it’s exciting to think about, the advent of RFM-based warehouse robotics systems isn’t something we can count on soon.
Hardware, compatibility and computational needs
Large-language models (LLMs) and generative AI rely on foundational models to “create” new content. Robots, on the other hand, have comparably less information to draw from. That’s because problem-solving in the warehouse requires real-world information about physics and motion, which traditional AI training sets — focused on text and images — haven’t included. Ironically, as RFMs generate more interest, we approach a different problem: too much data to affordably and realistically deploy in most warehouses, at least for now.
The concern here is computational power. Over the last decade, warehouse robotics providers have worked to minimize their cloud computing requirements by pushing capabilities like AI vision to the edge. However, RFMs are likely too large to sit on the edge, necessitating a cloud-based consumption model. The consequence would be a massive cloud bill for warehouses relying on RFMs.
The cost-prohibitive reality of RFMs’ sheer size could become a choke point for efficiency. If every robot on the warehouse floor needs an always-on connection to an RFM, they’ll all require significant capital to operate. Realistically, most warehouses may be unable to afford more than 1-3 RFM-powered robots at once.
But let’s say a fully functional RFM-based system for warehouse logistics is eventually developed and productized. Now the question becomes how this model manifests itself in the warehouse. How interoperable will the RFM be with outdated hardware? Will clients need to purchase an entirely new fleet of robots to take advantage of RFM? That’s not in the cards, so determining the backward compatibility of RFM-enabled robots is critical to level-set with clients.
It may be that robots operating on an RFM will remain cost-prohibitive for most warehouses over the next 20-30 years before the costs associated with massive datasets and firmware normalize. If so, these expectations should be communicated upfront.
Are RFMs the future of robotics?
RFMs are undoubtedly exciting. They could advance warehouse efficiency within the right parameters, especially in applications that are well-represented in training data or that don’t deviate from expectations.
However, I question the practicality and necessity of such models under traditional warehouse conditions. Robots equipped with AI vision are already more than capable of increasing efficiency and handling unexpected situations on the floor, particularly within human-in-the-loop systems.
A cost-benefit analysis of RFM is difficult to complete without more information about current progress. In particular, I’m eager to see a timeline for RFM development and deployment. When should warehouse leaders expect to see results in their operations, if ever?
Access to this information would transform an exciting but nebulous concept into a more tangible mile-marker to strive toward.