Why Thinking Machines’ API Is Both a Breakthrough and a Gamble

The arrival of Tinker

Another week, another AI announcement. This time it comes from Thinking Machines Lab, a new venture led by Mira Murati. Their first product is called Tinker, and it is being described as a Python-based API for fine-tuning large language models.

Simply put, Tinker is meant to let developers adjust and customise AI models more easily, without needing to build complicated infrastructure in the background. Instead of worrying about servers, clusters, and all the fiddly bits that usually get in the way, the promise is that you can focus on the model itself.

On paper, that sounds attractive. Fine-tuning has always been difficult and expensive, often needing specialist knowledge and costly equipment. If Tinker can truly make that easier, it feels like a breakthrough. But before we get carried away, it is worth noting that the picture may be more modest than the hype suggests.

What tinker actually offers

Tinker is built around a method called LoRA, short for Low-Rank Adaptation. You don’t need to know the maths, just that LoRA is a clever shortcut which:

Icon

Adjusts only a small part of the model, rather than retraining from scratch

Euros icon

Saves time and money

Icon

Has made fine-tuning more practical in recent years

The company also says that Tinker uses shared compute pools. That means resources are managed centrally by Thinking Machines, rather than every user needing their own setup. Again, the appeal is clear: less hassle for researchers and developers.

Importantly, Tinker is still in beta. It is free to use for now, but the company has said it will move to usage-based pricing in the near future.

Early adopters include research groups at Princeton, Stanford, Berkeley, and Redwood Research. They are exploring Tinker in areas like reinforcement learning and formal theorem proving. These are very niche problems, but they do show the tool is already being tried out in academic circles.

Why it matters

At a high level, Tinker is about customisation. If you are a business or a research team, you don’t always want a generic AI that speaks in vague, general terms. You might want one that reflects your company’s tone of voice, uses your industry’s terminology, or sticks to a particular curriculum in education.

Until now, the main way to do this has been prompt engineering: giving the model detailed instructions every time you ask a question. That works, but it is clumsy. Fine-tuning bakes those instructions into the model itself, so you don’t have to constantly remind it what style or rules to follow.

This is where Tinker’s positioning makes sense. It gives developers a way to create models that feel more “theirs” without building the whole training system from scratch.

The catch: early days and heavy marketing

Here is where the caution comes in. Tinker is not free in the long run, despite the initial grace period. It is also in beta, which means quality and reliability are not guaranteed. And there is not yet much detail on how it fits with other tools or platforms, which matters a lot once you want to move from a demo to a production system.

On the surface, there is also little that feels unique. Distributed training, cost-saving techniques like LoRA, and scalable serving are already available in other frameworks that have been tested and matured over several years.

Enter Ray: a mature alternative

If you look at the bigger picture, Ray is one of the leading platforms that already does much of what Tinker is advertising. Ray started as an open-source project at UC Berkeley and is now widely used in industry.
What does Ray offer? Without diving into technical jargon:

Icon

Helps with training models across many machines

Document icon with a tick

Helps with testing lots of different training options

Icon

Helps with serving models at scale

Graduate cap icon

Helps with running reinforcement learning experiments

In other words, it is a complete toolkit for large-scale AI work.

Ray is open-source, meaning it is free to use, and it integrates with popular platforms such as Databricks and Kubernetes. That makes it relatively easy for companies to plug it into existing setups, whether in the cloud or on their own servers.

It is not just theory either. Companies like Uber, Shopify, Spotify, Instacart, Intel, and Riot Games use Ray in their machine learning systems. It is proven at scale and across different industries.

So, what’s the difference?

Tinker and Ray share a lot of ground. Both aim to make distributed AI work easier. The difference lies in packaging and positioning. Ray is open-source and mature but requires more technical setup. Tinker is managed for you, at least in principle, so the promise is less about technical depth and more about convenience.

For non-technical users or small teams, that convenience might make Tinker attractive. But it is important to keep expectations in check. This is a beta product, built on methods that already exist elsewhere, and it is backed by heavy marketing.

FeatureTinkerRay
Managed serviceYesYes (open-source)
Fine-tuningLoRA-basedMultiple methods
MaturityBetaWidely adopted
PricingFree (beta), paidFree

Our experience

In our AI Labs, we have employed Ray on several projects.

In one project during the fine-tuning phase, we integrated Ray to orchestrate distributed training across multiple GPUs. Ray’s scheduling capabilities allowed us to parallelise the fine-tuning workload, significantly improving hardware utilisation.

As a result, we were able to complete the training more quickly and efficiently, ultimately lowering the computational expense of the entire operation.

In another project, we leveraged Ray to dynamically fragment GPU and CPU assignments across multiple lightweight agents. By distributing tasks in this fine-grained manner, we maximised GPU utilisation. This strategy effectively doubled our processing capacity and led to a substantial reduction in overall compute cost.

If the project required it, we could replicate all of Ray’s distributed computing features on-premises by creating a Kubernetes cluster. This would allow us to run Ray services locally with high scalability, fault tolerance, and resource orchestration, essentially giving us cloud-grade parallelism and performance within our own infrastructure.

Closing reflection

Tinker is not a revolution, but it is a useful reminder of where the AI tools market is heading. Companies are trying to lower the barrier for fine-tuning so more people can create models that fit their own needs. That is a good direction of travel.

At the same time, the underlying technology is not new. Mature, free, and widely adopted platforms like Ray already provide these capabilities.

The real test for Tinker will be whether it can make the experience smoother without simply reinventing what others already do.

For now, the best approach is cautious optimism. Tinker may make some things easier, but it is not magic. The tools already exist, and in some cases, they are more robust than what this beta release can yet offer.

Filippo Sassi is Head of the AI Labs at Version 1.

For more insights on how our AI expertise is driving innovation and helping businesses around the world transform, click here.