Feasibility Study: How to Build an AI to Rival Claude and ChatGPT, and What It Costs

Feasibility Study: Can You Build an AI to Rival Claude and ChatGPT? And What Does It Cost?
The honest answer from Origami's experts: building a frontier model from scratch at the scale of Claude or ChatGPT costs hundreds of millions to over a billion dollars, and is realistically within reach only of the mega-labs and nation-states. But the word "rival" has three very different cost meanings — and for an ordinary company, the smart path costs a tiny fraction of that. This study breaks down the three paths and the cost of each, honestly.
What does "rival Claude and ChatGPT" actually mean?
Before talking cost, you must define the ambition, because the gap between the paths is measured in thousands of times:
- A frontier model from scratch: you build a new "brain" that rivals the world's latest models at everything.
- A sovereign or specialized model: you build a smaller model, or continue-train an open model, to serve a language, a sector, or a country.
- Building on an open model: you take a ready open model and tailor it to your data so it wins in your own domain.
Path one: a frontier model from scratch — an astronomical cost
This is the hardest and most expensive path. According to Epoch AI, which specializes in tracking training costs, the cost of training frontier models roughly doubles 2.4x per year, and the largest training runs are projected to exceed one billion dollars by 2027 — putting them "out of reach for all but the most well-funded organizations." The cost splits between hardware (47–67%), R&D staff salaries (29–49%), and energy (2–6%).
In practice that means tens of thousands of GPUs (an H100 costs about $25,000–$40,000, a B200 about $30,000–$50,000) forming a cluster worth hundreds of millions, plus top-tier researchers earning over a million dollars each, enormous datasets, and years of work. The result: not an option for a company, but for labs like OpenAI, Anthropic, Google, and DeepSeek, or major national programs.
Path two: a sovereign or specialized model — a large strategic investment
Here you don't compete globally at everything; you build a smaller model (or deeply continue-train an open one) to excel in a language or sector. Saudi Arabia is a clear example through its sovereign-AI push, such as the company HUMAIN and SDAIA's "ALLaM" model. This path costs between millions and tens of millions of dollars, needs a specialized research team, a compute cluster, clean data, and 6 to 18 months. It fits governments and large enterprises with a strategic objective.
Path three: building on an open model — the smartest for 99% of companies
This is where the practical answer lies. You take a strong open-weight model (DeepSeek V4, Qwen, Llama, or GLM) and tailor it to your data via fine-tuning and Retrieval-Augmented Generation (RAG). You don't beat ChatGPT at everything — you beat it in your domain, your language, and your data, which is what matters for your business. The cost here drops from "astronomical" to between tens of thousands and a few million riyals depending on ambition and infrastructure, in weeks to months. A bonus: you can self-host it so your data stays with you — important under the Personal Data Protection Law (PDPL).
The three paths compared
| Option | What it is | Estimated cost | Time | Who it fits |
|---|---|---|---|---|
| Frontier model from scratch | A Claude/GPT-scale model from the ground up | Hundreds of millions to over a billion dollars | Years | Mega-labs and nations |
| Sovereign or specialized model | A smaller model, or deep continue-training of an open one | Millions to tens of millions of dollars | 6–18 months | Governments and large enterprises |
| Building on an open model | Tailoring an open model to your data (fine-tuning + RAG) | Tens of thousands to a few million riyals | Weeks to months | Most companies |
You rarely need to build a new engine; usually you need to build the right car for your road around an existing one.
The open-source option in detail: which model, and how to build on it?
Since building on an open model is the right path for most companies, here is the practical detail: which models to choose, how to build on them, and what each approach costs.
The leading open models in 2026:
| Model | Maker | Known for |
|---|---|---|
| DeepSeek V4 | DeepSeek | The strongest open-source model for agentic coding, and the cheapest |
| Qwen3-Coder | Alibaba | Code-specialized and multilingual |
| GLM-5.2 | Z.ai | Long-horizon coding and a 1M-token context |
| Llama 4 | Meta | A huge tooling ecosystem and broad community support |
| Kimi K2 | Moonshot | Long context and strong general performance |
How do you build on them? Four approaches, in rising cost and effort:
- 1) Direct use + prompt engineering: run the model as-is (via a cloud API or self-hosted) and steer it with smart instructions, no training at all. The cheapest and fastest — days to weeks, from thousands to tens of thousands of riyals.
- 2) Retrieval-Augmented Generation (RAG): connect the model to your knowledge base and documents so it answers from your own data without training it — ideal for a knowledge assistant that knows your products and policies. Weeks, tens of thousands of riyals.
- 3) Fine-tuning / LoRA: train the model on examples from your domain and style so it masters your specific task (tone, classification, output format). Weeks to months, tens to hundreds of thousands of riyals.
- 4) Continued pretraining: feed the model large amounts of your domain or language data to deepen its knowledge fundamentally — the most powerful and most expensive, approaching the "sovereign path." Months, hundreds of thousands and up.
Self-hosting and privacy: the big advantage of open models is that you run them on your own infrastructure (a private cloud or local servers), so your data never leaves your organization — decisive under the Personal Data Protection Law (PDPL). You need GPUs sized to the model (from a single card for small models to a cluster for large ones), and you balance fixed hosting cost against pay-as-you-go cloud API cost.
Origami's experts' recommendation
Don't try to out-train the mega-labs at their own game — that is a battle of billions. The smart move is to build a specialized solution on an open model that wins in your domain, your Arabic language, your data privacy, and your cost. This path is achievable, high-ROI, and gives you "your own AI" without the billion-dollar bill.
How Origami helps
At Origami we study feasibility first, then choose the right path and model for your goal and budget, tailor the model to your data (fine-tuning + RAG), self-host it for full privacy, and connect it to your systems and agents via MCP. The goal is an AI that competes in your domain, at a realistic, well-studied cost.
Sources
- Epoch AI — the cost of training frontier models: epoch.ai
- Saudi Data and AI Authority (SDAIA): sdaia.gov.sa
Hardware cost figures are market estimates for NVIDIA GPUs at publication time and are subject to change.
Frequently Asked Questions
Can I build a model like ChatGPT on a company budget?+
Not from scratch — a frontier model costs hundreds of millions to over a billion dollars and is limited to mega-labs and nations. But you can build 'your own AI' that competes in your domain by tailoring an open model to your data, starting from tens of thousands of riyals.
What is the cheapest path to start?+
Building on an open-weight model (such as DeepSeek or Qwen) and tailoring it with fine-tuning and RAG on your data. It starts from tens of thousands of riyals for a specialized assistant and rises with data volume, infrastructure, and self-hosting.
Does a tailored open model really compete?+
Yes, but on the right battlefield: it won't beat ChatGPT at everything, but it beats it in your domain, your Arabic language, your own data, your privacy, and your cost — which is usually what actually matters for your business.
Exactly how much does a frontier model from scratch cost?+
According to Epoch AI, the cost roughly doubles 2.4x per year, and the largest training runs are projected to exceed a billion dollars by 2027 — before R&D salaries (up to half the cost), infrastructure, and energy. In practice: hundreds of millions to billions, which is why it is limited to a few of the most well-funded organizations.
Rate this article
Related Articles
- Artificial IntelligenceHow AI is Reshaping the Future of Business in Saudi Arabia?AI is no longer science fiction. Explore how Saudi companies use AI technologies to improve efficiency, reduce costs, and innovate new business models.
- Artificial IntelligenceAI in Procurement and Inventory: How It Saves Your Business Money and TimeDead stock and guesswork purchasing quietly drain the profits of many businesses. Learn how AI turns your data into sharper purchasing decisions and leaner inventory.
- Artificial IntelligenceAutomating Customer Service with WhatsApp and AI ChatbotsA practical guide to automating customer service with WhatsApp and AI chatbots: reply to customers instantly 24/7, cut costs, and raise satisfaction in Saudi Arabia.
Weekly newsletter
The latest articles that matter to business owners, once a week. Just your email.
Looking for a software solution for your business?
At Origami we build custom systems, websites, and stores tailored to how your business works. Get in touch and we'll show you how we can help.
