AI Business Asia
Posts
GPT-4o Mini: the most cost-efficient model to date, 29 times cheaper than GPT-4, yet as good.

GPT-4o Mini: the most cost-efficient model to date, 29 times cheaper than GPT-4, yet as good.

Leaked: Nvidia working on AI chips specially for China

Leo Jiang & Lex Lee
July 23, 2024

In today’s newsletter:

Faster, cheaper and better: Introducing GPT-4o Mini
(AI) Chips Ahoy! 🍪
- OpenAI AI chip talks with Broadcom
- Nvidia’s back in the China market with new AI chips
4 AI tools in the spotlight
4 new investments

Read time: 4-6 mins

Sorry, we couldn’t resist the 🍪 cookie reference.

GPT-4o Mini: Most cost-efficient model to date, 29 times cheaper than GPT-4, yet just as good.

OpenAI’s got another exciting update, GPT-4o mini’s launch last Thursday introduced a faster, more cost-efficient version to expand the use cases and applications leveraging its APIs.

Key points:

Superior performance:
- Outperforms GPT-4 on chat preferences in LMSYS leaderboard
- Surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and multimodal reasoning, and supports the same range of languages as GPT-4o.
- Beating Claude Haiku and Gemini Flash in maths and coding proficiency, and multimodal reasoning
- Better than GPT-3.5 Turbo for tasks such as extracting structured data from receipt files or generating high quality email responses when provided with thread history.
It will take over from GPT3.5 Turbo as the underlying model of ChatGPT’s free version
Most cost-effective:
- GPT-4o costs $5 per 1M input tokens + $15 per 1M output token, while GPT-4o mini is $0.15 and $0.6 respectively. That is on average 29 times cheaper.

Feature/Model	GPT-4o Mini	GPT-4o	GPT-4
Context Window	128K tokens	128K tokens	8,192 tokens
Max Output	16.4K tokens	8,192 tokens	8,192 tokens
Input cost	$0.15 per million	$5 per million	$30 per million
Output costs	$0.60 per million	$15 per million	$60 per million
MMLU Score	82.0%	88.7%	86.4%
MMMU Score	59.4%	69.1%	56%
MGSM Score	87%	90.5%	90.2%
HumanEval	87.2%	90.2%	87.1%

MMLU - Multitask accuracy; MMMU - Multimodal understanding & reasoning; MGSM - Maths reasoning; HumanEval - code generation

So what?

It is clear that OpenAI is stepping up its game to deliver more accessible products to the market, trying to push and defend its market share. In less than 20 months, the cost of using models have dropped by more than 26 times (in a medium usage scenario at 10 million tokens per month, OpenAI’s text-davinci 003 would have cost $2,400 for the year while GPT-4o mini annual costs $90).

At the same time, shifting to a small, task-specific model will allow applications that require fast, real-time responses and network-independent to become much more feasible. it means we will see a proliferation of mobile Apps powered by LLM at the edge side, e.g. a true AI companion running on your smartphone that requires no Internet connection, which is aligned with what all mobile makers are trying to make - AI smartphones.