Cost per model
Understanding Model Costs
Different AI models have varying pricing structures based on their capabilities and computational requirements. Here's a detailed breakdown of the costs for each available model. You'll see that the cost for doing any kind of chat is very, very, low. It is quite difficult to spend more than a few cents per chat, even with the most capable models.
Real cost examples
A single, short question for Perplexity (Best):
- 363 total tokens
- $0.0005 per message
- $0.0068 total cost
A longer chat with Claude 3.5 Sonnet to learn about SEO best practices:
- 3,020 total tokens
- $0.05 total cost
Token-Based Pricing
Most models use token-based pricing, where costs are calculated separately for input (prompt) and output (completion) tokens. Prices are shown per 1 million tokens.
GPT-4o
- Prompt: $2.50 per 1M tokens
- Completion: $10.00 per 1M tokens
Claude 3.5 Sonnet
- Prompt: $3.00 per 1M tokens
- Completion: $15.00 per 1M tokens
Llama 3.3 70B (free for now)
- Prompt: $0 per 1M tokens
- Completion: $0 per 1M tokens
GPT-o1 (Premium Tier)
- Prompt: $15.00 per 1M tokens
- Completion: $60.00 per 1M tokens
GPT-o1 Mini
- Prompt: $3.00 per 1M tokens
- Completion: $12.00 per 1M tokens
Deepseek R1
- Tokens: $7 per 1M tokens (prompt, thinking and completion)
Perplexity Models
Perplexity models have a unique pricing structure that combines token costs with per-request fees.
Perplexity Pro
- Prompt: $3.00 per 1M tokens
- Completion: $15.00 per 1M tokens
- Additional fee: $5.00 per 1,000 messages
Perplexity Basic
- Token costs: $1.00 per 1M tokens (both prompt and completion)
- Additional fee: $5.00 per 1,000 messages
Cost Optimization Tips
- Choose the Right Model: For help selecting the most cost-effective model for your needs, check our model selection guide.
- Optimize Prompt Length: Since prompt tokens are charged, keep your inputs concise while maintaining clarity.
- Keep conversations short: This is the most important hack. To create the sense of conversation that ChatGPT pioneered, with every message you sent, the ENTIRE conversation is sent to the model. This means chat cost increases exponentially with the length of the conversation.