❓ FAQ Biaya Token AI

First time encounter AI API pricing? Sini kumpul FAQ paling sering ditanya, explained plain language.

Kalkulator ini membantu?

📖 FAQ

Klik question buat expand answer ↓

Apa itu Token?

Think Token sebagai minimal unit text model process, like "byte". Tapi Token gak 1:1 character:

Cina:1 char ≈ 1-2 Token ("你好世界" ≈ 4-6 Token)
Inggris:1 word ≈ 1-1.5 Token ("Hello World" ≈ 2 Token)
Code:Punctuation, keyword semua count, one line ~5-20 Token

Quick Rule:1000 Token ≈ 750 char Cina ≈ 500 English word。

Gimana calculate Token cost?

Cost = (Input Token count / 1.000.000) × input price + (Output Token count / 1.000.000) × output price

Example: Claude Sonnet 4.6, send 1000 Token prompt, get 2000 Token reply:
Cost = (1000/1M) × $3 + (2000/1M) × $15 = $0.003 + $0.03 = $0.033

Yes, one call just few cents. Expensive cause dari accumulate many calls.

Apa bedanya Input vs Output Token?

Input Token:Content kamu send ke AI, include system prompt, chat history, new message. Longer more expensive.
Output Token:AI reply. Usually output price 3-5x input, cause generate text consume more compute power.

Money-saving tip: Control output length (require "one sentence answer") more effective than compress input.

Ada free AI model gak?

Fully free API-level model sekarang:

• Zhipu GLM-4-Flash:Fully free, ada rate limit
• Llama Self-hosted:Model free, kamu provide GPU server
• Gemini Flash-Lite：Google AI Studio free tier
• New user signup bonus:Anthropic give $5, OpenAI give top-up bonus, etc

Mau whitepaper? Use GLM-4-Flash, enough.

Hit Rate Limit (Rate Limit) gmn?

API return 429 error (Too Many Requests). Gak charge, tapi request fail. Solution:

• Lower request frequency, add retry logic (exponential backoff)
• Upgrade tier (usually need more top-up)
• Switch ke model rate-limit lebih loose (domestic model usually ok)

Monitoring API usage + cost gimana?

Semua platform punya usage dashboard:

• OpenAI：platform.openai.com/usage
• Anthropic：console.anthropic.com check usage
• Google：AI Studio atau Cloud Console
• Domestic model:Masing dashboard have usage stat

Tip: Set cost limit dan alert email, avoid surprise big bill.

Caching gimana bisa hemat?

Kalo kamu repeatedly send same system prompt (like "kamu translator assistant..."), enable cache, prompt part ini charge full price once, subsequent request charge cache price (usually 10-25% of original).

DeepSeek cache special: cache hit $0,028 vs miss $0,28, 10x difference. Kalo app kamu punya fixed long system prompt, enable cache wajib.

💡 Still punya question? Go Universal Calculator try calculate sendiri, or check specific model FAQ (bottom of each model page).