LLMApache

gpt-oss-120b

by OpenAI Community

Oct 1, 202532K context$0.02/M input$0.09/M output

gpt-oss-120b is a community-developed model with clean IP lineage. High throughput inference at 45-60K TPS. Excellent price/performance ratio.

View on HuggingFace
Fleek Pricing
$0.0025/GPU-second
Context32K tokens
Estimated Token Cost
Input
$0.02/M
Output
$0.09/M
Based on 23,000 tokens/sec
vs CompetitorsSave 31%

Overview

Parameters

120B

Architecture

Dense Transformer

Context

32K

Provider

OpenAI Community

Best For

Enterprise codingCode generationDocumentationTesting

OpenAI Compatible

Drop-in replacement for OpenAI API. Just change the base URL.

Pay Per Second

Only pay for actual GPU compute time. No idle costs.

Enterprise Ready

99.9% uptime SLA, SOC 2 compliant, dedicated support.

Auto Scaling

Scales from zero to thousands of requests automatically.

Compare Pricing

FleekFireworksTogetherBaseten
Input$0.02$0.15$0.15$0.10
Output$0.09$0.60$0.60$0.50
Savings70%70%70%

Prices are per million tokens. Fleek pricing based on $0.0025/GPU-second.

Calculate Your Savings

See how much you'd save running gpt-oss-120b on Fleek

gpt-oss-120b
Your Fleek Cost
$9-14/mo
3.6K-5.6K GPU-sec × $0.0025
Fireworks AI
$38/mo
Your Savings70%
Annual Savings
$319

Technical Specifications

Model Namegpt-oss-120b
Total Parameters120B
Active ParametersN/A
ArchitectureDense Transformer
Context Length32K tokens
Inference Speed23,000 tokens/sec
ProviderOpenAI Community
Release DateOct 1, 2025
LicenseApache 2.0
HuggingFacehttps://huggingface.co/openai-community/gpt-oss-120b

Benchmarks

Click any benchmark to view the official leaderboard. Rankings among open-source models.

Ready to run gpt-oss-120b?

Join the waitlist for early access. Start free with $5 in credits.

View Pricing