LLMApache

gpt-oss-120b

by OpenAI Community

Oct 1, 2025•32K context•$0.02/M input•$0.09/M output

gpt-oss-120b is a community-developed model with clean IP lineage. High throughput inference at 45-60K TPS. Excellent price/performance ratio.

View on HuggingFace

Fleek Pricing

$0.0025/GPU-second

Context32K tokens

Estimated Token Cost

Input

$0.02/M

Output

$0.09/M

Based on 23,000 tokens/sec

vs CompetitorsSave 31%

Overview

Parameters

120B

Architecture

Dense Transformer

Context

32K

Provider

OpenAI Community

Best For

Enterprise codingCode generationDocumentationTesting

OpenAI Compatible

Drop-in replacement for OpenAI API. Just change the base URL.

Pay Per Second

Only pay for actual GPU compute time. No idle costs.

Enterprise Ready

99.9% uptime SLA, SOC 2 compliant, dedicated support.

Auto Scaling

Scales from zero to thousands of requests automatically.

Compare Pricing

	Fleek	Fireworks	Together	Baseten
Input	$0.02	$0.15	$0.15	$0.10
Output	$0.09	$0.60	$0.60	$0.50
Savings		70%	70%	70%

Prices are per million tokens. Fleek pricing based on $0.0025/GPU-second.

Calculate Your Savings

See how much you'd save running gpt-oss-120b on Fleek

Model

gpt-oss-120b

Compare To

Monthly Usage: 100M tokens

Quick Select

Your Fleek Cost

$9-14/mo

3.6K-5.6K GPU-sec × $0.0025

Fireworks AI

$38/mo

Your Savings70%

Annual Savings

$319

See all models in full calculatororUpload your bill for custom analysis

Technical Specifications

Model Name	gpt-oss-120b
Total Parameters	120B
Active Parameters	N/A
Architecture	Dense Transformer
Context Length	32K tokens
Inference Speed	23,000 tokens/sec
Provider	OpenAI Community
Release Date	Oct 1, 2025
License	Apache 2.0
HuggingFace	https://huggingface.co/openai-community/gpt-oss-120b