Skip to content

DeepSeek R1 32B

DeepSeek R1 32B is a state-of-the-art reasoning model from the DeepSeek series, specifically designed to excel at complex problem-solving and advanced reasoning tasks. It significantly outperforms conventional instruction-tuned models, particularly on hard problems that require thoughtful analysis.

Key Features

  • Parameters: 32 billion
  • Architecture: Transformers with RoPE and FlashAttention-2
  • Context Length: Up to 131,072 tokens.
  • Key Capability: Enhanced reasoning through sophisticated step-by-step thinking
  • Performance: Competitive with other large reasoning models at similar parameter scales

API Reference

Endpoint

Our API follows the OpenAI-compatible format:

POST /v1/chat/completions

Model Name

Use the following model identifier in your requests:

deepseek-r1-32b

Parameters

  • model (string): The model identifier (“deepseek-r1-32b”).
  • messages (array): An array of message objects that represent the conversation history, each with:
    • role (string): Either “system”, “user”, or “assistant”.
    • content (string): The content of the message.
  • max_tokens (int): Max output tokens (default: 100, max: 32768).
  • temperature (float): Sampling temperature (recommended: 0.6).
  • top_p (float): Nucleus sampling parameter (recommended: 0.95).
  • top_k (int): Top-k sampling parameter (recommended: 20-40).
  • stream (boolean): Whether to stream the response (default: false).

Optimal Usage

For best results:

  • Enable the model’s thinking process with the <think> tag in system prompts
  • Use temperature=0.6 and top_p=0.95 instead of greedy decoding
  • For math problems, include “Please reason step by step” in your prompt

Code Examples

Choose your preferred programming language:

Standard Request

from openai import OpenAI
client = OpenAI(
api_key="your-deeprequest-key",
base_url="https://api.deeprequest.io/v1"
)
# Standard request
response = client.chat.completions.create(
model="deepseek-r1-32b",
messages=[
{"role": "system", "content": "You are a helpful assistant that thinks step by step."},
{"role": "user", "content": "How many r's are in the word 'strawberry'?"}
],
temperature=0.6,
max_tokens=500
)
# Print the response
print(response.choices[0].message.content)

Streaming Request

from openai import OpenAI
client = OpenAI(
api_key="your-deeprequest-key",
base_url="https://api.deeprequest.io/v1"
)
# Streaming response
stream = client.chat.completions.create(
model="deepseek-r1-32b",
messages=[
{"role": "system", "content": "You are a helpful assistant that thinks step by step."},
{"role": "user", "content": "How many r's are in the word 'strawberry'?"}
],
temperature=0.6,
max_tokens=500,
stream=True # Enable streaming
)
# Process the streaming response
print("Streaming response:")
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)

Response Format

Here’s an example response from the model:

{
"id": "chatcmpl-123456789",
"object": "chat.completion",
"created": 1704476253,
"model": "deepseek-r1-32b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "<think>\nLet me analyze the word 'strawberry' letter by letter.\nS-t-r-a-w-b-e-r-r-y\n\nI can see that the letter 'r' appears 3 times in the word 'strawberry':\n1. The third letter: str\n2. The eighth letter: strawber\n3. The ninth letter: strawberr\n\nSo there are 3 r's in the word 'strawberry'.\n</think>\n\nTo count the number of r's in 'strawberry', I'll examine each letter:\n\ns-t-r-a-w-b-e-r-r-y\n\nI can see the letter 'r' appears three times in the word:\n- First 'r': 3rd position\n- Second 'r': 8th position \n- Third 'r': 9th position\n\nTherefore, there are 3 r's in the word 'strawberry'."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 215,
"total_tokens": 239
}
}

Additional Resources

For detailed API documentation, please visit our API Docs or ReDoc.

Pricing details at Pricing.