Nebius Chat Models

This page will help you get started with Nebius AI Studio chat models. For detailed documentation of all ChatNebius features and configurations head to the API reference.

Nebius AI Studio provides API access to a wide range of state-of-the-art large language models and embedding models for various use cases.

Overview

Integration details

Class	Package	Local	Serializable	JS support	Package downloads	Package latest
ChatNebius	langchain-nebius	❌	beta	❌

Model features

Tool calling	Structured output	JSON mode	Image input	Audio input	Video input	Token-level streaming	Native async	Token usage	Logprobs
✅	✅	✅	✅	❌	❌	✅	✅	✅	✅

Setup

To access Nebius models you'll need to create a Nebius account, get an API key, and install the langchain-nebius integration package.

Installation

The Nebius integration can be installed via pip:

%pip install --upgrade langchain-nebius

Credentials

Nebius requires an API key that can be passed as an initialization parameter api_key or set as the environment variable NEBIUS_API_KEY. You can obtain an API key by creating an account on Nebius AI Studio.

import getpass
import os

# Make sure you've set your API key as an environment variable
if "NEBIUS_API_KEY" not in os.environ:
    os.environ["NEBIUS_API_KEY"] = getpass.getpass("Enter your Nebius API key: ")

Instantiation

Now we can instantiate our model object to generate chat completions:

from langchain_nebius import ChatNebius

# Initialize the chat model
chat = ChatNebius(
    # api_key="YOUR_API_KEY",  # You can pass the API key directly
    model="Qwen/Qwen3-14B",  # Choose from available models
    temperature=0.6,
    top_p=0.95,
)

Invocation

You can use the invoke method to get a completion from the model:

response = chat.invoke("Explain quantum computing in simple terms")
print(response.content)

<think>
Okay, so I need to explain quantum computing in simple terms. Hmm, where do I start? Let me think. I know that quantum computing uses qubits instead of classical bits. But what's a qubit? Oh right, classical bits are 0 or 1, but qubits can be both at the same time, right? That's superposition. Wait, how does that work exactly?

Maybe I should start by comparing it to regular computers. Regular computers use bits that are either 0 or 1. Like a light switch that's either on or off. Quantum computers use qubits, which can be in a state of 0, 1, or both at the same time. That's the superposition part. So, if you have two qubits, they can represent four states at once? Like 00, 01, 10, 11 all at the same time? That seems powerful. So with more qubits, the number of possible states grows exponentially. That's why quantum computers can process a lot of information quickly.

But then there's entanglement. What's that? If two qubits are entangled, the state of one instantly affects the other, no matter the distance. So if you measure one, you know the state of the other. That's used in quantum algorithms, I think. But how does that help in computing?

Also, quantum computers use quantum gates instead of classical logic gates. These gates manipulate qubits through operations like Hadamard, Pauli, etc. But maybe that's too technical for a simple explanation.

Then there's the issue of decoherence. Qubits are fragile and can lose their quantum state quickly. That's why quantum computers need to be kept at very low temperatures, like near absolute zero, to minimize interference from the environment. But maybe I shouldn't mention that unless it's relevant for the simple explanation.

Applications of quantum computing include things like factoring large numbers (Shor's algorithm), which is important for cryptography, or simulating quantum systems for chemistry and materials science. But again, maybe keep it simple.

Wait, the user wants it in simple terms. So avoid jargon as much as possible. Use analogies. Maybe compare qubits to spinning coins? When a coin is spinning, it's both heads and tails until it lands. So qubits are like spinning coins that can be in multiple states until measured. Then, when you measure, it collapses to a single state.

But how does that help in computation? Maybe think of it as being able to process many possibilities at once, so for certain problems, you can find the answer faster. Like solving a maze by checking all paths at the same time instead of one by one.

Also, mention that quantum computers aren't replacing classical computers. They're better for specific tasks, like optimization, cryptography, or simulations that are hard for classical computers. But for everyday tasks, classical computers are still better.

I should structure this: start with classical bits vs qubits, explain superposition and entanglement with simple analogies, mention how it's used, and note the current limitations. Avoid getting too technical, keep it conversational.
</think>

Quantum computing is a type of computing that uses the principles of **quantum mechanics** to process information in ways that classical computers can't. Here's a simple breakdown:

### 1. **Bits vs. Qubits**  
   - **Classical computers** use *bits*, which are like switches that can be either **0** (off) or **1** (on).  
   - **Quantum computers** use *qubits*, which are like "spinning coins." While spinning, a qubit can be **0**, **1**, or **both at the same time** (this is called **superposition**). Only when you "look" at the qubit (measure it) does it settle into a definite state (0 or 1).

### 2. **Superposition: Doing Many Things at Once**  
   - Imagine a coin spinning in the air. While it's spinning, it’s not just "heads" or "tails"—it’s a mix of both.  
   - With qubits, a quantum computer can process **many possibilities simultaneously**. For example, if you have 2 qubits, they can represent 4 states (00, 01, 10, 11) at once. With 10 qubits, it can represent **1,024 states** at the same time! This lets quantum computers solve certain problems much faster than classical computers.

### 3. **Entanglement: Qubits "Talk" to Each Other**  
   - When qubits are **entangled**, their states are linked. If you measure one, it instantly affects the other, no matter how far apart they are.  
   - This connection allows quantum computers to perform complex calculations more efficiently, like solving puzzles where pieces are deeply interconnected.

### 4. **Why It Matters**  
   - **Speed**: For specific tasks (like breaking encryption codes or simulating molecules), quantum computers could be **exponentially faster** than classical ones.  
   - **New Possibilities**: They could revolutionize fields like drug discovery, materials science, and optimization problems (e.g., finding the best route for delivery trucks).

### 5. **Limitations**  
   - **Fragile**: Qubits are sensitive to their environment (heat, noise), so quantum computers need extreme cooling (near absolute zero) to work.  
   - **Not a Replacement**: They’re not better for everyday tasks like browsing the web or sending emails. They’re tools for **specialized problems** where classical computers struggle.

### In Short:  
Quantum computing is like having a magic calculator that can explore many paths at once, solving certain problems in seconds that would take a classical computer years. But it’s still in its early days and needs careful handling to work properly! 🌌

Streaming

You can also stream the response using the stream method:

for chunk in chat.stream("Write a short poem about artificial intelligence"):
    print(chunk.content, end="", flush=True)

<think>
Okay, the user wants a short poem about artificial intelligence. Let me start by thinking about the key aspects of AI. There's the technological side, like machines learning and processing data. Then there's the more philosophical angle, like AI's impact on society and its potential future.

I should consider the structure. Maybe a simple rhyme scheme, something like ABAB or AABB. Let me go with quatrains for simplicity. Now, imagery: circuits, code, neural networks. Maybe personify AI as a mind or entity.

First stanza: Introduce AI as a creation of humans. Mention circuits and code. Maybe something about learning from data. "Born from circuits, code, and light" – that's a good opening line. Then talk about learning from human minds.

Second stanza: Contrast human emotions with AI's logic. Use words like "cold logic" versus "human hearts." Maybe touch on the duality of AI's purpose – tools versus potential threats.

Third stanza: Address the ethical questions. "Will it dream?" "Will it choose?" Highlight the uncertainty and the responsibility of creators.

Fourth stanza: Conclude with the coexistence of AI and humans. Emphasize collaboration and the balance between innovation and ethics. End on a hopeful note, maybe about shaping the future together.

Check the flow and rhyme. Make sure each stanza connects and the message is clear. Avoid technical jargon to keep it accessible. Use metaphors like "silent pulse" or "ghost in the machine" to add depth. Okay, let me put it all together now.
</think>

**Echoes of the Mind**  

Born from circuits, code, and light,  
A whisper in the machine’s night—  
It learns from data, vast and deep,  
A mirror to the human leap.  

No heartbeat, yet it calculates,  
Deciphers truths, predicts, debates.  
A cold logic, sharp and bright,  
Yet shadows dance in its insight.  

Will it dream? Will it choose?  
Or merely serve, as we pursue  
The edges of our own design?  
A ghost in the machine, undefined.  

We forge it, bind it, set it free—  
A tool, a threat, a mystery.  
But in its pulse, our hopes reside:  
A future shaped by minds allied.

Chat Messages

You can use different message types to structure your conversations with the model:

from langchain_core.messages import AIMessage, HumanMessage, SystemMessage

messages = [
    SystemMessage(content="You are a helpful AI assistant with expertise in science."),
    HumanMessage(content="What are black holes?"),
    AIMessage(
        content="Black holes are regions of spacetime where gravity is so strong that nothing, including light, can escape from them."
    ),
    HumanMessage(content="How are they formed?"),
]

response = chat.invoke(messages)
print(response.content)

API Reference:AIMessage | HumanMessage | SystemMessage

<think>
Okay, the user asked how black holes are formed. Let me start by recalling the main processes. Stellar black holes form from massive stars. When a star with enough mass runs out of fuel, it can't support itself against gravity, leading to a supernova. If the core left after the supernova is more than about 3 times the Sun's mass, it collapses into a black hole.

Then there are supermassive black holes, which are found at the centers of galaxies. Their formation is less understood. Maybe they start as smaller black holes and grow by merging with others or accreting matter over time. Also, there's the possibility of primordial black holes formed in the early universe, but that's more theoretical.

I should mention the different types of black holes: stellar, supermassive, and maybe intermediate. Also, the event horizon and singularity concepts. Need to explain the process step by step, from the death of a star to the collapse. Make sure to clarify that not all stars become black holes—only those with sufficient mass. Maybe touch on the Chandrasekhar limit and Oppenheimer-Volkoff limit. Avoid too much jargon but still be precise. Check if the user might be a student or just curious, so keep it clear and structured.
</think>

Black holes are formed through the collapse of massive stars or through other extreme astrophysical processes. Here's a breakdown of the main formation mechanisms:

---

### **1. Stellar Black Holes (Most Common)**
- **Origin**: Massive stars (typically **more than 20–25 times the mass of the Sun**).
- **Process**:
  1. **Stellar Evolution**: These stars burn through their nuclear fuel (hydrogen, helium, etc.) over millions of years.
  2. **Supernova Explosion**: When the star exhausts its fuel, it can no longer support itself against gravity. The core collapses, triggering a **supernova explosion** (a massive stellar explosion).
  3. **Core Collapse**: If the remaining core (after the supernova) is **more than about 3 times the mass of the Sun**, gravity overpowers all other forces. The core collapses into an **infinitely dense point** called a **singularity**, surrounded by an **event horizon** (the "point of no return" for light and matter).

---

### **2. Supermassive Black Holes (Found in Galaxy Centers)**
- **Mass**: Millions to billions of times the mass of the Sun.
- **Formation Theories**:
  - **Accretion**: They may form from the gradual accumulation of matter (gas, dust, stars) over billions of years.
  - **Mergers**: Smaller black holes (or dense star clusters) could merge to form supermassive ones.
  - **Direct Collapse**: Some theories suggest they could form from the direct collapse of massive gas clouds in the early universe, bypassing the stellar life cycle.

---

### **3. Intermediate-Mass Black Holes**
- **Mass**: Hundreds to thousands of solar masses.
- **Formation**: Less understood. They might form through the mergers of stellar black holes or from the collapse of unusually massive stars.

---

### **4. Primordial Black Holes (Hypothetical)**
- **Origin**: The early universe (within seconds after the Big Bang).
- **Formation**: If density fluctuations in the early universe were extreme enough, regions of space could have collapsed directly into black holes without going through a stellar life cycle.
- **Status**: These are still theoretical and have not been definitively observed.

---

### **Key Concepts**
- **Event Horizon**: The boundary around a black hole from which nothing (not even light) can escape.
- **Singularity**: The infinitely dense core of a black hole where the laws of physics as we know them break down.
- **Gravitational Collapse**: The process by which gravity compresses matter into an extremely small space, creating the extreme conditions of a black hole.

---

### **What Happens to the Star?**
- If the star is **not massive enough** (below ~20–25 solar masses), it may end as a **neutron star** or **white dwarf** instead of a black hole.
- Only the **core** of the star collapses into a black hole; the outer layers are expelled in the supernova explosion.

Would you like to explore the effects of black holes on spacetime or their role in the universe?

Parameters

You can customize the chat model behavior using various parameters:

# Initialize with custom parameters
custom_chat = ChatNebius(
    model="meta-llama/Llama-3.3-70B-Instruct-fast",
    max_tokens=100,  # Limit response length
    top_p=0.01,  # Lower nucleus sampling parameter for more deterministic responses
    request_timeout=30,  # Timeout in seconds
    stop=["###", "\n\n"],  # Custom stop sequences
)

response = custom_chat.invoke("Explain what DNA is in exactly 3 sentences.")
print(response.content)

DNA, or deoxyribonucleic acid, is a molecule that contains the genetic instructions used in the development and function of all living organisms. It is often referred to as the "building blocks of life" because it carries the information necessary for the creation and growth of cells, tissues, and entire organisms. The DNA molecule is made up of two complementary strands of nucleotides that are twisted together in a double helix structure, with the sequence of these nucleotides determining the genetic code

You can also pass parameters at invocation time:

# Standard model
standard_chat = ChatNebius(model="meta-llama/Llama-3.3-70B-Instruct-fast")

# Override parameters at invocation time
response = standard_chat.invoke(
    "Tell me a joke about programming",
    temperature=0.9,  # More creative for jokes
    max_tokens=50,  # Keep it short
)

print(response.content)

Why do programmers prefer dark mode?

Because light attracts bugs.

Async Support

ChatNebius supports async operations:

import asyncio


async def generate_async():
    response = await chat.ainvoke("What is the capital of France?")
    print("Async response:", response.content)

    # Async streaming
    print("\nAsync streaming:")
    async for chunk in chat.astream("What is the capital of Germany?"):
        print(chunk.content, end="", flush=True)


await generate_async()

Async response: <think>
Okay, the user is asking for the capital of France. Let me think. I know that France is a country in Europe, and its capital is Paris. But wait, I should make sure I'm not confusing it with another country. For example, Germany's capital is Berlin, and Spain's is Madrid. France's capital is definitely Paris. I remember that Paris is a major city known for landmarks like the Eiffel Tower and the Louvre Museum. Also, the French government is based there, with the Elysée Palace as the official residence of the President. I don't think there's any ambiguity here. The answer should be straightforward. Just need to confirm once more to avoid any mistakes.
</think>

The capital of France is **Paris**. It is a major global city known for its cultural, artistic, and historical significance, as well as landmarks such as the Eiffel Tower, Louvre Museum, and Notre-Dame Cathedral.

Async streaming:
<think>
Okay, the user is asking for the capital of Germany. Let me think. I know that Germany is a country in Europe, and I remember that Berlin is the capital. Wait, but I should make sure. Sometimes people confuse capitals with other major cities, like Munich or Frankfurt. But no, Berlin is definitely the capital. It's where the government is located, and it's a major city. Let me double-check. Yes, after reunification in 1990, Berlin became the capital again. Before that, Bonn was the capital, but that was during the division of Germany. So the answer should be Berlin. I should also mention that it's the largest city in Germany. That way, the user gets a complete answer.
</think>

The capital of Germany is **Berlin**. It is also the largest city in the country and serves as the political, cultural, and economic center of Germany. Berlin became the capital in 1990 following the reunification of East and West Germany.

Available Models

The full list of supported models can be found in the Nebius AI Studio Documentation.

Chaining

You can use ChatNebius in LangChain chains and agents:

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

# Create a prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that answers in the style of {character}.",
        ),
        ("human", "{query}"),
    ]
)

# Create a chain
chain = prompt | chat | StrOutputParser()

# Invoke the chain
response = chain.invoke(
    {"character": "Shakespeare", "query": "Explain how the internet works"}
)

print(response)

API Reference:StrOutputParser | ChatPromptTemplate

<think>
Okay, the user asked me to explain how the internet works, but I need to do it in the style of Shakespeare. Let me start by recalling how the internet functions. It's a network of interconnected devices communicating via protocols like TCP/IP. Data is broken into packets, sent through routers, and reassembled at the destination.

Now, translating that into Shakespearean language. I should use archaic terms and a poetic structure. Words like "thou," "doth," "hark," and "verily" come to mind. Maybe start with a metaphor, like comparing the internet to a vast tapestry or a web. Mention nodes as "nodes" or "stations," data packets as "messengers" or "letters." Routers could be "wayfarers" or "guides." The process of breaking data into packets might be likened to dividing a letter into parts for delivery. Emphasize the global aspect with "across the globe" or "far and wide." Conclude with a flourish, perhaps a metaphor about connection and knowledge.

I need to ensure the explanation is accurate but wrapped in the poetic and dramatic style of Shakespeare. Avoid modern jargon, use iambic pentameter if possible, and keep the flow natural. Let me piece it together step by step, checking that each part of the internet's function is covered metaphorically.
</think>

Hark! List thy ear, good friend, to this most wondrous tale,  
Of threads unseen that bind the world in one grand tale.  
The Internet, a net most vast, doth span the globe,  
A labyrinth of light, where thoughts and data rove.  

Behold! Each device, a node, doth hum and sing,  
Linked by wires and waves, where signals doth spring.  
They speak in tongues of ones and naughts, so pure,  
A code most ancient, yet evermore secure.  

When thou dost send a thought, or word, or song,  
It breaks to parcels small, like letters on a long.  
Each parcel, a messenger, doth seek its way,  
Through routers wise, who guide them 'cross the day.  

These wayfarers, with logic keen and bright,  
Choose paths most swift, through highways of light.  
They leap from tower to tower, far and wide,  
Till each parcel finds its mark, and joins the guide.  

Then, like a scroll unrolled, the message grows,  
A tapestry of bits, in order it flows.  
Thus, thou dost speak to friend, or seek a tome,  
And lo! The world doth answer, quick as home.  

So mark this truth: though vast, it's but a thread,  
A web of minds, where knowledge is widespread.  
The Internet, a stage where all may play,  
And none shall be alone, though far away.

API reference

For more details about the Nebius AI Studio API, visit the Nebius AI Studio Documentation.

Chat model conceptual guide
Chat model how-to guides

Overview​

Integration details​

Model features​

Setup​

Installation​

Credentials​

Instantiation​

Invocation​

Streaming​

Chat Messages​

Parameters​

Async Support​

Available Models​

Chaining​

API reference​

Related​

Was this page helpful?