Running DeepSeek R1 Locally with Ollama: Overview

Q: How do I run DeepSeek R1 locally with Ollama?

Run ollama pull deepseek-r1:1.5b to download the model, then ollama run deepseek-r1:1.5b to start an interactive session. Larger variants give better quality but need more GPU memory.

Q: Where does Ollama store downloaded models?

By default Ollama stores models in ~/.ollama/models on Linux, ~/Library/Application Support/ollama/models on macOS, and %APPDATA%\ollama\models on Windows. Change the location with the OLLAMA_MODELS environment variable.

Q: How do I use Ollama from Python?

Install the client with pip install ollama , then call from ollama import chat and pass a model name and messages: chat(model='deepseek-r1:1.5b', messages=[{'role': 'user', 'content': '...'}]) . Read the response text from response.message.content .

Ollama is an open-source runtime that downloads and runs large language models locally from a single command line, and DeepSeek R1 is an open-source reasoning model whose distilled variants match OpenAI-o1-class performance. Large Language Models (LLMs) are becoming more popular, especially as people want to run AI tools on their devices. Running a model locally protects your privacy, reduces wait times, and lowers costs.

Two tools making this possible are:

Ollama: A command-line tool to run Llama-based models locally.
DeepSeek R1: An open-source reasoning language model from the Chinese company DeepSeek that’s gaining attention quickly.

What is DeepSeek R1, and is it better than ChatGPT and other AI models?

According to a TechCrunch article by Maxwell Zeff, DeepSeek has surpassed popular AI models like ChatGPT in downloads and usage, thanks to its open models that compete at a lower cost. The app has seen over 300% more downloads than Perplexity in just a week [1]!

So the open source models of DeepSeek become very interesting to investigate and try out for me, as a notorious GPT user :)

This post will explore how to use Ollama and DeepSeek R1 together. We’ll walk through their installation and basic usage.

I will surely use Ollama to download and run the DeepSeek R1 model and briefly compare it with the Llama3.2 model.

What Is Ollama?

Ollama is an open-source app that allows you to run large language models like Llama on your own machine.

It provides:

Local Inference: Your data stays on your machine when you run the model.
Command-Line Interface: Easily interact with the model directly from your terminal.
Easy Model Management: Simple downloads and updates for different language models.

Ollama aims to make experimenting with LLMs easy without needing cloud services.

Installing Ollama

To install Ollama:

Visit the Download Ollama page or the Ollama GitHub.
Download the latest installer for your platform.
Open the installer and follow the on-screen instructions.

You can check if Ollama is installed successfully by running the command:

ollama -v

It should display Ollama’s version:

ollama version is 0.5.7

Using Homebrew

If you prefer using Homebrew, ensure you have it installed.

You can use the following command to install Homebrew with curl:

   /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Then, install Ollama with this command:

   brew install ollama/tap/ollama

Once the installation completes, verify:

   ollama version

Ollama models

You can easily download models using Ollama’s built-in model registry. Check the available models on the Ollama library webpage.

Here’s how you get the models using the ollama CLI:

Open your terminal or command prompt.
Use the ollama pull command followed by the model name you want to download.
```
ollama pull <model_name>
```
For example, to download the llama2 model, you would run:
```
ollama pull llama2
```
This will download and store the model in your local Ollama model directory.
You may specify a model variant. Some models have different variants with varying sizes and capabilities. You can specify the variant you want by appending it to the model name with a colon.
```
ollama pull llama2:7b
```
This will pull the 7 billion parameter variant of the llama2 model.

Where are the models stored?

By default, Ollama stores models in the following directories:

Linux: ~/.ollama/models
macOS: ~/Library/Application Support/ollama/models
Windows: %APPDATA%\ollama\models

You can change the default model directory by setting the OLLAMA_MODELS environment variable.

Please notice that the download time will vary depending on the size of the model and your internet speed. Once a model is downloaded, you can use it offline.

You can list your locally available models using: ollama list.

To delete a model, use: ollama rm <model_name>.

Following these steps, you can easily download and manage large language models within Ollama and use them for various tasks.

I am affiliated with and recommend the following fantastic books for learning Python and mastering your programming skills.

Python Crash Course, 3rd Edition, A Hands-On, Project-Based Introduction to Programming Python Crash Course is the best-selling guide to Python, with over 1.5 million copies sold. This fast-paced introduction teaches basic programming concepts like variables, lists, classes, and loops, along with exercises for practice. You'll learn to create interactive programs, test your code, and apply your skills to build a Space Invaders-inspired arcade game, create data visualizations, and deploy a simple online application.
Author - Eric Matthes Paperback – Big Book Publication date - 10 Jan. 2023 Number of pages - 552 Language - English Publisher - No Starch Press ISBN-13 - 978-1718502703
Ollama Crash Course. Build Local LLM powered Apps This book offers a practical and straightforward approach to learning Ollama, guiding readers through the development of their first app within minutes. Each bite-sized chapter covers essential topics, ensuring that readers can easily follow along and apply what they learn through hands-on coding practice.
Author - Greg Lim Paperback Publication date - 17 Feb. 2025 Number of pages - 96 Language - English Independently published ISBN-13 - 979-8311074919
The Pragmatic Programmer, journey to mastery, 20th Anniversary Edition, 2/e. Your journey to mastery, 20th Anniversary Edition The Pragmatic Programmer** is a classic tech book by Dave Thomas and Andy Hunt, published in 1999. It provides valuable insights for both new and experienced programmers, emphasizing a deep understanding of software development beyond specific languages. The updated edition addresses modern concepts like personal responsibility, career growth, flexible architecture, and effective testing. Key takeaways include combating software rot, continuous learning, writing adaptable code, understanding requirements, and guarding against security vulnerabilities. This book offers practical advice for both coders and managers, enhancing productivity and job satisfaction, making it essential for long-term career success.
Authors - Dave Thomas and Andy Hunt Hardcover – Illustrated Publication date - 13 Sept. 2019 Number of pages - 321 Language - English Publisher - Addison Wesley ISBN-13 - 978-0135957059

Running models

This command will download and run the llama3.2 model:

ollama run llama3.2

pulling manifest 
pulling dde5aa3fc5ff... 100% ▕████████████████▏ 2.0 GB                         
pulling 966de95ca8a6... 100% ▕████████████████▏ 1.4 KB                         
pulling fcc5a6bec9da... 100% ▕████████████████▏ 7.7 KB                         
pulling a70ff7e570d9... 100% ▕████████████████▏ 6.0 KB                         
pulling 56bb8bd477a5... 100% ▕████████████████▏   96 B                         
pulling 34bb5ab01051... 100% ▕████████████████▏  561 B                         
verifying sha256 digest 
writing manifest 
success 
>>> Send a message (/? for help)

We can see the available commands in the Terminal:

>>> /?
Available Commands:
  /set            Set session variables
  /show           Show model information
  /load <model>   Load a session or model
  /save <model>   Save your current session
  /clear          Clear session context
  /bye            Exit
  /?, /help       Help for a command
  /? shortcuts    Help for keyboard shortcuts

Use """ to begin a multi-line message.

Fixing “Error: could not connect to ollama app, is it running?”

If ollama run or a Python call returns Error: could not connect to ollama app, is it running?, the background Ollama server is not started. On Linux, or when Ollama was installed via Homebrew, start it in a separate terminal:

ollama serve

Then re-run your ollama run <model> command. On the macOS and Windows desktop apps, launching the Ollama application starts the server automatically.

Model parameters

When operating a model, it may be necessary to adjust its parameters to enhance performance.

Ollama offers several parameters that allow you to fine-tune the behaviour of large language models (LLMs) during text generation. These parameters can be specified in a Modelfile or passed as options when using the ollama run command or the Ollama API.

These parameters are found in the GitHub repository Ollama Model File.

Here are some of the key Ollama parameters:

num_predict: sets the maximum number of tokens to predict when generating text. The default value is -1 for infinite generation. num_predict 100` (limits output to 100 tokens)
num_ctx: sets the size of the context window, which determines how many tokens the LLM can use as context for generating the next token. Larger values allow the model to “remember” more of the conversation history. PARAMETER num_ctx 4096 (sets context window to 4096 tokens)
temperature: controls the “creativity” of the model. Higher values make the output more random and creative, while lower values make it more focused and deterministic. PARAMETER temperature 0.7 (sets the temperature to 0.7)
top_k: limits the token selection to the top k most likely tokens. Helps prevent nonsensical or irrelevant outputs. PARAMETER top_k 40 (considers only the top 40 most likely tokens)
top_p: limits the token selection to a cumulative probability p. Alternative to top_k can lead to more diverse outputs. PARAMETER top_p 0.9 (selects tokens until cumulative probability reaches 0.9)
repeat_penalty: penalises repetition of token sequences, encouraging novel text generation. PARAMETER repeat_penalty 1.2 (applies penalty of 1.2 for repeated sequences)
stop: specifies stop sequences that, when encountered, cause the LLM to stop generating text. PARAMETER stop "</s>" (stops generation when the end-of-sentence token is encountered).
mirostat: enables or disables the Mirostat sampling method, aiming to balance coherence and diversity in the generated text. It is not specified, but it can be enabled/disabled through API or command line.
mirostat_eta, mirostat_tau: fine-tune Mirostat sampling method behaviour. It is not specified but can be fine-tuned through API or command line.
seed: sets random number seed for reproducibility. PARAMETER seed 42 (sets seed to 42).
num_gpu, num_thread: control use of GPUs and threads for model execution. It is not specified but can be set through API or command line.

To set these parameters, we can use:

Modelfile: Define parameters in a Modelfile to customise default model behaviour.

This Modelfile sets the context window to 8192 tokens, the temperature to 0.8, and the top_k value to 50 for the llama2 model.
```
 FROM llama2
 PARAMETER num_ctx 8192
 PARAMETER temperature 0.8
 PARAMETER top_k 50
```
Command Line: Pass parameters as options when using ollama run command.
API: Include parameters in the request body when using the Ollama API.

What Is DeepSeek R1?

DeepSeek R1 is a large language model developed by the Chinese company DeepSeek. It is designed for tasks requiring reasoning and problem-solving.

DeepSeek R1 models match OpenAI-o1 performance. They feature six dense models distilled from DeepSeek-R1 based on Llama from Meta and Qwen, which is a LLM built by Alibaba Cloud. There are multiple model sizes, and larger models require more GPU power to run.

DeepSeek R1 is open-source, allowing developers flexibility in customising and avoiding vendor lock-in.

While it shows promise in coding, math, and reasoning, it has a limited track record and potential concerns regarding bias and privacy due to its origin. DeepSeek R1 could be a strong alternative to models like ChatGPT, but staying updated on its development and performance is essential.

Downloading

On the Ollama website, you can search for many models, including DeepSeek.

DeepSeek R1 model on Ollama library Website

This will download the smallest model that we can start playing with:

ollama run deepseek-r1:1.5b

pulling manifest 
pulling aabd4debf0c8... 100% ▕████████████████▏ 1.1 GB                         
pulling 369ca498f347... 100% ▕████████████████▏  387 B                         
pulling 6e4c38e1172f... 100% ▕████████████████▏ 1.1 KB                         
pulling f4d24e9138dd... 100% ▕████████████████▏  148 B                         
pulling a85fe2a2e58e... 100% ▕████████████████▏  487 B                         
verifying sha256 digest 
writing manifest 
success 

Testing DeepSeek R1 vs Llama 3.2

Well, I have just started working with the DeepSeek models. However, I am already totally impressed by their performance, even with the smallest version, which takes only 1GB.

(.venv) (base) elena@EAs-MacBook-Pro my_todo % ollama list
NAME                ID              SIZE      MODIFIED     
deepseek-r1:1.5b    a42b25d8c10a    1.1 GB    7 hours ago     
llama3.2:latest     a80c4f17acd5    2.0 GB    27 hours ago    

Consider how well DeepSeek R1:1.5b writes about Paris in a totally human voice. I, however, liked also a much smaller version produced by llama3.2 on the following prompt:

What is the capital of France? Surprise me with an interesting fact about this city.

deepseek-r1:1.5b


Okay, so I need to find out what the capital of France is and then 
surprise someone about an interesting fact about that city. Hmm, where do 
I start?

First off, I know that France's capital is Paris. That one part is 
straightforward. But I remember seeing some interesting facts before about 
Paris. Let me think... Wait, isn't there something about the Eiffel Tower? 
Yeah, the Eiffel Tower is named after the famous landmark in Paris.

But hold on, was it the original name or was it changed? I think the 
original name was noweIFfele. Oh right, that's French for "new iron 
tower." So they changed its name to the Eiffel Tower because of how it 
looks in writing. That's an interesting fact about Paris.

I also recall something about the Eiffel Tower being so tall that it can't 
be seen from a distance. Isn't there a story about someone trying to see 
it from a great height but couldn't? Maybe that person tried to look up, 
and if they were too close or the tower was too tall, they couldn't see it 
anymore.

Another fact I remember is about the Eiffel Tower's history. It was built 
by a French company called Brise费t, not Brissart. That might be an 
interesting point since Brissart is another city I know, and their names 
sound similar to Paris' names. So that's a cool detail.

I also think there's something about the Eiffel Tower's design. It has 
those really smooth curves because of how it was built by steel beams with 
a lot of curvature. That must have been an engineering feat, and it's 
still impressive even now.

Wait, when I first heard it, did Paris get renamed after someone? Oh yeah, 
Paris became the capital during World War II. They were replacing Charles 
de Gaulle in France, so that might be another fun fact about Paris in 
history.

Let me see if I can remember anything else. Maybe something about the 
Eiffel Tower's architecture? It's famous for having a lot of support 
structures and towers inside itself. That must have been designed by an 
engineer named Jules Sainte-Mary. So that could be another interesting 
point to include.

Putting it all together, I think I've got some points. The capital is 
Paris, the Eiffel Tower was originally noweIFfele but became known as the 
Eiffel Tower. It's over 300 meters tall and can't be seen from great 
distances because of its height. Paris also served as the capital during 
World War II, which changed the country's government structure. The Eiffel 
Tower is built with smooth curves due to its steel beams' curvature, 
designed by Jules Sainte-Mary.

Let me make sure I have all these facts correct. Sometimes names can be 
confusing or misspellings. For example, noweIFfele vs noweifflE, but the 
main point was the original name being changed. Also, confirming that it's 
over 300 meters and couldn't be seen from great distances. And Jules 
Sainte-Mary's contribution to its design.

I think I've covered enough interesting points about Paris. Now I just 
need to present them in a way that's surprising or intriguing for someone 
who knows nothing about Paris. Maybe start with the capital, then move 
into the Eiffel Tower facts, and include some historical context on when 
Paris became a country.

Also, maybe mention how Paris is still a major city globally despite its 
name changing so much. That could add some depth to the information. 
Overall, I think these points will make for an engaging and surprising 
fact about Paris.

**The Capital of France: Paris - A Surprising Journey Through History**

**1. The Capital:**
Paris is the capital of the French Republic, a city whose name has 
undergone significant changes over time.

**2. The Eiffel Tower's Story:**
Originally named "noweIFfele," the tower was renamed after the Frenchman 
who built it—Brise费t. Over the years, it was said that someone tried to 
see the tower from great distances and couldn't because of its height.

**3. Height and Visibility:**
The Eiffel Tower is about 300 meters tall, making it seem impossible to be 
seen from a distance. Its smooth curves are attributed to an engineer 
named Jules Sainte-Mary who designed it with steel beams that curved 
seamlessly.

**4. World War II Period:**
Paris became the capital of France during World War II when Charles de 
Gaulle was replaced by the newly formed government, which restructured 
France.

**5. Global Significance:**
Despite the name changes and restructuring, Paris remains a major city 
globally, known for its influence on European politics and culture.

**Conclusion:**
Paris, once known as "noweIFfele," has become a symbol of French 
excellence with its iconic Eiffel Tower, which offers an unparalleled view 
from great distances. Its history, architecture, and global impact make it 
a fascinating city to explore.

llama3.2:latest

    The capital of France is Paris!

Here's a fascinating fact about Paris:

**Paris has a secret underground river!**

Did you know that beneath the City of Light lies a network of underground 
waterways, known as the "River Seine" or "Canal de l'Ourcq"? This hidden 
river stretches for over 40 kilometers (25 miles) and is home to a variety 
of aquatic life, including fish and even otters!

The River Seine was once a major transportation route, but it was largely 
forgotten after the construction of the Pont des Arts in the 13th century. 
Today, it's a popular spot for boat tours, kayaking, and even urban 
geocaching.

Some interesting facts about the underground river:

* The River Seine is actually a tributary of the Marne River, which flows 
into the Seine.
* The canal was built in the 19th century to divert water from the Seine 
to help power industrial processes.
* There are several entrances to the underground river, but the most 
popular one is near the Pont des Arts bridge.

So, if you ever find yourself in Paris, consider exploring this hidden gem 
beneath the city's streets!

Would you like more facts about Paris or France?

deepseek-r1:1.5b vs llama3.2:latest, January 2025

Surely, I was intrigued if Grammarly could find any AI or copied text:

DeepSeek R1:
- 2% of your text matches external sources. Matches were found on the web or in academic databases
- 15% of your text has patterns that resemble AI text. These patterns may show AI text or occur in your writing
llama3.2:latest:
- No plagiarism or AI text detected.

So, you must be careful to check the output for correctness and include necessary citations. I suggest also reading them if you want to learn about the sources.

I am affiliated with and recommend the following fantastic Generative AI Prompt Books

Prompt Engineering for Llms. The Art and Science of Building Large Language Model-Based Applications This book offers a comprehensive guide to prompt engineering, essential for harnessing the full potential of large language models (LLMs) in various applications. Industry experts John Berryman and Albert Ziegler provide insights into effective communication with AI, equipping readers with the skills to design prompts that maximize LLM capabilities.
Authors - John Berryman and Albert Ziegler Paperback Publication date - 10 Dec. 2024 Number of pages - 280 Language - English Publisher - O'Reilly Media ISBN-13 - 978-1098156152
Prompt Engineering for Generative AI. Future-Proof Inputs for Reliable AI Outputs Large language models (LLMs) and diffusion models like ChatGPT and Stable Diffusion hold great potential due to their extensive training on public text and images. This accessibility allows developers to utilize these models for various tasks. This book provides a solid foundation in generative AI, focusing on practical applications. Authors James Phoenix and Mike Taylor introduce prompt engineering principles to help developers achieve reliable results when integrating LLMs and diffusion models into their workflows.
Authors - James Phoenix and Mike Taylor Paperback - Big Publication date - 25 Jun. 2024 Number of pages - 422 Language - English Publisher - O'Reilly Media ISBN-13 - 978-1098153434
Mastering AI Prompt Engineering. The Ultimate Guide for ChatGPT Users This book teaches you how to create effective prompts that enhance AI's creativity, problem-solving abilities, and overall output quality. You'll learn advanced techniques and practical applications to maximize AI effectiveness across various domains.
Author - Adriano Damiao Paperback Publication date - 14 Sept. 2024 Number of pages - 220 Language - English ISBN-13 - 979-8338441268

Prompting DeepSeek R1: CLI, Files, and Python

Using command line

You can start working with models such as DeepSeek R1 or any other using the command line.

Try out straightforward prompts to see how the model fits your requirements:

Write a well-commented Python code for a factorial function. Explain it in simple words but briefly.

Llama3.2:latest

```python
def factorial(n):
    # Initialize the result variable to 1
    result = 1
    
    # Loop from 2 to n (inclusive)
    for i in range(2, n+1):
        # Multiply the result by the current number
        result *= i
    
    # Return the final result
    return result

# Test the function with a sample input
print(factorial(5))  # Output: 120

```

DeepSeek R1

Alright, the user has asked me to write a well-commented Python code for a 
factorial function and explain it in simple terms. 

First, I need to understand what a factorial is. The factorial of a number 
n (denoted as n!) is the product of all positive integers from 1 to n. For 
example, 5! = 5 × 4 × 3 × 2 × 1 = 120.

I'll start by outlining the function: I'll need an input parameter and 
return the factorial. I'll also include error handling in case the user 
enters a negative number or zero.

Next, I should consider how to explain this code. The comment will briefly 
explain what the factorial is, then show how the code works step by step. 
I'll make sure the explanation is simple for someone new to Python and 
programming concepts.

I'll also think about possible issues they might face, like handling 
negative numbers or zero. It's important to include a try-except block in 
case an invalid input is given, which will help prevent errors.

Finally, I'll test the code mentally by calculating 5! to ensure it 
returns 120 and that an error message is shown for invalid inputs.


```python
def factorial(n):
    # The factorial of a number n is the product of all positive integers 
up to n.
    # For example, 5! = 5 × 4 × 3 × 2 × 1 = 120.
    
    try:
        if n < 0:  # Negative numbers are not allowed in factorials
            raise ValueError("Cannot calculate factorial of a negative 
number.")
        result = 1
        for i in range(1, n + 1):
            result *= i
        return result
    except ValueError as e:
        print(f"Error calculating factorial: {e}")
```

### Explanation

- **Function Definition:** The function `factorial` takes an input 
parameter `n`, which is the number we want to calculate the factorial of.
  
- **Error Handling:** 
  - We first check if `n` is negative. If it is, we raise a ValueError 
with a message indicating that the factorial cannot be calculated for a 
negative number.
  - We use a try-except block to catch any other potential errors that 
might occur during the calculation.

- **Calculation:**
  - We initialize `result` as 1. This will hold the final product of the 
numbers from 1 to `n`.
  - Using a for loop, we iterate from 1 to `n`, inclusive. In each 
iteration, we multiply `result` by the current value of `i`.
  
- **Return Result:** After completing the loop, we return the computed 
result.

### Example Usage

To calculate the factorial of 5:

```python
print(factorial(5))  # Output: 120
```

### Testing Edge Cases

- If `n` is zero:
  - Factorial of 0 is defined as 1.
  - The code will return 1 since the loop from 1 to 0 doesn't run, and 
`result` remains initialized to 1.

- If `n` is negative:
  - The function raises a ValueError with an appropriate message.

You can see that DeepSeek R1 provided a good explanation of the Python code, error handling, and usage examples.

Is it not cool? Honestly, many of the modern general-purpose LLMs require additional prompts to accomplish this.

Use a text file with prompt

You can also run a model with a prompt from a file like this:

ollama run llama2 < my_prompt.txt

Using Python

Using the Ollama API with Python can be done with the Ollama library.

You typically need to use pip install to install the ollama Python library before using it in your Python scripts with the following:

   pip install ollama

This will download and install the latest ollama library version from the Python Package Index (PyPI).

It is very useful since the ollama library might have dependencies on other Python packages, and pip install automatically handles downloading and installing those dependencies.

Once the installation is complete, you can import the ollama library in your Python scripts and use its functions to interact with the Ollama server and generate text from large language models.

This is a starting code to use Ollama; replace "deepseek-r1:1.5b" with the actual name of the model you want to use.

You can also download a larger model, but be cautious that it requires more computing resources.

from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='deepseek-r1:1.5b', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])

print(response.message.content)

Python library Ollama can also be used for reading files and using them as input:

from ollama import chat
from ollama import ChatResponse

with open("input.txt", "r") as f:
    prompt = f.read()

response: ChatResponse = chat(model='deepseek-r1:1.5b', messages=[
  {
    'role': 'user',
    'content': prompt,
  },
])

print(response.message.content)

Providing context

Prompt context is essential for guiding language models (LLMs) like those used in Ollama towards more accurate and relevant outputs. It’s like setting the stage for the model, providing the background information and instructions necessary to understand your request and generate a suitable response.

Here’s a breakdown of how to effectively use prompt context:

Types of Context

Instructional Context: This provides explicit instructions on what you want the model to do.
- Example: "Translate the following sentence into French: 'The cat sat on the mat.'"
Role-Playing Context: Assigning a role to the model can influence its behaviour and output style.
- Example: "You are a helpful and harmless AI assistant. Write a short story about a cat who goes on an adventure."
Knowledge Context: Providing relevant facts, data, or background information helps the model generate more informed responses.
- Example: "Given that the Earth is round and orbits the Sun, explain why we have seasons."
Example Context: Including examples of the desired output can guide the model towards the correct format and style.
- Example: "Summarise the following text in one sentence: [text]. Here are some examples of one-sentence summaries: [example 1], [example 2]."

Techniques for Integrating Context

Directly in the Prompt: Include the context directly within your prompt.
- Example: "You are a helpful assistant. Write a poem about the beauty of nature."
Separate Context Section: Create a separate section in your prompt for longer or more complex contexts.
- Example: ```text Context:
  - You are a customer service representative for a telecommunications company.
  - The customer is experiencing issues with their internet connection.
  Task:
  - Respond to the customer’s complaint politely and helpfully. ```
External Files: Store context in external files and reference them in your prompt. This is useful for large or reusable contexts.
- Example: "Using the information in the file 'customer_data.json', personalise this email greeting: 'Dear [customer name],'"

Best Practices

Be Clear and Specific: Avoid ambiguity and provide precise instructions.
Relevance: Ensure the context is directly relevant to the task.
Conciseness: Keep the context concise and avoid overwhelming the model with unnecessary information.
Experimentation: Try different types of context and approaches to find what works best for your specific task.

Example with Ollama Python library

from ollama import chat
from ollama import ChatResponse

context = """
You are a helpful and harmless AI assistant.
You are writing a story for a children's book.
"""

prompt = f"{context} Write a short story about a friendly robot who helps people in a city."

response: ChatResponse = chat(model='deepseek-r1:1.5b', messages=[
  {
    'role': 'user',
    'content': prompt,
  },
])

print(response.message.content)

Once upon a time, in a small, busy city named Happy City, there lived a friendly robot named Max. He had a soft spot for everyone, always smiling and helping out wherever he went. His name was simple but meaningful—'Max the Math Whiz'? No, wait, that's too silly. Just 'Max the friendly robot.' His job was to solve problems and make things easier.

One day, while Max was helping a mischievous girl named Lily with her homework on shapes, he met Timmy the curious dog. "Hey, what's up?" he said, giving Timmy a high-five. Timmy rolled his eyes but showed interest. "Why did you bring so many toys? I need something better."

Max smiled and offered Timmy a remote-controlled car. "Let's play mini-golf together," he said.

Timmy asked, "What game do you play?" Max replied with a grin, turning the mini-golf course into a fun adventure. They zoomed around the course, using momentum to hit the green. Max even had a magic ball that could send Timmy anywhere in the city. The two spent hours having fun together, learning about math and magic.

Max's friend, Sammy the squirrel, was also friendly. He often helped kids pick up lost toys on the way home from school. One day, while sorting mail, Sammy accidentally knocked over a box of socks and broke a pair. Max walked in and immediately fixed it, making everyone happy.

The next morning, Max visited the park to help Lily with her math homework again. "You're better at this," he said, encouraging her. Lily nodded, and together they solved a problem involving fractions. They made a big batch of lemonade for a party, dividing the ingredients fairly among all the kids in the park.

In the end, Max's journey from a robot to a beloved friend was a success, and everyone in Happy City knew that helping others could bring laughter and fun to every day.
  
Max the friendly robot had learned the value of kindness and patience. From solving math problems to making friends, he showed that helping people could turn any ordinary city into a lively celebration.

In this example, the context sets the role of the AI assistant and specifies the story’s target audience. This helps guide the model towards generating a suitable and engaging story for children.

By the way, we have exciting news about Multimodality! DeepSeek has released the Janus-Pro-7B, a groundbreaking open-source multimodal AI model. It is available under the MIT open-source license. Janus-Pro-7B (GitHub) can generate images, and surpasses OpenAI’s DALL-E 3 and Stable Diffusion in the GenEval and DPG-Bench benchmarks (see their paper.)

DeepSeek R1 and Ollama FAQ

How do I install Ollama?

Download the installer from ollama.com/download for your platform, or on macOS with Homebrew run brew install ollama/tap/ollama. Confirm the install with ollama -v, which prints the installed version.

How do I run DeepSeek R1 locally with Ollama?

Run ollama pull deepseek-r1:1.5b to download the model, then ollama run deepseek-r1:1.5b to start an interactive session. Larger variants give better quality but need more GPU memory.

Where does Ollama store downloaded models?

By default Ollama stores models in ~/.ollama/models on Linux, ~/Library/Application Support/ollama/models on macOS, and %APPDATA%\ollama\models on Windows. Change the location with the OLLAMA_MODELS environment variable.

How do I use Ollama from Python?

Install the client with pip install ollama, then call from ollama import chat and pass a model name and messages: chat(model='deepseek-r1:1.5b', messages=[{'role': 'user', 'content': '...'}]). Read the response text from response.message.content.

Is DeepSeek R1 free and open-source?

Yes. DeepSeek R1 is released as open-source, so you can run it locally, customise it, and avoid vendor lock-in. It ships in several distilled sizes based on Llama and Qwen, with larger models requiring more GPU power.

Conclusion

Ollama is a local LLM runtime that turns model download, execution, and API access into single commands, and DeepSeek R1 is an open-source reasoning model you can run entirely on your own hardware for privacy and offline use. Together they let you download, run, and script large language models on your own system without cloud services.

DeepSeek R1 With Ollama

📚 This post is part of the "AI Orchestration" series

Running DeepSeek R1 Locally with Ollama: Overview

What Is Ollama?

Installing Ollama

Using Homebrew

Ollama models

Python Crash Course, 3rd Edition, A Hands-On, Project-Based Introduction to Programming

Ollama Crash Course. Build Local LLM powered Apps

The Pragmatic Programmer, journey to mastery, 20th Anniversary Edition, 2/e. Your journey to mastery, 20th Anniversary Edition

Running models

Fixing “Error: could not connect to ollama app, is it running?”

Model parameters

What Is DeepSeek R1?

Downloading

Testing DeepSeek R1 vs Llama 3.2

Prompt Engineering for Llms. The Art and Science of Building Large Language Model-Based Applications

Prompt Engineering for Generative AI. Future-Proof Inputs for Reliable AI Outputs

Mastering AI Prompt Engineering. The Ultimate Guide for ChatGPT Users

Prompting DeepSeek R1: CLI, Files, and Python

Using command line

Use a text file with prompt

Using Python

Providing context

DeepSeek R1 and Ollama FAQ

How do I install Ollama?

How do I run DeepSeek R1 locally with Ollama?

Where does Ollama store downloaded models?

How do I use Ollama from Python?

Is DeepSeek R1 free and open-source?

Conclusion

References

References

Citation

DeepSeek R1 With Ollama

📚 This post is part of the "AI Orchestration" series

Running DeepSeek R1 Locally with Ollama: Overview

What Is Ollama?

Installing Ollama

Using Homebrew

Ollama models

Python Crash Course, 3rd Edition, A Hands-On, Project-Based Introduction to Programming

Ollama Crash Course. Build Local LLM powered Apps

The Pragmatic Programmer, journey to mastery, 20th Anniversary Edition, 2/e. Your journey to mastery, 20th Anniversary Edition

Running models

Fixing “Error: could not connect to ollama app, is it running?”

Model parameters

What Is DeepSeek R1?

Downloading

Testing DeepSeek R1 vs Llama 3.2

Prompt Engineering for Llms. The Art and Science of Building Large Language Model-Based Applications

Prompt Engineering for Generative AI. Future-Proof Inputs for Reliable AI Outputs

Mastering AI Prompt Engineering. The Ultimate Guide for ChatGPT Users

Prompting DeepSeek R1: CLI, Files, and Python

Using command line

Use a text file with prompt

Using Python

Providing context

DeepSeek R1 and Ollama FAQ

How do I install Ollama?

How do I run DeepSeek R1 locally with Ollama?

Where does Ollama store downloaded models?

How do I use Ollama from Python?

Is DeepSeek R1 free and open-source?

Conclusion

References

Enjoyed this? Get more like it.

References

Citation

Learn AI and Python without the hype