Technology Exploration Jan '25

Let's kick the tires on Deepseek R-1.

Explore More ⇩
Overview

Deepseek R-1

Deepseek R-1 is an open source model, just released as of Jan '25, that is competitive with OpenAI's o1 model.

Installation

The first step is to download Ollama which is among the easiest ways to explore different models. As a Linux guy, I navigate to the download page and install as follows:

$ curl -fsSL https://ollama.com/install.sh | sh

in my termimal and then scold myself for blindly running a shell script directly off the internet.

Ok, just one more step.

Time to download the model. There is inherently a tradeoff in quality/processing demands and the specific model - e.g., larger models require more computation and yield better results. 32b and above are where this model shines, but unless you have a very powerful server/desktop, I recommend using the 8b version for exploration.

$ ollama run deepseek-r1:8b

Grab a beverage, take a stretch break and wait until you see a prompt: >>>.

Showtime. Feed it a prompt as you would with any other model. I offered up a softball question to start:

>>> What is the capital of Spain?

Observations

  1. Like asking my 10 year old a question, the answer is often more long winded than it needs to be. This is due to the model's Chain-of-Thought being included in the output. Be sure to check out the linked primer if you're curious to learn more.

  2. Generally, responses were solid. Albeit with the caveats that apply to anything GenAI related... don't assume it's right even though it presents itself as authoritative.

  3. Anything China-related is clearly sensitive. But US-based models aren't without bias either.

  4. Code-wise, the 8b model struggled with some relatively easy Rust questions. I suspect the 32b variants and above would fair better.

I shared this progress with my wife who responded with a "meh, it's not pretty, I'll stick with ChatGPT". If you want a nicer UI (that is blazing fast) stick around.

API

Did I mention that Ollama has a built-in API?

curl http://localhost:11434/api/generate -d '{
      "model": "deepseek-r1:8b",
      "prompt": "What is the capital of Spain?",
      "stream": false }'

Make it pretty

TL;DR Code

Just clone the linked repo above, follow the README instructions and have fun.

Let's build.

We're going to use Rust and HTMX to built a chat like UI that interacts with the Ollama API and Deepseek R-1 model.

Build

Setup

If you don't have Rust installed already, follow these instructions.

Why Rust? Performance and safety. If it compiles, it generally just works.

$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Using Cargo, Rust's package manager, let's scaffold a basic solution.

$ cargo new chat-deepseek
$ cd chat-deepseek

Let's add basic depedencies, including Axum (web server) and Tokio (async runtime)

$ cargo add axum tokio -F full

Edit /src/main.rs and replace the existing code with this code:

use axum::{response::Html, routing::get, Router};

#[tokio::main]
async fn main() {
    let app = Router::new().route("/", get(handler));

    let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
        .await
        .unwrap();
    println!("listening on {}", listener.local_addr().unwrap());
    axum::serve(listener, app).await.unwrap();
}

async fn handler() -> Html<&'static str> {
    Html("<h1>Hello, World!</h1>")
}

Now, back to the terminal to compile and let's make sure we have a basic Hello, World! rendering in the browser

$ cargo run

Open a browser and navigate to http://localhost:3000. Hopefully you see Hello, World!

At this point, we've got a basic working install of Rust, Axum (web server) and we're ready to wire up the Ollama API.

Wire up the API

Let's define the user interaction.

  1. We need a form input to capture the users' prompt
  2. A button to submit it
  3. An ability to call the Ollama API
  4. See the formatted output from the API
  5. Reset the form to submit another prompt

First, back to the terminal to stop the server from running. Anytime you make code changes you have to stop, recompile and restart the server.

Nicer Rust developer ergonomics are available (auto reloading), but outside the scope of this tutorial.

$ CTRL+C to stop the server from running

Another way to add Rust project dependencies is edit the Cargo.toml file directly (in the root folder of the solution).

Make yours match mine; edit and save.

[package]
name = "chat-deepseek"
version = "0.1.0"
edition = "2021"

[dependencies]
axum = "0.8.1"
reqwest = { version = "0.12.12", features =["json"] } 
serde = { version = "1.0.217", features = ["derive"] }
serde_json = "1.0.137"
tokio = { version = "1.43.0", features = ["full"] }

We've added a few more dependencies, notably reqwest for calling the API and serde for JSON processing.

We also need to edit src/main.rs. For simplicity, remove all the prior code and replace with this updated code:

use axum::{response::Html, routing::get, Form, Router};
use serde::{Deserialize, Serialize};
use reqwest;

#[derive(Deserialize, Debug)]
#[allow(dead_code)]
struct Prompt {
    prompt: String,
}

#[derive(Serialize, Debug)]
struct OllamaRequest {
    model: String,
    prompt: String,
    stream: bool,
}

#[derive(Deserialize, Debug)]
struct OllamaResponse {
    response: String,
}

#[tokio::main]
async fn main() {
    let app = Router::new().route("/", get(display_prompt).post(display_result));

    let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
        .await
        .unwrap();
    println!("listening on {}", listener.local_addr().unwrap());
    axum::serve(listener, app).await.unwrap();
}

async fn display_prompt() -> Html<&'static str> {
    Html(
        r#"
        <!doctype html>
        <html>
            <head>
            <script src="https://unpkg.com/htmx.org@2.0.4" integrity="sha384-HGfztofotfshcF7+8n44JQL2oJmowVChPTg48S+jvZoztPfvwD79OC/LTtG6dMp+" crossorigin="anonymous"></script>
            </head>
            <body>
                <form hx-post="/" hx-target="this" hx-swap="outerHTML">
                    <h1>Prompt</h1>
                    <div><textarea name="prompt" rows="4" cols="50"></textarea></div>
                    <button type="submit" class="btn primary">
                    Ask Deepseek R-1
                    </button>
                </form>
            </body>
        </html>
        "#,
    )
}

async fn display_result(Form(input): Form<Prompt>) -> Html<String> {
    let client = reqwest::Client::new();
    
    let ollama_request = OllamaRequest {
        model: "deepseek-r1:8b".to_string(),
        prompt: input.prompt,
        stream: false,
    };

    match client.post("http://localhost:11434/api/generate")
        .json(&ollama_request)
        .send()
        .await {
            Ok(response) => {
                match response.json::<OllamaResponse>().await {
                    Ok(result) => Html(format!(
                        r#"
                        <div>
                            <h2>Result</h2>
                            <p>{}</p>
                            <button hx-get="/" hx-target="body" hx-swap="innerHTML">
                                Back to Prompt
                            </button>
                        </div>
                        "#, 
                        result.response
                    )),
                    Err(_) => Html("<div>Failed to parse API response</div>".to_string())
                }
            },
            Err(_) => Html("<div>Failed to call Ollama API</div>".to_string())
        }
}

Finally, let's take it for a spin. Make sure all files are saved. Head back to your terminal and make sure it compiles as expected.

$ cargo run

Open a web browser to http://localhost:3000. Enter a question into the prompt, click the button and wait patiently for the results. Nice.

For the sake of focusing on the API interaction, I've implemented a simple form interaction by utilizing HTMX. A more robust implementation would use templates, log requests/errors and avoid hard-coding. Remember, this is a weekend project and simplicity is the goal.

Better User Interface

We're on the home stretch.

Let's apply enhancements to make this a little nicer to use:

  1. Tidy up the UI with TailwindCSS
  2. Differentiate the Chain-of-Thought (conveniently wrapped in <think>...</think> tags) from the answer provided in the API output

If you haven't done so already, stop the solution. Time to add one final dependency - regex - which is used to process the text.

$ cargo add regex

The final copy/paste of src/main.rs:

use axum::{response::Html, routing::get, Form, Router};
use regex::Regex;
use reqwest;
use serde::{Deserialize, Serialize};

#[derive(Deserialize, Debug)]
#[allow(dead_code)]
struct Prompt {
    prompt: String,
}

#[derive(Serialize, Debug)]
struct OllamaRequest {
    model: String,
    prompt: String,
    stream: bool,
}

#[derive(Deserialize, Debug)]
struct OllamaResponse {
    response: String,
}

#[tokio::main]
async fn main() {
    let app = Router::new().route("/", get(display_prompt).post(display_result));

    let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
        .await
        .unwrap();
    println!("listening on {}", listener.local_addr().unwrap());
    axum::serve(listener, app).await.unwrap();
}

async fn display_prompt() -> Html<&'static str> {
    Html(
        r#"
        <!doctype html>
        <html>
            <head>
            <script src="https://unpkg.com/@tailwindcss/browser@4"></script>
            <script src="https://unpkg.com/htmx.org@2.0.4" integrity="sha384-HGfztofotfshcF7+8n44JQL2oJmowVChPTg48S+jvZoztPfvwD79OC/LTtG6dMp+" crossorigin="anonymous"></script>
            </head>
            <body class="bg-gray-300 m-8">
                <form hx-post="/" hx-target="this" hx-swap="outerHTML">
                    <h1 class="mb-4 text-4xl font-extrabold leading-none tracking-tight text-gray-900 md:text-5xl lg:text-6xl dark:text-white">Prompt</h1>
                    <div><textarea class="block p-2.5 w-96 text-sm text-gray-900 bg-gray-50 rounded-lg border border-gray-300 focus:ring-blue-500 focus:border-blue-500 dark:bg-gray-700 dark:border-gray-600 dark:placeholder-gray-400 dark:text-white dark:focus:ring-blue-500 dark:focus:border-blue-500" name="prompt" rows="4" cols="50"></textarea></div>
                    <button type="submit" class="focus:outline-none text-white bg-red-700 hover:bg-red-800 focus:ring-4 focus:ring-red-300 font-medium rounded-lg text-sm px-5 py-2.5 me-2 mb-2 dark:bg-red-600 dark:hover:bg-red-700 dark:focus:ring-red-900 m-2">Ask Deepseek R-1</button>
                </form>
            </body>
        </html>
        "#,
    )
}

async fn display_result(Form(input): Form<Prompt>) -> Html<String> {
    let client = reqwest::Client::new();

    let ollama_request = OllamaRequest {
        model: "deepseek-r1:8b".to_string(),
        prompt: input.prompt,
        stream: false,
    };

    match client
        .post("http://localhost:11434/api/generate")
        .json(&ollama_request)
        .send()
        .await
    {
        Ok(response) => match response.json::<OllamaResponse>().await {
            Ok(result) => {
                let think_regex = Regex::new(r"(?s)<think>(.*?)</think>").unwrap();

                let processed_text =
                    think_regex.replace(&result.response, |caps: &regex::Captures| {
                        format!("<think class=\"italic\">{}</think>", &caps[1])
                    });

                Html(format!(
                    r#"
                            <div>
                                <h2 class="mb-4 text-2xl font-bold leading-none tracking-tight text-gray-900 md:text-2xl lg:text-3xl dark:text-white">Result</h2>
                                <p class="text-gray-500 bg-white rounded-lg whitespace-pre-line">{}</p>
                                <button hx-get="/" hx-target="body" hx-swap="innerHTML" class="focus:outline-none text-white bg-red-700 hover:bg-red-800 focus:ring-4 focus:ring-red-300 font-medium rounded-lg text-sm px-5 py-2.5 me-2 mb-2 dark:bg-red-600 dark:hover:bg-red-700 dark:focus:ring-red-900 m-2">Back to Prompt</button>
                            </div>
                            "#,
                    processed_text
                ))
            }
            Err(_) => Html("<div>Failed to parse API response</div>".to_string()),
        },
        Err(_) => Html("<div>Failed to call Ollama API</div>".to_string()),
    }
}

Wrap up

That's all folks. We've built a simple Rust front-end to interact via API with Deepseek R-1 hosted on Ollama.

Thanks

If you want to see more content of this nature, follow me on X.