Deepseek R-1
Deepseek R-1 is an open source model, just released as of Jan '25, that is competitive with OpenAI's o1 model.
- All the kids are talking about it because it potentially resets cost dynamics
- You can run it locally on a variety of hardware
- Like most, I was curious about the quality and whether it would be censored
Installation
The first step is to download Ollama which is among the easiest ways to explore different models. As a Linux guy, I navigate to the download page and install as follows:
$ curl -fsSL https://ollama.com/install.sh | sh
in my termimal and then scold myself for blindly running a shell script directly off the internet.
- Mac/Win users should follow their instructions
Ok, just one more step.
Time to download the model. There is inherently a tradeoff in quality/processing demands and the specific model - e.g., larger models require more computation and yield better results. 32b and above are where this model shines, but unless you have a very powerful server/desktop, I recommend using the 8b version for exploration.
$ ollama run deepseek-r1:8b
Grab a beverage, take a stretch break and wait until you see a prompt: >>>
.
Showtime. Feed it a prompt as you would with any other model. I offered up a softball question to start:
>>> What is the capital of Spain?
Observations
-
Like asking my 10 year old a question, the answer is often more long winded than it needs to be. This is due to the model's Chain-of-Thought being included in the output. Be sure to check out the linked primer if you're curious to learn more.
-
Generally, responses were solid. Albeit with the caveats that apply to anything GenAI related... don't assume it's right even though it presents itself as authoritative.
-
Anything China-related is clearly sensitive. But US-based models aren't without bias either.
-
Code-wise, the 8b model struggled with some relatively easy Rust questions. I suspect the 32b variants and above would fair better.
I shared this progress with my wife who responded with a "meh, it's not pretty, I'll stick with ChatGPT". If you want a nicer UI (that is blazing fast) stick around.
API
Did I mention that Ollama has a built-in API?
curl http://localhost:11434/api/generate -d '{
"model": "deepseek-r1:8b",
"prompt": "What is the capital of Spain?",
"stream": false }'
- Check out the API docs
- The model and prompt parameters are hopefully obvious
- Set stream to false to receive the payload (output) consolidated versus one token one at a time
Make it pretty
TL;DR Code
Just clone the linked repo above, follow the README instructions and have fun.
Let's build.
We're going to use Rust and HTMX to built a chat like UI that interacts with the Ollama API and Deepseek R-1 model.
- This is a weekend project, so it's not production ready
- I'm a developer, not a designer, so I'm using the term pretty liberally
Build
Setup
If you don't have Rust installed already, follow these instructions.
Why Rust? Performance and safety. If it compiles, it generally just works.
$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Using Cargo, Rust's package manager, let's scaffold a basic solution.
$ cargo new chat-deepseek
$ cd chat-deepseek
Let's add basic depedencies, including Axum (web server) and Tokio (async runtime)
$ cargo add axum tokio -F full
Edit /src/main.rs
and replace the existing code with this code:
use axum::{response::Html, routing::get, Router};
#[tokio::main]
async fn main() {
let app = Router::new().route("/", get(handler));
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.unwrap();
println!("listening on {}", listener.local_addr().unwrap());
axum::serve(listener, app).await.unwrap();
}
async fn handler() -> Html<&'static str> {
Html("<h1>Hello, World!</h1>")
}
Now, back to the terminal to compile and let's make sure we have a basic Hello, World! rendering in the browser
$ cargo run
Open a browser and navigate to http://localhost:3000. Hopefully you see Hello, World!
At this point, we've got a basic working install of Rust, Axum (web server) and we're ready to wire up the Ollama API.
Wire up the API
Let's define the user interaction.
- We need a form input to capture the users' prompt
- A button to submit it
- An ability to call the Ollama API
- See the formatted output from the API
- Reset the form to submit another prompt
First, back to the terminal to stop the server from running. Anytime you make code changes you have to stop, recompile and restart the server.
Nicer Rust developer ergonomics are available (auto reloading), but outside the scope of this tutorial.
$ CTRL+C to stop the server from running
Another way to add Rust project dependencies is edit the Cargo.toml
file directly (in the root folder of the solution).
Make yours match mine; edit and save.
[package]
name = "chat-deepseek"
version = "0.1.0"
edition = "2021"
[dependencies]
axum = "0.8.1"
reqwest = { version = "0.12.12", features =["json"] }
serde = { version = "1.0.217", features = ["derive"] }
serde_json = "1.0.137"
tokio = { version = "1.43.0", features = ["full"] }
We've added a few more dependencies, notably reqwest for calling the API and serde for JSON processing.
We also need to edit src/main.rs
. For simplicity, remove all the prior code and replace with this updated code:
use axum::{response::Html, routing::get, Form, Router};
use serde::{Deserialize, Serialize};
use reqwest;
#[derive(Deserialize, Debug)]
#[allow(dead_code)]
struct Prompt {
prompt: String,
}
#[derive(Serialize, Debug)]
struct OllamaRequest {
model: String,
prompt: String,
stream: bool,
}
#[derive(Deserialize, Debug)]
struct OllamaResponse {
response: String,
}
#[tokio::main]
async fn main() {
let app = Router::new().route("/", get(display_prompt).post(display_result));
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.unwrap();
println!("listening on {}", listener.local_addr().unwrap());
axum::serve(listener, app).await.unwrap();
}
async fn display_prompt() -> Html<&'static str> {
Html(
r#"
<!doctype html>
<html>
<head>
<script src="https://unpkg.com/htmx.org@2.0.4" integrity="sha384-HGfztofotfshcF7+8n44JQL2oJmowVChPTg48S+jvZoztPfvwD79OC/LTtG6dMp+" crossorigin="anonymous"></script>
</head>
<body>
<form hx-post="/" hx-target="this" hx-swap="outerHTML">
<h1>Prompt</h1>
<div><textarea name="prompt" rows="4" cols="50"></textarea></div>
<button type="submit" class="btn primary">
Ask Deepseek R-1
</button>
</form>
</body>
</html>
"#,
)
}
async fn display_result(Form(input): Form<Prompt>) -> Html<String> {
let client = reqwest::Client::new();
let ollama_request = OllamaRequest {
model: "deepseek-r1:8b".to_string(),
prompt: input.prompt,
stream: false,
};
match client.post("http://localhost:11434/api/generate")
.json(&ollama_request)
.send()
.await {
Ok(response) => {
match response.json::<OllamaResponse>().await {
Ok(result) => Html(format!(
r#"
<div>
<h2>Result</h2>
<p>{}</p>
<button hx-get="/" hx-target="body" hx-swap="innerHTML">
Back to Prompt
</button>
</div>
"#,
result.response
)),
Err(_) => Html("<div>Failed to parse API response</div>".to_string())
}
},
Err(_) => Html("<div>Failed to call Ollama API</div>".to_string())
}
}
Finally, let's take it for a spin. Make sure all files are saved. Head back to your terminal and make sure it compiles as expected.
$ cargo run
Open a web browser to http://localhost:3000. Enter a question into the prompt, click the button and wait patiently for the results. Nice.
- Compilation errors? Double-check that you copied/pasted both the
Cargo.toml
andsrc/main.rs
code properly and saved. - Did you get an error message about the API? You have to Ollama running in a separate terminal tab/window.
$ ollama run deepseek-r1:8b
for reference.
For the sake of focusing on the API interaction, I've implemented a simple form interaction by utilizing HTMX. A more robust implementation would use templates, log requests/errors and avoid hard-coding. Remember, this is a weekend project and simplicity is the goal.
Better User Interface
We're on the home stretch.
Let's apply enhancements to make this a little nicer to use:
- Tidy up the UI with TailwindCSS
- Differentiate the Chain-of-Thought (conveniently wrapped in
<think>
...</think>
tags) from the answer provided in the API output
If you haven't done so already, stop the solution. Time to add one final dependency - regex - which is used to process the text.
$ cargo add regex
The final copy/paste of src/main.rs
:
use axum::{response::Html, routing::get, Form, Router};
use regex::Regex;
use reqwest;
use serde::{Deserialize, Serialize};
#[derive(Deserialize, Debug)]
#[allow(dead_code)]
struct Prompt {
prompt: String,
}
#[derive(Serialize, Debug)]
struct OllamaRequest {
model: String,
prompt: String,
stream: bool,
}
#[derive(Deserialize, Debug)]
struct OllamaResponse {
response: String,
}
#[tokio::main]
async fn main() {
let app = Router::new().route("/", get(display_prompt).post(display_result));
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.unwrap();
println!("listening on {}", listener.local_addr().unwrap());
axum::serve(listener, app).await.unwrap();
}
async fn display_prompt() -> Html<&'static str> {
Html(
r#"
<!doctype html>
<html>
<head>
<script src="https://unpkg.com/@tailwindcss/browser@4"></script>
<script src="https://unpkg.com/htmx.org@2.0.4" integrity="sha384-HGfztofotfshcF7+8n44JQL2oJmowVChPTg48S+jvZoztPfvwD79OC/LTtG6dMp+" crossorigin="anonymous"></script>
</head>
<body class="bg-gray-300 m-8">
<form hx-post="/" hx-target="this" hx-swap="outerHTML">
<h1 class="mb-4 text-4xl font-extrabold leading-none tracking-tight text-gray-900 md:text-5xl lg:text-6xl dark:text-white">Prompt</h1>
<div><textarea class="block p-2.5 w-96 text-sm text-gray-900 bg-gray-50 rounded-lg border border-gray-300 focus:ring-blue-500 focus:border-blue-500 dark:bg-gray-700 dark:border-gray-600 dark:placeholder-gray-400 dark:text-white dark:focus:ring-blue-500 dark:focus:border-blue-500" name="prompt" rows="4" cols="50"></textarea></div>
<button type="submit" class="focus:outline-none text-white bg-red-700 hover:bg-red-800 focus:ring-4 focus:ring-red-300 font-medium rounded-lg text-sm px-5 py-2.5 me-2 mb-2 dark:bg-red-600 dark:hover:bg-red-700 dark:focus:ring-red-900 m-2">Ask Deepseek R-1</button>
</form>
</body>
</html>
"#,
)
}
async fn display_result(Form(input): Form<Prompt>) -> Html<String> {
let client = reqwest::Client::new();
let ollama_request = OllamaRequest {
model: "deepseek-r1:8b".to_string(),
prompt: input.prompt,
stream: false,
};
match client
.post("http://localhost:11434/api/generate")
.json(&ollama_request)
.send()
.await
{
Ok(response) => match response.json::<OllamaResponse>().await {
Ok(result) => {
let think_regex = Regex::new(r"(?s)<think>(.*?)</think>").unwrap();
let processed_text =
think_regex.replace(&result.response, |caps: ®ex::Captures| {
format!("<think class=\"italic\">{}</think>", &caps[1])
});
Html(format!(
r#"
<div>
<h2 class="mb-4 text-2xl font-bold leading-none tracking-tight text-gray-900 md:text-2xl lg:text-3xl dark:text-white">Result</h2>
<p class="text-gray-500 bg-white rounded-lg whitespace-pre-line">{}</p>
<button hx-get="/" hx-target="body" hx-swap="innerHTML" class="focus:outline-none text-white bg-red-700 hover:bg-red-800 focus:ring-4 focus:ring-red-300 font-medium rounded-lg text-sm px-5 py-2.5 me-2 mb-2 dark:bg-red-600 dark:hover:bg-red-700 dark:focus:ring-red-900 m-2">Back to Prompt</button>
</div>
"#,
processed_text
))
}
Err(_) => Html("<div>Failed to parse API response</div>".to_string()),
},
Err(_) => Html("<div>Failed to call Ollama API</div>".to_string()),
}
}
- Included TailwindCSS (via CDN) with this line
<script src="https://unpkg.com/@tailwindcss/browser@4"></script>
- Applied TailwindCSS classes to the various HTML elements (body, p, button, textarea) to improve the rendering
- Use the regex library to find the
<think>
tag and style it as italicized text
Wrap up
That's all folks. We've built a simple Rust front-end to interact via API with Deepseek R-1 hosted on Ollama.
Thanks
If you want to see more content of this nature, follow me on X.