Self-Hosting Ollama: Your Guide to Local and Remote Access

March 22, 2024 5-minute read

ollama • self-hosting • tailscale • local network • remote access • VSCode

Introduction: The Why and What of Self-Hosting Ollama

Thinking about tapping into AI power locally without the cloud overhead? You’re part of a growing trend! Self-hosting Ollama not only cuts down on cloud costs but also enhances privacy and control. This guide will walk you through setting up Ollama for local and remote access via Tailscale, enabling you to leverage AI from anywhere, securely and conveniently.

Use Case Ideas for Self-Hosted Ollama

Self-hosting Ollama opens up a plethora of possibilities. Here are some practical use cases to consider:

Development Side Projects

Utilize the Ollama API for your development projects to avoid the expense of cloud services. This setup is perfect for testing new ideas or building personal projects without the fear of racking up a hefty bill.

Private LLM Chats

Create a private, AI-powered chat environment that you can access from any device. This ensures your conversations are not only smart but also secure and confidential.

Continue in VSCode

Integrate Ollama with VSCode for real-time code suggestions. By using your self-hosted model, you sidestep the costs and security concerns associated with external large language models, making your coding sessions both efficient and secure.

Getting Ollama Up with Multiple Models

Ollama is an open-source platform that enables you to run and manage large language models (LLMs) locally on your hardware. It’s designed for those who prefer to keep their AI activities private and cost-effective.

To get started with Ollama, you’ll need to set it up on your machine and then install the models you want to use. Check out the Downloads page, and once installed the Models page for information on all the available models.

As an example, to install the llama2 model, you would use the following command:

ollama pull llama2

Once a model is pulled you can test it out in the console by using ollama run llama2 (exit with Ctrl + d once you are done talking to it).

Exposing Ollama on Your Local Network

To allow Ollama to be accessible on your local network, you need to change its default host setting from localhost to 0.0.0.0. This tells Ollama to listen on all network interfaces, making it reachable from other devices on your network.

Set the OLLAMA_HOST environment variable to 0.0.0.0 with the following command (using a command prompt with Administrator) setx OLLAMA_HOST "0.0.0.0". This will set the environment variable permanently. Once complete, either restart Ollama or reboot.

Testing Ollama on Your Local Network

Once you’ve configured Ollama to be accessible over the local network, it’s time to test it. Use the curl command to send a request to Ollama’s server and check if you get the expected response.

Replace 192.168.0.2 with the IP address of the machine where Ollama is running on your local network:

curl http://192.168.0.2:11434

If everything is set up correctly, you should see a response saying “Ollama is running”. This confirms that Ollama is accessible from other devices on your network. If you wish to stop here and simply use this or the PCs hostname on your local network you can continue to use Ollama this.

Tailscale Serve: Remote Access with a Cherry on Top

So, you’ve got Ollama cozying up on your local network, but what about when you’re out and about? Enter Tailscale, a sleek VPN service that hooks up your devices no matter where they roam. It’s like a secure, invisible bridge for your digital traffic.

First, ensure Tailscale is installed and set up on the same machine as Ollama. You’ll also want it on any devices from which you plan to access Ollama remotely. Get the lowdown on setting up Tailscale at Tailscale quickstart. Once installed see Enabling HTTPS in Tailscale and MagicDNS where you’ll see how to enable both HTTPS and the Magic DNS in Tailscale.

Now’s a good time to think about your Tailnet name because once we dive further, it’s going to be out in the public domain. Fancy a change? Check out how at Changing Your Tailnet Name. Who says network names can’t be fun, right?

On your Ollama-hosting device, pop open a terminal and run tailscale cert to get a shiny new certificate. This command reveals your machine’s name in Tailscale’s world and will tell you to runs something similar to tailscale cert hostname.tailnet-name.ts.net. You’ll know you’re golden when you see “Wrote public cert” and “Wrote private key” on your screen. From here on everytime you see ‘hostname.tailnet-name.ts.net’ replace it with your machines unique address.

Time to serve up some Ollama over HTTPS! Execute:

tailscale serve --https 11434 localhost:11434

This command starts a temporary HTTPS server for Ollama, now reachable at https://hostname.tailnet-name.ts.net:11434. Test this out from any device (outside your local network, but on Tailscale) with the good old curl:curl https://hostname.tailnet-name.ts.net:11434

If you get a cheerful “Ollama is running” or similar, you’ve nailed remote access. Congrats, your Ollama is now a globe-trotter, securely accessible from anywhere on your Tailnet!

If you are happy with how its setup, you can cancel the tailscale serve command and replace it with tailscale serve --bg --https 11434 localhost:11434 to run serve behind the scenes without having to keep the command prompt up. You can always revert this with tailscale serve --https=11434 off. You can checkout the serve documentation incase you want to use this to expose anything else you want to run.

Ollama Anywhere, Anytime

Congrats on making it this far! You’ve turned Ollama into a globe-trotting, network-hopping, AI powerhouse. Now, you can connect to your Ollama instance from pretty much anywhere using https://hostname.tailnet-name.ts.net:11434.

Whether you’re using ChatBox, another app, or a custom platform, as long as you’re on your Tailnet, you’re good to go. This setup gives you the freedom to interact with your self-hosted Ollama from any device, be it your phone while lounging at the beach or your laptop in a café halfway across the world.