Ifindal
📖 Tutorial

How to Harness Local AI on Ubuntu Without the Cloud Hassle

Last updated: 2026-05-01 10:25:40 Intermediate
Complete guide
Follow along with this comprehensive guide

Introduction

Ubuntu is embracing AI in a way that keeps your data local and your workflow simple. Instead of forcing you into cloud subscriptions or complex setups, Canonical's new approach uses inference snaps—sandboxed, hardware-optimized packages that run open-weight models directly on your machine. This guide walks you through getting started with both implicit AI (like speech-to-text in the background) and explicit AI (like automated troubleshooting). By the end, you'll be able to ask your Ubuntu system to solve problems hands-free, all without sending a single prompt to the cloud.

How to Harness Local AI on Ubuntu Without the Cloud Hassle
Source: itsfoss.com

What You Need

  • Ubuntu 22.04 LTS or later (24.04 LTS recommended for best compatibility)
  • Snapd installed and updated (usually pre‑installed on most Ubuntu flavors)
  • A modern CPU with at least 4 cores (8+ GB RAM for larger models)
  • Optional: A dedicated GPU (NVIDIA or AMD) if you want faster inference
  • Internet connection only for downloading the snaps and initial model setup
  • Basic terminal familiarity (we'll keep commands simple)

Step‑by‑Step Guide

Step 1: Update Snapd and Enable Snap Support

Open a terminal and ensure your snap environment is up‑to‑date. Run:

sudo snap install core
sudo snap refresh core

If snapd is not installed (rare), install it with:

sudo apt update
sudo apt install snapd

After installation, log out and back in to make sure the /snap directory is properly integrated.

Step 2: Install the Ubuntu AI Inference Snap

Canonical provides a dedicated snap that bundles open‑weight models and inference engines. Install it with:

sudo snap install ubuntu-ai-inference

This snap includes Whisper.cpp for speech‑to‑text, Text‑to‑Speech capabilities, and a lightweight agent for explicit tasks. The snap automatically selects the best model for your hardware. If you have a GPU, it will use CUDA or ROCm optimised builds.

Step 3: Activate Implicit AI Features (Background Intelligence)

Implicit AI makes existing OS features smarter without extra user interaction. For example, enable local speech‑to‑text by running:

snap set ubuntu-ai-inference speech-to-text=true

Now, when you use any application that supports dictation (like the system text input), the transcription happens entirely on your machine. You can test it with snap run ubuntu-ai-inference.demo-speech to see the model in action.

Step 4: Explore Explicit AI Features (Agentic Workflows)

Explicit AI gives you a command‑style interface. To troubleshoot a network issue, for instance, invoke the agent:

snap run ubuntu-ai-inference.agent --task "troubleshoot Wi-Fi connection"

The agent will analyse your systemd logs, check network settings, and output a diagnostic report along with proposed fixes. You can also ask it to set up a secure web application (e.g., a forge) by saying:

snap run ubuntu-ai-inference.agent --task "deploy a secure software forge with TLS"

All actions are confined by the snap sandbox—the model cannot read arbitrary files or send data outside your machine.

Step 5: Optimise for Your Hardware

To squeeze out maximum performance, check which hardware version the snap detected:

snap run ubuntu-ai-inference.info

If you have an NVIDIA GPU and the snap doesn't automatically use it, install the NVIDIA container toolkit and restart the snap service:

How to Harness Local AI on Ubuntu Without the Cloud Hassle
Source: itsfoss.com
sudo snap set ubuntu-ai-inference cuda=true
sudo snap restart ubuntu-ai-inference

For AMD GPUs, set rocm=true instead. The snap uses the same confinement rules as other snaps, so models cannot escape their sandbox.

Step 6: Manage Models and Storage

By default, the snap downloads one or two small models (~500 MB each). You can list installed models with:

snap run ubuntu-ai-inference.models list

To add a larger model (e.g., an 8B parameter LLM for richer agent responses), use:

snap run ubuntu-ai-inference.models add --model gemma-2-8b-it

The snap supports models from Hugging Face and automatically converts them to the correct format. Remove old models to free space:

snap run ubuntu-ai-inference.models remove --model old-model-name

Step 7: (Optional) Connect to Cloud Services

If you ever need extra horsepower or external data, you can enable cloud inference by setting an API endpoint. This is optional and off by default. To allow a specific cloud service, configure the snap:

snap set ubuntu-ai-inference cloud-api-key=your-key
snap set ubuntu-ai-inference cloud-url=https://api.example.com/inference

Be aware that once you enable cloud mode, your prompts are no longer local-only. Use this only when necessary.

Tips for a Smooth Experience

  • Start small: Begin with the default models – they are fast and consume little RAM. Upgrade to larger models only if you need deeper reasoning.
  • Check your GPU drivers: NVIDIA users should have the proprietary driver (≥525) installed. AMD users need ROCm packages for acceleration.
  • Use voice commands: Pair the implicit speech‑to‑text with the explicit agent to run commands entirely by voice – great for hands‑free troubleshooting or server management.
  • Monitor logs: If the agent misbehaves, inspect logs via snap logs ubuntu-ai-inference.
  • Stay updated: Canonical pushes new models and features through snap refreshes. Run sudo snap refresh ubuntu-ai-inference regularly.
  • Sandbox security: Remember that even the local models are confined. The snap cannot access your home files unless you grant permission explicitly via snap connect.

With this setup, you can enjoy Ubuntu’s growing AI capabilities – from smarter dictation to automated system management – without ever touching a cloud API. Everything runs locally, stays private, and gets out of your way.