Running Immich with AI-Powered Image Search on Raspberry Pi 5 + AXera NPU
-
TL;DR: Got Immich running with CLIP-based semantic search on a Raspberry Pi 5 using the AXera AX8850 NPU. Chinese language search works surprisingly well thanks to the ViT-L-14-336-CN model. Setup took about 30 minutes once I figured out the ML server configuration.
What is Immich?
Immich is an open-source, self-hosted photo and video management platform. Think Google Photos, but you control the data. It supports automatic backup, intelligent search, and cross-device access.
Why This Setup?
I wanted to test AI-accelerated image search on edge hardware. The AXera AX8850 NPU on our M5Stack development board provides hardware acceleration for the CLIP models, making semantic search actually usable on a Pi.
Hardware Setup
- Raspberry Pi 5
- M5Stack AX8850 AI Module (provides NPU acceleration)
- Standard Pi power supply and storage
Step-by-Step Deployment
1. Download the Pre-built Package
Grab the optimized Immich build from HuggingFace:
git clone https://huggingface.co/AXERA-TECH/immichNote: You'll need
git lfsinstalled. If you don't have it, install it first.What you get:
m5stack@raspberrypi:~/rsp/immich $ ls -lh total 421M drwxrwxr-x 2 m5stack m5stack 4.0K Oct 10 09:12 asset -rw-rw-r-- 1 m5stack m5stack 421M Oct 10 09:20 ax-immich-server-aarch64.tar.gz -rw-rw-r-- 1 m5stack m5stack 0 Oct 10 09:12 config.json -rw-rw-r-- 1 m5stack m5stack 7.6K Oct 10 09:12 docker-deploy.zip -rw-rw-r-- 1 m5stack m5stack 104K Oct 10 09:12 immich_ml-1.129.0-py3-none-any.whl -rw-rw-r-- 1 m5stack m5stack 9.4K Oct 10 09:12 README.md -rw-rw-r-- 1 m5stack m5stack 177 Oct 10 09:12 requirements.txt2. Load the Docker Image
cd immich docker load -i ax-immich-server-aarch64.tar.gzIf Docker isn't installed, you'll need to set that up first.
3. Configure the Environment
unzip docker-deploy.zip cp example.env .env4. Start the Core Services
docker compose -f docker-compose.yml -f docker-compose.override.yml up -dSuccess looks like this:
[+] Running 3/3 ✔ Container immich_postgres Started 1.0s ✔ Container immich_redis Started 0.9s ✔ Container immich_server Started 0.9s5. Set Up the ML Service (The Interesting Part)
The ML service handles the AI-powered image search. It runs separately to leverage the NPU.
Create and activate a virtual environment:
python -m venv mich source mich/bin/activateInstall dependencies:
pip install https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3.rc2/axengine-0.1.3-py3-none-any.whl pip install -r requirements.txt pip install immich_ml-1.129.0-py3-none-any.whlLaunch the ML server:
python -m immich_mlYou should see:
[10/10/25 09:50:12] INFO Listening at: http://[::]:3003 (8698) [INFO] Available providers: ['AXCLRTExecutionProvider'] [10/10/25 09:50:16] INFO Application startup complete.The
AXCLRTExecutionProviderconfirms the NPU is being used.Web Interface Configuration
Initial Setup
- Navigate to
http://<your-pi-ip>:3003(e.g.,192.168.20.27:3003) - First visit requires admin account creation - credentials are stored locally
<img src="https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/linux/ax8850_card/images/immich1.png" width="95%" />
Configure the ML Server
This is critical - the web interface needs to know where your ML service is running.
- Go to Settings → Machine Learning
- Set the URL to your Pi's IP and port 3003:
http://192.168.20.27:3003 - Choose your CLIP model based on language:
- Chinese search:
ViT-L-14-336-CN__axera - English search:
ViT-L-14-336__axera
- Chinese search:
<img src="https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/linux/ax8850_card/images/immich4.png" width="95%" />
First-Time Index
Important: You need to manually trigger the initial indexing.
- Go to Administration → Jobs
- Find "SMART SEARCH"
- Click "Run Job" to process your uploaded images
<img src="https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/linux/ax8850_card/images/immich6.png" width="95%" />
Testing Image Search
Upload some photos, wait for indexing to complete, then try semantic searches:
<img src="https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/linux/ax8850_card/images/immich7.png" width="95%" />
The search works conceptually - you can search for "sunset" or "dogs playing" and it'll find relevant images even if those exact words aren't in the filename.
Technical Notes
- The NPU acceleration makes CLIP inference fast enough for interactive search
- Chinese language support is genuinely good with the CN model
- The ML server runs independently, so you can restart it without affecting the main Immich service
- Docker handles PostgreSQL and Redis automatically
Why M5Stack in This Stack?
The AX8850 NPU module provides the hardware acceleration that makes this practical on a Pi. Without it, running CLIP inference would be too slow for interactive use. We're working on more edge AI applications that leverage this acceleration - this Immich setup is a good real-world test case.
Questions about the setup or the NPU integration? Happy to dig into specifics.