Building a cyber oracle using a local LLM on a Raspberry Pi 5

For a few years now, I have been a regular attendant of the Chaos Communication Congress, this year called #38C3. But this congress is more fun when you bring your own project instead of only attending talks.

The idea: Let’s build a cyber oracle: A small machine for visitors to collect a prophecy in the style of a cyber punk fortune cookie. In 2024, we should be able to run a LLM (large-ish language model) on a small computer and let it do it’s thing, right? As it turns out: Yes, pretty much.

Who worked on the project: Mainly Pablo (hardware) and myself (software). We received help from Kim (3d-printed case) and Markus (last minute debugging and bug fixing).

Brainstorming possible directions

Thanks for the concept art to Leonardo.ai, but we needed to build something more pragmatic and in about a week’s time. So instead of an ambitious cyber punk robot, we took a simpler path.

The hardware

Pablo took ownership of researching, purchasing and assembling all hardware components, as well as figuring out how to work with the cheap printer from a Python script. Parts needed:

  • Raspberry Pi 5
  • Raspberry Pi Touch Display 2
  • Thermal printer (Amazon link) along with some paper (Amazon link)
  • A fitting power supply
  • A custom case, aptly sized for the printer and display. Luckily, Kim volunteered to design and 3d-print the enclosure. He has also published the case on Makerworld.

Building a cyber-ish UI for the touch screen

While Pablo and Kim handled hardware and the case, I took care of building the software. First, the UI. Usable on a touch screen and with the option to pick one out of nine “cyber zodiacs”, imagined based on common tropes and terms in the German hacker community.

The tech stack is plain: A local Python web server using FastAPI and a frontend built using some CSS and some JavaScript (no frameworks were harmed in the making of the project).

For the generation of the prophecies (i.e. the fortune messages), the backend was busy for about 30 seconds (more on the step in the next section).

Keeping the user waiting for 30 seconds felt broken, so we added a second step in the UI, simply to bridge the waiting period: While the machine is already doing its thing in the background, the user can select additional terms to leave as “entropy” for the person that is using the cyber oracle after them. Born out of necessesity, this “feature” actually made a lot of sense, because it connected each visitor of our machine to the person who came before, and the one who came after.

Generating prophecies

The core of the cyber oracle was a small LLM (that is, a small large language model, so maybe a… language model?). I did some research and picked Gemma 2 in the 2B variant. It runs super fast on my macbook and produces decent results, and it runs reasonably fast on the Raspberri Pi 5.

I was impressed by that. We’ve come a long way in the last two years!

Pablo and I hand-crafted nine prompts for the different zodiacs the user could pick. The results were super fun. To increase variation, we enriched the prompt with the “entropy words” picked by the previous user and in fact, across 900+ generated prophecies, we had a lot of different versions and I never noticed an exact duplicate.

One additional input was a sentiment (optimistic, neutral, dire, and so on) to achieve a variation in tone across the various prophecies.

Here’s a typical prompt (the full prompt generation logic is on Github).

You are a fortune teller in a cyberpunk story. 
Write a fortune cookie message for the cyber 
zodiac "Cryptogeek" with a sentiment of "neutral". 
The message should be exactly 2 lines long.
Write in German. Do not explain your answer. 
Be short and concise. Add no special characters.

The following terms and phrases are typical for 
the cyber zodiac Cryptogeek. Use them as inspiration
for the message but don't just copy them verbatim:
- Public Key
- Private Key
- Alles verschlüsseln
- Blockchain
- Keysigning Party
- GPG Key
- https everywhere
- Private Daten schützen, öffentliche Daten nützen

One artifact of using a small model like Gemma2 2B in a non-English language (i.e. German): Some phrases came out a little off and stilted. Ironically, this went really well with the idea of a text from a fortune cookie, so we didn’t even bother improving this part.

This is what the finished project looked like

And here’s what it looked like in the end.

Some prophecies

See the images for some examples from the congress. You can also see all of them on Mastodon, because the script published each one as it was printing it.

Hackers loved it

The little machine attracted a lot of attention and we had close to a thousand prophecies printed over the course of 4 days. The hacker community is naturally very sceptical about “AI”. But local inference (“look mum, no cloud!”) and completely pointless output just for the fun of it seemed to warm people’s heart to the curious little box.

Below are some stats about the usage and the most popular zodiacs.

Total printed995 fortunes
Most popular zodiac“Einhorn” (Unicorn): 203 fortunes
Least popular zodiac“Cryptogeek”: 41 fortunes
Shortest fortune (30 characters): “Alles verschlüsseln. Saal 1.”
Longest fortune (268 characters)“Mit deinem Einhorn-Glitzer wird deine magische kreative Energie in Freifunk-Chaos zu einem wundervollen Regenbogen transformiert. Liebe und Freundschaft verbinden dich mit deiner digitalen Gemeinschaft, während du deine Kreativität mit 5G-Technologie verwirklichst.”
Some statistics on the output of the cyber oracle

Build your own oracle using our source code

This was a fun and not-so-serious little AI project completed in about 3 or 4 evenings of work. The code is messy, but that’s in the spirit of projects like these.

When (if) we bring the oracle to the event next year, I hope to clean up the front-end code and add some more gimmicks in the prompts. Find the project on Github and let us know if you use it to build a similar little machine!

New publication: A paper about situated interaction published at Neurips in 2024

I have added a new entry to the “publications” section of my About page for a paper that I’ve been a co-author of.

This contribution dates back to my earlier work at Twenty Billion Neurons in Berlin, the AI company that has since been acquired by Qualcomm.

In that role, I worked on the Python source code of the real-time inference stack that’s still in use at Qualcomm and I was involved in early work that went into the dataset which is now being made public as part of the paper.

I am happy about this publication in particular because it was presented at Neurips, the prestigious conference in Machine Learning and AI.

The paper is titled “What to Say and When to Say it: Live Fitness Coaching as a Testbed for Situated Interaction”.

It introduces a new dataset of exercise videos and presents a reference model to perform video stream analysis and provide corrective feedback to the user. The work presents one possible method to combine language models with real-time video processing, something that we will for sure see more of in the next couple of years.

Full citation: Panchal, S., Bhattacharyya, A., Berger, G., Mercier, A., Bohm, C., Dietrichkeit, F., … & Memisevic, R. (2024). What to Say and When to Say it:
Live Fitness Coaching as a Testbed for Situated Interaction

You can read the PDF in full online: https://arxiv.org/pdf/2407.08101

Building a tiny sports guessing game using Go

A few things have coincided this summer: I had some time on my hands, I wanted to play around with the programming language Go, and Euro 2024 has been happening in Germany (I’m not a big football fan, but I do enjoy following the big tournaments every 2 years).

Long story short, I’ve worked through the (excellent!) book “Let’s Go” and have built a very rough web app just in time for the kickoff match in June.

The project is called “Go Tipp” (after the German word “Tipp” for a bet or guess), the source is now on Github:

16 of my friends have joined the game since and we have been actively guessing and (more often) mis-guessing the match outcomes of the Eurocup matches.

The features I’ve built

Here’s a quick rundown of the main project features:

  • Signup using invite code that puts you into a private group
  • Enter guesses for the outcomes of all matches
  • Once a match is finished, points are given for all correct guesses
  • Custom scoring rules per phase of the event: During the k.o. phase, correct guesses yield more points than during the group phase
  • Leaderboard to show the ranking of best guessers in the group
  • Profile page of each user to show their match history compared agains your own
  • Live update of match scores while the match is running, using the free api from https://openligadb.de/

A simple tech stack

The tech stack is simple:

  • Backend: Go
  • Database: MySQL
  • Frontend: No JS framework and mostly custom CSS (with some base styling using Pure)
  • Hosting: Uberspace, simple deployment via ssh and supervisord

A rewarding side project

So, I “learned” Go, but really more as a side effect of building a game that 17 people have been using actively almost daily for about 6 weeks.

It feels good when people use the thing you’ve built.

MVP gone right

I took a very deliberate approach when deciding which features to build first and which to leave out. I think it’s the first time I went ultra “MVP” with a project of mine and it worked out really well.

For an embarassingly long time, the site didn’t work properly on mobile, yet it still was used daily.

Even now, you can’t change your password, there’s no admin interface for me to edit match data (I do it in the database by hand) and there’s no “proper” frontend build pipeline.

My co-developer: AI

For this project, I’ve made heavy use of Copilot, GPT4 and Claude. It’s like directing a skilled developer who can code at 10x my own speed.

Find the code online

It’s a plain monolith, yet the code base is structured enough that making the code public doesn’t feel too embarassing.

It’s not meant to be used out-of-the-box, because many things have been hardcoded for the Euro 2024. Re-using it for another tournament can be possible with some adjustments.

In any case, for someone’s education or entertainment, you can find the repository online: https://github.com/floriandotpy/go-tipp

Learning something new: React Native

Over the end-of-year slowdown and the holidays, I’ve started to learn something new: React Native (and TypeScript along with it). It’s refreshing to approach a technology I haven’t actively used with a beginner’s mindset. Plus, it’s fun to build stuff.

A new tech stack for me: React Native and TypeScript

React Native is a framework to build mobile apps for both iOS and Android using the same codebase (which is either JavaScript or TypeScript).

You can do much more with React Native, but this is what it’s mostly used for.

Why React Native?

First, professional relevance: I work as an AI and Machine Learning Engineer, so I usually work in the Python ecosystem. However, ML software doesn’t live in isolation and we are often building web or mobile applications, either as internal tools or for product integration of the machine learning systems. To be able to build web and mobile applications, better knowledge of React and the ecosystem makes a lot of sense to me. In fact, my whole team has recently decided to up-skill in this direction.

Second, personal interest: Since I stopped working as a web developer in 2017, I haven’t really followed the changes in the web and JS space. I’ve remained curious about web technology and have always wanted to be able to build mobile apps for my personal use and potential side projects. React Native offers both, plus a lot of the knowledge will transfer easily to vanilla React for the web.

How I am learning

I like reading traditional paper books when learning something new because I can focus better when I look at printed paper rather than a digital screen.

  • Book 1: Professional React Native by Alexander Kuttig. A compact overview of the important elements of React Native projects and a collection of best practices. The book is not comprehensive in listing the available API methods, but I like this style: It’s a fast-paced guide that I can use to start building my own projects. The book has many pointers on important packages. There are some mistakes in the code listings and the code formatting is sometimes broken, so the whole thing feels a little rushed. Still, I’d recommend it if you have previous programming experience.
  • Book 2: Learning TypeScript by Josh Goldberg. A compact, but detailed look at the TypeScript language. I have only covered the basics of the language to get me started on my own projects, but I will continue reading this book because I want to make use of the full power of TypeScript in my projects. It’s very well explained and has clearly gone through a better editing process than Book 1 (which is what I would expect from an O’Reilly publication). Clear recommendation.
  • Learning by doing: As I am working through these books (and googling anything I don’t know), I am building my first project, see below.
The two books I am currently reading to learn React Native and TypeScript.

My first project: A Mastodon client

Having looked at the Mastodon API in a previous (Python) project, I decided to build a Mastodon mobile app for my personal use – or rather my learning experience.

I have worked on the project for a few days now, and it is almost at MVP-level, meaning it provides some value to the user (i.e. to me).

My first project: A Mastodon client. I split the first feature (the Home timeline) in 3 steps to start with something simple and slowly build on top, as I am learning new concepts.

What I’ve implemented and learned so far:

Project setup of a React Native app

This took longer than expected because I needed to update Node and Ruby versions on my Mac. This reminded me of the frustration I felt as a web developer 5+ years ago when every few weeks the community moved to a new build tool and all dependencies had to remain compatible. It took me around 2 hours for the setup, but I’m happy I came out on the other side because since then the dev experience with React Native and hot-reloading of the app in the phone simulator has been pleasant.

Fetching the personal home timeline

I decided not to use any Mastodon API wrappers but to use the REST API directly. It helps me learn what’s actually going on. This is straightforward using fetch() and casting the result to a matching type definition in TypeScript. Reading the home timeline requires authentication. I haven’t built a UI-based login flow yet, but I am simply passing the auth token associated with my Mastodon account.

Display of the home timeline

This is the only real feature I’ve implemented, but it helped me to learn quite a bit:

  • Build and structure React components
  • Use React hooks
  • Styling of React Native views
  • How to render the HTML content of the posts as native views
  • How to implement pull-to-refresh and infinite scrolling of a list view

What is still missing

For a full-fledged Mastodon client, I’ve maybe implemented 2% and the remaining 98% is still missing. Even for an MVP “read-only” app, I am still missing some crucial pieces:

  • Login flow
  • Display attachments (images, videos, …)
  • Detail view of the toots with more details (replies, like count, …)

I need to learn a few more core concepts to be able to implement these features, most notably navigation of multiple views and storing data on the device.

My plan is to build out this MVP version to continue learning the core concepts.

Afterwards, I will probably look for another project idea, one that is uniquely “my project”.

Ambitious ideas for this project

If I do end up working on the Mastodon app longer term, there are some ideas that would be fun to implement. In particular, I’d love to bring some of my Data Science / ML experience over to a mobile app. How about these ideas:

  • Detect the language of posts and split your timeline into localized versions
  • Detect the sentiment of posts and let the app know if you want to filter out clickbaity posts today
  • Summarize today’s posts in a short text (possible GPT3/ChatGPT integration)
  • Cluster posts into topics (like “news”, “meme”, “personal” or “cat content”) so that you can decide if you’re in the mood to explore or simply want to focus on what’s relevant today
  • Include tools to explore your Mastodon instance or the whole fediverse: Find accounts you would like, and find accounts that are popular outside your own circles. Some inspiration is in my previous post on exploring the Fediverse.

Follow along

If you want to follow along, you can find my current project progress on Github. Remember that this isn’t meant as an actual Mastodon client, but as an educational exercise for myself. Use at your own risk.

Github for the project source: https://github.com/floriandotpy/rn-mastodon

Exploring the Fediverse

Like many, I have been looking for a new digital community in the past few weeks (the old one is on fire) and have found a place on Mastodon.

You can find and follow me at https://sigmoid.social/@florian

I’ve picked the Mastodon instance sigmoid.social, an AI-related instance that is only 3 months old but already has close to 7000 users.

Machines talking to each other

Each Mastodon instance has a public API so it’s straightforward to fetch some basic statistics even without any authentication. I wrote some simple Python scripts to fetch basic info about my home instance.

You can find my scripts on Github if you’re interested in doing something similar (very rough code): https://github.com/floriandotpy/mastodon-stats

Who else is on my home instance?

I wondered: Who are the other users on sigmoid.social? To gain an overview, I fetched the profiles of all user accounts that are discoverable (which at the time of writing means 1300 accounts out of 6700).

Most profiles have a personal description text, typically this is a short bio. I plotted these as an old-fashioned word cloud.

The insight isn’t that surprising: The place is swarming with ML researchers and research scientists, both from universities and commercial research labs.

Who is present on sigmoid.social? Getting an overview from this word cloud generated from user profile bios.

A stroll through the neighborhood

You don’t want to have an account surrounded by AI folk? No problem, there are more than 12,000 instances to choose from (according to a recent number I found). And they can all talk to each other.

I wanted to see how connected the instance sigmoid.social is and plotted its neighborhood.

This is the method I used to generate the neighborhood graph:

  1. Fetch the 1000 most recent posts present on the instance (which can originate from any other Mastodon instance).
  2. Identify all instances that occur among these posts, and fetch their respective recent posts.
  3. With all these posts of a few hundred instances, create a graph: Each instance becomes a node. Two nodes are connected by an edge if at least five of the recent posts connect the two instances.

My method is naive, but it works sufficiently well to create a simple undirected graph.

The graph yields another unsurprising insight: All roads lead to mastodon.social, the largest and most well-known instance (as far as I know).

Neighboring instances (based on their most recent 1000 toots).

Join us on Mastodon?

I may or may not become more active as a poster myself. In any case, feel free to come over and say Hi: https://sigmoid.social/@florian

To see how these figures were created, find the scripts on Github (very rough code): https://github.com/floriandotpy/mastodon-stats