Happy New Year, 2024
In late 2023, I worked on a bunch of small AI projects:
osai-ubuntu
configuresUbuntu 22.04
to rundocker
,nvidia-driver
, andnvidia-container-toolkit
.osai-apps
is a collection of scripts and config files I wrote to in order to run open source AI apps in gpu-accelerated docker containers.browserlab
is a set of projects I wrote that use the browser as a front-end for AI-powered backend apps.jupyter-notebooks
are a bunch of very small notebooks that I wrote as I learned about neural networks, llms, and diffusion models.memory-cache
is a browser plugin that connects your local content store with a RAG-enabled LLM.AI Guide
is a blog covering various AI-related topics. I submitted an article about image generation (that is also published to my personal blog).
I learned something useful from each of these projects. But none of these projects required me to actually ship software in a way that other people (especially non-developers) could use or run.
With my first project of 2024, I aim to change that.
OSAI Kube
osai-kube is a set of scripts and config files for setting up a kubernetes cluster for running AI apps.
With osai-kube
, my goal is to ship software that regular people can use to explore the capabilities of open source AI models. This will allow me to:
- Learn about development, distribution, cost and maintenance of cloud-based AI apps.
- Share my work and the work of other app developers.
- Get rapid feedback on small product prototypes and ideas.
- Solve real problems related to accounts, auth, security, scaling, storage, and metrics.
Here is the core functionality / set of features that I am aiming for:
- Each application specifies its hardware requirements (e.g. GPUs) in a declarative, version-controlled config file (i.e. kubernetes manifests).
- GPU nodes auto-scale in app-specific node pools. A special process called the
supervisor
monitors the Kubernetes cluster and the backing GCP nodes to determine when it can scale up or scale down. - Accounts and auth are implemented once, for all applications.
- Authentication and authorization are powered by
keycloak
andkeycloak-gatekeeper
, which means... - Authentication is flexible.
OpenID Connect
,SAML 2.0
, and various social network identity providers are supported. - Authorization is uniform and flexible. Permissioning is organized with roles, sub-roles and groups.
- Individual applications do not need to implement auth. Requests are routed through gatekeepers, which run as sidecar containers in each application's pod.
- SSL termination and routing is handled by a
traefik
reverse proxy. - Applications share object storage, so that users can bring their data (images, documents, etc) with them between applications. This will enable workflows involving several "tools" (similar to Runway's online editing platform)
Getting to version 1.0
I have a minimum-viable version of osai-kube
that I'm aiming to complete by the end of January. That version includes:
- A few small, simple AI-powered programs for interacting with diffusion models and llms.
- User accounts and roles that grant access to these programs.
- Shared object storage for generated/uploaded images.
- At least one human user who is not me using the site.
- GPU node auto-scaling and a "loading" screen for the five minutes it takes to spin up a node.
Version 1.0 probably does not include:
- Graceful recovery of failures. For example sometimes it seems that my deployments fail to update because GPU nodes have been exhausted. If for some reason the kubernetes "replacement" strategy (which deletes nodes before spinning up new ones, so that more resources do not need to be spun up whenever I want to update the pods) does not work, then I'll just have to manually debug it to get things running.
- Database backups, replicas, caching or anything fancy that you'd want for a reliable production service. Same for object storage.
- An actually good / useful / well-designed application. My goal is to set up the cluster so that as I am working on new apps, I have a way to ship and share them. I have not yet built a specific app that I am trying to get people to use.
- Custom themes for keycloak or an overarching user interface (e.g. webpage headers/footers for navigation) for the various apps running in the cluster. For now the goal is to run apps with little to no custom code related to their runtime environment (
osai-kube
). Perhaps I'll explore running the apps in iframes so that I can add some platform chrome while keeping the apps as simple as possible. (After all, I want to run these same apps unchanged on local devices without all these infrastructural dependencies.) - A home page. This blog post and a conversation with a person might be the replacement for any landing page meant to inform a user about what to expect from the website.
- A mailing list or contact forms. I'll talk to the few people I share this with directly.