Installation

Installing OME to a Kubernetes Cluster

Before you begin

OME supports RawDeployment, Serverless, and MultiNodeRayVLLM mode to enable InferenceService deployment with Kubernetes resources Deployment, Service, Ingress and Horizontal Pod Autoscaler.

  • RawDeployment, comparing to serverless deployment, unlocks Knative limitations such as mounting multiple volumes, on the other hand Scale down and from Zero is not supported in RawDeployment mode. RawDeployment mode also supports scaling based on custom metrics with KEDA and Prometheus.
  • Serverless installation enables autoscaling based on request volume and supports scale down to and from zero. It also supports revision management and canary rollout based on revisions.
  • MultiNodeRayVLLM mode enables deploying a Ray cluster with multiple nodes and a VLLM model serving with InferenceService. This mode does not support auto-scaling or canary deployment.

Kubernetes 1.27.1 is the minimally required version, and please check the following recommended Istio versions for the corresponding Kubernetes version.

Make sure the following conditions are met:

  • A Kubernetes cluster with version 1.27 or newer is running. Learn how to install the Kubernetes tools.
  • The kubectl command-line tool has communication with your cluster.
  • The cluster has a cert-manager installed.
  • The cluster has knative-serving and Istio installed for Serverless mode.
  • The cluster has a KEDA and Prometheus installed for custom metrics scaling.
  • The cluster has a Ray installed for MultiNodeRayVLLM mode.

1. Install Istio

This is required for Serverless mode. The minimally required Istio version is 1.19 and you can refer to the Istio install guide.

Once Istio is installed, create IngressClass resource for istio.

apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
  name: istio
spec:
  controller: istio.io/ingress-controller

!!! Note If you are running on a managed Kubernetes service, you can use the managed Istio service provided by the cloud provider.

!!! Note Istio ingress is recommended, but you can choose to install with other Ingress controllers and create IngressClass resource for your Ingress option.

2. Install Cert Manager

The minimally required Cert Manager version is 1.9.0, and you can refer to Cert Manager installation guide.

!!! Note Cert manager is required to provision webhook certs for production grade installation. Alternatively, you can run a self-signed certs generation script.

3. Install Knative Serving

Please refer to Knative Serving install guide. This is required for Serverless mode mode. !!! note If you are looking to use PodSpec fields such as nodeSelector, affinity or tolerations which are now supported in the v1beta1 API spec, you need to turn on the corresponding feature flags in your Knative configuration.

!!! note If you are using private registry for your images, you need to configure the knative to skip resolve image digest.

kubectl -n knative-serving edit configmap config-deployment

add the following to the data section:

data:
  registriesSkippingTagResolving: ko.local, dev.local, ghcr.io

4. Install KEDA through Helm

Please refer to KEDA install guide.

5. Install Prometheus

  1. Get Helm Repository Information
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
  1. Install kube-prometheus-stack
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack

6. Clone OME repository

The Go tools require that you clone the repository to the src/github.com/sgl-project/ome directory in your GOPATH.

To check out this repository:

  1. Create your own clone this repo
  2. Clone it to your machine:
mkdir -p ${GOPATH}/src/github.com/sgl-project
cd ${GOPATH}/src/github.com/sgl-project
git clone https://github.com/sgl-project/ome.git
cd ome

Once you reach this point, you are ready to do a full build and deploy as described below.

Install the latest development version

To install the latest development version of OME in your cluster, run the following command:

make install

The controller runs in the ome namespace.

Uninstall

To uninstall OME, run the following command:

make uninstall