Installation
Before you begin
OME supports RawDeployment
, Serverless
,
and MultiNodeRayVLLM
mode to enable InferenceService
deployment with Kubernetes resources Deployment,
Service, Ingress and Horizontal Pod Autoscaler.
RawDeployment
, comparing to serverless deployment, unlocks Knative limitations such as mounting multiple volumes, on the other hand Scale down and from Zero is not supported in RawDeployment mode.RawDeployment
mode also supports scaling based on custom metrics with KEDA and Prometheus.Serverless
installation enables autoscaling based on request volume and supports scale down to and from zero. It also supports revision management and canary rollout based on revisions.MultiNodeRayVLLM
mode enables deploying a Ray cluster with multiple nodes and a VLLM model serving with InferenceService. This mode does not support auto-scaling or canary deployment.
Kubernetes 1.27.1 is the minimally required version, and please check the following recommended Istio versions for the corresponding Kubernetes version.
Make sure the following conditions are met:
- A Kubernetes cluster with version 1.27 or newer is running. Learn how to install the Kubernetes tools.
- The kubectl command-line tool has communication with your cluster.
- The cluster has a cert-manager installed.
- The cluster has knative-serving and Istio installed for Serverless mode.
- The cluster has a KEDA and Prometheus installed for custom metrics scaling.
- The cluster has a Ray installed for MultiNodeRayVLLM mode.
1. Install Istio
This is required for Serverless mode.
The minimally required Istio version is 1.19
and you can refer to the Istio install guide.
Once Istio is installed, create IngressClass
resource for istio.
apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
name: istio
spec:
controller: istio.io/ingress-controller
!!! Note If you are running on a managed Kubernetes service, you can use the managed Istio service provided by the cloud provider.
!!! Note
Istio ingress is recommended, but you can choose to install with other Ingress controllers and create IngressClass
resource for your Ingress option.
2. Install Cert Manager
The minimally required Cert Manager version is 1.9.0, and you can refer to Cert Manager installation guide.
!!! Note Cert manager is required to provision webhook certs for production grade installation. Alternatively, you can run a self-signed certs generation script.
3. Install Knative Serving
Please refer to Knative Serving install guide. This is required for Serverless mode mode. !!! note If you are looking to use PodSpec fields such as nodeSelector, affinity or tolerations which are now supported in the v1beta1 API spec, you need to turn on the corresponding feature flags in your Knative configuration.
!!! note If you are using private registry for your images, you need to configure the knative to skip resolve image digest.
kubectl -n knative-serving edit configmap config-deployment
add the following to the data
section:
data:
registriesSkippingTagResolving: ko.local, dev.local, ghcr.io
4. Install KEDA through Helm
Please refer to KEDA install guide.
5. Install Prometheus
- Get Helm Repository Information
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
- Install kube-prometheus-stack
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack
6. Clone OME repository
The Go tools require that you clone the repository to the
src/github.com/sgl-project/ome
directory in your
GOPATH
.
To check out this repository:
- Create your own clone this repo
- Clone it to your machine:
mkdir -p ${GOPATH}/src/github.com/sgl-project
cd ${GOPATH}/src/github.com/sgl-project
git clone https://github.com/sgl-project/ome.git
cd ome
Once you reach this point, you are ready to do a full build and deploy as described below.
Install the latest development version
To install the latest development version of OME in your cluster, run the following command:
make install
The controller runs in the ome
namespace.
Uninstall
To uninstall OME, run the following command:
make uninstall
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.