OME API

Generated API reference documentation for ome.io/v1beta1.

Package v1beta1 contains API Schema definitions for the serving v1beta1 API group

Resource Types

`BaseModel`

Appears in:

BaseModel is the Schema for the basemodels API

Field	Description
`apiVersion` string	`ome.io/v1beta1`
`kind` string	`BaseModel`
`spec` [Required] `BaseModelSpec`	No description provided.
`status` [Required] `ModelStatusSpec`	No description provided.

`BenchmarkJob`

Appears in:

BenchmarkJob is the schema for the BenchmarkJobs API

Field	Description
`apiVersion` string	`ome.io/v1beta1`
`kind` string	`BenchmarkJob`
`spec` [Required] `BenchmarkJobSpec`	No description provided.
`status` [Required] `BenchmarkJobStatus`	No description provided.

`ClusterBaseModel`

Appears in:

ClusterBaseModel is the Schema for the basemodels API

Field	Description
`apiVersion` string	`ome.io/v1beta1`
`kind` string	`ClusterBaseModel`
`spec` [Required] `BaseModelSpec`	No description provided.
`status` [Required] `ModelStatusSpec`	No description provided.

`ClusterServingRuntime`

Appears in:

ClusterServingRuntime is the Schema for the servingruntimes API

Field	Description
`apiVersion` string	`ome.io/v1beta1`
`kind` string	`ClusterServingRuntime`
`spec` [Required] `ServingRuntimeSpec`	No description provided.
`status` [Required] `ServingRuntimeStatus`	No description provided.

`FineTunedWeight`

Appears in:

FineTunedWeight is the Schema for the finetunedweights API

Field	Description
`apiVersion` string	`ome.io/v1beta1`
`kind` string	`FineTunedWeight`
`spec` [Required] `FineTunedWeightSpec`	No description provided.
`status` [Required] `ModelStatusSpec`	No description provided.

`InferenceService`

Appears in:

InferenceService is the Schema for the InferenceServices API

Field	Description
`apiVersion` string	`ome.io/v1beta1`
`kind` string	`InferenceService`
`spec` [Required] `InferenceServiceSpec`	No description provided.
`status` [Required] `InferenceServiceStatus`	No description provided.

`ServingRuntime`

Appears in:

ServingRuntime is the Schema for the servingruntimes API

Field	Description
`apiVersion` string	`ome.io/v1beta1`
`kind` string	`ServingRuntime`
`spec` [Required] `ServingRuntimeSpec`	No description provided.
`status` [Required] `ServingRuntimeStatus`	No description provided.

`BaseModelSpec`

Appears in:

BaseModelSpec defines the desired state of BaseModel

Field	Description
`modelFormat` `ModelFormat`	No description provided.
`modelType` `string`	ModelType defines the architecture family of the model (e.g., "bert", "gpt2", "llama"). This value typically corresponds to the "model_type" field in a Hugging Face model's config.json. It is used to identify the transformer architecture and inform runtime selection and tokenizer behavior.
`modelFramework` `ModelFrameworkSpec`	ModelFramework specifies the underlying framework used by the model, such as "ONNX", "TensorFlow", "PyTorch", "Transformer", or "TensorRTLLM". This value helps determine the appropriate runtime for model serving.
`modelArchitecture` `string`	ModelArchitecture specifies the concrete model implementation or head, such as "LlamaForCausalLM", "GemmaForCausalLM", or "MixtralForCausalLM". This is often derived from the "architectures" field in Hugging Face config.json.
`quantization` `ModelQuantization`	Quantization defines the quantization scheme applied to the model weights, such as "fp8", "fbgemm_fp8", or "int4". This influences runtime compatibility and performance.
`modelParameterSize` `string`	ModelParameterSize indicates the total number of parameters in the model, expressed in human-readable form such as "7B", "13B", or "175B". This can be used for scheduling or runtime selection.
`modelCapabilities` `[]string`	ModelCapabilities of the model, e.g., "TEXT_GENERATION", "TEXT_SUMMARIZATION", "TEXT_EMBEDDINGS"
`modelConfiguration` `k8s.io/apimachinery/pkg/runtime.RawExtension`	Configuration of the model, stored as generic JSON for flexibility.
`storage` [Required] `StorageSpec`	Storage configuration for the model
`ModelExtensionSpec` [Required] `ModelExtensionSpec`	(Members of `ModelExtensionSpec` are embedded into this type.) ModelExtension is the common extension of the model
`servingMode` [Required] `[]string`	No description provided.
`maxTokens` `int32`	MaxTokens is the maximum number of tokens that can be processed by the model
`additionalMetadata` `map[string]string`	Additional metadata for the model

`BenchmarkJobSpec`

Appears in:

BenchmarkJob

BenchmarkJobSpec defines the specification for a benchmark job. All fields within this specification collectively represent the desired state and configuration of a BenchmarkJob.

Field	Description
`huggingFaceSecretReference` `HuggingFaceSecretReference`	HuggingFaceSecretReference is a reference to a Kubernetes Secret containing the Hugging Face API key. The referenced Secret must reside in the same namespace as the BenchmarkJob. This field replaces the raw HuggingFaceAPIKey field for improved security.
`endpoint` [Required] `EndpointSpec`	Endpoint is the reference to the inference service to benchmark.
`serviceMetadata` `ServiceMetadata`	ServiceMetadata records metadata about the backend model server or service being benchmarked. This includes details such as server engine, version, and GPU configuration for filtering experiments.
`task` [Required] `string`	Task specifies the task to benchmark, pattern: -to- (e.g., "text-to-text", "image-to-text").
`trafficScenarios` `[]string`	TrafficScenarios contains a list of traffic scenarios to simulate during the benchmark. If not provided, defaults will be assigned via genai-bench.
`numConcurrency` `[]int`	NumConcurrency defines a list of concurrency levels to test during the benchmark. If not provided, defaults will be assigned via genai-bench.
`maxTimePerIteration` [Required] `int`	MaxTimePerIteration specifies the maximum time (in minutes) for a single iteration. Each iteration runs for a specific combination of TrafficScenarios and NumConcurrency.
`maxRequestsPerIteration` [Required] `int`	MaxRequestsPerIteration specifies the maximum number of requests for a single iteration. Each iteration runs for a specific combination of TrafficScenarios and NumConcurrency.
`additionalRequestParams` `map[string]string`	AdditionalRequestParams contains additional request parameters as a map.
`dataset` `StorageSpec`	Dataset is the dataset used for benchmarking. It is optional and only required for tasks other than "text-to-".
`outputLocation` [Required] `StorageSpec`	OutputLocation specifies where the benchmark results will be stored (e.g., object storage).
`resultFolderName` `string`	ResultFolderName specifies the name of the folder that stores the benchmark result. A default name will be assigned if not specified.
`podOverride` `PodOverride`	Pod defines the pod configuration for the benchmark job. This is optional, if not provided, default values will be used.

`BenchmarkJobStatus`

Appears in:

BenchmarkJob

BenchmarkJobStatus reflects the state and results of the benchmark job. It will be set and updated by the controller.

Field	Description
`state` [Required] `string`	State represents the current state of the benchmark job: "Pending", "Running", "Completed", "Failed".
`startTime` `k8s.io/apimachinery/pkg/apis/meta/v1.Time`	StartTime is the timestamp for when the benchmark job started.
`completionTime` `k8s.io/apimachinery/pkg/apis/meta/v1.Time`	CompletionTime is the timestamp for when the benchmark job completed, either successfully or unsuccessfully.
`lastReconcileTime` `k8s.io/apimachinery/pkg/apis/meta/v1.Time`	LastReconcileTime is the timestamp for the last time the job was reconciled by the controller.
`failureMessage` `string`	FailureMessage contains any error messages if the benchmark job failed.
`details` `string`	Details provide additional information or metadata about the benchmark job.

`ComponentExtensionSpec`

Appears in:

ComponentExtensionSpec defines the deployment configuration for a given InferenceService component

Field	Description
`minReplicas` `int`	Minimum number of replicas, defaults to 1 but can be set to 0 to enable scale-to-zero.
`maxReplicas` `int`	Maximum number of replicas for autoscaling.
`scaleTarget` `int`	ScaleTarget specifies the integer target value of the metric type the Autoscaler watches for. concurrency and rps targets are supported by Knative Pod Autoscaler (https://knative.dev/docs/serving/autoscaling/autoscaling-targets/).
`scaleMetric` `ScaleMetric`	ScaleMetric defines the scaling metric type watched by autoscaler possible values are concurrency, rps, cpu, memory. concurrency, rps are supported via Knative Pod Autoscaler(https://knative.dev/docs/serving/autoscaling/autoscaling-metrics).
`containerConcurrency` `int64`	ContainerConcurrency specifies how many requests can be processed concurrently, this sets the hard limit of the container concurrency(https://knative.dev/docs/serving/autoscaling/concurrency).
`timeoutSeconds` `int64`	TimeoutSeconds specifies the number of seconds to wait before timing out a request to the component.
`canaryTrafficPercent` `int64`	CanaryTrafficPercent defines the traffic split percentage between the candidate revision and the last ready revision
`labels` `map[string]string`	Labels that will be add to the component pod. More info: http://kubernetes.io/docs/user-guide/labels
`annotations` `map[string]string`	Annotations that will be add to the component pod. More info: http://kubernetes.io/docs/user-guide/annotations
`deploymentStrategy` `k8s.io/api/apps/v1.DeploymentStrategy`	The deployment strategy to use to replace existing pods with new ones. Only applicable for raw deployment mode.
`kedaConfig` [Required] `KedaConfig`	No description provided.

`ComponentStatusSpec`

Appears in:

InferenceServiceStatus

ComponentStatusSpec describes the state of the component

Field	Description
`latestReadyRevision` `string`	Latest revision name that is in ready state
`latestCreatedRevision` `string`	Latest revision name that is created
`previousRolledoutRevision` `string`	Previous revision name that is rolled out with 100 percent traffic
`latestRolledoutRevision` `string`	Latest revision name that is rolled out with 100 percent traffic
`traffic` `[]knative.dev/serving/pkg/apis/serving/v1.TrafficTarget`	Traffic holds the configured traffic distribution for latest ready revision and previous rolled out revision.
`url` `knative.dev/pkg/apis.URL`	URL holds the primary url that will distribute traffic over the provided traffic targets. This will be one the REST or gRPC endpoints that are available. It generally has the form http[s]://{route-name}.{route-namespace}.{cluster-level-suffix}
`restURL` `knative.dev/pkg/apis.URL`	REST endpoint of the component if available.
`address` `knative.dev/pkg/apis/duck/v1.Addressable`	Addressable endpoint for the InferenceService

`DecoderSpec`

Appears in:

DecoderSpec defines the configuration for the Decoder component (token generation in PD-disaggregated deployment) Used specifically for prefill-decode disaggregated deployments to handle the token generation phase. Similar to EngineSpec in structure, it allows for detailed pod and container configuration, but is specifically used for the decode phase when separating prefill and decode processes.

Field	Description
`PodSpec` `PodSpec`	(Members of `PodSpec` are embedded into this type.) This spec provides a full PodSpec for the decoder component Allows complete customization of the Kubernetes Pod configuration including containers, volumes, security contexts, affinity rules, and other pod settings.
`ComponentExtensionSpec` [Required] `ComponentExtensionSpec`	(Members of `ComponentExtensionSpec` are embedded into this type.) ComponentExtensionSpec defines deployment configuration like min/max replicas, scaling metrics, etc. Controls scaling behavior and resource allocation for the decoder component.
`runner` `RunnerSpec`	Runner container override for customizing the main container This is essentially a container spec that can override the default container Defines the main decoder container configuration, including image, resource requests/limits, environment variables, and command.
`leader` `LeaderSpec`	Leader node configuration (only used for MultiNode deployment) Defines the pod and container spec for the leader node that coordinates distributed token generation in multi-node deployments.
`worker` `WorkerSpec`	Worker nodes configuration (only used for MultiNode deployment) Defines the pod and container spec for worker nodes that perform distributed token generation tasks as directed by the leader.

`Endpoint`

Appears in:

EndpointSpec

Endpoint defines a direct URL-based inference service with additional API configuration.

Field	Description
`url` [Required] `string`	URL represents the endpoint URL for the inference service.
`apiFormat` [Required] `string`	APIFormat specifies the type of API, such as "openai" or "oci-cohere".
`modelName` [Required] `string`	ModelName specifies the name of the model being served at the endpoint. Useful for endpoints that require model-specific configuration. For instance, for openai API, this is a required field in the payload

`EndpointSpec`

Appears in:

BenchmarkJobSpec

EndpointSpec defines a reference to an inference service. It supports either a Kubernetes-style reference (InferenceService) or an Endpoint struct for a direct URL. Cross-namespace references are supported for InferenceService but require appropriate RBAC permissions to access resources in the target namespace.

Field	Description
`inferenceService` `InferenceServiceReference`	InferenceService holds a Kubernetes reference to an internal inference service.
`endpoint` `Endpoint`	Endpoint holds the details of a direct endpoint for an external inference service, including URL and API details.

`EngineSpec`

Appears in:

EngineSpec defines the configuration for the Engine component (can be used for both single-node and multi-node deployments) Provides a comprehensive specification for deploying model serving containers and pods. It allows for complete Kubernetes pod configuration including main containers, init containers, sidecars, volumes, and other pod-level settings. For distributed deployments, it supports leader-worker architecture configuration.

Field	Description
`PodSpec` `PodSpec`	(Members of `PodSpec` are embedded into this type.) This spec provides a full PodSpec for the engine component Allows complete customization of the Kubernetes Pod configuration including containers, volumes, security contexts, affinity rules, and other pod settings.
`ComponentExtensionSpec` [Required] `ComponentExtensionSpec`	(Members of `ComponentExtensionSpec` are embedded into this type.) ComponentExtensionSpec defines deployment configuration like min/max replicas, scaling metrics, etc. Controls scaling behavior and resource allocation for the engine component.
`runner` `RunnerSpec`	Runner container override for customizing the engine container This is essentially a container spec that can override the default container Defines the main model runner container configuration, including image, resource requests/limits, environment variables, and command.
`leader` `LeaderSpec`	Leader node configuration (only used for MultiNode deployment) Defines the pod and container spec for the leader node that coordinates distributed inference in multi-node deployments.
`worker` `WorkerSpec`	Worker nodes configuration (only used for MultiNode deployment) Defines the pod and container spec for worker nodes that perform distributed processing tasks as directed by the leader.

`FailureInfo`

Appears in:

ModelStatus

Field	Description
`location` `string`	Name of component to which the failure relates (usually Pod name)
`reason` `FailureReason`	High level class of failure
`message` `string`	Detailed error message
`modelRevisionName` `string`	Internal Revision/ID of model, tied to specific Spec contents
`time` `k8s.io/apimachinery/pkg/apis/meta/v1.Time`	Time failure occurred or was discovered
`exitCode` `int32`	Exit status from the last termination of the container

`FailureReason`

(Alias of string)

Appears in:

FailureInfo

FailureReason enum

`FineTunedWeightSpec`

Appears in:

FineTunedWeight

FineTunedWeightSpec defines the desired state of FineTunedWeight

Field	Description
`baseModelRef` [Required] `ObjectReference`	Reference to the base model that this weight is fine-tuned from
`modelType` [Required] `string`	ModelType of the fine-tuned weight, e.g., "Distillation", "Adapter", "Tfew"
`hyperParameters` [Required] `k8s.io/apimachinery/pkg/runtime.RawExtension`	HyperParameters used for fine-tuning, stored as generic JSON for flexibility
`ModelExtensionSpec` [Required] `ModelExtensionSpec`	(Members of `ModelExtensionSpec` are embedded into this type.) ModelExtension is the common extension of the model
`configuration` `k8s.io/apimachinery/pkg/runtime.RawExtension`	Configuration of the fine-tuned weight, stored as generic JSON for flexibility
`storage` [Required] `StorageSpec`	Storage configuration for the fine-tuned weight
`trainingJobRef` `ObjectReference`	TrainingJobID is the ID of the training job that produced this weight

`HuggingFaceSecretReference`

Appears in:

BenchmarkJobSpec

HuggingFaceSecretReference defines a reference to a Kubernetes Secret containing the Hugging Face API key. This secret must reside in the same namespace as the BenchmarkJob. Cross-namespace references are not allowed for security and simplicity.

Field	Description
`name` [Required] `string`	Name of the secret containing the Hugging Face API key. The secret must reside in the same namespace as the BenchmarkJob.

`InferenceServiceReference`

Appears in:

EndpointSpec

InferenceServiceReference defines the reference to a Kubernetes inference service.

Field	Description
`name` [Required] `string`	Name specifies the name of the inference service to benchmark.
`namespace` [Required] `string`	Namespace specifies the Kubernetes namespace where the inference service is deployed. Cross-namespace references are allowed but require appropriate RBAC permissions.

`InferenceServiceSpec`

Appears in:

InferenceService

InferenceServiceSpec is the top level type for this resource

Field	Description
`predictor` `PredictorSpec`	Predictor defines the model serving spec It specifies how the model should be deployed and served, handling inference requests. Deprecated: Predictor is deprecated and will be removed in a future release. Please use Engine and Model fields instead.
`engine` `EngineSpec`	Engine defines the serving engine spec This provides detailed container and pod specifications for model serving. It allows defining the model runner (container spec), as well as complete pod specifications including init containers, sidecar containers, and other pod-level configurations. Engine can also be configured for multi-node deployments using leader and worker specifications.
`decoder` `DecoderSpec`	Decoder defines the decoder spec This is specifically used for PD (Prefill-Decode) disaggregated serving deployments. Similar to Engine in structure, it allows for container and pod specifications, but is only utilized when implementing the disaggregated serving pattern to separate the prefill and decode phases of inference.
`model` `ModelRef`	Model defines the model to be used for inference, referencing either a BaseModel or a custom model. This allows models to be managed independently of the serving configuration.
`runtime` `ServingRuntimeRef`	Runtime defines the serving runtime environment that will be used to execute the model. It is an inference service spec template that determines how the service should be deployed. Runtime is optional - if not defined, the operator will automatically select the best runtime based on the model's size, architecture, format, quantization, and framework.
`router` `RouterSpec`	Router defines the router spec
`kedaConfig` [Required] `KedaConfig`	KedaConfig defines the autoscaling configuration for KEDA Provides settings for event-driven autoscaling using KEDA (Kubernetes Event-driven Autoscaling), allowing the service to scale based on custom metrics or event sources.

`InferenceServiceStatus`

Appears in:

InferenceService

InferenceServiceStatus defines the observed state of InferenceService

Field	Description
`Status` [Required] `knative.dev/pkg/apis/duck/v1.Status`	(Members of `Status` are embedded into this type.) Conditions for the InferenceService EngineRouteReady: engine route readiness condition; DecoderRouteReady: decoder route readiness condition; PredictorReady: predictor readiness condition; RoutesReady (serverless mode only): aggregated routing condition, i.e. endpoint readiness condition; LatestDeploymentReady (serverless mode only): aggregated configuration condition, i.e. latest deployment readiness condition; Ready: aggregated condition;
`address` `knative.dev/pkg/apis/duck/v1.Addressable`	Addressable endpoint for the InferenceService
`url` `knative.dev/pkg/apis.URL`	URL holds the url that will distribute traffic over the provided traffic targets. It generally has the form http[s]://{route-name}.{route-namespace}.{cluster-level-suffix}
`components` [Required] `map[ComponentType]ComponentStatusSpec`	Statuses for the components of the InferenceService
`modelStatus` [Required] `ModelStatus`	Model related statuses

`KedaConfig`

Appears in:

KedaConfig stores the configuration settings for KEDA autoscaling within the InferenceService. It includes fields like the Prometheus server address, custom query, scaling threshold, and operator.

Field	Description
`enableKeda` [Required] `bool`	EnableKeda determines whether KEDA autoscaling is enabled for the InferenceService. true: KEDA will manage the autoscaling based on the provided configuration. false: KEDA will not be used, and autoscaling will rely on other mechanisms (e.g., HPA).
`promServerAddress` [Required] `string`	PromServerAddress specifies the address of the Prometheus server that KEDA will query to retrieve metrics for autoscaling decisions. This should be a fully qualified URL, including the protocol and port number. Example: http://prometheus-operated.monitoring.svc.cluster.local:9090
`customPromQuery` [Required] `string`	CustomPromQuery defines a custom Prometheus query that KEDA will execute to evaluate the desired metric for scaling. This query should return a single numerical value that represents the metric to be monitored. Example: avg_over_time(http_requests_total{service="llama"}[5m])
`scalingThreshold` [Required] `string`	ScalingThreshold sets the numerical threshold against which the result of the Prometheus query will be compared. Depending on the ScalingOperator, this threshold determines when to scale the number of replicas up or down. Example: "10" - The Autoscaler will compare the metric value to 10.
`scalingOperator` [Required] `string`	ScalingOperator specifies the comparison operator used by KEDA to decide whether to scale the Deployment. Common operators include: "GreaterThanOrEqual": Scale up when the metric is >= ScalingThreshold. "LessThanOrEqual": Scale down when the metric is <= ScalingThreshold. This operator defines the condition under which scaling actions are triggered based on the evaluated metric. Example: "GreaterThanOrEqual"

`LeaderSpec`

Appears in:

LeaderSpec defines the configuration for a leader node in a multi-node component The leader node coordinates the activities of worker nodes in distributed inference or token generation setups, handling task distribution and result aggregation.

Field	Description
`PodSpec` `PodSpec`	(Members of `PodSpec` are embedded into this type.) Pod specification for the leader node This overrides the main PodSpec when specified Allows customization of the Kubernetes Pod configuration specifically for the leader node.
`runner` `RunnerSpec`	Runner container override for customizing the main container This is essentially a container spec that can override the default container Provides fine-grained control over the container that executes the leader node's coordination logic.

`LifeCycleState`

(Alias of string)

Appears in:

ModelStatusSpec

LifeCycleState enum

`ModelCopies`

Appears in:

ModelStatus

Field	Description
`failedCopies` [Required] `int`	How many copies of this predictor's models failed to load recently
`totalCopies` `int`	Total number copies of this predictor's models that are currently loaded

`ModelExtensionSpec`

Appears in:

Field	Description
`displayName` `string`	DisplayName is the user-friendly name of the model
`version` `string`	No description provided.
`disabled` `bool`	Whether the model is enabled or not
`vendor` `string`	Vendor of the model, e.g., "NVIDIA", "Meta", "HuggingFace"
`compartmentID` `string`	CompartmentID is the compartment ID of the model

`ModelFormat`

Appears in:

Field	Description
`name` [Required] `string`	Name of the format in which the model is stored, e.g., "ONNX", "TensorFlow SavedModel", "PyTorch", "SafeTensors"
`version` `string`	Version of the model format. Used in validating that a runtime supports a predictor. It Can be "major", "major.minor" or "major.minor.patch".

`ModelFrameworkSpec`

Appears in:

Field	Description
`name` [Required] `string`	Name of the library in which the model is stored, e.g., "ONNXRuntime", "TensorFlow", "PyTorch", "Transformer", "TensorRTLLM"
`version` `string`	Version of the library. Used in validating that a runtime supports a predictor. It Can be "major", "major.minor" or "major.minor.patch".

`ModelQuantization`

(Alias of string)

Appears in:

`ModelRef`

Appears in:

InferenceServiceSpec

Field	Description
`name` [Required] `string`	Name of the model being referenced Identifies the specific model to be used for inference.
`kind` [Required] `string`	Kind of the model being referenced Defaults to ClusterBaseModel Specifies the Kubernetes resource kind of the referenced model.
`apiGroup` [Required] `string`	APIGroup of the resource being referenced Defaults to `ome.io` Specifies the Kubernetes API group of the referenced model.
`fineTunedWeights` `[]string`	Optional FineTunedWeights references References to fine-tuned weights that should be applied to the base model.

`ModelRevisionStates`

Appears in:

ModelStatus

Field	Description
`activeModelState` [Required] `ModelState`	High level state string: Pending, Standby, Loading, Loaded, FailedToLoad
`targetModelState` [Required] `ModelState`	No description provided.

`ModelSizeRangeSpec`

Appears in:

ServingRuntimeSpec

ModelSizeRangeSpec defines the range of model sizes supported by this runtime

Field	Description
`min` `string`	Minimum size of the model in bytes
`max` `string`	Maximum size of the model in bytes

`ModelSpec`

Appears in:

PredictorSpec

Field	Description
`runtime` `string`	Specific ClusterServingRuntime/ServingRuntime name to use for deployment.
`PredictorExtensionSpec` [Required] `PredictorExtensionSpec`	(Members of `PredictorExtensionSpec` are embedded into this type.) No description provided.
`baseModel` [Required] `string`	No description provided.
`fineTunedWeights` [Required] `[]string`	No description provided.

`ModelState`

(Alias of string)

Appears in:

ModelRevisionStates

ModelState enum

`ModelStatus`

Appears in:

InferenceServiceStatus

Field	Description
`transitionStatus` [Required] `TransitionStatus`	Whether the available predictor endpoints reflect the current Spec or is in transition
`modelRevisionStates` `ModelRevisionStates`	State information of the predictor's model.
`lastFailureInfo` `FailureInfo`	Details of last failure, when load of target model is failed or blocked.
`modelCopies` `ModelCopies`	Model copy information of the predictor's model.

`ModelStatusSpec`

Appears in:

ModelStatusSpec defines the observed state of Model weight

Field	Description
`lifecycle` [Required] `string`	LifeCycle is an enum of Deprecated, Experiment, Public, Internal
`state` [Required] `LifeCycleState`	Status of the model weight
`nodesReady` [Required] `[]string`	No description provided.
`nodesFailed` [Required] `[]string`	No description provided.

`ObjectReference`

Appears in:

FineTunedWeightSpec

ObjectReference contains enough information to let you inspect or modify the referred object.

Field	Description
`name` [Required] `string`	Name of the referenced object
`namespace` [Required] `string`	Namespace of the referenced object

`PodOverride`

Appears in:

BenchmarkJobSpec

Field	Description
`image` `string`	Image specifies the container image to use for the benchmark job.
`env` `[]k8s.io/api/core/v1.EnvVar`	List of environment variables to set in the container.
`envFrom` `[]k8s.io/api/core/v1.EnvFromSource`	List of sources to populate environment variables in the container.
`volumeMounts` `[]k8s.io/api/core/v1.VolumeMount`	Pod volumes to mount into the container's filesystem.
`resources` `k8s.io/api/core/v1.ResourceRequirements`	Compute Resources required by this container. Cannot be updated. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
`tolerations` `[]k8s.io/api/core/v1.Toleration`	If specified, the pod's tolerations.
`nodeSelector` `map[string]string`	NodeSelector is a selector which must be true for the pod to fit on a node. Selector which must match a node's labels for the pod to be scheduled on that node. More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
`affinity` `k8s.io/api/core/v1.Affinity`	If specified, the pod's scheduling constraints
`volumes` `[]k8s.io/api/core/v1.Volume`	List of volumes that can be mounted by containers belonging to the pod. More info: https://kubernetes.io/docs/concepts/storage/volumes

`PodSpec`

Appears in:

PodSpec is a description of a pod.

Field	Description
`volumes` `[]k8s.io/api/core/v1.Volume`	List of volumes that can be mounted by containers belonging to the pod. More info: https://kubernetes.io/docs/concepts/storage/volumes
`initContainers` [Required] `[]k8s.io/api/core/v1.Container`	List of initialization containers belonging to the pod. Init containers are executed in order prior to containers being started. If any init container fails, the pod is considered to have failed and is handled according to its restartPolicy. The name for an init container or normal container must be unique among all containers. Init containers may not have Lifecycle actions, Readiness probes, Liveness probes, or Startup probes. The resourceRequirements of an init container are taken into account during scheduling by finding the highest request/limit for each resource type, and then using the max of of that value or the sum of the normal containers. Limits are applied to init containers in a similar fashion. Init containers cannot currently be added or removed. Cannot be updated. More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
`containers` [Required] `[]k8s.io/api/core/v1.Container`	List of containers belonging to the pod. Containers cannot currently be added or removed. There must be at least one container in a Pod. Cannot be updated.
`ephemeralContainers` `[]k8s.io/api/core/v1.EphemeralContainer`	List of ephemeral containers run in this pod. Ephemeral containers may be run in an existing pod to perform user-initiated actions such as debugging. This list cannot be specified when creating a pod, and it cannot be modified by updating the pod spec. In order to add an ephemeral container to an existing pod, use the pod's ephemeralcontainers subresource.
`restartPolicy` `k8s.io/api/core/v1.RestartPolicy`	Restart policy for all containers within the pod. One of Always, OnFailure, Never. In some contexts, only a subset of those values may be permitted. Default to Always. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy
`terminationGracePeriodSeconds` `int64`	Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request. Value must be non-negative integer. The value zero indicates stop immediately via the kill signal (no opportunity to shut down). If this value is nil, the default grace period will be used instead. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process. Defaults to 30 seconds.
`activeDeadlineSeconds` `int64`	Optional duration in seconds the pod may be active on the node relative to StartTime before the system will actively try to mark it failed and kill associated containers. Value must be a positive integer.
`dnsPolicy` `k8s.io/api/core/v1.DNSPolicy`	Set DNS policy for the pod. Defaults to "ClusterFirst". Valid values are 'ClusterFirstWithHostNet', 'ClusterFirst', 'Default' or 'None'. DNS parameters given in DNSConfig will be merged with the policy selected with DNSPolicy. To have DNS options set along with hostNetwork, you have to specify DNS policy explicitly to 'ClusterFirstWithHostNet'.
`nodeSelector` `map[string]string`	NodeSelector is a selector which must be true for the pod to fit on a node. Selector which must match a node's labels for the pod to be scheduled on that node. More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
`serviceAccountName` `string`	ServiceAccountName is the name of the ServiceAccount to use to run this pod. More info: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/
`serviceAccount` `string`	DeprecatedServiceAccount is a deprecated alias for ServiceAccountName. Deprecated: Use serviceAccountName instead.
`automountServiceAccountToken` `bool`	AutomountServiceAccountToken indicates whether a service account token should be automatically mounted.
`nodeName` `string`	NodeName indicates in which node this pod is scheduled. If empty, this pod is a candidate for scheduling by the scheduler defined in schedulerName. Once this field is set, the kubelet for this node becomes responsible for the lifecycle of this pod. This field should not be used to express a desire for the pod to be scheduled on a specific node. https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodename
`hostNetwork` `bool`	Host networking requested for this pod. Use the host's network namespace. If this option is set, the ports that will be used must be specified. Default to false.
`hostPID` `bool`	Use the host's pid namespace. Optional: Default to false.
`hostIPC` `bool`	Use the host's ipc namespace. Optional: Default to false.
`shareProcessNamespace` `bool`	Share a single process namespace between all of the containers in a pod. When this is set containers will be able to view and signal processes from other containers in the same pod, and the first process in each container will not be assigned PID 1. HostPID and ShareProcessNamespace cannot both be set. Optional: Default to false.
`securityContext` `k8s.io/api/core/v1.PodSecurityContext`	SecurityContext holds pod-level security attributes and common container settings. Optional: Defaults to empty. See type description for default values of each field.
`imagePullSecrets` `[]k8s.io/api/core/v1.LocalObjectReference`	ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec. If specified, these secrets will be passed to individual puller implementations for them to use. More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod
`hostname` `string`	Specifies the hostname of the Pod If not specified, the pod's hostname will be set to a system-defined value.
`subdomain` `string`	If specified, the fully qualified Pod hostname will be "...svc.". If not specified, the pod will not have a domainname at all.
`affinity` `k8s.io/api/core/v1.Affinity`	If specified, the pod's scheduling constraints
`schedulerName` `string`	If specified, the pod will be dispatched by specified scheduler. If not specified, the pod will be dispatched by default scheduler.
`tolerations` `[]k8s.io/api/core/v1.Toleration`	If specified, the pod's tolerations.
`hostAliases` `[]k8s.io/api/core/v1.HostAlias`	HostAliases is an optional list of hosts and IPs that will be injected into the pod's hosts file if specified.
`priorityClassName` `string`	If specified, indicates the pod's priority. "system-node-critical" and "system-cluster-critical" are two special keywords which indicate the highest priorities with the former being the highest priority. Any other name must be defined by creating a PriorityClass object with that name. If not specified, the pod priority will be default or zero if there is no default.
`priority` `int32`	The priority value. Various system components use this field to find the priority of the pod. When Priority Admission Controller is enabled, it prevents users from setting this field. The admission controller populates this field from PriorityClassName. The higher the value, the higher the priority.
`dnsConfig` `k8s.io/api/core/v1.PodDNSConfig`	Specifies the DNS parameters of a pod. Parameters specified here will be merged to the generated DNS configuration based on DNSPolicy.
`readinessGates` `[]k8s.io/api/core/v1.PodReadinessGate`	If specified, all readiness gates will be evaluated for pod readiness. A pod is ready when all its containers are ready AND all conditions specified in the readiness gates have status equal to "True" More info: https://git.k8s.io/enhancements/keps/sig-network/580-pod-readiness-gates
`runtimeClassName` `string`	RuntimeClassName refers to a RuntimeClass object in the node.k8s.io group, which should be used to run this pod. If no RuntimeClass resource matches the named class, the pod will not be run. If unset or empty, the "legacy" RuntimeClass will be used, which is an implicit class with an empty definition that uses the default runtime handler. More info: https://git.k8s.io/enhancements/keps/sig-node/585-runtime-class
`enableServiceLinks` `bool`	EnableServiceLinks indicates whether information about services should be injected into pod's environment variables, matching the syntax of Docker links. Optional: Defaults to true.
`preemptionPolicy` `k8s.io/api/core/v1.PreemptionPolicy`	PreemptionPolicy is the Policy for preempting pods with lower priority. One of Never, PreemptLowerPriority. Defaults to PreemptLowerPriority if unset.
`overhead` `k8s.io/api/core/v1.ResourceList`	Overhead represents the resource overhead associated with running a pod for a given RuntimeClass. This field will be autopopulated at admission time by the RuntimeClass admission controller. If the RuntimeClass admission controller is enabled, overhead must not be set in Pod create requests. The RuntimeClass admission controller will reject Pod create requests which have the overhead already set. If RuntimeClass is configured and selected in the PodSpec, Overhead will be set to the value defined in the corresponding RuntimeClass, otherwise it will remain unset and treated as zero. More info: https://git.k8s.io/enhancements/keps/sig-node/688-pod-overhead/README.md
`topologySpreadConstraints` `[]k8s.io/api/core/v1.TopologySpreadConstraint`	TopologySpreadConstraints describes how a group of pods ought to spread across topology domains. Scheduler will schedule pods in a way which abides by the constraints. All topologySpreadConstraints are ANDed.
`setHostnameAsFQDN` `bool`	If true the pod's hostname will be configured as the pod's FQDN, rather than the leaf name (the default). In Linux containers, this means setting the FQDN in the hostname field of the kernel (the nodename field of struct utsname). In Windows containers, this means setting the registry value of hostname for the registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters to FQDN. If a pod does not have FQDN, this has no effect. Default to false.
`os` `k8s.io/api/core/v1.PodOS`	Specifies the OS of the containers in the pod. Some pod and container fields are restricted if this is set. If the OS field is set to linux, the following fields must be unset: -securityContext.windowsOptions If the OS field is set to windows, following fields must be unset: spec.hostPID spec.hostIPC spec.hostUsers spec.securityContext.appArmorProfile spec.securityContext.seLinuxOptions spec.securityContext.seccompProfile spec.securityContext.fsGroup spec.securityContext.fsGroupChangePolicy spec.securityContext.sysctls spec.shareProcessNamespace spec.securityContext.runAsUser spec.securityContext.runAsGroup spec.securityContext.supplementalGroups spec.securityContext.supplementalGroupsPolicy spec.containers[].securityContext.appArmorProfile spec.containers[].securityContext.seLinuxOptions spec.containers[].securityContext.seccompProfile spec.containers[].securityContext.capabilities spec.containers[].securityContext.readOnlyRootFilesystem spec.containers[].securityContext.privileged spec.containers[].securityContext.allowPrivilegeEscalation spec.containers[].securityContext.procMount spec.containers[].securityContext.runAsUser spec.containers[].securityContext.runAsGroup
`hostUsers` `bool`	Use the host's user namespace. Optional: Default to true. If set to true or not present, the pod will be run in the host user namespace, useful for when the pod needs a feature only available to the host user namespace, such as loading a kernel module with CAP_SYS_MODULE. When set to false, a new userns is created for the pod. Setting false is useful for mitigating container breakout vulnerabilities even allowing users to run their containers as root without actually having root privileges on the host. This field is alpha-level and is only honored by servers that enable the UserNamespacesSupport feature.
`schedulingGates` `[]k8s.io/api/core/v1.PodSchedulingGate`	SchedulingGates is an opaque list of values that if specified will block scheduling the pod. If schedulingGates is not empty, the pod will stay in the SchedulingGated state and the scheduler will not attempt to schedule the pod. SchedulingGates can only be set at pod creation time, and be removed only afterwards.
`resourceClaims` `[]k8s.io/api/core/v1.PodResourceClaim`	ResourceClaims defines which ResourceClaims must be allocated and reserved before the Pod is allowed to start. The resources will be made available to those containers which consume them by name. This is an alpha field and requires enabling the DynamicResourceAllocation feature gate. This field is immutable.

`PredictorExtensionSpec`

Appears in:

ModelSpec

PredictorExtensionSpec defines configuration shared across all predictor frameworks

Field	Description
`storageUri` `string`	This field points to the location of the model which is mounted onto the pod.
`runtimeVersion` `string`	Runtime version of the predictor docker image
`protocolVersion` `github.com/sgl-project/ome/pkg/constants.InferenceServiceProtocol`	Protocol version to use by the predictor (i.e. v1 or v2 or grpc-v1 or grpc-v2)
`Container` `k8s.io/api/core/v1.Container`	(Members of `Container` are embedded into this type.) Container enables overrides for the predictor. Each framework will have different defaults that are populated in the underlying container spec.

`PredictorSpec`

Appears in:

InferenceServiceSpec

PredictorSpec defines the configuration for a predictor, The following fields follow a "1-of" semantic. Users must specify exactly one spec.

Field	Description
`model` [Required] `ModelSpec`	Model spec for any arbitrary framework.
`PodSpec` [Required] `PodSpec`	(Members of `PodSpec` are embedded into this type.) This spec is dual purpose. Provide a full PodSpec for custom predictor. The field PodSpec.Containers is mutually exclusive with other predictors (i.e. TFServing). Provide a predictor (i.e. TFServing) and specify PodSpec overrides, you must not provide PodSpec.Containers in this case.
`ComponentExtensionSpec` [Required] `ComponentExtensionSpec`	(Members of `ComponentExtensionSpec` are embedded into this type.) Component extension defines the deployment configurations for a predictor
`workerSpec` `WorkerSpec`	WorkerSpec for the predictor, this is used for multi-node serving without Ray Cluster

`RouterSpec`

Appears in:

RouterSpec defines the configuration for the Router component, which handles request routing

Field	Description
`PodSpec` [Required] `PodSpec`	(Members of `PodSpec` are embedded into this type.) PodSpec defines the container configuration for the router
`ComponentExtensionSpec` [Required] `ComponentExtensionSpec`	(Members of `ComponentExtensionSpec` are embedded into this type.) ComponentExtensionSpec defines deployment configuration like min/max replicas, scaling metrics, etc.
`runner` `RunnerSpec`	This is essentially a container spec that can override the default container
`config` `map[string]string`	Additional configuration parameters for the runner This can include framework-specific settings

`RunnerSpec`

Appears in:

RunnerSpec defines container configuration plus additional config settings The Runner is the primary container that executes the model serving or token generation logic.

Field	Description
`Container` `k8s.io/api/core/v1.Container`	(Members of `Container` are embedded into this type.) Container spec for the runner Provides complete Kubernetes container configuration for the primary execution container.

`ScaleMetric`

(Alias of string)

Appears in:

ComponentExtensionSpec

ScaleMetric enum

`ServiceMetadata`

Appears in:

BenchmarkJobSpec

ServiceMetadata contains metadata fields for recording the backend model server's configuration and version details. This information helps track experiment context, enabling users to filter and query experiments based on server properties.

Field	Description
`engine` [Required] `string`	Engine specifies the backend model server engine. Supported values: "vLLM", "SGLang", "TGI".
`version` [Required] `string`	Version specifies the version of the model server (e.g., "0.5.3").
`gpuType` [Required] `string`	GpuType specifies the type of GPU used by the model server. Supported values: "H100", "A100", "MI300", "A10".
`gpuCount` [Required] `int`	GpuCount indicates the number of GPU cards available on the model server.

`ServingRuntimePodSpec`

Appears in:

Field	Description
`containers` `[]k8s.io/api/core/v1.Container`	List of containers belonging to the pod. Containers cannot currently be added or removed. Cannot be updated.
`volumes` `[]k8s.io/api/core/v1.Volume`	List of volumes that can be mounted by containers belonging to the pod. More info: https://kubernetes.io/docs/concepts/storage/volumes
`nodeSelector` `map[string]string`	NodeSelector is a selector which must be true for the pod to fit on a node. Selector which must match a node's labels for the pod to be scheduled on that node. More info: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
`affinity` `k8s.io/api/core/v1.Affinity`	If specified, the pod's scheduling constraints
`tolerations` `[]k8s.io/api/core/v1.Toleration`	If specified, the pod's tolerations.
`labels` `map[string]string`	Labels that will be add to the pod. More info: http://kubernetes.io/docs/user-guide/labels
`annotations` `map[string]string`	Annotations that will be add to the pod. More info: http://kubernetes.io/docs/user-guide/annotations
`imagePullSecrets` `[]k8s.io/api/core/v1.LocalObjectReference`	ImagePullSecrets is an optional list of references to secrets in the same namespace to use for pulling any of the images used by this PodSpec. If specified, these secrets will be passed to individual puller implementations for them to use. More info: https://kubernetes.io/docs/concepts/containers/images#specifying-imagepullsecrets-on-a-pod
`schedulerName` `string`	If specified, the pod will be dispatched by specified scheduler. If not specified, the pod will be dispatched by default scheduler.
`hostIPC` `bool`	Use the host's ipc namespace. Optional: Default to false.
`dnsPolicy` `k8s.io/api/core/v1.DNSPolicy`	Set DNS policy for the pod. Defaults to "ClusterFirst". Valid values are 'ClusterFirstWithHostNet', 'ClusterFirst', 'Default' or 'None'. DNS parameters given in DNSConfig will be merged with the policy selected with DNSPolicy. To have DNS options set along with hostNetwork, you have to specify DNS policy explicitly to 'ClusterFirstWithHostNet'.
`hostNetwork` `bool`	Host networking requested for this pod. Use the host's network namespace. If this option is set, the ports that will be used must be specified. Default to false.

`ServingRuntimeRef`

Appears in:

InferenceServiceSpec

Field	Description
`name` [Required] `string`	Name of the runtime being referenced Identifies the specific runtime environment to be used for model execution.
`kind` [Required] `string`	Kind of the runtime being referenced Defaults to ClusterServingRuntime Specifies the Kubernetes resource kind of the referenced runtime. ClusterServingRuntime is a cluster-wide runtime, while ServingRuntime is namespace-scoped.
`apiGroup` [Required] `string`	APIGroup of the resource being referenced Defaults to `ome.io` Specifies the Kubernetes API group of the referenced runtime.

`ServingRuntimeSpec`

Appears in:

ServingRuntimeSpec defines the desired state of ServingRuntime. This spec is currently provisional and are subject to change as details regarding single-model serving and multi-model serving are hammered out.

Field	Description
`supportedModelFormats` [Required] `[]SupportedModelFormat`	Model formats and version supported by this runtime
`modelSizeRange` `ModelSizeRangeSpec`	ModelSizeRange is the range of model sizes supported by this runtime
`disabled` `bool`	Set to true to disable use of this runtime
`routerConfig` `RouterSpec`	Router configuration for this runtime
`engineConfig` `EngineSpec`	Engine configuration for this runtime
`decoderConfig` `DecoderSpec`	Decoder configuration for this runtime
`protocolVersions` `[]github.com/sgl-project/ome/pkg/constants.InferenceServiceProtocol`	Supported protocol versions (i.e. openAI or cohere or openInference-v1 or openInference-v2)
`ServingRuntimePodSpec` [Required] `ServingRuntimePodSpec`	(Members of `ServingRuntimePodSpec` are embedded into this type.) PodSpec for the serving runtime
`workers` `WorkerPodSpec`	WorkerPodSpec for the serving runtime, this is used for multi-node serving without Ray Cluster

`ServingRuntimeStatus`

Appears in:

ServingRuntimeStatus defines the observed state of ServingRuntime

`StorageSpec`

Appears in:

Field	Description
`path` `string`	Path is the absolute path where the model will be downloaded and stored on the node.
`schemaPath` `string`	SchemaPath is the path to the model schema or configuration file within the storage system. This can be used to validate the model or customize how it's loaded.
`parameters` `map[string]string`	Parameters contain key-value pairs to override default storage credentials or configuration. These values are typically used to configure access to object storage or mount options.
`key` `string`	StorageKey is the name of the key in a Kubernetes Secret used to authenticate access to the model storage. This key will be used to fetch credentials during model download or access.
`storageUri` [Required] `string`	StorageUri specifies the source URI of the model in a supported storage backend. Supported formats: OCI Object Storage: oci://n/{namespace}/b/{bucket}/o/{object_path} Persistent Volume: pvc://{pvc-name}/{sub-path} Vendor-specific: vendor://{vendor-name}/{resource-type}/{resource-path} This field is required.
`nodeSelector` `map[string]string`	NodeSelector defines a set of key-value label pairs that must be present on a node for the model to be scheduled and downloaded onto that node.
`nodeAffinity` `k8s.io/api/core/v1.NodeAffinity`	NodeAffinity describes the node affinity rules that further constrain which nodes are eligible to download and store this model, based on advanced scheduling policies.

`SupportedModelFormat`

Appears in:

ServingRuntimeSpec

Field	Description
`name` `string`	TODO this field is being used as model format name, and this is not correct, we should deprecate this and use Name from ModelFormat Name of the model
`modelFormat` [Required] `ModelFormat`	ModelFormat of the model, e.g., "PyTorch", "TensorFlow", "ONNX", "SafeTensors"
`modelType` `string`	DEPRECATED: This field is deprecated and will be removed in future releases.
`version` `string`	Version of the model format. Used in validating that a runtime supports a predictor. It Can be "major", "major.minor" or "major.minor.patch".
`modelFramework` [Required] `ModelFrameworkSpec`	ModelFramework of the model, e.g., "PyTorch", "TensorFlow", "ONNX", "Transformers"
`modelArchitecture` `string`	ModelArchitecture of the model, e.g., "LlamaForCausalLM", "GemmaForCausalLM", "MixtralForCausalLM"
`quantization` `ModelQuantization`	Quantization of the model, e.g., "fp8", "fbgemm_fp8", "int4"
`autoSelect` `bool`	Set to true to allow the ServingRuntime to be used for automatic model placement if this model format is specified with no explicit runtime.
`priority` `int32`	Priority of this serving runtime for auto selection. This is used to select the serving runtime if more than one serving runtime supports the same model format. The value should be greater than zero. The higher the value, the higher the priority. Priority is not considered if AutoSelect is either false or not specified. Priority can be overridden by specifying the runtime in the InferenceService.

`TransitionStatus`

(Alias of string)

Appears in:

ModelStatus

TransitionStatus enum

`WorkerPodSpec`

Appears in:

ServingRuntimeSpec

Field	Description
`size` `int`	Size of the worker, this is the number of pods in the worker.
`ServingRuntimePodSpec` `ServingRuntimePodSpec`	(Members of `ServingRuntimePodSpec` are embedded into this type.) PodSpec for the worker

`WorkerSpec`

Appears in:

WorkerSpec defines the configuration for worker nodes in a multi-node component Worker nodes perform the distributed processing tasks assigned by the leader node, enabling horizontal scaling for compute-intensive workloads.

Field	Description
`PodSpec` `PodSpec`	(Members of `PodSpec` are embedded into this type.) PodSpec for the worker Allows customization of the Kubernetes Pod configuration specifically for worker nodes.
`size` `int`	Size of the worker, this is the number of pods in the worker. Controls how many worker pod instances will be deployed for horizontal scaling.
`runner` `RunnerSpec`	Runner container override for customizing the main container This is essentially a container spec that can override the default container Provides fine-grained control over the container that executes the worker node's processing logic.

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.

Last modified June 21, 2025: [misc] update srt to latest container image (#36) (dd4064e)