Controller for ModelMesh


ModelMesh Serving

ModelMesh Serving is the Controller for managing ModelMesh, a general-purpose model serving management/routing layer.

Getting Started

To quickly get started with ModelMesh Serving, check out the Quick Start Guide.

For help, please open an issue in this repository.

Components and their Repositories

ModelMesh Serving currently comprises components spread over a number of repositories. The supported versions for the latest release are documented here.

Architecture Image

Issues across all components are tracked centrally in this repo.

Core Components

Runtime Adapters

  • modelmesh-runtime-adapter - the containers which run in each model serving pod and act as an intermediary between ModelMesh and third-party model-server containers. Its build produces a single "multi-purpose" image which can be used as an adapter to work with each of the out-of-the-box supported model servers. It also incorporates the "puller" logic which is responsible for retrieving the models from storage before handing over to the respective adapter logic to load the model (and to delete after unloading). This image is also used for a container in the load/unload path of custom ServingRuntime Pods, as a "standalone" puller.

Model Serving runtimes

ModelMesh Serving provides out-of-the-box integration with the following model servers.

ServingRuntime custom resources can be used to add support for other existing or custom-built model servers, see the docs on implementing a custom Serving Runtime


  • KServe V2 REST Proxy - a reverse-proxy server which translates a RESTful HTTP API into gRPC. This allows sending inference requests using the KServe V2 REST Predict Protocol to ModelMesh models which currently only support the V2 gRPC Predict Protocol.


These are helper Java libraries used by the ModelMesh component.

  • kv-utils - Useful KV store recipes abstracted over etcd and Zookeeper
  • litelinks-core - RPC/service discovery library based on Apache Thrift, used only for communications internal to ModelMesh.


Please read our contributing guide for details on contributing.

Building Images

# Build develop image
make build.develop

# After building the develop image,  build the runtime image
make build
  • "code":3,"message":"json: cannot unmarshal string into Go value of type main.RESTRequest"

    Getting {"code":3,"message":"json: cannot unmarshal string into Go value of type main.RESTRequest"} which sending an inference request. Codes:

    from mlserver.codecs.pandas import PandasCodec # required mlserver>=1.1.0, 
    payloads = "./IDA_en.ndjson"
    df = pd.read_json(payloads, lines=True)
    df = df.fillna('')
    payload = PandasCodec.encode_request(df, use_bytes=False)
    response ="http://localhost:8008/v2/models/muc-en-aa-predictor/infer", json=payload.json())

    I never experienced the error. What's the wrong?

    opened by MLHafizur 13
  • Higher payload size not working as described in doc

    Higher payload size not working as described in doc

    I have deployed a model using custom MLServer runtime. The gRPC inferencing is working as expected with small size payload.

    I modified the configurations to make it works with large payload size:

    1. In the runtime configuration:
    value: "300000000"

    as well as: 2, In the global configmap:

    grpcMaxMessageSizeBytes: 300000000

    But still is giving error and showing the default exceed limits:

    io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: gRPC message exceeds maximum size 16777216: 65844251
            at io.grpc.Status.asRuntimeException(
            at io.grpc.internal.MessageDeframer.processHeader(
            at io.grpc.internal.MessageDeframer.deliver(
            at io.grpc.internal.MessageDeframer.request(
            at io.grpc.internal.AbstractStream$TransportState$
            at io.grpc.netty.NettyServerStream$TransportState$
            at io.netty.util.concurrent.AbstractEventExecutor.runTask(
            at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(
            at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(
            at io.netty.util.concurrent.SingleThreadEventExecutor$
            at io.netty.util.internal.ThreadExecutorMap$
            at java.base/

    Client side error:

    Traceback (most recent call last):
      File "/temp/docker/", line 66, in <module>
        response = grpc_stub.ModelInfer(inference_request_g)
      File "/usr/local/lib/python3.9/site-packages/grpc/", line 946, in __call__
        return _end_unary_response_blocking(state, call, False, None)
      File "/usr/local/lib/python3.9/site-packages/grpc/", line 849, in _end_unary_response_blocking
        raise _InactiveRpcError(state)
    grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
            status = StatusCode.CANCELLED
            details = "Received RST_STREAM with error code 8"
            debug_error_string = "UNKNOWN:Error received from peer ipv4: {grpc_message:"Received RST_STREAM with error code 8", grpc_status:1, created_time:"2022-11-03T18:47:46.142301977+00:00"}"

    How can I overcome this situation?

    opened by MLHafizur 9
  • feat: TorchServe support

    feat: TorchServe support


    The Triton runtime can be used with model-mesh to serve PyTorch torchscript models, but it does not support arbitrary PyTorch models i.e. eager mode. KServe "classic" has integration with TorchServe but it would be good to have integration with model-mesh too so that these kinds of models can be used in distributed multi-model serving contexts.


    The bulk of the required changes are to the adapter image, covered by PR

    This PR contains the minimal controller changes needed to enable the support:

    • TorchServe ServingRuntime spec
    • Add "torchserve" to the list of supported built-in runtime types
    • Add "ID extraction" entry for TorchServe's gRPC Predictions RPC so that model-mesh will automatically extract the model name from corresponding request messages

    Note the supported model format is advertised as "pytorch-mar" to distinguish from the existing "pytorch" format that refers to raw TorchScript .pt files as supported by Triton.


    TorchServe can be used seamlessly with ModelMesh Serving to serve PyTorch models, including eager mode.

    Resolves #63

    opened by njhill 9
  • feat: storage phase 1 for inference service reconciler

    feat: storage phase 1 for inference service reconciler


    rebase #32 to the new inference service reconciler for model mesh

    For Storage Spec details, please refer to the design doc:

    Additional storages/parameters support will come in phase 2.



    lgtm approved 
    opened by Tomcli 9
  • chore: Automatically set kube context in development container

    chore: Automatically set kube context in development container


    When using the containerized development environment make develop to run FVT tests, one needs to configure access to a Kubernetes or OpenShift cluster from inside the container. Which has to be done for every make develop session. This can be tricky when cloud provider specific CLI tools are needed to connect and authenticate to a cluster.

    Currently there is a short paragraph in the FVT README about how to export a minified kubeconfig file and create that inside the container. It is tedious to repeat those steps for each make develop session and depending on OS, shell environment, editors and possible text encoding issue it is also error prone.


    This PR proposes to automatically create the kubeconfig file in a local and git-ignored directory inside the local project and automatically mount it to the develop container. All the user then has to do is connect and authenticate to the cluster in the shell that will be running make develop.


    Kubernetes context is ready inside the development container.

    # shell environment, outside the develop container has access to K8s cluster
    [modelmesh-serving_ckadner]$ kubectl get pods
    NAME                                        READY   STATUS    RESTARTS   AGE
    pod/etcd                                    1/1     Running   0          17m
    pod/minio                                   1/1     Running   0          17m
    pod/modelmesh-controller-387aef25be-ftyqu   1/1     Running   0          17m
    [modelmesh-serving_ckadner]$ make develop
    Pulling dev image kserve/modelmesh-controller-develop:6be58b09c25833c1...
    Building dev image kserve/modelmesh-controller-develop:6be58b09c25833c1...
    Image kserve/modelmesh-controller-develop:6be58b09c25833c1 has 14 layers
    Tagging dev image kserve/modelmesh-controller-develop:6be58b09c25833c1 as latest
    [root@17c121286549 workspace]# kubectl get pods
    NAME                                        READY   STATUS    RESTARTS   AGE
    pod/etcd                                    1/1     Running   0          18m
    pod/minio                                   1/1     Running   0          18m
    pod/modelmesh-controller-387aef25be-ftyqu   1/1     Running   0          18m
    [root@17c121286549 workspace]# 

    /cc @njhill

    lgtm approved test development 
    opened by ckadner 8
  • chore: Update GH Action workflows

    chore: Update GH Action workflows

    Both workflows for PRs and pushes were cleaned up. On a push (which includes when PRs merge), the code base is linted and tested first before building and publishing.

    lgtm approved 
    opened by pvaneck 8
  • fix: Ensure script works properly with implicit namespace

    fix: Ensure script works properly with implicit namespace


    The cleanup script currently doesn't require a namespace to be provided via the -n option and uses the current kubectl context's namespace otherwise. The $namespace variable wasn't set in this case however meaning later parts of the script might not work as intended.


    Ensure $namespace variable is set correctly either way.

    Result script works properly when -n option isn't used.

    lgtm approved 
    opened by njhill 7
  • fvt: a bunch of FVT improvements

    fvt: a bunch of FVT improvements


    A bunch of improvements to the FVT framework coming from our internal fork.

    After parallelizing and expanding the FVT suite to support testing the REST proxy internally, we had issues with the consistency of the FVTs. It took a few iterations of improvements to get them back to stable while we continued to add support for more tests. With the "fixes" and "features" coupled in a few different PRs the changes cannot be easily disentangled. This is a big mess of a PR, but the final result should be in a good place with all of our internal improvements.


    Test Parallelization:

    • split FVTs into separate suites (go packages)
      • ginkgo can parallelize within a suite, but runs the suites sequentially
    • refactors to enable sharing of code across the FVT suites
    • support parallelization by using ginkgo CLI to execute the tests instead of go test
    • use Ordered/Serial decorators on groups of tests that require it
      • this can help to speed up "inference" tests by creating the predictor once and using it across multiple specs, but it does mean some specs are not independent
      • TLS tests are marked as Serial because they require roll outs of the runtime pods
    • to help debugging, print inference services on failure in Predictor FVTs
    • avoid a nil pointer dereference that can occur if FVTs error during initialization while running in parallel
    • remove sleep in AfterEach of TLS tests
    • update port-forwards to select a pod directly from the Endpoints object corresponding to the service
      • when port-forwarding to a Service, there is no guard against selecting a Terminating pod

    Config and Secrets:

    • specify the full DefaultConfig in code instead of in the user-configmap.yaml file
    • allow TLS config maps to be overlayed on the base config (instead of template string in YAML)
    • generate TLS certificates for each run of the FVTs instead of using hard-coded certs

    REST Proxy Tests:

    • enable the REST proxy for FVTs and add inference tests using proxy
    • have the FVT Client manage port-forwards for each of REST and gRPC


    • a faster and more efficient FVT suite with parallelization
    • improved FVT stability and extensibility to support future changes
    lgtm approved 
    opened by tjohnson31415 7
  • feat: Update InferenceService reconciliation logic

    feat: Update InferenceService reconciliation logic

    The ModelMesh controller can now reconcile InferenceServices using the new Model Spec in the predictor. Also, an issue was fixed with there being a leading slash in the model path when an InferenceService StorageURI was parsed. This was causing issues in the adapter, preventing models from being loaded as seen in #97.

    Closes: #96, #97 Related: #90


    Users can now successfully deploy an InferenceService using the predictor Model Spec such as the following:

    kind: InferenceService
      name: example-sklearn-isvc
      annotations: ModelMesh localMinIO
            name: sklearn 
          storageUri: s3://modelmesh-example-models/sklearn/mnist-svm.joblib
    lgtm approved 
    opened by pvaneck 7
  • ModelMesh Release Tracker for KServe v0.7.0

    ModelMesh Release Tracker for KServe v0.7.0

    The plan is to cut the KServe 0.7 release mid next week. For this release, ModelMesh will be loosely integrated with KServe.

    Action Items:

    • [x] ModelMesh InferenceService CRD Support
      • [x]
      • [x] Documentation on using InferenceService CR with ModelMesh
    • [x] ModelMesh REST Proxy Sidecar Support
      • [x]
      • [x] Documentation on using REST inferencing
    • [x] Add KServe ModelMesh DeploymentMode Annotation checker
    • [ ] Update KServe hack/ to include ModelMesh-Serving as part of installation.
    • [x] Documentation updates
      • [x] Refresh External ModelMesh Documentation
      • [x] Update KServe Website with ModelMesh Documentation
        • Can view website here:
    • [x] Assemble release process items
      • Tag release for version v0.7.0 to follow suit with KServe.
      • [x] GitHub workflow for tagged release
      • [x] Release process documentation
        • Create document outline process of creating a release branch and tagging from commit in that branch. KServe should already have document like this.
    opened by pvaneck 7
  • Adjust FVT GH-Actions workflow

    Adjust FVT GH-Actions workflow


    Decrease the flakiness of FVT runs that occur when certain tests are run back to back.


    The rollingUpdate strategy is adjusted in a preprocessing step of the FVT Github Actions workflow to allow better stability in low resource environments. The defaultTimeout was increased to account for the the changes in strategy. Ran into some intermittent failures due to timeouts when the deployment doesn't become ready in time.


    Less flakiness in FVT runs.

    lgtm approved 
    opened by pvaneck 7
  • test: Add TorchServe FVT

    test: Add TorchServe FVT


    Support for TorchServe was added in #250 and A test should be added for it as well.


    • Adds basic FVT for load/inference with a TorchServe MAR model using the native TorchServe gRPC API


    Closes #280

    opened by rafvasq 1
  • Remove residual

    Remove residual "Watson" references

    Model-mesh was originally developed as part of IBM Watson. Now that it is part of KServe we should scrub any remaining places that "watson" is used in the codebase, at least starting with those that are straightforward to change.

    opened by njhill 0
  • Update Dockerfile packages

    Update Dockerfile packages


    Fix vulnerabilities in the ModelMesh Controller image.


    1. Remove dependencies to any packages in the runtime.
    2. Updated Go version to the latest version.


    The feature is functionally working, I have tested it on my own cluster.

    opened by JasmondL 2
  • OutOfDirectMemoryError on setting higher grpc input size

    OutOfDirectMemoryError on setting higher grpc input size

    Describe the bug

    Followed this doc and have set the grpcMaxMessageSizeBytes to 400000000
    here's my config.yaml from model-serving-config ConfigMap

    podsPerRuntime: 1
      enabled: true
    grpcMaxMessageSizeBytes: 400000000

    When I make a grpc call, it throws

    Dec 01, 2022 8:15:15 PM io.grpc.netty.NettyServerTransport notifyTerminated
    INFO: Transport failed
    io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 4194304 byte(s) of direct memory (used: 75497758, max: 76546048)
    	at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(
    	at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(
    	at io.netty.buffer.PoolArena$DirectArena.allocateDirect(
    	at io.netty.buffer.PoolArena$DirectArena.newChunk(
    	at io.netty.buffer.PoolArena.allocateNormal(
    	at io.netty.buffer.PoolArena.tcacheAllocateNormal(
    	at io.netty.buffer.PoolArena.allocate(
    	at io.netty.buffer.PoolArena.allocate(
    	at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(
    	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(
    	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(
    	at io.netty.buffer.AbstractByteBufAllocator.buffer(
    	at io.netty.handler.codec.ByteToMessageDecoder.expandCumulation(
    	at io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(
    	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(

    in triton, I can see that the model is working fine and you can also see the input shape and that the model did release the response

    I am not sure what other changes I need to get this to work.
    I suppose this is where it's being set. I tried to back track it to some setting but couldn't.

    I1201 20:15:13.885755 1] Process for ModelInferHandler, rpc_ok=1, 15 step START
    I1201 20:15:13.885773 1] New request handler for ModelInferHandler, 17
    I1201 20:15:13.885778 1] GetModel() '6230834ea7f575001e824ce9__isvc-14826e7e9a' version -1
    I1201 20:15:13.885783 1] GetModel() '6230834ea7f575001e824ce9__isvc-14826e7e9a' version -1
    I1201 20:15:13.885792 1] prepared: [0x0x7f3bc0007b50] request id: , model: 6230834ea7f575001e824ce9__isvc-14826e7e9a, requested version: -1, actual version: 1, flags: 0x0, correlation id: 0, batch size: 1, priority: 0, timeout (us): 0
    original inputs:
    [0x0x7f3bc0007e58] input: CLIP, type: UINT8, original shape: [1,32,589,617,3], batch + shape: [1,32,589,617,3], shape: [32,589,617,3]
    override inputs:
    [0x0x7f3bc0007e58] input: CLIP, type: UINT8, original shape: [1,32,589,617,3], batch + shape: [1,32,589,617,3], shape: [32,589,617,3]
    original requested outputs:
    requested outputs:
    I1201 20:15:13.885874 1] model 6230834ea7f575001e824ce9__isvc-14826e7e9a, instance 6230834ea7f575001e824ce9__isvc-14826e7e9a, executing 1 requests
    I1201 20:15:13.947333 1] add response output: output: classes, type: BYTES, shape: [1,1]
    I1201 20:15:13.947355 1] GRPC: using buffer for 'classes', size: 18, addr: 0x7f3aec004b90
    I1201 20:15:13.947360 1] add response output: output: scores, type: FP32, shape: [1,1]
    I1201 20:15:13.947363 1] GRPC: using buffer for 'scores', size: 4, addr: 0x7f3aec004d70
    I1201 20:15:13.947367 1] ModelInferHandler::InferResponseComplete, 15 step ISSUED
    I1201 20:15:13.947375 1] GRPC free: size 18, addr 0x7f3aec004b90
    I1201 20:15:13.947379 1] GRPC free: size 4, addr 0x7f3aec004d70
    I1201 20:15:13.947442 1] ModelInferHandler::InferRequestComplete
    I1201 20:15:13.947451 1] TRITONBACKEND_ModelInstanceExecute: model instance name 6230834ea7f575001e824ce9__isvc-14826e7e9a released 1 requests
    I1201 20:15:13.947455 1] Process for ModelInferHandler, rpc_ok=1, 15 step COMPLETE

    Additional context

    The model being used is a video sequence classification model and the input for it is a sequence of 32 cropped frames, hence such huge input size. I did try using encoding the cropped sequence into h.264 and decoding in the but it adds a lot of overhead on inference speed. hence I am trying to infer using the large input tensor.

    opened by dumbPy 1
  • Payload logging/events

    Payload logging/events

    For various reasons including monitoring by external system for things like drift / outlier detection etc.

    It should support CloudEvents and be compatible with the logger in KServe "classic", so that it can be used in a similar way, as illustrated in these samples:

    • /

    Some considerations / possible complications:

    • In KServe the logger can be configured per InferenceService. We need to decide whether we support this with model-mesh, or a simpler global configuration, or both. Another possibility could be allowing a logging destination to be configured globally and enabled/disabled per model.
    • Model-mesh doesn't really touch the payloads currently, but it only routes gRPC/protobuf. So we could emit the raw protobuf messages but this would differ from the existing KServe case and so would not necessarily be compatible with the same integrations. We could transcode to json on the fly, but this would introduce processing overhead that may be undesirable and affect data path performance.
    • The KServe examples are based on the V1 API, we should to check whether the existing logger works with the V2 API; the runtimes supported by model-mesh are primarily V2 based.

    cc @rafvasq

    opened by njhill 1
  • Create isolation between serving runtimes

    Create isolation between serving runtimes

    Is your feature request related to a problem? If so, please describe.

    In my team’s use case, we are currently using the KServe V2 Inference Protocol REST API for sending inference requests. On top of this protocol, we also make use of multiple virtual services to direct traffic to the modelmesh-serving service such that each serving runtime should be mapped to one virtual service.

    Our use case does not allow us to match the /v2/models/<model-id>/infer in our virtual service, and this creates a problem for us because requests sent to the virtual service for serving runtime X can end up reaching models loaded in serving runtime Y due to the fact that:

    • All serving runtimes share the same modelmesh-serving service
    • Users can set the model id to any existing inference service name in the request path

    Describe your proposed solution

    Since this is an unwanted behaviour for my team, we have 2 possible solutions for solving this.

    1. Support mm-vmodel-id header in the REST API and allow it to take precedence over the model id specified in the V2 inference path
    2. Create a dedicated service per serving runtime instead of having all serving runtimes share the same service

    Describe alternatives you have considered

    Additional context

    opened by xvnyv 1
  • v0.9.0(Jul 21, 2022)

    :warning: What's Changed

    • ModelMesh Serving now directly imports KServe types for ServingRuntimes and InferenceServices. (#140, #146)
    • InferenceService CRD now copied from KServe and included as part of standalone ModelMesh Serving installation by default.
    • Renamed role/rolebinding names to incllude modelmesh prefix. (#181)
    • ModelMesh now uses Java 17 (kserve/modelmesh#33) and G1 garbage collector. (kserve/modelmesh#41)
    • ModelMesh logging improvements. (kserve/modelmesh#41)
    • InferenceService CRD now included in default standalone mm-serving installation. (#166)
    • Many dependencies including etcd (updated to v3.5.3) were bumped. (#145)

    :rainbow: What's New?

    • Added support for OpenVINO Model Server ServingRuntime. (#141)
    • OpenVINO Model Server adapter implemented. (#kserve/modelmesh-runtime-adapter#18)
    • TotalCopies is now available in the Predictor and InferenceService statuses. (#142)
    • Users can now set labels and annotations for ServingRuntime pods via the model-serving-config ConfigMap. (#144)
    • Users can override adapter environment variables added by the controller. (#149)
    • ServingRuntime matching based on protocolVersion is now supported. (#154)
    • ModelMetadata endpoint now enabled for Triton and MLServer ServingRuntimes. (#164)
    • Azure Blob Storage now added as a supported storage provider. (#174, kserve/modelmesh-runtime-adapter#23)
    • Add ModelMesh metrics for inference request/response payload sizes. (kserve/modelmesh#37)

    :lady_beetle: Fixes

    • Fixed possible nil pointer dereferences and minor log improvements. (#160)
    • Fixed potential eviction deadlock in ModelMesh. (kserve/modelmesh#25)
    • Disabled FIPS for Java in ModelMesh. (kserve/modelmesh#35)
    • Repair invalid ModelRecord lastUsed values in registry. (kserve/modelmesh#36)
    • Quickstart minio and etcd pods were converted to Deployment resources. (#157)

    :page_facing_up: Documentation

    • OpenVINO ServingRuntime documentation added. (#167)
    • Rest proxy documentation added. (#177)
    • Monitoring and metrics documentation added. (#175)
    • TLS configuration documentation added. (#176)
    • InferenceService CRD now documented as the primary interface for interacting with ModelMesh. (#190)

    :otter: Other

    • Upgrade tests to use to Ginkgo V2. (#133)
    • Add performance test to E2E toolchain. (#139)
    • Quickstart etcd version updated to v3.5.4. (#151)

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
    config-v0.9.0.tar.gz(45.76 KB)
    modelmesh-quickstart-dependencies.yaml(2.76 KB)
    modelmesh-runtimes.yaml(4.11 KB)
    modelmesh.yaml(648.97 KB)
  • v0.8.0(Feb 12, 2022)

    :warning: What's Changed

    • Removed support for KServe TrainedModel CRD (#54)
    • MLServer ServingRuntime updated to use 0.5.2 (#61)
    • Go version updated to 1.17 along with other tooling updates (
    • MLServer ServingRuntime now has an increased gRPC max message size (#85)
    • In the ServingRuntime CRD, SupportedModelTypes now goes by SupportedModelFormats (#100)
    • The max gRPC response message size via the REST-proxy has been increased to 16MiB (

    :rainbow: What's New?

    • Multi-namespace support for the ModelMesh controller was introduced (#84)
      • Kube resolver can now work with multiple namespaces for multi-namespace capability (#73)
      • ModelMeshEventStream component can now support multiple namespaces (#76)
      • ServingRuntime controller now works across multiple namespaces (#77)
      • Service Controller is now namespace-aware (#82)
    • Default RBAC is now cluster-scoped instead of namespace-scoped (#88)
    • Users can now configure environment variables for the model-mesh containers in ServingRuntime deployments (
    • Reconciliation logic added for new storage spec in InferenceServices and Predictors (#56, #83)
    • A multiModel field added to the ServingRuntime spec for denoting if a ServingRuntime is compatible with ModelMesh or not (#89)
    • The controller can now reconcile InferenceServices using the new Model Spec in the predictor (#101)
    • autoSelect field introduced to ServingRuntime CRD supportedModelTypes spec (#100)
    • Logic was added to have MM only consider SRs with model format containing autoSelect as true when finding compatible runtimes (#108)
    • Install script now allows passing in a URL to a config archive (#118)
    • Models hosted using GCS or HTTP(S) can now be used with ModelMesh through InferenceServices (#121)
    • REST input payloads through the REST-proxy can now be multi-dimensional (

    :lady_beetle: Fixes

    • Fix code errors reported by golangci-lint (#57)
    • Fixed a bug where invalid vModel specs led to a nil pointer dereference (
    • Fixed a bug where ServingRuntime controller would loop over empty reconcile events (
    • Events from plugged-in Predictor sources are now transformed properly when setting up ServingRuntime controller (
    • Fixed install issues on Mac (#114, #119)

    :page_facing_up: Documentation

    • Added developer documentation (#59)
    • Added notes about debug flags in custom MLServer runtimes (
    • Added Keras docs and example (, #109)
    • Change install instructions to install from a release branch (#117)

    :otter: Other

    • Some controller code was cleaned up and optimized (
    • Script for setting up a user namespace for ModelMesh was added (#112)

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
    config-v0.8.0.tar.gz(28.16 KB)
    modelmesh-quickstart-dependencies.yaml(2.40 KB)
    modelmesh-runtimes.yaml(3.01 KB)
    modelmesh.yaml(105.73 KB)
  • v0.7.0(Oct 12, 2021)

Highly scalable and standards based Model Inference Platform on Kubernetes for Trusted AI
A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore

bookstore-sample-controller A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore. A resource cre

Abdullah Al Shaad 0 Jan 20, 2022
Controller Area Network (CAN) SDK for Go.

?? CAN Go CAN toolkit for Go programmers. can-go makes use of the Linux SocketCAN abstraction for CAN communication. (See the SocketCAN documentation

Einride 101 Dec 20, 2022
Annotated and kubez-autoscaler-controller will maintain the HPA automatically for kubernetes resources.

Kubez-autoscaler Overview kubez-autoscaler 通过为 deployment / statefulset 添加 annotations 的方式,自动维护对应 HorizontalPodAutoscaler 的生命周期. Prerequisites 在 kuber

null 138 Jan 2, 2023
network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of kubernetes.

Network Node Manager network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of ku

kakao 102 Dec 18, 2022
A controller to create K8s Ingresses for Openshift routes.

route-to-ingress-operator A controller to create corresponding resources for TODO int port string p

Mohammad Yosefpor 5 Jan 7, 2022
A Kubernetes Terraform Controller

Terraform Controller Terraform Controller is a Kubernetes Controller for Terraform, which can address the requirement of Using Terraform HCL as IaC mo

Open Application Model 101 Jan 2, 2023
Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes.

Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes. This project is inspired by agones. Introduction Genera

Open Cloud-native Game-application Initiative 31 Nov 25, 2022
A fluxcd controller for managing remote manifests with kubecfg

kubecfg-operator A fluxcd controller for managing remote manifests with kubecfg This project is in very early stages proof-of-concept. Only latest ima

Pelotech 59 Nov 1, 2022
A fluxcd controller for managing manifests declared in jsonnet

jsonnet-controller A fluxcd controller for managing manifests declared in jsonnet. Kubecfg (and its internal libraries) as well as Tanka-style directo

Pelotech 59 Nov 1, 2022
Write controller-runtime based k8s controllers that read/write to git, not k8s

Git Backed Controller The basic idea is to write a k8s controller that runs against git and not k8s apiserver. So the controller is reading and writin

Darren Shepherd 50 Dec 10, 2021
The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the operator-sdk or controller-runtime.

k8s-generic-webhook The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the opera

Norwin Schnyder 9 Nov 24, 2022
the simplest testing framework for Kubernetes controller.

KET(Kind E2e Test framework) KET is the simplest testing framework for Kubernetes controller. KET is available as open source software, and we look fo

Riita 37 Dec 10, 2022
pubsub controller using kafka and base on sarama. Easy controll flow for actions streamming, event driven.

Psub helper for create system using kafka to streaming and events driven base. Install go get have 3 env variables for config

Te Nguyen 6 Sep 26, 2022
Kubernetes workload controller for container image deployment

kube-image-deployer kube-image-deployer는 Docker Registry의 Image:Tag를 감시하는 Kubernetes Controller입니다. Keel과 유사하지만 단일 태그만 감시하며 더 간결하게 동작합니다. Container, I

PUBG Corporation 2 Mar 8, 2022
Raspberry pi GPIO controller package(CGO)

GOPIO A simple gpio controller package for raspberrypi. Documentation Examples Installation sudo apt-get install wiringpi go get

arian firoozfar 14 Nov 24, 2022
Knative Controller which emits cloud events when Knative Resources change state

Knative Sample Controller Knative sample-controller defines a few simple resources that are validated by webhook and managed by a controller to demons

salaboy 2 Oct 2, 2021
A controller managing namespaces deployments, statefulsets and cronjobs objects. Inspired by kube-downscaler.

kube-ns-suspender Kubernetes controller managing namespaces life cycle. kube-ns-suspender Goal Usage Internals The watcher The suspender Flags Resourc

Virtuo 62 Dec 27, 2022
K8s controller implementing Multi-Cluster Services API based on AWS Cloud Map.

AWS Cloud Map MCS Controller for K8s Introduction AWS Cloud Map multi-cluster service discovery for Kubernetes (K8s) is a controller that implements e

Amazon Web Services 69 Dec 17, 2022
A Pulumi NGINX Ingress Controller component

Pulumi NGINX Ingress Controller Component This repo contains the Pulumi NGINX Ingress Controller component for Kubernetes. This ingress controller use

Pulumi 8 Aug 10, 2022
Create cluster to run ingress controller and set the dns resolver

kubebuilder-crd-dep-svc-ing create cluster to run ingress controller and set the dns resolver $ kind create cluster --config clust.yaml $ sudo

null 1 Nov 15, 2021