Controller for ModelMesh

Overview

ModelMesh Serving

ModelMesh Serving is the Controller for managing ModelMesh, a general-purpose model serving management/routing layer.

Getting Started

To quickly get started with ModelMesh Serving, check out the Quick Start Guide.

For help, please open an issue in this repository.

Components and their Repositories

ModelMesh Serving currently comprises components spread over a number of repositories. The supported versions for the latest release are documented here.

Architecture Image

Issues across all components are tracked centrally in this repo.

Core Components

Runtime Adapters

  • modelmesh-runtime-adapter - the containers which run in each model serving pod and act as an intermediary between ModelMesh and third-party model-server containers. Its build produces a single "multi-purpose" image which can be used as an adapter to work with each of the out-of-the-box supported model servers. It also incorporates the "puller" logic which is responsible for retrieving the models from storage before handing over to the respective adapter logic to load the model (and to delete after unloading). This image is also used for a container in the load/unload path of custom ServingRuntime Pods, as a "standalone" puller.

Model Serving runtimes

ModelMesh Serving provides out-of-the-box integration with the following model servers.

ServingRuntime custom resources can be used to add support for other existing or custom-built model servers, see the docs on implementing a custom Serving Runtime

Supplementary

  • KServe V2 REST Proxy - a reverse-proxy server which translates a RESTful HTTP API into gRPC. This allows sending inference requests using the KServe V2 REST Predict Protocol to ModelMesh models which currently only support the V2 gRPC Predict Protocol.

Libraries

These are helper Java libraries used by the ModelMesh component.

  • kv-utils - Useful KV store recipes abstracted over etcd and Zookeeper
  • litelinks-core - RPC/service discovery library based on Apache Thrift, used only for communications internal to ModelMesh.

Contributing

Please read our contributing guide for details on contributing.

Building Images

# Build develop image
make build.develop

# After building the develop image,  build the runtime image
make build
Comments
  • "code":3,"message":"json: cannot unmarshal string into Go value of type main.RESTRequest"

    Getting {"code":3,"message":"json: cannot unmarshal string into Go value of type main.RESTRequest"} which sending an inference request. Codes:

    from mlserver.codecs.pandas import PandasCodec # required mlserver>=1.1.0, 
    payloads = "./IDA_en.ndjson"
    df = pd.read_json(payloads, lines=True)
    df = df.fillna('')
    payload = PandasCodec.encode_request(df, use_bytes=False)
    response = requests.post("http://localhost:8008/v2/models/muc-en-aa-predictor/infer", json=payload.json())
    

    I never experienced the error. What's the wrong?

    opened by MLHafizur 12
  • Higher payload size not working as described in doc

    Higher payload size not working as described in doc

    I have deployed a model using custom MLServer runtime. The gRPC inferencing is working as expected with small size payload.

    I modified the configurations to make it works with large payload size:

    1. In the runtime configuration:
    - name: MLSERVER_GRPC_MAX_MESSAGE_LENGTH
    value: "300000000"
    

    as well as: 2, In the global configmap:

    grpcMaxMessageSizeBytes: 300000000

    But still is giving error and showing the default exceed limits:

    io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: gRPC message exceeds maximum size 16777216: 65844251
            at io.grpc.Status.asRuntimeException(Status.java:530)
            at io.grpc.internal.MessageDeframer.processHeader(MessageDeframer.java:392)
            at io.grpc.internal.MessageDeframer.deliver(MessageDeframer.java:272)
            at io.grpc.internal.MessageDeframer.request(MessageDeframer.java:162)
            at io.grpc.internal.AbstractStream$TransportState$1RequestRunnable.run(AbstractStream.java:236)
            at io.grpc.netty.NettyServerStream$TransportState$1.run(NettyServerStream.java:198)
            at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
            at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
            at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
            at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:403)
            at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
            at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
            at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
            at java.base/java.lang.Thread.run(Thread.java:833)
    

    Client side error:

    Traceback (most recent call last):
      File "/temp/docker/grpc_call.py", line 66, in <module>
        response = grpc_stub.ModelInfer(inference_request_g)
      File "/usr/local/lib/python3.9/site-packages/grpc/_channel.py", line 946, in __call__
        return _end_unary_response_blocking(state, call, False, None)
      File "/usr/local/lib/python3.9/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
        raise _InactiveRpcError(state)
    grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
            status = StatusCode.CANCELLED
            details = "Received RST_STREAM with error code 8"
            debug_error_string = "UNKNOWN:Error received from peer ipv4:10.244.8.5:8033 {grpc_message:"Received RST_STREAM with error code 8", grpc_status:1, created_time:"2022-11-03T18:47:46.142301977+00:00"}"
    

    How can I overcome this situation?

    bug 
    opened by MLHafizur 9
  • feat: TorchServe support

    feat: TorchServe support

    Motivation

    The Triton runtime can be used with model-mesh to serve PyTorch torchscript models, but it does not support arbitrary PyTorch models i.e. eager mode. KServe "classic" has integration with TorchServe but it would be good to have integration with model-mesh too so that these kinds of models can be used in distributed multi-model serving contexts.

    Modifications

    The bulk of the required changes are to the adapter image, covered by PR https://github.com/kserve/modelmesh-runtime-adapter/pull/34.

    This PR contains the minimal controller changes needed to enable the support:

    • TorchServe ServingRuntime spec
    • Add "torchserve" to the list of supported built-in runtime types
    • Add "ID extraction" entry for TorchServe's gRPC Predictions RPC so that model-mesh will automatically extract the model name from corresponding request messages

    Note the supported model format is advertised as "pytorch-mar" to distinguish from the existing "pytorch" format that refers to raw TorchScript .pt files as supported by Triton.

    Result

    TorchServe can be used seamlessly with ModelMesh Serving to serve PyTorch models, including eager mode.

    Resolves #63

    approved 
    opened by njhill 9
  • feat: storage phase 1 for inference service reconciler

    feat: storage phase 1 for inference service reconciler

    Motivation

    rebase #32 to the new inference service reconciler for model mesh

    For Storage Spec details, please refer to the design doc: https://docs.google.com/document/d/1rYNh93XNMRp8YaR56m-_zK3bqpZhhGKgpVbLPPsi-Ow/edit#

    Additional storages/parameters support will come in phase 2.

    Modifications

    Result

    lgtm approved 
    opened by Tomcli 9
  • chore: Automatically set kube context in development container

    chore: Automatically set kube context in development container

    Motivation

    When using the containerized development environment make develop to run FVT tests, one needs to configure access to a Kubernetes or OpenShift cluster from inside the container. Which has to be done for every make develop session. This can be tricky when cloud provider specific CLI tools are needed to connect and authenticate to a cluster.

    Currently there is a short paragraph in the FVT README about how to export a minified kubeconfig file and create that inside the container. It is tedious to repeat those steps for each make develop session and depending on OS, shell environment, editors and possible text encoding issue it is also error prone.

    Modifications

    This PR proposes to automatically create the kubeconfig file in a local and git-ignored directory inside the local project and automatically mount it to the develop container. All the user then has to do is connect and authenticate to the cluster in the shell that will be running make develop.

    Result

    Kubernetes context is ready inside the development container.

    # shell environment, outside the develop container has access to K8s cluster
    [modelmesh-serving_ckadner]$ kubectl get pods
    
    NAME                                        READY   STATUS    RESTARTS   AGE
    pod/etcd                                    1/1     Running   0          17m
    pod/minio                                   1/1     Running   0          17m
    pod/modelmesh-controller-387aef25be-ftyqu   1/1     Running   0          17m
    
    [modelmesh-serving_ckadner]$ make develop
    
    ./scripts/build_devimage.sh
    Pulling dev image kserve/modelmesh-controller-develop:6be58b09c25833c1...
    Building dev image kserve/modelmesh-controller-develop:6be58b09c25833c1...
    Image kserve/modelmesh-controller-develop:6be58b09c25833c1 has 14 layers
    Tagging dev image kserve/modelmesh-controller-develop:6be58b09c25833c1 as latest
    ./scripts/develop.sh
    [[email protected] workspace]# kubectl get pods
    NAME                                        READY   STATUS    RESTARTS   AGE
    pod/etcd                                    1/1     Running   0          18m
    pod/minio                                   1/1     Running   0          18m
    pod/modelmesh-controller-387aef25be-ftyqu   1/1     Running   0          18m
    [[email protected] workspace]# 
    

    /cc @njhill

    lgtm approved test development 
    opened by ckadner 8
  • chore: Update GH Action workflows

    chore: Update GH Action workflows

    Both workflows for PRs and pushes were cleaned up. On a push (which includes when PRs merge), the code base is linted and tested first before building and publishing.

    lgtm approved 
    opened by pvaneck 8
  • fix: Ensure delete.sh script works properly with implicit namespace

    fix: Ensure delete.sh script works properly with implicit namespace

    Motivation

    The delete.sh cleanup script currently doesn't require a namespace to be provided via the -n option and uses the current kubectl context's namespace otherwise. The $namespace variable wasn't set in this case however meaning later parts of the script might not work as intended.

    Modifications

    Ensure $namespace variable is set correctly either way.

    Result

    delete.sh script works properly when -n option isn't used.

    lgtm approved 
    opened by njhill 7
  • fvt: a bunch of FVT improvements

    fvt: a bunch of FVT improvements

    Motivation

    A bunch of improvements to the FVT framework coming from our internal fork.

    After parallelizing and expanding the FVT suite to support testing the REST proxy internally, we had issues with the consistency of the FVTs. It took a few iterations of improvements to get them back to stable while we continued to add support for more tests. With the "fixes" and "features" coupled in a few different PRs the changes cannot be easily disentangled. This is a big mess of a PR, but the final result should be in a good place with all of our internal improvements.

    Modifications

    Test Parallelization:

    • split FVTs into separate suites (go packages)
      • ginkgo can parallelize within a suite, but runs the suites sequentially
    • refactors to enable sharing of code across the FVT suites
    • support parallelization by using ginkgo CLI to execute the tests instead of go test
    • use Ordered/Serial decorators on groups of tests that require it
      • this can help to speed up "inference" tests by creating the predictor once and using it across multiple specs, but it does mean some specs are not independent
      • TLS tests are marked as Serial because they require roll outs of the runtime pods
    • to help debugging, print inference services on failure in Predictor FVTs
    • avoid a nil pointer dereference that can occur if FVTs error during initialization while running in parallel
    • remove sleep in AfterEach of TLS tests
    • update port-forwards to select a pod directly from the Endpoints object corresponding to the service
      • when port-forwarding to a Service, there is no guard against selecting a Terminating pod

    Config and Secrets:

    • specify the full DefaultConfig in code instead of in the user-configmap.yaml file
    • allow TLS config maps to be overlayed on the base config (instead of template string in YAML)
    • generate TLS certificates for each run of the FVTs instead of using hard-coded certs

    REST Proxy Tests:

    • enable the REST proxy for FVTs and add inference tests using proxy
    • have the FVT Client manage port-forwards for each of REST and gRPC

    Result

    • a faster and more efficient FVT suite with parallelization
    • improved FVT stability and extensibility to support future changes
    lgtm approved 
    opened by tjohnson31415 7
  • feat: Update InferenceService reconciliation logic

    feat: Update InferenceService reconciliation logic

    The ModelMesh controller can now reconcile InferenceServices using the new Model Spec in the predictor. Also, an issue was fixed with there being a leading slash in the model path when an InferenceService StorageURI was parsed. This was causing issues in the adapter, preventing models from being loaded as seen in #97.

    Closes: #96, #97 Related: #90

    Result

    Users can now successfully deploy an InferenceService using the predictor Model Spec such as the following:

    apiVersion: serving.kserve.io/v1beta1
    kind: InferenceService
    metadata:
      name: example-sklearn-isvc
      annotations:
        serving.kserve.io/deploymentMode: ModelMesh
        serving.kserve.io/secretKey: localMinIO
    spec:
      predictor:
        model:
          modelFormat:
            name: sklearn 
          storageUri: s3://modelmesh-example-models/sklearn/mnist-svm.joblib
    
    lgtm approved 
    opened by pvaneck 7
  • ModelMesh Release Tracker for KServe v0.7.0

    ModelMesh Release Tracker for KServe v0.7.0

    The plan is to cut the KServe 0.7 release mid next week. For this release, ModelMesh will be loosely integrated with KServe.

    Action Items:

    • [x] ModelMesh InferenceService CRD Support
      • [x] https://github.com/kserve/modelmesh-serving/pull/34
      • [x] Documentation on using InferenceService CR with ModelMesh
        • https://github.com/kserve/modelmesh-serving/pull/47
    • [x] ModelMesh REST Proxy Sidecar Support
      • [x] https://github.com/kserve/modelmesh-serving/pull/27
      • [x] Documentation on using REST inferencing
    • [x] Add KServe ModelMesh DeploymentMode Annotation checker
      • https://github.com/kserve/kserve/pull/1851
    • [ ] Update KServe hack/quick_install.sh to include ModelMesh-Serving as part of installation.
      • https://github.com/kserve/kserve/pull/1844
    • [x] Documentation updates
      • [x] Refresh External ModelMesh Documentation
        • https://github.com/kserve/modelmesh-serving/pull/39
      • [x] Update KServe Website with ModelMesh Documentation
        • https://github.com/kserve/website/issues/17
        • Can view website here: https://kserve.github.io/website/
        • https://github.com/kserve/website/pull/32
        • https://github.com/kserve/website/pull/37
    • [x] Assemble release process items
      • Tag release for version v0.7.0 to follow suit with KServe.
      • [x] GitHub workflow for tagged release
      • [x] Release process documentation
        • Create document outline process of creating a release branch and tagging from commit in that branch. KServe should already have document like this.
        • https://github.com/kserve/modelmesh-serving/pull/40
        • https://github.com/kserve/modelmesh/pull/7
        • https://github.com/kserve/modelmesh-runtime-adapter/pull/7
    opened by pvaneck 7
  • Adjust FVT GH-Actions workflow

    Adjust FVT GH-Actions workflow

    Motivation

    Decrease the flakiness of FVT runs that occur when certain tests are run back to back.

    Modifications

    The rollingUpdate strategy is adjusted in a preprocessing step of the FVT Github Actions workflow to allow better stability in low resource environments. The defaultTimeout was increased to account for the the changes in strategy. Ran into some intermittent failures due to timeouts when the deployment doesn't become ready in time.

    Result

    Less flakiness in FVT runs.

    lgtm approved 
    opened by pvaneck 7
  • Payload logging/events

    Payload logging/events

    For various reasons including monitoring by external system for things like drift / outlier detection etc.

    It should support CloudEvents and be compatible with the logger in KServe "classic", so that it can be used in a similar way, as illustrated in these samples:

    • https://github.com/kserve/kserve/tree/master/docs/samples/logger/basic
    • https://github.com/kserve/kserve/tree/master/docs/samples/drift-detection/alibi-detect/cifar10 / https://github.com/kserve/kserve/tree/master/docs/samples/outlier-detection/alibi-detect/cifar10

    Some considerations / possible complications:

    • In KServe the logger can be configured per InferenceService. We need to decide whether we support this with model-mesh, or a simpler global configuration, or both. Another possibility could be allowing a logging destination to be configured globally and enabled/disabled per model.
    • Model-mesh doesn't really touch the payloads currently, but it only routes gRPC/protobuf. So we could emit the raw protobuf messages but this would differ from the existing KServe case and so would not necessarily be compatible with the same integrations. We could transcode to json on the fly, but this would introduce processing overhead that may be undesirable and affect data path performance.
    • The KServe examples are based on the V1 API, we should to check whether the existing logger works with the V2 API; the runtimes supported by model-mesh are primarily V2 based.

    cc @rafvasq

    enhancement 
    opened by njhill 1
  • Create isolation between serving runtimes

    Create isolation between serving runtimes

    Is your feature request related to a problem? If so, please describe.

    In my team’s use case, we are currently using the KServe V2 Inference Protocol REST API for sending inference requests. On top of this protocol, we also make use of multiple virtual services to direct traffic to the modelmesh-serving service such that each serving runtime should be mapped to one virtual service.

    Our use case does not allow us to match the /v2/models/<model-id>/infer in our virtual service, and this creates a problem for us because requests sent to the virtual service for serving runtime X can end up reaching models loaded in serving runtime Y due to the fact that:

    • All serving runtimes share the same modelmesh-serving service
    • Users can set the model id to any existing inference service name in the request path

    Describe your proposed solution

    Since this is an unwanted behaviour for my team, we have 2 possible solutions for solving this.

    1. Support mm-vmodel-id header in the REST API and allow it to take precedence over the model id specified in the V2 inference path
    2. Create a dedicated service per serving runtime instead of having all serving runtimes share the same service

    Describe alternatives you have considered

    Additional context

    opened by xvnyv 1
  • Add FVT for TorchServe runtime

    Add FVT for TorchServe runtime

    Support for TorchServe has been added in https://github.com/kserve/modelmesh-serving/pull/250 and https://github.com/kserve/modelmesh-runtime-adapter/pull/34, but we should also add a test for this in our CI.

    Related to https://github.com/kserve/modelmesh-runtime-adapter/pull/34

    opened by njhill 0
  • InferenceService status is not updated correctly

    InferenceService status is not updated correctly

    Describe the bug

    If I have some InferenceService resource running on a ServingRuntime, and I delete the ServingRuntime, the status of the InferenceService is still Ready

    Check the pod log:

    {"level":"info","ts":1668185801.959513,"logger":"controllers.Predictor","msg":"InferenceService Status updated","namespacedName":"mesh-test/example-onnx-mnist","source":"InferenceService","newStatus":{"available":true,"transitionStatus":"UpToDate","activeModelState":"Loaded","targetModelState":"","lastFailureInfo":{"reason":"RuntimeUnhealthy","message":"Waiting for runtime Pod to become available","modelId":"example-onnx-mnist__isvc-82e2bf7ea4"},"httpEndpoint":"http://modelmesh-serving.mesh-test:8008/","grpcEndpoint":"grpc://modelmesh-serving.mesh-test:8033","totalCopies":1,"failedCopies":0}}
    

    The available is still true

    To Reproduce Steps to reproduce the behavior:

    1. Create a ServingRuntime
    2. Create an InferenceService running on the server
    3. Delete ServingRuntime
    4. Check the condition of the InferenceService

    Expected behavior As soon as the ServingRuntime is deleted, the InferenceService should be unavailable.

    bug 
    opened by DaoDaoNoCode 1
  • GPU use in custom MLServer runtime

    GPU use in custom MLServer runtime

    I have deployed the PyTorch model using custom MLServer runtime on the CPU and making gRPC inference requests. As our payload is vast, we need to speed up the inferencing process, so we are looking forward to using GPU. Can you please suggest the most efficient ways to use GPU on ModelMesh for custom MLServer runtime? Is there any document or resource available to know how to use GPU on ModelMesh?

    @njhill @lizzzcai

    opened by MLHafizur 1
  • tls key and cert for etcd should be copied to user namespace along with etcd config

    tls key and cert for etcd should be copied to user namespace along with etcd config

    Is your feature request related to a problem? If so, please describe.

    When tls is enabled for etcd, user need to provide the following model-serving-etcd.

    apiVersion: v1
    kind: Secret
    metadata:
      name: model-serving-etcd
      namespace: modelmesh-serving
    type: Opaque
    data:
      etcd_connection: xxx
      ca.crt: xxxxxxxx
      tls.crt: xxxxxxxxx
      tls.key: xxxxxxxxx
    

    where etcd_connection contains:

    # json string
    {
      "endpoints": "https://etcd.modelmesh-serving.svc.cluster.local:2379",
      "root_prefix": "modelmesh-serving",
      "userid": "root",
      "password": "xxxxx",
      "certificate_file": "ca.crt",
      "client_key_file": "tls.key",
      "client_certificate_file": "tls.crt"
    }
    

    However, the model-serving-etcd in user namespace only contains etcd_connection, ref. (to make it works, tls cert and key and ectd_connection have to in a single model-serving-etcd)

    Another limitation in model-serving-etcd is that etcd_connection is a json string, it is hard to refer the value (like user id and password) via valueFrom, and I have to maintain two sets of secret, one for modelmesh, another one for etcd.

    Describe your proposed solution

    Options:

    1. Copy the the whole model-serving-etcd secret (including tls cert, key and ca and others) to user namespace. (However, it seems like the current modelmesh controller will not update the secret to the user namespace when the secret is updated in the root namespace)
    2. Separate the tls secret from the model-serving-etcd, only keep the tls secret name in the model-serving-etcd for reference, user has to sync the tls secret to user namespace manually.

    Personally I prefer option 1 if the model mesh controller is able to sync the updated secret to the user namespace automatically, and unpack the etcd_connection as key-value pairs under data.

    Describe alternatives you have considered

    Additional context

    opened by lizzzcai 2
Releases(v0.9.0)
  • v0.9.0(Jul 21, 2022)

    :warning: What's Changed

    • ModelMesh Serving now directly imports KServe types for ServingRuntimes and InferenceServices. (#140, #146)
    • InferenceService CRD now copied from KServe and included as part of standalone ModelMesh Serving installation by default.
    • Renamed role/rolebinding names to incllude modelmesh prefix. (#181)
    • ModelMesh now uses Java 17 (kserve/modelmesh#33) and G1 garbage collector. (kserve/modelmesh#41)
    • ModelMesh logging improvements. (kserve/modelmesh#41)
    • InferenceService CRD now included in default standalone mm-serving installation. (#166)
    • Many dependencies including etcd (updated to v3.5.3) were bumped. (#145)

    :rainbow: What's New?

    • Added support for OpenVINO Model Server ServingRuntime. (#141)
    • OpenVINO Model Server adapter implemented. (#kserve/modelmesh-runtime-adapter#18)
    • TotalCopies is now available in the Predictor and InferenceService statuses. (#142)
    • Users can now set labels and annotations for ServingRuntime pods via the model-serving-config ConfigMap. (#144)
    • Users can override adapter environment variables added by the controller. (#149)
    • ServingRuntime matching based on protocolVersion is now supported. (#154)
    • ModelMetadata endpoint now enabled for Triton and MLServer ServingRuntimes. (#164)
    • Azure Blob Storage now added as a supported storage provider. (#174, kserve/modelmesh-runtime-adapter#23)
    • Add ModelMesh metrics for inference request/response payload sizes. (kserve/modelmesh#37)

    :lady_beetle: Fixes

    • Fixed possible nil pointer dereferences and minor log improvements. (#160)
    • Fixed potential eviction deadlock in ModelMesh. (kserve/modelmesh#25)
    • Disabled FIPS for Java in ModelMesh. (kserve/modelmesh#35)
    • Repair invalid ModelRecord lastUsed values in registry. (kserve/modelmesh#36)
    • Quickstart minio and etcd pods were converted to Deployment resources. (#157)

    :page_facing_up: Documentation

    • OpenVINO ServingRuntime documentation added. (#167)
    • Rest proxy documentation added. (#177)
    • Monitoring and metrics documentation added. (#175)
    • TLS configuration documentation added. (#176)
    • InferenceService CRD now documented as the primary interface for interacting with ModelMesh. (#190)

    :otter: Other

    • Upgrade tests to use to Ginkgo V2. (#133)
    • Add performance test to E2E toolchain. (#139)
    • Quickstart etcd version updated to v3.5.4. (#151)

    Full Changelog: https://github.com/kserve/modelmesh-serving/compare/v0.8.0...v0.9.0

    Source code(tar.gz)
    Source code(zip)
    config-v0.9.0.tar.gz(45.76 KB)
    modelmesh-quickstart-dependencies.yaml(2.76 KB)
    modelmesh-runtimes.yaml(4.11 KB)
    modelmesh.yaml(648.97 KB)
  • v0.8.0(Feb 12, 2022)

    :warning: What's Changed

    • Removed support for KServe TrainedModel CRD (#54)
    • MLServer ServingRuntime updated to use 0.5.2 (#61)
    • Go version updated to 1.17 along with other tooling updates (https://github.com/kserve/modelmesh-serving/commit/5355eb7249c483c54c5f800b61d32877e9dde980)
    • MLServer ServingRuntime now has an increased gRPC max message size (#85)
    • In the ServingRuntime CRD, SupportedModelTypes now goes by SupportedModelFormats (#100)
    • The max gRPC response message size via the REST-proxy has been increased to 16MiB (https://github.com/kserve/rest-proxy/pull/11)

    :rainbow: What's New?

    • Multi-namespace support for the ModelMesh controller was introduced (#84)
      • Kube resolver can now work with multiple namespaces for multi-namespace capability (#73)
      • ModelMeshEventStream component can now support multiple namespaces (#76)
      • ServingRuntime controller now works across multiple namespaces (#77)
      • Service Controller is now namespace-aware (#82)
    • Default RBAC is now cluster-scoped instead of namespace-scoped (#88)
    • Users can now configure environment variables for the model-mesh containers in ServingRuntime deployments (https://github.com/kserve/modelmesh-serving/commit/98eea5570b45f59149afb05ac8cc72045b98cf54)
    • Reconciliation logic added for new storage spec in InferenceServices and Predictors (#56, #83)
    • A multiModel field added to the ServingRuntime spec for denoting if a ServingRuntime is compatible with ModelMesh or not (#89)
    • The controller can now reconcile InferenceServices using the new Model Spec in the predictor (#101)
    • autoSelect field introduced to ServingRuntime CRD supportedModelTypes spec (#100)
    • Logic was added to have MM only consider SRs with model format containing autoSelect as true when finding compatible runtimes (#108)
    • Install script now allows passing in a URL to a config archive (#118)
    • Models hosted using GCS or HTTP(S) can now be used with ModelMesh through InferenceServices (#121)
    • REST input payloads through the REST-proxy can now be multi-dimensional (https://github.com/kserve/rest-proxy/pull/6)

    :lady_beetle: Fixes

    • Fix code errors reported by golangci-lint (#57)
    • Fixed a bug where invalid vModel specs led to a nil pointer dereference (https://github.com/kserve/modelmesh-serving/commit/1bea19895d128bb8bd52e931bee5c8a03295418f)
    • Fixed a bug where ServingRuntime controller would loop over empty reconcile events (https://github.com/kserve/modelmesh-serving/commit/2063f7353d300b4707b636b9778fa7c42d6ff004)
    • Events from plugged-in Predictor sources are now transformed properly when setting up ServingRuntime controller (https://github.com/kserve/modelmesh-serving/commit/d6f5c5dbaf30fd2a73611812aa81e756fd9cee72)
    • Fixed install issues on Mac (#114, #119)

    :page_facing_up: Documentation

    • Added developer documentation (#59)
    • Added notes about debug flags in custom MLServer runtimes (https://github.com/kserve/modelmesh-serving/commit/314761f8693213296b464d5b63a878d299bb3355)
    • Added Keras docs and example (https://github.com/kserve/modelmesh-serving/commit/54311bb26ec9e8f6176538ec9224afbb61f94d0a, #109)
    • Change install instructions to install from a release branch (#117)

    :otter: Other

    • Some controller code was cleaned up and optimized (https://github.com/kserve/modelmesh-serving/commit/f380a278099422d39fcdbf37ab0c9e47dc4966c7)
    • Script for setting up a user namespace for ModelMesh was added (#112)

    Full Changelog: https://github.com/kserve/modelmesh-serving/compare/v0.7.0...v0.8.0

    Source code(tar.gz)
    Source code(zip)
    config-v0.8.0.tar.gz(28.16 KB)
    modelmesh-quickstart-dependencies.yaml(2.40 KB)
    modelmesh-runtimes.yaml(3.01 KB)
    modelmesh.yaml(105.73 KB)
  • v0.7.0(Oct 12, 2021)

Owner
KSERVE
Highly scalable and standards based Model Inference Platform on Kubernetes for Trusted AI
KSERVE
A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore

bookstore-sample-controller A Controller written in kubernetes sample-controller style which watches a custom resource named Bookstore. A resource cre

Abdullah Al Shaad 0 Jan 20, 2022
Controller Area Network (CAN) SDK for Go.

?? CAN Go CAN toolkit for Go programmers. can-go makes use of the Linux SocketCAN abstraction for CAN communication. (See the SocketCAN documentation

Einride 98 Nov 17, 2022
Annotated and kubez-autoscaler-controller will maintain the HPA automatically for kubernetes resources.

Kubez-autoscaler Overview kubez-autoscaler 通过为 deployment / statefulset 添加 annotations 的方式,自动维护对应 HorizontalPodAutoscaler 的生命周期. Prerequisites 在 kuber

null 136 Nov 13, 2022
network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of kubernetes.

Network Node Manager network-node-manager is a kubernetes controller that controls the network configuration of a node to resolve network issues of ku

kakao 99 Oct 4, 2022
A controller to create K8s Ingresses for Openshift routes.

route-to-ingress-operator A controller to create corresponding ingress.networking.k8s.io/v1 resources for route.openshift.io/v1 TODO int port string p

Mohammad Yosefpor 5 Jan 7, 2022
A Kubernetes Terraform Controller

Terraform Controller Terraform Controller is a Kubernetes Controller for Terraform, which can address the requirement of Using Terraform HCL as IaC mo

Open Application Model 98 Nov 9, 2022
Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes.

Carrier is a Kubernetes controller for running and scaling game servers on Kubernetes. This project is inspired by agones. Introduction Genera

Open Cloud-native Game-application Initiative 30 Jul 28, 2022
A fluxcd controller for managing remote manifests with kubecfg

kubecfg-operator A fluxcd controller for managing remote manifests with kubecfg This project is in very early stages proof-of-concept. Only latest ima

Pelotech 59 Nov 1, 2022
A fluxcd controller for managing manifests declared in jsonnet

jsonnet-controller A fluxcd controller for managing manifests declared in jsonnet. Kubecfg (and its internal libraries) as well as Tanka-style directo

Pelotech 59 Nov 1, 2022
Write controller-runtime based k8s controllers that read/write to git, not k8s

Git Backed Controller The basic idea is to write a k8s controller that runs against git and not k8s apiserver. So the controller is reading and writin

Darren Shepherd 50 Dec 10, 2021
The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the operator-sdk or controller-runtime.

k8s-generic-webhook The k8s-generic-webhook is a library to simplify the implementation of webhooks for arbitrary customer resources (CR) in the opera

Norwin Schnyder 8 Nov 16, 2022
the simplest testing framework for Kubernetes controller.

KET(Kind E2e Test framework) KET is the simplest testing framework for Kubernetes controller. KET is available as open source software, and we look fo

Riita 36 Nov 9, 2022
pubsub controller using kafka and base on sarama. Easy controll flow for actions streamming, event driven.

Psub helper for create system using kafka to streaming and events driven base. Install go get github.com/teng231/psub have 3 env variables for config

Te Nguyen 6 Sep 26, 2022
Kubernetes workload controller for container image deployment

kube-image-deployer kube-image-deployer는 Docker Registry의 Image:Tag를 감시하는 Kubernetes Controller입니다. Keel과 유사하지만 단일 태그만 감시하며 더 간결하게 동작합니다. Container, I

PUBG Corporation 2 Mar 8, 2022
Raspberry pi GPIO controller package(CGO)

GOPIO A simple gpio controller package for raspberrypi. Documentation Examples Installation sudo apt-get install wiringpi go get github.com/polarspet

arian firoozfar 14 Nov 24, 2022
Knative Controller which emits cloud events when Knative Resources change state

Knative Sample Controller Knative sample-controller defines a few simple resources that are validated by webhook and managed by a controller to demons

salaboy 2 Oct 2, 2021
A controller managing namespaces deployments, statefulsets and cronjobs objects. Inspired by kube-downscaler.

kube-ns-suspender Kubernetes controller managing namespaces life cycle. kube-ns-suspender Goal Usage Internals The watcher The suspender Flags Resourc

Virtuo 59 Nov 15, 2022
K8s controller implementing Multi-Cluster Services API based on AWS Cloud Map.

AWS Cloud Map MCS Controller for K8s Introduction AWS Cloud Map multi-cluster service discovery for Kubernetes (K8s) is a controller that implements e

Amazon Web Services 64 Nov 14, 2022
A Pulumi NGINX Ingress Controller component

Pulumi NGINX Ingress Controller Component This repo contains the Pulumi NGINX Ingress Controller component for Kubernetes. This ingress controller use

Pulumi 8 Aug 10, 2022
Create cluster to run ingress controller and set the dns resolver

kubebuilder-crd-dep-svc-ing create cluster to run ingress controller and set the dns resolver $ kind create cluster --config clust.yaml $ sudo

null 1 Nov 15, 2021