Prophecis is a one-stop machine learning platform developed by WeBank

Overview

Prophecis

License

English | 中文

Introduction

Prophecis is a one-stop machine learning platform developed by WeBank. It integrates multiple open-source machine learning frameworks, has the multi tenant management capability of machine learning compute cluster, and provides full stack container deployment and management services for production environment.

Architecture

  • Overall Structure

    Prophecis

    Five key services in Prophecis:

  • Prophecis Machine Learning Flow:Distributed machine learning modeling tool, with stand-alone and distributed mode model training functions, supports Tensorflow, Python, XGBoost and other machine learning frameworks, and supports the complete pipeline from machine learning modeling to deployment;

  • Prophecis MLLabis:Machine learning development and exploration tools, providing development and exploration services. MLLabis is an online IDE based on Jupyter Lab. It also supports machine learning tasks for GPU and Hadoop clusters, supports Python, R, and Julia, and integrates Debug and TensorBoard plugins;

  • Prophecis Model Factory: MF provides machine learning model storage, model deployment, AB Test, model management and other services;

  • Prophecis Data Factory: DF provides feature engineering tools, data labeling tools and material management services;

  • Prophecis Application Factory:AF is jointly developed by the big data platform team and AI Department of Webank. It is based on QingCloud's open source system KubeSphere custom development, providing CI/CD and DevOps tools, GPU cluster monitoring and warning capabilities.

  • Features

    Prophecis功能特色

  • Whole Machine Learning Life Cycle Support:Prophecis's MLFlow can be nested into the workflow of DataSphere Stdudio through AppJoint. Support the entire machine learning process from data upload, data preprocessing, feature engineering, model training, model evaluation, model release to model deployment.

    DSS-Prophecis

  • One-Click Model Deployment Service:Prophecis MF supports deploying models generated by Prophecis Machine Learning Flow and Propheics MLLabis as restful API or RPC interface with one click, so as to realize seamless connection between model and business system.

  • Comprehensive Management Platform:Based on the community open source program customization, Prophecis provides complete, reliable, and highly flexible enterprise-level machine learning application release, monitoring, service management, log collection and query management tools, and realizes the full control of machine learning applications to meet the needs of enterprises Learn to apply all the work requirements of the online production environment.

Quick Start Guide

Developing

  • Read the Develop Guide guide to quickly get how to develop Prophecis.

Roadmap

  • See our Roadmap for what's coming soon in Prophecis.

Contributing

Contributions are warmly welcomed and greatly appreciated.

Communication

If you desire immediate response, please kindly raise issues to us or scan the below QR code by WeChat and QQ to join our group: :

Communication

License

Prophecis is under the Apache 2.0 license. See the LICENSE file for details.

Comments
  • 创建notebook一直是waiting状态

    创建notebook一直是waiting状态

    • helm version:version.BuildInfo{Version:"v3.2.1", GitCommit:"fe51cd1e31e6a202cba7dead9552a6d418ded79a", GitTreeState:"clean", GoVersion:"go1.13.10"}
    • docker version:
    Client: Docker Engine - Community
     Version:           19.03.9
     API version:       1.40
     Go version:        go1.13.10
     Git commit:        9d988398e7
     Built:             Fri May 15 00:25:27 2020
     OS/Arch:           linux/amd64
     Experimental:      false
    
    Server: Docker Engine - Community
     Engine:
      Version:          19.03.9
      API version:      1.40 (minimum version 1.12)
      Go version:       go1.13.10
      Git commit:       9d988398e7
      Built:            Fri May 15 00:24:05 2020
      OS/Arch:          linux/amd64
      Experimental:     false
     containerd:
      Version:          1.4.13
      GitCommit:        9cc61520f4cd876b86e77edfeb88fbcd536d1f9d
     runc:
      Version:          1.0.3
      GitCommit:        v1.0.3-0-gf46b6ba
     docker-init:
      Version:          0.18.0
      GitCommit:        fec3683
    

    -k8s version:

    Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-02-16T12:38:05Z", GoVersion:"go1.17.7", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.20", GitCommit:"1f3e19b7beb1cc0110255668c4238ed63dadb7ad", GitTreeState:"clean", BuildDate:"2021-06-16T12:51:17Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
    

    创建NoteBook之后,状态一直都是waiting,该如何处理?哪里能看到日志?

    opened by marstalk 5
  • 镜像无法拉取!!!

    镜像无法拉取!!!

    您好! wedatasphere/prophecis:metrics-0.2.0 wedatasphere/prophecis:jobmonitor-0.2.0 wedataspere/prophecis:minio-2020-06-14 wedatasphere/prophecis:lcm-0.2.0 wedatasphere/prophecis:trainer-0.2.0 这些镜像都无法拉取,请问仓库里有这些镜像吗?

    opened by gugumituo 3
  • I encountered the following problems:nginx: [emerg] host not found in resolver

    I encountered the following problems:nginx: [emerg] host not found in resolver "kube-dns.kube-system.svc.cluster.local" in /etc/nginx/conf.d/ui.conf:46

    [[email protected] Prophecis]# kubectl logs -f bdap-ui-deployment-595f6c44bf-jmkb5 -n prophecis /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh 10-listen-on-ipv6-by-default.sh: info: /etc/nginx/conf.d/default.conf is not a file or does not exist /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh /docker-entrypoint.sh: Configuration complete; ready for start up nginx: [emerg] host not found in resolver "kube-dns.kube-system.svc.cluster.local" in /etc/nginx/conf.d/ui.conf:46 [[email protected] Prophecis]#

    opened by ATM006 1
  • which version of k8s supported?

    which version of k8s supported?

    I had test kubernetes v1.20.0 and v1.18.6 and it shows errors, and below is the detail information:

    run helm install notebook-controller . in folder Prophecis/helm-charts/k8s 1.18.6/notebook-controller it shows: Error: template: MLSS/templates/notebook-controller-0.5.1.yaml:115:24: executing "MLSS/templates/notebook-controller-0.5.1.yaml" at <.Values.aide.controller.notebook.repository>: nil pointer evaluating interface {}.controller

    run helm install notebook-controller . in folder Prophecis/helm-charts/notebook-controller it shows: Error: unable to build kubernetes objects from release manifest: [unable to recognize "": no matches for kind "Deployment" in version "apps/v1beta1", unable to recognize "": no matches for kind "StatefulSet" in version "apps/v1beta2"

    opened by xueyoo 1
  • Prophecis v0.3.2 release

    Prophecis v0.3.2 release

    [1] Update the MLFlow module, adapt to DataSphere Studio 1.1.0 version. [2] Update the Prophecis appconn , adapt to DataSphere Studio 1.1.0 version. [3] Update MLFlow user_manual. [4] Fix the error of model list interface. [5] Fix some wrong variables in helm chart installation.

    opened by alexzyWu 0
  • feat(mllabis): Add resource modification and release in notebook server

    feat(mllabis): Add resource modification and release in notebook server

    1、Add resource modification and resource release in notebook server. 2、Optimize log and status viewing function. 3、Add notebook controller(From Kubeflow Controller).

    opened by alexzyWu 0
  • 经常出现“网络服务异常”报错,是给LDAP认证有关的吗?

    经常出现“网络服务异常”报错,是给LDAP认证有关的吗?

    部署Prophecis版本: v0.3.0 Kubernetes版本: 1.18.6, 所有pod运行状态都Running 1.部署文档中说是: Prophecis使用LDAP来负责统一认证,但部署文档没有要求必须安装LDAP目录服务,有要求LDAP必须创建什么用户吗? 2.部署文档要求创建的超级管理员和用户密码,给t_superadmin表对应,t_superadmin表中有name字段,不需要密码字段存储吗?LDAP创建的用户要和t_superadmin表的超级管理员用户对应吗? 3.登录时,出现错误:原因是:LDAP目录服务器 用户认证没通过吗? 37dd7aeb1b376cfd4f22bcc705d56b2

    4.清除浏览器缓存后,重新打开登录页面,可以登录进去,但好多页面点击过程都会出现“网络服务异常”错误。 微信图片_20221027194135 请问这个给Auth_type:LDAP有关系吗?

    1. LDAP统一认证支持可配置吗?关闭或者开启。
    opened by larypython 0
  • Prophecis0.3.2对应DSS/Linkis平台Appconn插件部署包及初始化SQL缺失

    Prophecis0.3.2对应DSS/Linkis平台Appconn插件部署包及初始化SQL缺失

    根据安装部署文档中的要求准备安装DSS/Linkis平台的Appconn插件部署包及初始化SQL发现存在异常。 release0.3.2未提供对应的插件包、并且0.3.2源码内appconn对应pom依赖为DSS1.0.1 Linkis1.0.3,尝试替换为7月DSS发布版本的DSS1.1.0 Linkis1.1.1进行编译发现代码存在报错: MLSSOpenRequestRef.java[XX,XX] error: camnot find symbol. 同时发现appconn初始化sql中需要操作的dss_appcation元数据表在DSS1.1.0 Linkis1.1.1的元数据表中已不存在

    源码包编译版本依赖DSS1 0 1Linkis1 0 3替换DSS1 1 1LINKIS1 1 0编译报错

    mlss初始化sql中包含dss不存在的表dss_application

    dss源数据库中不包含dss_application表

    opened by AlexHuyc 0
  • docker hub missing images of some v0.3.2 components

    docker hub missing images of some v0.3.2 components

    The missing images in Prophecis/install/value.yaml:

    wedatasphere/prophecis:mllabis-v0.3.2 --> wedatasphere/prophecis:mllabis-v0.3.0 wedatasphere/prophecis:metrics-v0.3.2 --> wedatasphere/prophecis:metrics-v0.3.0 wedatasphere/prophecis:mf-server-v0.3.2 --> wedatasphere/prophecis:mf-server-v0.3.0

    opened by ResultLv 0
  • save model error, cannot connect to di-storage-rpc service

    save model error, cannot connect to di-storage-rpc service

    di-storage-rpc is running, and DNS is: di-storage-rpc.prophecis, but the address in log is: di-storage-rpc..svc.cluster.local:80, it looks like "prophecis" is missing in DNS

    image image image

    opened by ResultLv 0
  • version 0.3.2, di-storage server start, but has rpc error

    version 0.3.2, di-storage server start, but has rpc error

    image

    storage-deployment.yaml

    kind: Deployment
    apiVersion: apps/v1
    metadata:
      name: di-storage
      namespace: prophecis
      labels:
        app.kubernetes.io/managed-by: Helm
        environment: prophecis
        service: di-storage
      annotations:
        deployment.kubernetes.io/revision: '1'
        meta.helm.sh/release-name: prophecis
        meta.helm.sh/release-namespace: default
    spec:
      replicas: 1
      selector:
        matchLabels:
          environment: prophecis
          service: di-storage
      template:
        metadata:
          creationTimestamp: null
          labels:
            environment: prophecis
            service: di-storage
            version: storage-v0.3.2
        spec:
          volumes:
            - name: di-config
              configMap:
                name: di-config
                defaultMode: 420
            - name: timezone-volume
              hostPath:
                path: /usr/share/zoneinfo/Asia/Shanghai
                type: File
            - name: oss-storage
              hostPath:
                path: tmp
                type: Directory
          containers:
            - name: di-storage-rpc-server
              image: 'wedatasphere/prophecis:storage-v0.3.2'
              command:
                - /bin/sh
                - '-c'
              args:
                - DLAAS_PORT=8443 /main
              ports:
                - containerPort: 8443
                  protocol: TCP
              env:
                - name: DLAAS_POD_NAMESPACE
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: metadata.namespace
                - name: DLAAS_ENV
                  value: prophecis
                - name: DLAAS_LOGLEVEL
                  value: DEBUG
                - name: DLAAS_PUSH_METRICS_ENABLED
                  value: 'true'
                - name: LINKIS_ADDRESS
                  value: '127.0.0.1:8088'
                - name: LINKIS_TOKEN_CODE
                  value: BML-AUTH
                - name: MONGO_ADDRESS
                  value: mongo.prophecis.svc.cluster.local
                - name: MONGO_USERNAME
                  value: mlssopr
                - name: MONGO_PASSWORD
                  value: mlssopr
                - name: MONGO_DATABASE
                  value: mlsstest
                - name: MONGO_Authentication_Database
                  value: admin
                - name: DLAAS_OBJECTSTORE_TYPE
                  valueFrom:
                    secretKeyRef:
                      name: storage-secrets
                      key: DLAAS_OBJECTSTORE_TYPE
                - name: DLAAS_OBJECTSTORE_AUTH_URL
                  valueFrom:
                    secretKeyRef:
                      name: storage-secrets
                      key: DLAAS_OBJECTSTORE_AUTH_URL
                - name: DLAAS_OBJECTSTORE_USER_NAME
                  valueFrom:
                    secretKeyRef:
                      name: storage-secrets
                      key: DLAAS_OBJECTSTORE_USER_NAME
                - name: DLAAS_OBJECTSTORE_PASSWORD
                  valueFrom:
                    secretKeyRef:
                      name: storage-secrets
                      key: DLAAS_OBJECTSTORE_PASSWORD
                - name: DLAAS_ELASTICSEARCH_SCHEME
                  value: http
                - name: DLAAS_ELASTICSEARCH_ADDRESS
                  value: 'http://elasticsearch.prophecis.svc.cluster.local:9200'
                - name: DLAAS_ELASTICSEARCH_ADDRESS
                  valueFrom:
                    secretKeyRef:
                      name: trainingdata-secrets
                      key: DLAAS_ELASTICSEARCH_ADDRESS
                - name: DLAAS_ELASTICSEARCH_USERNAME
                  valueFrom:
                    secretKeyRef:
                      name: trainingdata-secrets
                      key: DLAAS_ELASTICSEARCH_USERNAME
                - name: DLAAS_ELASTICSEARCH_PASSWORD
                  valueFrom:
                    secretKeyRef:
                      name: trainingdata-secrets
                      key: DLAAS_ELASTICSEARCH_PASSWORD
              resources:
                limits:
                  cpu: 500m
                  memory: 1Gi
              volumeMounts:
                - name: di-config
                  mountPath: /etc/mlss/
                - name: timezone-volume
                  mountPath: /etc/localtime
                - name: oss-storage
                  mountPath: /data/oss-storage
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              imagePullPolicy: Always
          restartPolicy: Always
          terminationGracePeriodSeconds: 30
          dnsPolicy: ClusterFirst
          nodeSelector:
            mlss-node-role: platform
          securityContext: {}
          imagePullSecrets:
            - name: hubsecret
          schedulerName: default-scheduler
      strategy:
        type: RollingUpdate
        rollingUpdate:
          maxUnavailable: 25%
          maxSurge: 25%
      revisionHistoryLimit: 10
      progressDeadlineSeconds: 600
    
    opened by ResultLv 1
Releases(v0.3.2)
  • v0.3.2(Jul 8, 2022)

    Enhancement

    [1] Update the MLFlow module, adapt to DataSphere Studio 1.1.0 version. #69

    [2] Update the Prophecis appconn , adapt to DataSphere Studio 1.1.0 version. #69

    [3] Update MLFlow user_manual. #69

    Bugfix

    [1] Fix the error of model list interface. #64

    [2] Fix some wrong variables in helm chart installation. #59

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Apr 7, 2022)

    Prophecis v0.3.0 mainly publish MLFlow module and Model Factory module.

    Enhancement

    [1] Add mlflow module to support the workflow management of machine learning experiment; #42 #43 #48 #51 #53 [2] Add mlflow appconn module to support embedding machine learning workflow into datasphere studio workflow; #44 [3] A new databrick mlflow server is added to support the storage and tracking of the recorded data of modeling experiments; #48 [4] Add a new model factory service to support model registration, deployment and image construction; #45 [5] Optimize the notebook instance management capability of mllabis, and add log viewing, resource modification, resource release and other functions; #47 #50 [6] Optimize the notebook sparkclient of mllabis and support the creation of multiple sparkclient instances; #47 [7] Optimize doc module, add appconn compilation and deployment documents, and update relevant documents of user_manual; #50 [8] Optimize the install module and add a model factory deployment script (helm chart). #46

    Source code(tar.gz)
    Source code(zip)
    mlflow-appconn-lib.zip(58.85 MB)
    prophecis-appconn-lib.zip(59.66 MB)
  • v0.2.2(Aug 10, 2021)

  • v0.2.1(Jul 29, 2021)

    Prophecis 0.2.1 release

    Prophecis v0.2.1 mainly updates the installation, development and use documents, while optimizing the Helm Chart.

    Enhancement

    [1] Add CC user document, DI user document, development document and configuration document, modify the deployment document. #28 [2] Move the LogCollectorDS component to the Prophecis component. #27 [3] Reduce the varaiables exposed by values.yaml in the Prophecis component. #27

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Mar 15, 2021)

    Prophecis 0.2.0 release

    Prophecis v0.2.0 mainly publish distributed modeling module(DI). This module is based on FfDL and mainly provides single machine modeling and distributed tensorflow tasks.

    Enhancement

    [1] Add Prophecis-DI Rest Module. #12 [2] Add Prophecis-DI Trainer & JobMonitor Module, which is responsible for managing the task lifecycle. #13 [3] Add Prophecis-DI LCM Module, which is responsible task scheduling, building single machine and distributed tasks. #8 [4] Add Prophecis-DI Storage Module, which is Responsible for the operation of storage module, such as Minio, ES, Mongo, etc. #14 [5] ADD Log in CLI Program, a command-line interface tool. #11

    Bugfix

    [1] Fix Helm Chart Setting Error. #16

    Prophecis-DI module is built based on the FfDL. The main modifications are as follows:

    [1] Integrate Kubeflow Arena,Provide distributed tensorflow task ability. [2] Modify the creation mode of single machine modeling task:remove helper and job jobmonitor in task, and change deploy pod to deploy job. [3] The log collection service is changed to daemonset, and the collection tool is changed to fluent bit. [4] The task status update mode is changed to an independent service job monitor. [5] Add user GUID control in container data directory. [6] Enhance CLI, added parameter replacement of yaml template, and the train command was modified to websocket connect, providing log and state. [7] The code file storage server is changed to Minio.

    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Dec 21, 2020)

    Prophecis 0.1.1 release

    Prophecis v0.1.1 is mainly a bug fix version, this version aim to fix some login error and adds helm chart for notebook controller deployment.

    Enhancement

    [1] User login add password RSA encryption verification. #1 [2] Add configurable administrator users to the system. #1 [3] Add Notebook Controller Deploy Helm Chart. #1

    Bugfix

    [1] Fix LDAP BUG:Add BaseDN config. #1 [2] Fix Helm Chart BUG:Wrong repository and variable. #1

    Source code(tar.gz)
    Source code(zip)
On-line Machine Learning in Go (and so much more)

goml Golang Machine Learning, On The Wire goml is a machine learning library written entirely in Golang which lets the average developer include machi

Conner DiPaolo 1.4k Nov 23, 2022
Gorgonia is a library that helps facilitate machine learning in Go.

Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily

Gorgonia 4.7k Nov 19, 2022
Machine Learning libraries for Go Lang - Linear regression, Logistic regression, etc.

package ml - Machine Learning Libraries ###import "github.com/alonsovidales/go_ml" Package ml provides some implementations of usefull machine learnin

Alonso Vidales 196 Nov 10, 2022
Gorgonia is a library that helps facilitate machine learning in Go.

Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily

Gorgonia 4.7k Nov 23, 2022
Go Machine Learning Benchmarks

Benchmarks of machine learning inference for Go

Nikolay Dubina 24 Sep 27, 2022
Deploy, manage, and scale machine learning models in production

Deploy, manage, and scale machine learning models in production. Cortex is a cloud native model serving platform for machine learning engineering teams.

Cortex Labs 7.8k Nov 23, 2022
A High-level Machine Learning Library for Go

Overview Goro is a high-level machine learning library for Go built on Gorgonia. It aims to have the same feel as Keras. Usage import ( . "github.

AUNUM 351 Nov 20, 2022
Standard machine learning models

Cog: Standard machine learning models Define your models in a standard format, store them in a central place, run them anywhere. Standard interface fo

Replicate 3.4k Nov 28, 2022
Katib is a Kubernetes-native project for automated machine learning (AutoML).

Katib is a Kubernetes-native project for automated machine learning (AutoML). Katib supports Hyperparameter Tuning, Early Stopping and Neural Architec

Kubeflow 1.3k Nov 22, 2022
PaddleDTX is a solution that focused on distributed machine learning technology based on decentralized storage.

中文 | English PaddleDTX PaddleDTX is a solution that focused on distributed machine learning technology based on decentralized storage. It solves the d

null 82 Nov 18, 2022
Self-contained Machine Learning and Natural Language Processing library in Go

Self-contained Machine Learning and Natural Language Processing library in Go

NLP Odyssey 1.3k Nov 17, 2022
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.

End-to-end computer vision platform Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises. onepa

Onepanel, Inc. 638 Nov 15, 2022
Reinforcement Learning in Go

Overview Gold is a reinforcement learning library for Go. It provides a set of agents that can be used to solve challenges in various environments. Th

AUNUM 306 Nov 21, 2022
Spice.ai is an open source, portable runtime for training and using deep learning on time series data.

Spice.ai Spice.ai is an open source, portable runtime for training and using deep learning on time series data. ⚠️ DEVELOPER PREVIEW ONLY Spice.ai is

Spice.ai 772 Nov 25, 2022
FlyML perfomant real time mashine learning libraryes in Go

FlyML perfomant real time mashine learning libraryes in Go simple & perfomant logistic regression (~100 LoC) Status: WIP! Validated on mushrooms datas

Vadim Kulibaba 1 May 30, 2022
Go (Golang) encrypted deep learning library; Fully homomorphic encryption over neural network graphs

DC DarkLantern A lantern is a portable case that protects light, A dark lantern is one who's light can be hidden at will. DC DarkLantern is a golang i

Raven 2 Oct 31, 2022
A tool for building identical machine images for multiple platforms from a single source configuration

Packer Packer is a tool for building identical machine images for multiple platforms from a single source configuration. Packer is lightweight, runs o

null 2 Oct 3, 2021
Social-gold - Social Gold is the blockchain that powers the Social Gold Social platform

Social Gold is Proof of authority (POA) blockchain that powers the Social Gold S

Willis Ayres 1 Feb 20, 2022
The one-stop shop for most common Go functions

Pandati The one stop shop for most common Go functions Table of contents Pandati The one stop shop for most common Go functions Table of contents Purp

Lukasz Raczylo 2 Mar 21, 2022
Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data

Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data throughout the software development life cycle (SDLC) for engineering teams.

Merico 66 Nov 10, 2022