Instant Kubernetes-Native Application Observability

Overview

Pixie!


Docs Slack Mentioned in Awesome Kubernetes Mentioned in Awesome Go Build Status


What is Pixie?

Pixie

Pixie gives you instant visibility by giving access to metrics, events, traces and logs without changing code.

We're building up Pixie for broad use by the end of 2020. If you are interested, feel free to try our community beta and join our community on slack.


Table of contents

Quick Start

Review Pixie's requirements to make sure that your Kubernetes cluster is supported.

Signup

Visit our product page and signup with your google account.

Install CLI

Run the command below:

bash -c "$(curl -fsSL https://withpixie.ai/install.sh)"

Or see our Installation Docs to install Pixie using Docker, Debian, RPM or with the latest binary.

(optional) Setup a sandbox

If you don't already have a K8s cluster available, you can use Minikube to set-up a local environment:

  • On Linux, run minikube start --cpus=4 --memory=6000 --driver=kvm2 -p=<cluster-name>. The default docker driver is not currently supported, so using the kvm2 driver is important.

  • On Mac, run minikube start --cpus=4 --memory=6000 -p=<cluster-name>.

More detailed instructions are available here.

Start a demo-app:

🚀 Deploy Pixie

Use the CLI to deploy the Pixie Platform in your K8s cluster by running:

px deploy

Alternatively, you can deploy with YAML or Helm.


Check out our install guides and walkthrough videos for alternate install schemes.

Get Instant Auto-Telemetry

Run scripts with px CLI

CLI Demo


Service SLA:

px run px/service_stats


Node health:

px run px/node_stats


MySQL metrics:

px run px/mysql_stats


Explore more scripts by running:

px scripts list


Check out our pxl_scripts repo for more examples.


View machine generated dashboards with Live views

CLI Demo

The Pixie Platform auto-generates "Live View" dashboard to visualize script results.

You can view them by clicking on the URLs prompted by px or by visiting:

https://work.withpixie.ai/live


Pipe Pixie dust into any tool

CLI Demo

You can transform and pipe your script results into any other system or workflow by consuming px results with tools like jq.

Example with http_data:

px run px/http_data -o json| jq -r .

More examples here


To see more script examples and learn how to write your own, check out our docs for more guides


Contributing

We're excited to have you contribute to Pixie. Our community has adopted the Contributor Covenant as its code of conduct, and we expect all participants to adhere to it. Please report any violations to [email protected]. All code contributions require the Contributor License Agreement. The CLA can be signed when creating your first PR.

There are many ways to contribute to Pixie:

  • Bugs: Something not working as expected? Send a bug report.
  • Features: Need new Pixie capabilities? Send a feature request.
  • Views & Scripts Requests: Need help building a live view or pxl scripts? Send a live view request.
  • PxL Scripts: PxL scripts are used to extend Pixie functionality. They are an excellent way to contribute to golden debugging workflows. Look here for more information.
  • Pixienaut Community: Interested in becoming a Pixienaut and in helping shape our community? Apply here.

Open Source

Along with building Pixie as a freemium SaaS product, contributing open and accessible projects to the broader developer community is integral to our roadmap.

We plan to contribute in two ways:

  • Open Sourced Pixie Platform Primitives: We plan to open-source components of the Pixie Platform which can be independently useful to developers after our Beta. These include our Community Pxl Scripts, Pixie CLI, eBPF Collectors etc. If you are interested in contributing during our Beta, email us.
  • Unlimited Pixie Community Access: Our Pixie Community product is a completely free offering with all core features of the Pixie developer experience. We will invest in this offering for the long term to give developers across the world an easy and zero cost way to use Pixie.

Under the Hood

Three fundamental innovations enable Pixie's magical developer experience:

Progressive Instrumentation: Pixie Edge Modules (“PEMs”) collect full body request traces (via eBPF), system metrics & K8s events without the need for code-changes and at less than 5% overhead. Custom metrics, traces & logs can be integrated into the Pixie Command Module.

In-Cluster Edge Compute: The Pixie Command Module is deployed in your K8s cluster to isolate data storage and computation within your environment for drastically better intelligence, performance & security.

Command Driven Interfaces: Programmatically access data via the Pixie CLI and Pixie UI which are designed ground-up to allow you to run analysis & debug scenarios faster than any other developer tool.

For more information on Pixie Platform's architecture, check out our docs or overview deck

Resources

About Us

Founded in late 2018, we are a San Francisco based stealth machine intelligence startup. Our north star is to build a new generation of intelligent products which empower developers to engineer the future.

We're heads down building Pixie and excited to share it broadly with the community later this year. If you're interested in learning more about us or in our current job openings, we'd love to hear from you.

License

Pixie Community is the free offering of Pixie's proprietary SaaS product catalogue.

Our PxL Scripts are licensed under Apache License, Version 2.0.

Other Pixie Platform components such as Pixie CLI and eBPF based Data Collectors will also be licensed under the Apache License, Version 2.0. Contribution of these are planned for Oct 2020.

Issues
  • Compile error, missing HTTP Tables.

    Compile error, missing HTTP Tables.

    Describe the bug Cannot run any scripts due to a HTTP Event module not found?

    Script compilation failed: L222 : C22  Table 'http_events' not found.\n
    

    To Reproduce Steps to reproduce the behavior: Install fresh version of Pixie on Minikube Cluster

    Expected behavior Pixie scripts to execute

    Screenshots image image

    Logs Please attach the logs by running the following command:

    ./px collect-logs (See Zip File) 
    

    pixie_logs_20210505024739.zip App information (please complete the following information):

    • Pixie version: 0.5.3+Distribution.0ff53f6.20210503183144.1
    • K8s cluster version: v1.20.2
    opened by WarpWing 12
  • [Doc issue] no ingress installed so dev_dns_updater did nothing

    [Doc issue] no ingress installed so dev_dns_updater did nothing

    Describe the bug I've been followed the document to deploy pixie cloud, and setup-dns section would update /etc/hosts if there is any ingress rules in k8s cluster. But it didn't have!

    ➜  pixie git:(main) ✗ kubectl get ing
    No resources found in default namespace.
    ➜  pixie git:(main) ✗ kubectl get ing -n plc
    No resources found in plc namespace.
    

    And it of course doesn't change anything:

    ➜  pixie git:(main) ✗ ./dev_dns_updater --domain-name="dev.withpixie.dev"  --kubeconfig=$HOME/.kube/config --n=plc
    INFO[0000] DNS Entries                                   entries="dev.withpixie.dev, work.dev.withpixie.dev, segment.dev.withpixie.dev, docs.dev.withpixie.dev" service=cloud-proxy-service
    INFO[0000] DNS Entries                                   entries=cloud.dev.withpixie.dev service=vzconn-service
    

    It didn't change /etc/hosts file!

    To Reproduce

    Expected behavior Should update /etc/hosts and we could visit dev.withpixie.dev in browser.

    Screenshots

    Logs

    App information (please complete the following information):

    • Pixie version: master branch
    • K8s cluster version: minikube on macOS 10.15.7 k8s version v1.22.2

    Additional context

    opened by Colstuwjx 11
  • Add bpftrace_pxls part 1 - Issue #291

    Add bpftrace_pxls part 1 - Issue #291

    As described in Issue #291 the first part of bpftrace pxl scripts.

    Thx to @oazizi000 for the support.

    opened by avwsolutions 10
  • px deploy failed flatcar linux kubernetes cluster

    px deploy failed flatcar linux kubernetes cluster

    Describe the bug A clear and concise description of what the bug is. $ px deploy (failed)

    To Reproduce Steps to reproduce the behavior:

    1. Go to '...'
    2. Click on '....'
    3. Scroll down to '....'
    4. See error fatal failed to fetch vizier versions error=open /home/core/.pixie/auth.json: no such file or directory

    Expected behavior A clear and concise description of what you expected to happen. pixie should be running properly

    Screenshots If applicable, add screenshots to help explain your problem. Please make sure the screenshot does not contain any sensitive information such as API keys or access tokens.

    Logs Please attach the logs by running the following command: px deploy ./px collect-logs

    
    **App information (please complete the following information):**
    - Pixie version
    - K8s cluster version v1.19.2
    
    fatal failed to fetch vizier versions error=open /home/core/.pixie/auth.json: no such file or directory
    
    Please help
    
    
    **Additional context**
    Add any other context about the problem here.
    
    opened by 4ss3g4f 9
  • Add more detailed instructions to the dev docs

    Add more detailed instructions to the dev docs

    Improve the DEVELOPMENT.md documentation.

    • Include prerequisites.
    • Add an example to Vizier section of running unit tests.
    • Link to instructions for spinning up a Minikube cluster to deploy onto.
    • Clarify the differences between Vizier and Pixie Cloud.
    • Add workaround instructions for failed px deploy.
    • Note when and where various commands in the instructions should be run and explain what they do in greater detail.
    opened by hmstepanek 7
  • Unable to deploy on minikube with k8s v1.11

    Unable to deploy on minikube with k8s v1.11

    // filed on behalf of @XaF

    Describe the bug UI Does

    To Reproduce Steps to reproduce the behavior:

    1. Download and authorize CLI in minikube
    2. Run deployment command
    3. n -pl pods start running with flakiness
    4. UI does not refresh to console view.

    Expected behavior After deployment, UI should refresh to console view to execute queries

    Screenshots Shared via zoom call

    Logs Please attach the logs by running the following command:

    pixie_logs_20200221141224 (1).zip

    App information (please complete the following information):

    • Pixie version: v0.1.16
    • K8s cluster version: v1.11

    Additional context n/a

    ┆Issue is synchronized with this Jira Task by Unito

    bug 
    opened by ishanmkh 6
  • Auto-close the

    Auto-close the "you can now close this window" browser page

    Is your feature request related to a problem? Please describe. No

    Describe the solution you'd like Add a javascript tag or something to automatically close the window after a short delay, so I don't have to actually interact with my browser during the setup. Shouldn't be complicated to add as you already control the page that's shown after the authorization :)

    Describe alternatives you've considered None

    ┆Issue is synchronized with this Jira Task by Unito

    opened by XaF 6
  • Add mux stitcher implementation

    Add mux stitcher implementation

    This is the next step in supporting the mux protocol whose parser was added in 93a23272d2b78cce9fb8acc835e5f3ad263aa45c (#327).

    Testing

    • [x] New stitcher tests pass which verify the following cases:
      • When there are requests with missing responses those frames are not consumed
      • When there are responses with missing requests those frames are consumed
      • Consumes frames that have matching request and response pairs
    opened by ddelnano 5
  • pxtrace:

    pxtrace: "Struct/union of type 'struct _tracepoint_sched_sched_wakeup' does not contain a field named 'pid'"

    Describe the bug We have created the following pixie script to get scheduling latencies using bpftrace:

    import pxtrace
    import px
    
    
    program = """
    #include <uapi/linux/ptrace.h>
    #include <linux/sched.h>
    #include <linux/nsproxy.h>
    #include <linux/pid_namespace.h>
    
    tracepoint:sched:sched_wakeup,
    tracepoint:sched:sched_wakeup_new 
    {
        @qtime[args->pid] = nsecs;
    }
    
    tracepoint:sched:sched_switch {
        
        if (args->prev_state == TASK_RUNNING) {
            if (args->prev_pid != 0) {
                @qtime[args->prev_pid] = nsecs;
            }
        }
        $ns = @qtime[args->next_pid];
        $latency = (nsecs - $ns)/1000;
        if($latency != 0 && args->next_pid != 0 && args->prev_pid != 0){
            printf(\"time_:%d oproc:%s opid:%d lat:%lld nproc:%s npid:%d\", nsecs, args->prev_comm, args->prev_pid, $latency, args->next_comm, args->next_pid);
        }
        delete(@qtime[args->next_pid]);    
    }
    """
    
    def demo_func():
        table_name = 'latencies_table'
        pxtrace.UpsertTracepoint('latencies_probe_2111',
                                 table_name,
                                 program,
                                 pxtrace.kprobe(),
                                 "10m")
        # Rename columns
        df = px.DataFrame(table=table_name)
    
        return df   
    
    df = demo_func()
    px.display(df)
    

    On running the above script, we get the following errors:

    Semantic analyser failed with message: stdin:9:5-21: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_wakeup' does not contain a field named 'pid'
        @qtime[args->pid] = nsecs;
        ~~~~~~~~~~~~~~~~
    stdin:9:5-21: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_wakeup_new' does not contain a field named 'pid'
        @qtime[args->pid] = nsecs;
        ~~~~~~~~~~~~~~~~
    stdin:13:8-25: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'prev_state'
        if (args->prev_state == TASK_RUNNING) {
           ~~~~~~~~~~~~~~~~~
    stdin:13:29-41: ERROR: Unknown identifier: 'TASK_RUNNING'
        if (args->prev_state == TASK_RUNNING) {
                                ~~~~~~~~~~~~
    stdin:14:12-27: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'prev_pid'
            if (args->prev_pid != 0) {
               ~~~~~~~~~~~~~~~
    stdin:15:13-34: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'prev_pid'
                @qtime[args->prev_pid] = nsecs;
                ~~~~~~~~~~~~~~~~~~~~~
    stdin:18:11-32: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'next_pid'
        $ns = @qtime[args->next_pid];
              ~~~~~~~~~~~~~~~~~~~~~
    stdin:20:25-39: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'next_pid'
        if($latency != 0 && args->next_pid != 0 && args->prev_pid != 0){
                            ~~~~~~~~~~~~~~
    stdin:20:48-62: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'prev_pid'
        if($latency != 0 && args->next_pid != 0 && args->prev_pid != 0){
                                                   ~~~~~~~~~~~~~~
    stdin:21:78-93: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'prev_comm'
            printf("time_:%d oproc:%s opid:%d lat:%lld nproc:%s npid:%d", nsecs, args->prev_comm, args->prev_pid, $latency, args->next_comm, args->next_pid);
                                                                                 ~~~~~~~~~~~~~~~
    stdin:21:95-109: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'prev_pid'
            printf("time_:%d oproc:%s opid:%d lat:%lld nproc:%s npid:%d", nsecs, args->prev_comm, args->prev_pid, $latency, args->next_comm, args->next_pid);
                                                                                                  ~~~~~~~~~~~~~~
    stdin:21:121-136: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'next_comm'
            printf("time_:%d oproc:%s opid:%d lat:%lld nproc:%s npid:%d", nsecs, args->prev_comm, args->prev_pid, $latency, args->next_comm, args->next_pid);
                                                                                                                            ~~~~~~~~~~~~~~~
    stdin:21:138-152: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'next_pid'
            printf("time_:%d oproc:%s opid:%d lat:%lld nproc:%s npid:%d", nsecs, args->prev_comm, args->prev_pid, $latency, args->next_comm, args->next_pid);
                                                                                                                                             ~~~~~~~~~~~~~~
    stdin:23:5-33: ERROR: Struct/union of type 'struct _tracepoint_sched_sched_switch' does not contain a field named 'next_pid'
        delete(@qtime[args->next_pid]);    
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    

    The above errors shouldn't be occurring as the structures in question do in fact contain these fields:

    //// struct _tracepoint_sched_sched_wakeup
    
    $ sudo cat /sys/kernel/debug/tracing/events/sched/sched_wakeup/format
    name: sched_wakeup
    ID: 283
    format:
            field:unsigned short common_type;       offset:0;       size:2; signed:0;
            field:unsigned char common_flags;       offset:2;       size:1; signed:0;
            field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
            field:int common_pid;   offset:4;       size:4; signed:1;
    
            field:char comm[16];    offset:8;       size:16;        signed:1;
            field:pid_t pid;        offset:24;      size:4; signed:1;
            field:int prio; offset:28;      size:4; signed:1;
            field:int success;      offset:32;      size:4; signed:1;
            field:int target_cpu;   offset:36;      size:4; signed:1;
    
    print fmt: "comm=%s pid=%d prio=%d target_cpu=%03d", REC->comm, REC->pid, REC->prio, REC->target_cpu
    
    //// struct _tracepoint_sched_sched_switch
    
    $ sudo cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
    name: sched_switch
    ID: 281
    format:
            field:unsigned short common_type;       offset:0;       size:2; signed:0;
            field:unsigned char common_flags;       offset:2;       size:1; signed:0;
            field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
            field:int common_pid;   offset:4;       size:4; signed:1;
    
            field:char prev_comm[16];       offset:8;       size:16;        signed:1;
            field:pid_t prev_pid;   offset:24;      size:4; signed:1;
            field:int prev_prio;    offset:28;      size:4; signed:1;
            field:long prev_state;  offset:32;      size:8; signed:1;
            field:char next_comm[16];       offset:40;      size:16;        signed:1;
            field:pid_t next_pid;   offset:56;      size:4; signed:1;
            field:int next_prio;    offset:60;      size:4; signed:1;
    
    print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, REC->prev_state, REC->next_comm, REC->next_pid, REC->next_prio
    

    I can also confirm that our script works when we run it using the bpftrace tool. Any clue what the issue iould be? Thanks in advance!

    opened by pranav-bhatt 5
  • Query Broker crashing when running a custom script

    Query Broker crashing when running a custom script

    Describe the bug

    When running the attached script using local dev mode, the query broker crashes.

    To Reproduce

    I added a simple function to the px/nodes script, return a test value and store in a new column called wavelength_zone.

    def getWavelength(node: str):
        return "test"
    
    def nodes(start_time: str):
        df = px.DataFrame(table='process_stats', start_time=start_time)
        df.node = df.ctx['node_name']
        df['wavelength_zone'] = getWavelength(df.node)
        return df.groupby(['node','wavelength_zone']).agg()
    

    Expected behavior I would expect either a syntax error or the script to run successfully.

    Screenshots N/A

    Logs query broker pod error.log attached.

    App information (please complete the following information):

    • Pixie version: ??
    • K8s cluster version: v1.21.2-eks-55daa9d
    • Node Kernel version: 5.4.141-67.229.amzn2.x86_64
    • Browser version: Chrome 95.0.4638.69

    Additional context N/A

    opened by bpschmitt 5
  • Cannot get valid information when running

    Cannot get valid information when running "px run px/namespaces"

    Describe the bug I got nothing when running script except bpftrace. The vizier and agent seem like health. Self-Hosted Pixie installed by "px deploy --dev_cloud_namespace plc --deploy_olm=false"

    Screenshots

    [[email protected] pixie_cli]# px run px/namespaces
    Pixie CLI
    *******************************
    * IN TESTING MODE
    * 	 PL_CLOUD_ADDR=dev.withpixie.dev
    *******************************
    Table ID: Namespaces
      NAMESPACE  POD COUNT  SERVICE COUNT  
    Table ID: Process Stats Overview by Namespace
      NAMESPACE  AVG VSIZE  AVG RSS  ACTUAL DISK READ THROUGHPUT  ACTUAL DISK WRITE THROUGHPUT  TOTAL DISK READ THROUGHPUT  TOTAL DISK WRITE THROUGHPUT  
    
    
    ==>  Live UI: https://work.dev.withpixie.dev:443/live/clusters/[email protected]?script=px%2Fnamespaces&start_time=-5m
    [[email protected] pixie_cli]# px run px/agent_status
    Pixie CLI
    *******************************
    * IN TESTING MODE
    * 	 PL_CLOUD_ADDR=dev.withpixie.dev
    *******************************
    Table ID: output
      AGENT ID                              ASID  HOSTNAME                 IP ADDRESS          AGENT STATE          CREATE TIME                              LAST HEARTBEAT NS  
      172d7fa8-f824-4cb1-89eb-dcdbfd952756  3     vm183184                                     AGENT_STATE_HEALTHY  2021-12-01 01:53:11.10538375 +0000 GMT   2390995972         
      865624ac-f170-4e79-a748-37d8cc02ee2c  1     kelvin-745846bfc6-bglsx  10.168.16.39:59300  AGENT_STATE_HEALTHY  2021-12-01 01:52:27.294379787 +0000 GMT  1623646045         
      a58193bf-4f55-4853-9132-f0bec7519821  4     vm183186                                     AGENT_STATE_HEALTHY  2021-12-01 01:53:15.766434811 +0000 GMT  2588635798         
      bb23c1fe-c274-46a2-ad16-abfe088155b3  2     vm183185                                     AGENT_STATE_HEALTHY  2021-12-01 01:53:04.856278598 +0000 GMT  3641795561         
    
    
    ==>  Live UI: https://work.dev.withpixie.dev:443/live/clusters/[email protected]?script=px%2Fagent_status
    [[email protected] pixie_cli]# px get viziers
    Pixie CLI
    *******************************
    * IN TESTING MODE
    * 	 PL_CLOUD_ADDR=dev.withpixie.dev
    *******************************
    Table ID: viziers
      CLUSTERNAME                  ID                                    K8S VERSION  VIZIER VERSION  LAST HEARTBEAT  PASSTHROUGH  STATUS      STATUS MESSAGE  
      [email protected]  8bd86533-078f-4932-a2f1-8c071c57946a  v1.22.4      0.9.11          1 second ago    true         CS_HEALTHY 
    [[email protected] pixie_cli]# kubectl logs vizier-pem-27qt7 -n pl
    I1201 07:27:18.627548 84708 exec.cc:91] Queries in flight: 0
    I1201 07:27:18.627647  7701 exec.cc:51] Executing query: id=e707aeca-26a1-4dfe-bd2d-147c8dc8fe52
    I1201 07:27:18.645769  7701 exec.cc:63] Completed query: id=e707aeca-26a1-4dfe-bd2d-147c8dc8fe52
    W1201 07:27:23.586112 84708 state_manager.cc:262] Failed to read PID info for pod=6a668241-b2aa-46aa-8626-07a7c6cd0f54, cid= [msg=Failed to open file /sys/fs/cgroup/pids/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod6a668241_b2aa_46aa_8626_07a7c6cd0f54.slice/docker-.scope/cgroup.procs]
    

    Logs pixie_logs_20211201072955.zip

    App information (please complete the following information):

    • Pixie version
    • K8s cluster version v1.22.4
    • Node Kernel version 4.18.0
    • Browser version

    Additional context I found the health check is stuck here deploying Self-Hosted Pixie. It look lie the cloudAddr(api-service.plc.svc.cluster.local:51200) is non-accessible,so i commented out the code, then deploy success. But it can access this domain name normally in other pod.

    sh-4.2$ curl -v https://api-service.plc.svc.cluster.local:51200/healthz -k
    * About to connect() to api-service.plc.svc.cluster.local port 51200 (#0)
    *   Trying 10.100.228.181...
    * Connected to api-service.plc.svc.cluster.local (10.100.228.181) port 51200 (#0)
    * Initializing NSS with certpath: sql:/etc/pki/nssdb
    * skipping SSL peer certificate verification
    * SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
    * Server certificate:
    * 	subject: CN=pixie.local,O=Pixie
    * 	start date: Dec 01 01:24:44 2021 GMT
    * 	expire date: Dec 01 01:24:44 2022 GMT
    * 	common name: pixie.local
    * 	issuer: CN=pixie.local,O=Pixie
    > GET /healthz HTTP/1.1
    > User-Agent: curl/7.29.0
    > Host: api-service.plc.svc.cluster.local:51200
    > Accept: */*
    > 
    < HTTP/1.1 200 OK
    < Date: Thu, 02 Dec 2021 02:17:43 GMT
    < Content-Length: 35
    < Content-Type: text/plain; charset=utf-8
    < 
    OK
    [+]ping OK
    healthz check passed
    * Connection #0 to host api-service.plc.svc.cluster.local left intact
    sh-4.2$ exit
    
    

    I am not sure if this operation caused the current problem. I redeployed Pixie by unmodified Pixie's CLI, health check failed, run script get nothing.

    opened by Emin3mU 0
  • Integrate mux parser into stirling

    Integrate mux parser into stirling

    This is my second attempt at integrating the mux parser into stirling. I'm opening this preemptively to show that the testing for #366 was successful, but we need to merge that PR first and rebase to pull out the docker image related changes.

    Todo

    • [x] Verified bazel built thriftmux image works for the mux_trace_bpf_test (added separately in #366)
    • [x] Existing test from #351 passes
    • [ ] Fix issue that is causing a subset of mux data from being collected
    opened by ddelnano 1
  • Add the base for thriftmux scala docker images (layers) used for end to end bpf testing

    Add the base for thriftmux scala docker images (layers) used for end to end bpf testing

    This PR pulls out the thriftmux container changes necessary for #351. It also switches the build process to use bazel entirely rather than using docker directly.

    Testing

    • [x] Used this change in #367 to verify that it works properly
    opened by ddelnano 1
  • add support for pod nodeSelector in helm and operator

    add support for pod nodeSelector in helm and operator

    vizier only runs on linux nodes. add support for nodeSelector so consumers can restrict the pods where vizier can be run. Signed-off-by: smcavallo [email protected]

    (@aimichelle - replaces https://github.com/pixie-io/pixie/pull/359)

    opened by smcavallo 1
  • UDF to identify SQL injections in PostgreSQL via regex rules

    UDF to identify SQL injections in PostgreSQL via regex rules

    In order to identify SQL injection attacks inside SQL queries we need to implement a UDF that returns whether or not the input string contains a SQL injection attack.

    Describe the solution you'd like Implement a UDF that takes a string as input. It should test this string against a list of regular expressions and return on the first match, a string indicating what regex rule matched that identifies it as a SQL injection. If none of the regular expressions match, it should return an empty string indicating it was not a SQL injection attack. This function would be called as part of a PxL script that passes it SQL query data. Note this function is a first pass at identifying SQL injections and eventually will be replaced by a ML model.

    Regular expression rules:

    "comment_dash_rule": ".*--.*"
    "comment_slash_rule": ".*\/\*.*"
    "semicolon_rule": ".*;.+.*"
    "unmatched_single_quotes_rule": "^([^']*'([^']*'[^']*')*[^']*')[^']*'[^']*$"
    "union_rule": "(?i).*UNION.*"
    "char_casting_rule": "(?i).*chr(\(|%28).*"
    "system_catalog_access_rule": "(?i).*from\s+pg_.*"
    "always_true_rule": "(?i).*OR\s+(['\w]+)=\1.*"
    
    

    Sudo code:

    def matches_postgresql_injection_rule(sql_query):
        for rule, regex in regular_expression_rules.items():
            if regex.match(sql_query):
                return rule
        return ""
    

    Describe alternatives you've considered One alternative is to use a generic UDF that takes in a list of regular expression rules as opposed to making this function SQL injection specific and hard coding the regex rules inside it.

    opened by hmstepanek 0
  • UDF to identify SQL injections in MySQL via regex rules

    UDF to identify SQL injections in MySQL via regex rules

    In order to identify SQL injection attacks inside SQL queries we need to implement a UDF that returns whether or not the input string contains a SQL injection attack.

    Describe the solution you'd like Implement a UDF that takes a string as input. It should test this string against a list of regular expressions and return on the first match, a string indicating what regex rule matched that identifies it as a SQL injection. If none of the regular expressions match, it should return an empty string indicating it was not a SQL injection attack. This function would be called as part of a PxL script that passes it SQL query data. Note this function is a first pass at identifying SQL injections and eventually will be replaced by a ML model.

    Regular expression rules:

    "comment_dash_rule": ".*--.*"
    "comment_hashtag_rule": ".*#.*"
    "comment_slash_rule": ".*\/\*.*"
    "semicolon_rule": ".*;.+.*"
    "unmatched_single_quotes_rule": "^([^']*'([^']*'[^']*')*[^']*')[^']*'[^']*$"
    "unmatched_double_quotes_rule": '^([^"]*"([^"]*"[^"]*")*[^"]*")[^"]*"[^"]*$'
    "union_rule": "(?i).*UNION.*"
    "char_casting_rule": "(?i).*chr(\(|%28).*"
    "system_catalog_access_rule": "(?i).*from\s+mysql.*"
    "always_true_rule": "(?i).*OR\s+(['\w]+)=\1.*"
    
    

    Sudo code:

    def matches_mysql_injection_rule(sql_query):
        for rule, regex in regular_expression_rules.items():
            if regex.match(sql_query):
                return rule
        return ""
    

    Describe alternatives you've considered One alternative is to use a generic UDF that takes in a list of regular expression rules as opposed to making this function SQL injection specific and hard coding the regex rules inside it.

    opened by hmstepanek 0
  • UDF to identify XSS attacks via regex rules

    UDF to identify XSS attacks via regex rules

    In order to identify reflected and stored Cross Site Scripting (XSS) attacks inside HTTP requests and SQL queries we need to implement a UDF that returns whether or not the input string contains a XSS attack.

    Describe the solution you'd like Implement a UDF that takes a string as input. It should test this string against a list of regular expressions and return on the first match, a string indicating what regex rule matched that identifies it as a XSS attack. If none of the regular expressions match, it should return an empty string indicating it was not a XSS attack. This function would be called as part of a PxL script that passes it both HTTP request and response data as well as SQL query data.

    Regular expression rules:

    "img_tag": "(?i).*(<|%3C)\s*img.*"
    "iframe_tag": "(?i).*(<|%3C)\s*iframe.*"
    "object_tag": "(?i).*(<|%3C)\s*object.*"
    "embed_tag": "(?i).*(<|%3C)\s*embed.*"
    "script_tag": "(?i).*(<|%3C)\s*script.*"
    "alert_event": "(?i).*[\s\"\'`;\/0-9=\x0B\x09\x0C\x3B\x2C\x28\x3B]alert(.*"
    "href_property": "(?i).*[\s\"\'`;\/0-9=\x0B\x09\x0C\x3B\x2C\x28\x3B]href[\s\x0B\x09\x0C\x3B\x2C\x28\x3B]*?=[^=].*"
    "src_property": "(?i).*[\s\"\'`;\/0-9=\x0B\x09\x0C\x3B\x2C\x28\x3B]src[\s\x0B\x09\x0C\x3B\x2C\x28\x3B]*?=[^=].*"
    "flash_command_event": "(?i).*i[\s\"\'`;\/0-9=\x0B\x09\x0C\x3B\x2C\x28\x3B]fscommand[\s\x0B\x09\x0C\x3B\x2C\x28\x3B]*?=[^=].*"
    # Pulled from https://github.com/coreruleset/coreruleset/blob/v3.4/dev/rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf.
    "event": "(?i).*[\s\"\'`;\/0-9=\x0B\x09\x0C\x3B\x2C\x28\x3B]on[a-zA-Z]{3,25}[\s\x0B\x09\x0C\x3B\x2C\x28\x3B]*?=[^=].*"
    "attribute_vector": "(?i).*[\s\S](?:\b(?:x(?:link:href|html|mlns)|data:text\/html|pattern\b.*?=|formaction)|!ENTITY\s+(?:\S+|%\s+\S+)\s+(?:PUBLIC|SYSTEM)|;base64|@import)\b.*"
    "javascript_uri_and_tags": "(?i).*[a-z]+=(?:[^:=]+:.+;)*?[^:=]+:url\(javascript.*"
    

    Sudo code:

    def matches_xss_rule(string):
        for rule, regex in regular_expression_rules.items():
            if regex.match(string):
                return rule
        return ""
    

    Describe alternatives you've considered One alternative is to use a generic UDF that takes in a list of regular expression rules as opposed to making this function XSS specific and hard coding the regex rules inside it.

    opened by hmstepanek 0
  • Networking Charts Not Working with End to End Encryption in AWS AppMesh

    Networking Charts Not Working with End to End Encryption in AWS AppMesh

    Is your feature request related to a problem? Please describe. Currently using end to end encryption with AWS App Mesh and I believe it is causing an issue viewing Networking Flow in Pixie. For example the flow in charts like HTTP Service Map, DNS Flow Chart, TCP Drops does not show up. We can see each of the services, pods, etc, but the network flow that we'd expect to see is not present.

    Describe the solution you'd like I need to know if this is a known issue and if so what the workaround and fix is? Is Pixie supported in environments with full end to end encryption? Is Pixie supported in AWS App Mesh? What about if both are enabled?

    opened by mrytower05 0
  • Kafka Consumer lag with high-throughput topics

    Kafka Consumer lag with high-throughput topics

    Describe the bug

    When using the Kafka Consumer lag script, if there is too much throughput, most of the data is discarded and the result is empty. The rest of Kafka stats work as usual.

    The size of the data buffer in Pixie is currently limited to 1 megabyte per connection, but the perf tests are sending well over that amount of data, so most of it gets dropped. After bumping up the buffer_size and table_store_size_limit , the consumer_latency script should work.

    To Reproduce

    Launch the consumer-perf/producer-perf test and load the kafka_producer_consumer_latency script.

    Expected behavior

    We should be able to see the latency in the graphs.

    App information (please complete the following information):

    • Pixie version: latest
    • K8s cluster version: GKE
    opened by antonmry 0
  • How to deploy self hosted pixie cloud?

    How to deploy self hosted pixie cloud?

    I tried deploying the self hosted pixie cloud into a GKE cluster by using guideline from pixie.docs . By default pixie services create internal LoadBalancer for the domain name dev.withpixie.dev, work.dev.withpixie.dev, segment.dev.withpixie.dev, docs.dev.withpixie.dev which I'm not able to access from outside the network. So, I manually changed the two service cloud-proxy-service and vzconn-service to external LB, but the domain names are still not accessible.

    cloud-proxy-service LoadBalancer 10.116.14.68 35.247.191.178 56000:31810/TCP,56004:30847/TCP - this service in my cluster resolve the work.dev.* domain name which is giving the DNS_PROBE_FINISHED_NXDOMAIN error in my browser.

    Can anyone help with this issue?

    opened by ark-xs 0
Owner
Pixie Labs
Engineers use Pixie’s auto-telemetry to debug distributed environments in real-time
Pixie Labs
Open Service Mesh (OSM) is a lightweight, extensible, cloud native service mesh that allows users to uniformly manage, secure, and get out-of-the-box observability features for highly dynamic microservice environments.

Open Service Mesh (OSM) Open Service Mesh (OSM) is a lightweight, extensible, Cloud Native service mesh that allows users to uniformly manage, secure,

Open Service Mesh 2.2k Nov 29, 2021
SigNoz helps developers monitor their applications & troubleshoot problems, an open-source alternative to DataDog, NewRelic, etc. 🔥 🖥. 👉 Open source Application Performance Monitoring (APM) & Observability tool

Monitor your applications and troubleshoot problems in your deployed applications, an open-source alternative to DataDog, New Relic, etc. Documentatio

SigNoz 4.7k Sep 24, 2021
The OCI Service Operator for Kubernetes (OSOK) makes it easy to connect and manage OCI services from a cloud native application running in a Kubernetes environment.

OCI Service Operator for Kubernetes Introduction The OCI Service Operator for Kubernetes (OSOK) makes it easy to create, manage, and connect to Oracle

Oracle 11 Nov 10, 2021
Hubble - Network, Service & Security Observability for Kubernetes using eBPF

Network, Service & Security Observability for Kubernetes What is Hubble? Getting Started Features Service Dependency Graph Metrics & Monitoring Flow V

Cilium 1.7k Nov 29, 2021
pREST (PostgreSQL REST), simplify and accelerate development, ⚡ instant, realtime, high-performance on any Postgres application, existing or new

pREST pREST (PostgreSQL REST), simplify and accelerate development, instant, realtime, high-performance on any Postgres application, existing or new P

pREST 3k Dec 5, 2021
:rocket: Instant live visualization of your Go application runtime statistics (GC, MemStats, etc.) in the browser

Statsviz Instant live visualization of your Go application runtime statistics (GC, MemStats, etc.). Import "github.com/arl/statsviz" Register statsviz

Aurélien Rainone 1.7k Nov 20, 2021
Sample cloud-native application with 10 microservices showcasing Kubernetes, Istio, gRPC and OpenCensus.

Online Boutique is a cloud-native microservices demo application. Online Boutique consists of a 10-tier microservices application. The application is

Google Cloud Platform 11.3k Dec 7, 2021
eBPF based TCP observability.

TCPDog is a total solution from exporting TCP statistics from Linux kernel by eBPF very efficiently to store them at your Elasticsearch or InfluxDB da

Mehrdad Arshad Rad 166 Nov 25, 2021
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

The open-source platform for monitoring and observability. Grafana allows you to query, visualize, alert on and understand your metrics no matter wher

Grafana Labs 45.2k Dec 4, 2021
Open source Observability Platform. 👉 SigNoz helps developers find issues in their deployed applications & solve them quickly

SigNoz SigNoz is an opensource observability platform. SigNoz uses distributed tracing to gain visibility into your systems and powers data using Kafk

SigNoz 5.2k Nov 28, 2021
A distributed, fault-tolerant pipeline for observability data

Table of Contents What Is Veneur? Use Case See Also Status Features Vendor And Backend Agnostic Modern Metrics Format (Or Others!) Global Aggregation

Stripe 1.6k Dec 6, 2021
TCPProbe is a modern TCP tool and service for network performance observability.

TCPProbe is a modern TCP tool and service for network performance observability. It exposes information about socket’s underlying TCP session, TLS and HTTP (more than 60 metrics). you can run it through command line or as a service. the request is highly customizable and you can integrate it with your application through gRPC. it runs in a Kubernetes cluster as cloud native application and by adding annotations on pods allow a fine control of the probing process.

Mehrdad Arshad Rad 312 Nov 24, 2021
An observability database aims to ingest, analyze and store Metrics, Tracing and Logging data.

BanyanDB BanyanDB, as an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data. It's designed to handle observab

The Apache Software Foundation 55 Nov 25, 2021
Secure Distributed Thanos Deployment using an Observability Cluster

Atlas Status: BETA - I don't expect breaking changes, but still possible. Atlas, forced by Zeus to support the heavens and the skies on his shoulders.

Atlas 37 Nov 8, 2021
Recipes for observability solutions at AWS

AWS o11y recipes See aws-observability.github.io/aws-o11y-recipes/. Security See CONTRIBUTING for more information. License This library is licensed u

null 60 Nov 28, 2021
Litmus helps Kubernetes SREs and developers practice chaos engineering in a Kubernetes native way.

Litmus Cloud-Native Chaos Engineering Read this in other languages. ???? ???? ???? ???? Overview Litmus is a toolset to do cloud-native chaos engineer

Litmus Chaos 2.3k Dec 6, 2021
GraphJin - Build APIs in 5 minutes with GraphQL. An instant GraphQL to SQL compiler.

GraphJin - Build APIs in 5 minutes GraphJin gives you a high performance GraphQL API without you having to write any code. GraphQL is automagically co

Vikram Rangnekar 1.6k Nov 30, 2021
Instant loginless chats with people that share an IP with you.

localchat Instant web chat rooms (anything under the /<room> path goes). Defaults to your local public IP, which means in most cases people from the s

fiatjaf 36 Jul 15, 2021
Instant messaging platform. Backend in Go. Clients: Swift iOS, Java Android, JS webapp, scriptable command line; chatbots

Tinode Instant Messaging Server Instant messaging server. Backend in pure Go (license GPL 3.0), client-side binding in Java, Javascript, and Swift, as

Tinode 8.2k Dec 6, 2021
GraphJin - Build APIs in 5 minutes with GraphQL. An instant GraphQL to SQL compiler.

GraphJin - Build APIs in 5 minutes GraphJin gives you a high performance GraphQL API without you having to write any code. GraphQL is automagically co

Vikram Rangnekar 1.6k Nov 28, 2021
Instant messaging server for the Extensible Messaging and Presence Protocol (XMPP).

Instant messaging server for the Extensible Messaging and Presence Protocol (XMPP).

Miguel Ángel Ortuño 1.2k Nov 28, 2021
Instant, disposable, single-binary web based live chat server. Go + VueJS.

Niltalk Niltalk is a web based disposable chat server. It allows users to create password protected disposable, ephemeral chatrooms and invite peers t

Kailash Nadh 834 Nov 28, 2021
Pixie gives you instant visibility by giving access to metrics, events, traces and logs without changing code.

Pixie gives you instant visibility by giving access to metrics, events, traces and logs without changing code.

Pixie Labs 2.6k Dec 2, 2021
null 46 Nov 24, 2021
Open-IM-Server is open source instant messaging Server.Backend in Go.

Open-IM-Server Open-IM-Server: Open source Instant Messaging Server Instant messaging server. Backend in pure Golang, wire transport protocol is JSON

OpenIM Corporation 5.1k Nov 25, 2021
纯Go编写的IM,完全自定义协议的高性能即时通讯服务(High-performance instant messaging service with fully customizable protocol)

LiMaoIM (Everything so easy) This project is a simple and easy to use, powerful performance, simple design concept instant messaging service, fully cu

null 120 Nov 23, 2021
GraphJin - Build APIs in 5 minutes with GraphQL. An instant GraphQL to SQL compiler.

GraphJin gives you a high performance GraphQL API without you having to write any code. GraphQL is automagically compiled into an efficient SQL query. Use it either as a library or a standalone service.

Vikram Rangnekar 1.6k Nov 25, 2021
Open-IM-Server is open source instant messaging Server.Backend in Go.

Open-IM-Server is open source instant messaging Server.Backend in Go.

OpenIM Corporation 5.4k Dec 4, 2021
Instant online preview of HTML files or websites.

Instant online preview of HTML files or websites.

Krzysztof Kowalczyk 10 Oct 10, 2021