Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version:master-20211129-cb952d6
- Deployment mode(standalone or cluster):cluster
- SDK version(e.g. pymilvus v2.0.0rc2):pymilvus-2.0.0rc9.dev7
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
client pod:benchmark-tag-8k2zj-989751322
client logs:
[2021-11-30 05:05:01,227] [ DEBUG] - Row count: 999875199 in collection: <sift_1b_128_l2> (milvus_benchmark.client:416)
[2021-11-30 05:05:01,228] [ DEBUG] - 999875199 (milvus_benchmark.runners.base:89)
[2021-11-30 05:05:01,229] [ INFO] - {'total_time': 49243.11, 'rps': 20307.41, 'ni_time': 2.46} (milvus_benchmark.runners.base:151)
[2021-11-30 05:05:01,346] [ DEBUG] - Start flush. (milvus_benchmark.runners.locust:428)
[2021-11-30 05:05:04,305] [ DEBUG] - Milvus flush run in 2.96s (milvus_benchmark.client:52)
[2021-11-30 05:05:04,305] [ DEBUG] - Fulsh done, during time: 2.96 (milvus_benchmark.runners.locust:431)
[2021-11-30 05:05:04,311] [ DEBUG] - Row count: 999925199 in collection: <sift_1b_128_l2> (milvus_benchmark.client:416)
[2021-11-30 05:05:04,311] [ DEBUG] - 999925199 (milvus_benchmark.runners.locust:432)
[2021-11-30 05:05:04,312] [ DEBUG] - Start build index for last file (milvus_benchmark.runners.locust:434)
[2021-11-30 05:05:04,313] [ INFO] - Building index start, collection_name: sift_1b_128_l2, index_type: IVF_SQ8, metric_type: L2 (milvus_benchmark.client:273)
[2021-11-30 05:05:04,313] [ INFO] - {'nlist': 1024} (milvus_benchmark.client:275)
[2021-11-30 05:05:04,314] [ DEBUG] - collection: sift_1b_128_l2 Index params: {'index_type': 'IVF_SQ8', 'metric_type': 'L2', 'params': {'nlist': 1024}} (milvus_benchmark.client:281)
[2021-11-30 05:05:52,106] [ DEBUG] - Building index done, collection_name: sift_1b_128_l2, response: Status(code=0, message='') (milvus_benchmark.client:283)
[2021-11-30 05:05:52,107] [ DEBUG] - Milvus create_index run in 47.79s (milvus_benchmark.client:52)
[2021-11-30 05:05:52,107] [ DEBUG] - {'flush_time': 2.96, 'build_time': 47.79} (milvus_benchmark.runners.locust:438)
[2021-11-30 05:05:52,112] [ DEBUG] - Row count: 999925199 in collection: <sift_1b_128_l2> (milvus_benchmark.client:416)
[2021-11-30 05:05:52,112] [ INFO] - 999925199 (milvus_benchmark.runners.locust:439)
[2021-11-30 05:05:52,113] [ INFO] - Start load collection (milvus_benchmark.runners.locust:440)
[2021-11-30 07:13:25,695] [ ERROR] - Error: <BaseException: (code=1, message=err: rpc error: code = Unknown desc = collection 429437365324811585 has not been loaded to memory or load failed
, /usr/local/go/src/runtime/extern.go:216 runtime.Callers
/go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:25 github.com/milvus-io/milvus/internal/util/trace.StackTraceMsg
/go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:43 github.com/milvus-io/milvus/internal/util/trace.StackTrace
/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:215 github.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).recall
/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:297 github.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).ShowCollections
/go/src/github.com/milvus-io/milvus/internal/proxy/task.go:2836 github.com/milvus-io/milvus/internal/proxy.(*showCollectionsTask).Execute
/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:458 github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).processTask
/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:486 github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).definitionLoop
/usr/local/go/src/runtime/asm_amd64.s:1374 runtime.goexit
)> (pymilvus.client.grpc_handler:69)
[2021-11-30 07:13:25,734] [ ERROR] - Error: <BaseException: (code=1, message=err: rpc error: code = Unknown desc = collection 429437365324811585 has not been loaded to memory or load failed
, /usr/local/go/src/runtime/extern.go:216 runtime.Callers
/go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:25 github.com/milvus-io/milvus/internal/util/trace.StackTraceMsg
/go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:43 github.com/milvus-io/milvus/internal/util/trace.StackTrace
/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:215 github.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).recall
/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:297 github.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).ShowCollections
/go/src/github.com/milvus-io/milvus/internal/proxy/task.go:2836 github.com/milvus-io/milvus/internal/proxy.(*showCollectionsTask).Execute
/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:458 github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).processTask
/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:486 github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).definitionLoop
/usr/local/go/src/runtime/asm_amd64.s:1374 runtime.goexit
)> (pymilvus.client.grpc_handler:69)
[2021-11-30 07:13:25,735] [ ERROR] - <BaseException: (code=1, message=err: rpc error: code = Unknown desc = collection 429437365324811585 has not been loaded to memory or load failed
, /usr/local/go/src/runtime/extern.go:216 runtime.Callers
/go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:25 github.com/milvus-io/milvus/internal/util/trace.StackTraceMsg
/go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:43 github.com/milvus-io/milvus/internal/util/trace.StackTrace
/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:215 github.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).recall
/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:297 github.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).ShowCollections
/go/src/github.com/milvus-io/milvus/internal/proxy/task.go:2836 github.com/milvus-io/milvus/internal/proxy.(*showCollectionsTask).Execute
/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:458 github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).processTask
/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:486 github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).definitionLoop
/usr/local/go/src/runtime/asm_amd64.s:1374 runtime.goexit
)> (milvus_benchmark.main:117)
[2021-11-30 07:13:25,742] [ ERROR] - Traceback (most recent call last):
File "main.py", line 86, in run_suite
runner.prepare(**cases[0])
File "/src/milvus_benchmark/runners/locust.py", line 442, in prepare
self.milvus.load_collection()
File "/src/milvus_benchmark/client.py", line 48, in wrapper
result = func(*args, **kwargs)
File "/src/milvus_benchmark/client.py", line 478, in load_collection
return self._milvus.load_collection(collection_name, timeout=timeout)
File "/usr/local/lib/python3.6/site-packages/pymilvus/client/stub.py", line 58, in handler
raise e
File "/usr/local/lib/python3.6/site-packages/pymilvus/client/stub.py", line 42, in handler
return func(self, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/pymilvus/client/stub.py", line 322, in load_collection
return handler.load_collection("", collection_name=collection_name, timeout=timeout, **kwargs)
File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 75, in handler
raise e
File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 67, in handler
return func(self, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 823, in load_collection
self.wait_for_loading_collection(collection_name, timeout)
File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 75, in handler
raise e
File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 67, in handler
return func(self, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 841, in wait_for_loading_collection
return self._wait_for_loading_collection_v2(collection_name, timeout)
File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 868, in _wait_for_loading_collection_v2
raise BaseException(response.status.error_code, response.status.reason)
pymilvus.client.exceptions.BaseException: <BaseException: (code=1, message=err: rpc error: code = Unknown desc = collection 429437365324811585 has not been loaded to memory or load failed
, /usr/local/go/src/runtime/extern.go:216 runtime.Callers
/go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:25 github.com/milvus-io/milvus/internal/util/trace.StackTraceMsg
/go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:43 github.com/milvus-io/milvus/internal/util/trace.StackTrace
/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:215 github.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).recall
/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:297 github.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).ShowCollections
/go/src/github.com/milvus-io/milvus/internal/proxy/task.go:2836 github.com/milvus-io/milvus/internal/proxy.(*showCollectionsTask).Execute
/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:458 github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).processTask
/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:486 github.com/milvus-io/milvus/internal/proxy.(*taskScheduler).definitionLoop
/usr/local/go/src/runtime/asm_amd64.s:1374 runtime.goexit
)>
(milvus_benchmark.main:118)
[2021-11-30 07:13:25,748] [ DEBUG] - {'_version': '0.1', '_type': 'metric', 'run_id': 1638173528, 'mode': 'local', 'server': <milvus_benchmark.metrics.models.server.Server object at 0x7f000ea82358>, 'hardware': <milvus_benchmark.metrics.models.hardware.Hardware object at 0x7f000ea82208>, 'env': <milvus_benchmark.metrics.models.env.Env object at 0x7f000ea82128>, 'status': 'RUN_FAILED', 'err_message': '', 'collection': {'dimension': 128, 'metric_type': 'l2', 'dataset_name': 'sift_1b_128_l2', 'collection_size': 1000000000, 'other_fields': None, 'ni_per': 50000, 'shards_num': None}, 'index': {'index_type': 'ivf_sq8', 'index_param': {'nlist': 1024}}, 'search': None, 'run_params': {'task': {'types': [{'type': 'query', 'weight': 20, 'params': {'top_k': 10, 'nq': 10, 'search_param': {'nprobe': 16}}}, {'type': 'load', 'weight': 1}, {'type': 'get', 'weight': 2, 'params': {'ids_length': 10}}], 'connection_num': 1, 'clients_num': 20, 'spawn_rate': 2, 'during_time': 864000}, 'connection_type': 'single'}, 'metrics': {'type': 'locust_random_performance', 'value': {}}, 'datetime': '2021-11-29 08:12:08.166657', 'type': 'metric'} (milvus_benchmark.metric.api:29)
Expected Behavior
No response
Steps To Reproduce
No response
Anything else?
argo task: benchmark-tag-8k2zj
test yaml:
client-configmap: client-random-locust-search-84h-1b
server-configmap: server-cluster-8c64m-datanode2-indexnode4-querynode6-nocompaction
server:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
benchmark-tag-8k2zj-1-etcd-0 1/1 Running 0 23h 10.97.17.186 qa-node014.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-etcd-1 1/1 Running 0 23h 10.97.17.187 qa-node014.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-etcd-2 1/1 Running 0 23h 10.97.17.185 qa-node014.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-datacoord-599c7f4cc8-ghg4z 1/1 Running 0 23h 10.97.8.105 qa-node006.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-datanode-d865f756f-9x4bq 1/1 Running 0 23h 10.97.11.185 qa-node009.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-datanode-d865f756f-lnlnv 1/1 Running 0 23h 10.97.11.184 qa-node009.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-indexcoord-8547476c8f-wvfrp 1/1 Running 0 23h 10.97.9.58 qa-node007.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-indexnode-6b5d94dc7d-8d8cs 1/1 Running 0 23h 10.97.5.172 qa-node003.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-indexnode-6b5d94dc7d-cxt5m 1/1 Running 0 23h 10.97.10.101 qa-node008.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-indexnode-6b5d94dc7d-f6s6k 1/1 Running 0 23h 10.97.15.124 qa-node012.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-indexnode-6b5d94dc7d-lh785 1/1 Running 0 23h 10.97.13.106 qa-node010.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-proxy-79bdcb98bd-fww7f 1/1 Running 0 23h 10.97.9.54 qa-node007.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-querycoord-6fdc5c6566-9z29v 1/1 Running 0 23h 10.97.8.104 qa-node006.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-querynode-98996b9b8-4mbhq 1/1 Running 0 23h 10.97.12.141 qa-node015.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-querynode-98996b9b8-brhtn 1/1 Running 0 23h 10.97.12.142 qa-node015.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-querynode-98996b9b8-cmzmm 1/1 Running 0 23h 10.97.14.236 qa-node011.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-querynode-98996b9b8-fxprl 1/1 Running 0 23h 10.97.15.125 qa-node012.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-querynode-98996b9b8-l8lm8 1/1 Running 0 23h 10.97.10.102 qa-node008.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-querynode-98996b9b8-lqq78 1/1 Running 0 23h 10.97.3.114 qa-node001.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-milvus-rootcoord-7f4d679bc4-8klcn 1/1 Running 0 23h 10.97.9.55 qa-node007.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-minio-0 1/1 Running 0 23h 10.97.9.60 qa-node007.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-minio-1 1/1 Running 0 23h 10.97.4.101 qa-node002.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-minio-2 1/1 Running 0 23h 10.97.6.85 qa-node004.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-minio-3 1/1 Running 0 23h 10.97.6.86 qa-node004.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-pulsar-autorecovery-58f6b6bbd6-hlpgd 1/1 Running 0 23h 10.97.8.106 qa-node006.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-pulsar-bastion-55b7db56-cd664 1/1 Running 0 23h 10.97.15.123 qa-node012.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-pulsar-bookkeeper-0 1/1 Running 0 23h 10.97.9.62 qa-node007.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-pulsar-bookkeeper-1 1/1 Running 0 23h 10.97.13.107 qa-node010.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-pulsar-broker-5f7c58cc86-9zwx9 1/1 Running 0 23h 10.97.14.235 qa-node011.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-pulsar-proxy-7cb5577568-w6v9p 2/2 Running 0 23h 10.97.7.247 qa-node005.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-pulsar-zookeeper-0 1/1 Running 0 23h 10.97.9.61 qa-node007.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-pulsar-zookeeper-1 1/1 Running 0 23h 10.97.6.87 qa-node004.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-pulsar-zookeeper-2 1/1 Running 0 23h 10.97.6.88 qa-node004.zilliz.local <none> <none>
benchmark-tag-8k2zj-1-pulsar-zookeeper-metadata-dq9xb 0/1 Completed 0 23h 10.97.3.113 qa-node001.zilliz.local <none> <none>
kind/bug triage/accepted stale test/benchmark