目錄
1、障礙說明
昨天按照著官方網站的安裝手冊將 Neutron 安裝好之後,準備要來啟動第一個在 OpenStack 上的 VM,結果下完 nova boot xxxxxxx 的指令之後,我檢查了一下 VM 狀態,發現 status ERROR:
controller# nova list
+--------------------------------------+----------------+--------+------------+-------------+----------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+----------------+--------+------------+-------------+----------+
| b7eb4c0f-ef39-4034-b4a9-1d9f7f90b553 | demo-instance1 | ERROR | - | NOSTATE | |
+--------------------------------------+----------------+--------+------------+-------------+----------+
2、檢查過程
2.1 檢查 Controller node
2.1.1 查詢 nova 錯誤訊息:
nova show demo-instance1
+--------------------------------------+------------------------------------------------------------------------------------------------------------------------+
| Property | Value |
+--------------------------------------+------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | - |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | - |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2014-10-04T13:56:37Z |
| fault | {"message": "No valid host was found.", "code": 500, "created": "2014-10-03T09:50:40Z"} |
| flavor | m1.tiny (1) |
| hostId | |
| id | b7eb4c0f-ef39-4034-b4a9-1d9f7f90b553 |
| image | cirros-0.3.3-x86_64 (77c0d5f8-1bcc-4937-932c-72f4b0eccbc3) |
| key_name | demo-key |
| metadata | {} |
| name | demo-instance1 |
| os-extended-volumes:volumes_attached | [] |
| status | ERROR |
| tenant_id | 7539436331ca4f9783bf93163e2a2e0f |
| updated | 2014-10-04T22:44:43Z |
| user_id | bc1ae50e167f45edb064e582702c5792 |
+--------------------------------------+------------------------------------------------------------------------------------------------------------------------+
2.1.2 查詢 /etc/nova/nova-api.log
出現以下訊息:
2014-10-04 13:56:38.074 1347 INFO nova.osapi_compute.wsgi.server [req-18fd79ad-0a01-48a0-b8f0-29fac56a5c09 bc1ae50e167f45edb064e582702c5792 7539436331ca4f9783bf93163e2a2e0f] 10.0.0.11 “GET /v2/7539436331ca4f9783bf93163e2a2e0f/images/77c0d5f8-1bcc-4937-932c-72f4b0eccbc3 HTTP/1.1” status: 200 len: 894 time: 0.4527042
2014-10-04 13:56:38.092 1347 INFO nova.api.openstack.wsgi [req-d508d2a8-4a70-46d6-b11e-e21da95224be bc1ae50e167f45edb064e582702c5792 7539436331ca4f9783bf93163e2a2e0f] HTTP exception thrown: The resource could not be found.
2014-10-04 13:56:38.094 1347 INFO nova.osapi_compute.wsgi.server [req-d508d2a8-4a70-46d6-b11e-e21da95224be bc1ae50e167f45edb064e582702c5792 7539436331ca4f9783bf93163e2a2e0f] 10.0.0.11 “GET /v2/7539436331ca4f9783bf93163e2a2e0f/flavors/m1.tiny HTTP/1.1” status: 404 len: 272 time: 0.0192289
2014-10-04 13:56:38.106 1347 INFO nova.osapi_compute.wsgi.server [req-9b6159e7-67ff-454e-a711-46c626354c7d bc1ae50e167f45edb064e582702c5792 7539436331ca4f9783bf93163e2a2e0f] 10.0.0.11 “GET /v2/7539436331ca4f9783bf93163e2a2e0f/flavors HTTP/1.1” status: 200 len: 1383 time: 0.0111101
2014-10-04 13:56:38.117 1347 INFO nova.osapi_compute.wsgi.server [req-a84e3545-5d20-456f-abad-5e8d5f7bc634 bc1ae50e167f45edb064e582702c5792 7539436331ca4f9783bf93163e2a2e0f] 10.0.0.11 “GET /v2/7539436331ca4f9783bf93163e2a2e0f/flavors HTTP/1.1” status: 200 len: 1383 time: 0.0103061
2014-10-04 13:56:38.130 1347 INFO nova.osapi_compute.wsgi.server [req-f5888737-0e2a-4c43-bffa-7f61479f3844 bc1ae50e167f45edb064e582702c5792 7539436331ca4f9783bf93163e2a2e0f] 10.0.0.11 “GET /v2/7539436331ca4f9783bf93163e2a2e0f/flavors/1 HTTP/1.1” status: 200 len: 591 time: 0.0125630
2.1.3 查詢 Nova DB
在 nova.instance_faults 裡面找到 error detail 為以下內容:
File “/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py”, line 108, in schedule_run_instance
raise exception.NoValidHost(reason=”“)
可以研判是 nova schedular 找不到合適的 compute node 作為 host。
2.2 檢查 Compute node
2.2.1 查詢 /etc/nova/nova-compute.log
出現以下訊息:
2014-10-04 13:56:11.489 18675 INFO oslo.messaging._drivers.impl_rabbit [-] Reconnecting to AMQP server on controller:5672
2014-10-04 13:56:11.489 18675 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds…
2014-10-04 13:56:15.501 18675 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on controller:5672 is unreachable: Socket closed. Trying again in 5 seconds.
2014-10-04 13:56:20.504 18675 INFO oslo.messaging._drivers.impl_rabbit [-] Reconnecting to AMQP server on controller:5672
2014-10-04 13:56:20.505 18675 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds…
2014-10-04 13:56:24.520 18675 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on controller:5672 is unreachable: Socket closed. Trying again in 7 seconds.
2014-10-04 13:56:31.525 18675 INFO oslo.messaging._drivers.impl_rabbit [-] Reconnecting to AMQP server on controller:5672
2014-10-04 13:56:31.525 18675 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds…
2014-10-04 13:56:35.536 18675 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on controller:5672 is unreachable: Socket closed. Trying again in 9 seconds.
……. (以下省略)
從上面可以看出 compute node 無法與 RabbitMQ service 進行通訊。
另外從官網上找到一張圖,說明 Nova 在啟動 VM instance 的完整流程:
問題就是出在 4~8 這一段,computer node 無法與 (queue)RabbitMQ service 進行通訊,因此無法向 nova-api 通知有可用的 compute node,因此 nova schedular 就找不到合適的 compute node 可用,也因此無法派送佈署的訊息給 queue。
因為目前環境中只有一台 compute node 的情況下,Nova Scheduler 會找不到可以佈署 VM instance 的 compute node,因此會產生 Error。
3、解決方式
確認了 compute node 無法與 RabbitMQ 通訊後,首先檢查 /etc/nova/nova.conf 內的 RabbitMQ 帳號密碼設定是否正確。
結果發現原來密碼設定錯誤,難怪 compute node 一直無法與 RabbitMQ 通訊,修正後重新啟動 nova-compute 服務就可以正常佈署 VM 了!
沒有留言:
張貼留言