Nova 启动虚拟机流程解析

前言

Nova 启动虚拟机的东西太多，持续更新…

从请求说起

无论是通过 Dashboard 还是 CLI 启动一个虚拟机，发送的是 POST /servers请求，改与该请求的 Body 详情，可以浏览官方文档 Create server。

nova-api service 阶段

Nova API Service 本质是一个 WSGI Application，采用了 Paste + PasteDeploy + Routes + WebOb 框架，简称 PPRW。关于架构的实现不属于本文范畴，所以直接看接收到 POST /servers 请求之后的处理函数（View Method）create。

NOTE：首先需要说明的是，下文中所有的代码解析均直接通过注释的方式呈现完成。

# File: /opt/stack/nova/nova/api/openstack/compute/servers.py

    @wsgi.response(202)
    @wsgi.expected_errors((400, 403, 409))
    @validation.schema(schema_servers.base_create_v20, '2.0', '2.0')
    @validation.schema(schema_servers.base_create, '2.1', '2.18')
    @validation.schema(schema_servers.base_create_v219, '2.19', '2.31')
    @validation.schema(schema_servers.base_create_v232, '2.32', '2.32')
    @validation.schema(schema_servers.base_create_v233, '2.33', '2.36')
    @validation.schema(schema_servers.base_create_v237, '2.37', '2.41')
    @validation.schema(schema_servers.base_create_v242, '2.42', '2.51')
    @validation.schema(schema_servers.base_create_v252, '2.52', '2.56')
    @validation.schema(schema_servers.base_create_v257, '2.57', '2.62')
    @validation.schema(schema_servers.base_create_v263, '2.63', '2.66')
    @validation.schema(schema_servers.base_create_v267, '2.67')
    def create(self, req, body):
        """Creates a new server for a given user."""
        context = req.environ['nova.context']
        server_dict = body['server']
        
        # 用户设置的虚拟机密码
        password = self._get_server_admin_password(server_dict)
        
        name = common.normalize_name(server_dict['name'])
        description = name
        if api_version_request.is_supported(req, min_version='2.19'):
            description = server_dict.get('description')

        # Arguments to be passed to instance create function
        create_kwargs = {}

        # 需要注入到虚拟机的 user_data 数据（Configuration information or scripts to use upon launch.）
        create_kwargs['user_data'] = server_dict.get('user_data')
        
        # NOTE(alex_xu): The v2.1 API compat mode, we strip the spaces for
        # keypair create. But we didn't strip spaces at here for
        # backward-compatible some users already created keypair and name with
        # leading/trailing spaces by legacy v2 API.
      
        # keypair 名称
        create_kwargs['key_name'] = server_dict.get('key_name')

        # 是否启用 config_drive（布尔值）
        create_kwargs['config_drive'] = server_dict.get('config_drive')

        # 虚拟机安全组
        security_groups = server_dict.get('security_groups')
        if security_groups is not None:
            create_kwargs['security_groups'] = [
                sg['name'] for sg in security_groups if sg.get('name')]
            create_kwargs['security_groups'] = list(
                set(create_kwargs['security_groups']))

        # 虚拟机调度提示信息，是一种高级的调度因子
        scheduler_hints = {}
        if 'os:scheduler_hints' in body:
            scheduler_hints = body['os:scheduler_hints']
        elif 'OS-SCH-HNT:scheduler_hints' in body:
            scheduler_hints = body['OS-SCH-HNT:scheduler_hints']
        create_kwargs['scheduler_hints'] = scheduler_hints

        # min_count and max_count are optional.  If they exist, they may come
        # in as strings.  Verify that they are valid integers and > 0.
        # Also, we want to default 'min_count' to 1, and default
        # 'max_count' to be 'min_count'.
        min_count = int(server_dict.get('min_count', 1))
        max_count = int(server_dict.get('max_count', min_count))
        return_id = server_dict.get('return_reservation_id', False)
        if min_count > max_count:
            msg = _('min_count must be <= max_count')
            raise exc.HTTPBadRequest(explanation=msg)
        create_kwargs['min_count'] = min_count
        create_kwargs['max_count'] = max_count
        create_kwargs['return_reservation_id'] = return_id

        # 指定可用域
        availability_zone = server_dict.pop("availability_zone", None)

        # 虚拟机 tags 信息
        if api_version_request.is_supported(req, min_version='2.52'):
            create_kwargs['tags'] = server_dict.get('tags')

        helpers.translate_attributes(helpers.CREATE,
                                     server_dict, create_kwargs)

        target = {
            'project_id': context.project_id,
            'user_id': context.user_id,
            'availability_zone': availability_zone}
        # 验证 target 是否支持 create 操作
        context.can(server_policies.SERVERS % 'create', target)

        # Skip policy check for 'create:trusted_certs' if no trusted
        # certificate IDs were provided.
        
        # 镜像证书
        trusted_certs = server_dict.get('trusted_image_certificates', None)
        if trusted_certs:
            create_kwargs['trusted_certs'] = trusted_certs
            context.can(server_policies.SERVERS % 'create:trusted_certs',
                        target=target)

        # TODO(Shao He, Feng) move this policy check to os-availability-zone
        # extension after refactor it.
        parse_az = self.compute_api.parse_availability_zone
        try:
            # 解析 --availability-zone AZ1:Compute1:Hypervisor1 参数
            availability_zone, host, node = parse_az(context,
                                                     availability_zone)
        except exception.InvalidInput as err:
            raise exc.HTTPBadRequest(explanation=six.text_type(err))
        if host or node:
            context.can(server_policies.SERVERS % 'create:forced_host', {})

        # NOTE(danms): Don't require an answer from all cells here, as
        # we assume that if a cell isn't reporting we won't schedule into
        # it anyway. A bit of a gamble, but a reasonable one.
        min_compute_version = service_obj.get_minimum_version_all_cells(
            nova_context.get_admin_context(), ['nova-compute'])
        # 是否支持 device tagging 功能
        supports_device_tagging = (min_compute_version >=
                                   DEVICE_TAGGING_MIN_COMPUTE_VERSION)

        # 两个 Boot from volume 的 block device mapping 版本
        block_device_mapping_legacy = server_dict.get('block_device_mapping',
                                                      [])
        block_device_mapping_v2 = server_dict.get('block_device_mapping_v2',
                                                  [])

        if block_device_mapping_legacy and block_device_mapping_v2:
            expl = _('Using different block_device_mapping syntaxes '
                     'is not allowed in the same request.')
            raise exc.HTTPBadRequest(explanation=expl)

        if block_device_mapping_legacy:
            for bdm in block_device_mapping_legacy:
                if 'delete_on_termination' in bdm:
                    bdm['delete_on_termination'] = strutils.bool_from_string(
                        bdm['delete_on_termination'])
            create_kwargs[
                'block_device_mapping'] = block_device_mapping_legacy
            # Sets the legacy_bdm flag if we got a legacy block device mapping.
            create_kwargs['legacy_bdm'] = True
        elif block_device_mapping_v2:
            # Have to check whether --image is given, see bug 1433609
           
            image_href = server_dict.get('imageRef')
            image_uuid_specified = image_href is not None
            
            try:
                # Transform the API format of data to the internally used one.
                # block_device_mapping_v2 的数据结构：
                # "block_device_mapping_v2": [{
                #     "boot_index": "0",
                #     "uuid": "ac408821-c95a-448f-9292-73986c790911",
                #     "source_type": "image",
                #     "volume_size": "25",
                #     "destination_type": "volume",
                #     "delete_on_termination": true,
                #     "tag": "disk1",
                #     "disk_bus": "scsi"}]
                block_device_mapping = [
                    block_device.BlockDeviceDict.from_api(bdm_dict,
                        image_uuid_specified)
                    for bdm_dict in block_device_mapping_v2]
            except exception.InvalidBDMFormat as e:
                raise exc.HTTPBadRequest(explanation=e.format_message())
            create_kwargs['block_device_mapping'] = block_device_mapping
            # Unset the legacy_bdm flag if we got a block device mapping.
            create_kwargs['legacy_bdm'] = False

        block_device_mapping = create_kwargs.get("block_device_mapping")
        if block_device_mapping:
            # 检查 target 是否支持 create:attach_volume 操作
            context.can(server_policies.SERVERS % 'create:attach_volume',
                        target)
            for bdm in block_device_mapping:
                if bdm.get('tag', None) and not supports_device_tagging:
                    msg = _('Block device tags are not yet supported.')
                    raise exc.HTTPBadRequest(explanation=msg)

        # 获取指定的 image uuid
        # 如果没有 image_href 并且存在 block_device_mapping 则返回 '' 空字符串，Boot from volume 就不需要指定 Image 了
        # 如果有 image_href，则 image_uuid = image_href
        image_uuid = self._image_from_req_data(server_dict, create_kwargs)

        # NOTE(cyeoh): Although upper layer can set the value of
        # return_reservation_id in order to request that a reservation
        # id be returned to the client instead of the newly created
        # instance information we do not want to pass this parameter
        # to the compute create call which always returns both. We use
        # this flag after the instance create call to determine what
        # to return to the client
        return_reservation_id = create_kwargs.pop('return_reservation_id',
                                                  False)
                                                  
        # 通过 networks attribute 创建一个 list of requested networks
        requested_networks = server_dict.get('networks', None)
        if requested_networks is not None:
            requested_networks = self._get_requested_networks(
                requested_networks, supports_device_tagging)

        # Skip policy check for 'create:attach_network' if there is no
        # network allocation request.
        if requested_networks and len(requested_networks) and 
                not requested_networks.no_allocate:
            context.can(server_policies.SERVERS % 'create:attach_network',
                        target)

        # 获取 flavor object（DB）
        flavor_id = self._flavor_id_from_req_data(body)
        try:
            inst_type = flavors.get_flavor_by_flavor_id(
                    flavor_id, ctxt=context, read_deleted="no")

            # 是否支持 Cinder Mulit-Attach
            supports_multiattach = common.supports_multiattach_volume(req)
            # 跳转到 Compute API 处理，实际上后来是继续跳转到 Conductor 了。
            (instances, resv_id) = self.compute_api.create(context,
                            inst_type,
                            image_uuid,
                            display_name=name,
                            display_description=description,
                            availability_zone=availability_zone,
                            forced_host=host, forced_node=node,
                            metadata=server_dict.get('metadata', {}),
                            admin_password=password,
                            requested_networks=requested_networks,
                            check_server_group_quota=True,
                            supports_multiattach=supports_multiattach,
                            **create_kwargs)
        except (exception.QuotaError,
                exception.PortLimitExceeded) as error:
            raise exc.HTTPForbidden(
                explanation=error.format_message())
        except exception.ImageNotFound:
            msg = _("Can not find requested image")
            raise exc.HTTPBadRequest(explanation=msg)
        except exception.KeypairNotFound:
            msg = _("Invalid key_name provided.")
            raise exc.HTTPBadRequest(explanation=msg)
        except exception.ConfigDriveInvalidValue:
            msg = _("Invalid config_drive provided.")
            raise exc.HTTPBadRequest(explanation=msg)
        except (exception.BootFromVolumeRequiredForZeroDiskFlavor,
                exception.ExternalNetworkAttachForbidden) as error:
            raise exc.HTTPForbidden(explanation=error.format_message())
        except messaging.RemoteError as err:
            msg = "%(err_type)s: %(err_msg)s" % {'err_type': err.exc_type,
                                                 'err_msg': err.value}
            raise exc.HTTPBadRequest(explanation=msg)
        except UnicodeDecodeError as error:
            msg = "UnicodeError: %s" % error
            raise exc.HTTPBadRequest(explanation=msg)
        except (exception.CPUThreadPolicyConfigurationInvalid,
                exception.ImageNotActive,
                exception.ImageBadRequest,
                exception.ImageNotAuthorized,
                exception.FixedIpNotFoundForAddress,
                exception.FlavorNotFound,
                exception.FlavorDiskTooSmall,
                exception.FlavorMemoryTooSmall,
                exception.InvalidMetadata,
                exception.InvalidRequest,
                exception.InvalidVolume,
                exception.MultiplePortsNotApplicable,
                exception.InvalidFixedIpAndMaxCountRequest,
                exception.InstanceUserDataMalformed,
                exception.PortNotFound,
                exception.FixedIpAlreadyInUse,
                exception.SecurityGroupNotFound,
                exception.PortRequiresFixedIP,
                exception.NetworkRequiresSubnet,
                exception.NetworkNotFound,
                exception.InvalidBDM,
                exception.InvalidBDMSnapshot,
                exception.InvalidBDMVolume,
                exception.InvalidBDMImage,
                exception.InvalidBDMBootSequence,
                exception.InvalidBDMLocalsLimit,
                exception.InvalidBDMVolumeNotBootable,
                exception.InvalidBDMEphemeralSize,
                exception.InvalidBDMFormat,
                exception.InvalidBDMSwapSize,
                exception.VolumeTypeNotFound,
                exception.AutoDiskConfigDisabledByImage,
                exception.ImageCPUPinningForbidden,
                exception.ImageCPUThreadPolicyForbidden,
                exception.ImageNUMATopologyIncomplete,
                exception.ImageNUMATopologyForbidden,
                exception.ImageNUMATopologyAsymmetric,
                exception.ImageNUMATopologyCPUOutOfRange,
                exception.ImageNUMATopologyCPUDuplicates,
                exception.ImageNUMATopologyCPUsUnassigned,
                exception.ImageNUMATopologyMemoryOutOfRange,
                exception.InvalidNUMANodesNumber,
                exception.InstanceGroupNotFound,
                exception.MemoryPageSizeInvalid,
                exception.MemoryPageSizeForbidden,
                exception.PciRequestAliasNotDefined,
                exception.RealtimeConfigurationInvalid,
                exception.RealtimeMaskNotFoundOrInvalid,
                exception.SnapshotNotFound,
                exception.UnableToAutoAllocateNetwork,
                exception.MultiattachNotSupportedOldMicroversion,
                exception.CertificateValidationFailed) as error:
            raise exc.HTTPBadRequest(explanation=error.format_message())
        except (exception.PortInUse,
                exception.InstanceExists,
                exception.NetworkAmbiguous,
                exception.NoUniqueMatch,
                exception.MultiattachSupportNotYetAvailable,
                exception.VolumeTypeSupportNotYetAvailable,
                exception.CertificateValidationNotYetAvailable) as error:
            raise exc.HTTPConflict(explanation=error.format_message())

        # If the caller wanted a reservation_id, return it
        if return_reservation_id:
            return wsgi.ResponseObject({'reservation_id': resv_id})

        server = self._view_builder.create(req, instances[0])

        # Enables returning of the instance password by the relevant server API calls
        # such as create, rebuild, evacuate, or rescue. 
        if CONF.api.enable_instance_password:
            server['server']['adminPass'] = password

        robj = wsgi.ResponseObject(server)

        return self._add_location(robj)

上述的参数基本上可以与 Dashboard 上的 Form 表单一一对应：
在这里插入图片描述

# File: /opt/stack/nova/nova/compute/api.py

    @hooks.add_hook("create_instance")
    def create(self, context, instance_type,
               image_href, kernel_id=None, ramdisk_id=None,
               min_count=None, max_count=None,
               display_name=None, display_description=None,
               key_name=None, key_data=None, security_groups=None,
               availability_zone=None, forced_host=None, forced_node=None,
               user_data=None, metadata=None, injected_files=None,
               admin_password=None, block_device_mapping=None,
               access_ip_v4=None, access_ip_v6=None, requested_networks=None,
               config_drive=None, auto_disk_config=None, scheduler_hints=None,
               legacy_bdm=True, shutdown_terminate=False,
               check_server_group_quota=False, tags=None,
               supports_multiattach=False, trusted_certs=None):
        """Provision instances, sending instance information to the
        scheduler.  The scheduler will determine where the instance(s)
        go and will handle creating the DB entries.

        Returns a tuple of (instances, reservation_id)
        """
        
        if requested_networks and max_count is not None and max_count > 1:
            # 验证不能指定一个 IP 地址来创建多个虚拟机
            self._check_multiple_instances_with_specified_ip(
                requested_networks)
            if utils.is_neutron():
                # # 验证不能指定一个 Port 来创建多个虚拟机
                self._check_multiple_instances_with_neutron_ports(
                    requested_networks)

        # 验证指定的 AZ 是否为可用域
        if availability_zone:
            available_zones = availability_zones.
                get_availability_zones(context.elevated(), True)
            if forced_host is None and availability_zone not in 
                    available_zones:
                msg = _('The requested availability zone is not available')
                raise exception.InvalidRequest(msg)

        # filter_properties 就是一个 Dict 类型对象，现在包含了实参的列表的内容。
        filter_properties = scheduler_utils.build_filter_properties(
                scheduler_hints, forced_host, forced_node, instance_type)

        return self._create_instance(
                       context, instance_type,
                       image_href, kernel_id, ramdisk_id,
                       min_count, max_count,
                       display_name, display_description,
                       key_name, key_data, security_groups,
                       availability_zone, user_data, metadata,
                       injected_files, admin_password,
                       access_ip_v4, access_ip_v6,
                       requested_networks, config_drive,
                       block_device_mapping, auto_disk_config,
                       filter_properties=filter_properties,
                       legacy_bdm=legacy_bdm,
                       shutdown_terminate=shutdown_terminate,
                       check_server_group_quota=check_server_group_quota,
                       tags=tags, supports_multiattach=supports_multiattach,
                       trusted_certs=trusted_certs)

    def _create_instance(self, context, instance_type,
               image_href, kernel_id, ramdisk_id,
               min_count, max_count,
               display_name, display_description,
               key_name, key_data, security_groups,
               availability_zone, user_data, metadata, injected_files,
               admin_password, access_ip_v4, access_ip_v6,
               requested_networks, config_drive,
               block_device_mapping, auto_disk_config, filter_properties,
               reservation_id=None, legacy_bdm=True, shutdown_terminate=False,
               check_server_group_quota=False, tags=None,
               supports_multiattach=False, trusted_certs=None):
        """Verify all the input parameters regardless of the provisioning
        strategy being performed and schedule the instance(s) for
        creation.
        """

        # Normalize and setup some parameters
        if reservation_id is None:
            reservation_id = utils.generate_uid('r')
        security_groups = security_groups or ['default']
        min_count = min_count or 1
        max_count = max_count or min_count
        block_device_mapping = block_device_mapping or []
        tags = tags or []

        if image_href:
            # 获取 Image Id 和 Image Metadata
            image_id, boot_meta = self._get_image(context, image_href)
        else:
            # This is similar to the logic in _retrieve_trusted_certs_object.
            if (trusted_certs or
                (CONF.glance.verify_glance_signatures and
                 CONF.glance.enable_certificate_validation and
                 CONF.glance.default_trusted_certificate_ids)):
                msg = _("Image certificate validation is not supported "
                        "when booting from volume")
                raise exception.CertificateValidationFailed(message=msg)
            image_id = None
            # 如果没有指定 Image 的话就获取 block device 的 Metadata
            boot_meta = self._get_bdm_image_metadata(
                context, block_device_mapping, legacy_bdm)

        # ？？？
        self._check_auto_disk_config(image=boot_meta,
                                     auto_disk_config=auto_disk_config)

        # 进一步验证和转换虚拟机创建参数
        base_options, max_net_count, key_pair, security_groups, 
            network_metadata = self._validate_and_build_base_options(
                    context, instance_type, boot_meta, image_href, image_id,
                    kernel_id, ramdisk_id, display_name, display_description,
                    key_name, key_data, security_groups, availability_zone,
                    user_data, metadata, access_ip_v4, access_ip_v6,
                    requested_networks, config_drive, auto_disk_config,
                    reservation_id, max_count)
                    
        # max_net_count is the maximum number of instances requested by the
        # user adjusted for any network quota constraints, including
        # consideration of connections to each requested network
        # 如果最大的 network 数量小于最小的虚拟机数量，则触发异常
        if max_net_count < min_count:
            raise exception.PortLimitExceeded()
        elif max_net_count < max_count:
            LOG.info("max count reduced from %(max_count)d to "
                     "%(max_net_count)d due to network port quota",
                     {'max_count': max_count,
                      'max_net_count': max_net_count})
            max_count = max_net_count

        # 进一步检查 Boot from volume 的 block device 的可用性
        block_device_mapping = self._check_and_transform_bdm(context,
            base_options, instance_type, boot_meta, min_count, max_count,
            block_device_mapping, legacy_bdm)

        # We can't do this check earlier because we need bdms from all sources
        # to have been merged in order to get the root bdm.
        
        # 进一步检查 quota 和 image 的可用性
        self._checks_for_create_and_rebuild(context, image_id, boot_meta,
                instance_type, metadata, injected_files,
                block_device_mapping.root_bdm())

        # 转换为多个虚拟机为 InstanceGroup 数据结构
        instance_group = self._get_requested_instance_group(context,
                                   filter_properties)
        # 转换为 TagList 数据结构
        tags = self._create_tag_list_obj(context, tags)

        # 将诸多创建参数封装到 instances_to_build 列表类型对象中
        instances_to_build = self._provision_instances(
            context, instance_type, min_count, max_count, base_options,
            boot_meta, security_groups, block_device_mapping,
            shutdown_terminate, instance_group, check_server_group_quota,
            filter_properties, key_pair, tags, trusted_certs,
            supports_multiattach, network_metadata)

        instances = []
        request_specs = []
        build_requests = []
        for rs, build_request, im in instances_to_build:
            build_requests.append(build_request)
            instance = build_request.get_new_instance(context)
            instances.append(instance)
            request_specs.append(rs)

        if CONF.cells.enable:
            # NOTE(danms): CellsV1 can't do the new thing, so we
            # do the old thing here. We can remove this path once
            # we stop supporting v1.
            for instance in instances:
                # 创建 Instance 数据库对象
                instance.create()
            # NOTE(melwitt): We recheck the quota after creating the objects
            # to prevent users from allocating more resources than their
            # allowed quota in the event of a race. This is configurable
            # because it can be expensive if strict quota limits are not
            # required in a deployment.
            # 再一次检查资源配额
            if CONF.quota.recheck_quota:
                try:
                    compute_utils.check_num_instances_quota(
                        context, instance_type, 0, 0,
                        orig_num_req=len(instances))
                except exception.TooManyInstances:
                    with excutils.save_and_reraise_exception():
                        # Need to clean up all the instances we created
                        # along with the build requests, request specs,
                        # and instance mappings.
                        self._cleanup_build_artifacts(instances,
                                                      instances_to_build)
 
            self.compute_task_api.build_instances(context,
                instances=instances, image=boot_meta,
                filter_properties=filter_properties,
                admin_password=admin_password,
                injected_files=injected_files,
                requested_networks=requested_networks,
                security_groups=security_groups,
                block_device_mapping=block_device_mapping,
                legacy_bdm=False)
        else:
            self.compute_task_api.schedule_and_build_instances(
                context,
                build_requests=build_requests,
                request_spec=request_specs,
                image=boot_meta,
                admin_password=admin_password,
                injected_files=injected_files,
                requested_networks=requested_networks,
                block_device_mapping=block_device_mapping,
                tags=tags)

        return instances, reservation_id

    def _validate_and_build_base_options(self, context, instance_type,
                                         boot_meta, image_href, image_id,
                                         kernel_id, ramdisk_id, display_name,
                                         display_description, key_name,
                                         key_data, security_groups,
                                         availability_zone, user_data,
                                         metadata, access_ip_v4, access_ip_v6,
                                         requested_networks, config_drive,
                                         auto_disk_config, reservation_id,
                                         max_count):
        """Verify all the input parameters regardless of the provisioning
        strategy being performed.
        """
        # 确定 Flavor 可用
        if instance_type['disabled']:
            raise exception.FlavorNotFound(flavor_id=instance_type['id'])

        # 确定 user_data 数据编码格式为 Base64
        if user_data:
            try:
                base64utils.decode_as_bytes(user_data)
            except TypeError:
                raise exception.InstanceUserDataMalformed()

        # When using Neutron, _check_requested_secgroups will translate and
        # return any requested security group names to uuids.
        security_groups = (
            self._check_requested_secgroups(context, security_groups))

        # Note:  max_count is the number of instances requested by the user,
        # max_network_count is the maximum number of instances taking into
        # account any network quotas
        max_network_count = self._check_requested_networks(context,
                                     requested_networks, max_count)

        # Choose kernel and ramdisk appropriate for the instance.
        # ramdisk（虚拟内存盘）是一种将内存模拟成硬盘来使用的技术，可以在内存启动虚拟机。
        kernel_id, ramdisk_id = self._handle_kernel_and_ramdisk(
                context, kernel_id, ramdisk_id, boot_meta)

        # 依旧返回是布尔值类型
        config_drive = self._check_config_drive(config_drive)

        # 通过 keypair name 获取 keypair data
        if key_data is None and key_name is not None:
            key_pair = objects.KeyPair.get_by_name(context,
                                                   context.user_id,
                                                   key_name)
            key_data = key_pair.public_key
        else:
            key_pair = None

        # 获取系统盘设备的名称（可能从 Image 获取或者从 Volume 获取）
        root_device_name = block_device.prepend_dev(
                block_device.properties_root_device_name(
                    boot_meta.get('properties', {})))

        try:
            # 此处的 image 是一个抽象概念，表示启动操作系统的系统盘。将 Image File 和 Volume 统一为 ImageMeta 对象。
            image_meta = objects.ImageMeta.from_dict(boot_meta)
        except ValueError as e:
            # there must be invalid values in the image meta properties so
            # consider this an invalid request
            msg = _('Invalid image metadata. Error: %s') % six.text_type(e)
            raise exception.InvalidRequest(msg)

        # 从 Flavor properties 和 Image properties 获取 NUMA 属性
        numa_topology = hardware.numa_get_constraints(
                instance_type, image_meta)

        system_metadata = {}

        # PCI requests come from two sources: instance flavor and
        # requested_networks. The first call in below returns an
        # InstancePCIRequests object which is a list of InstancePCIRequest
        # objects. The second call in below creates an InstancePCIRequest
        # object for each SR-IOV port, and append it to the list in the
        # InstancePCIRequests object

        # PCI requests come from two sources: instance flavor and requested_networks. 
        # 从 Flavor 获取 PCI 设备
        pci_request_info = pci_request.get_pci_requests_from_flavor(
            instance_type)
        # ？？？
        network_metadata = self.network_api.create_resource_requests(
            context, requested_networks, pci_request_info)

        # base_options 的数据基本就是 instances 数据库表的属性。
        base_options = {
            'reservation_id': reservation_id,
            'image_ref': image_href,
            'kernel_id': kernel_id or '',
            'ramdisk_id': ramdisk_id or '',
            'power_state': power_state.NOSTATE,
            'vm_state': vm_states.BUILDING,
            'config_drive': config_drive,
            'user_id': context.user_id,
            'project_id': context.project_id,
            'instance_type_id': instance_type['id'],
            'memory_mb': instance_type['memory_mb'],
            'vcpus': instance_type['vcpus'],
            'root_gb': instance_type['root_gb'],
            'ephemeral_gb': instance_type['ephemeral_gb'],
            'display_name': display_name,
            'display_description': display_description,
            'user_data': user_data,
            'key_name': key_name,
            'key_data': key_data,
            'locked': False,
            'metadata': metadata or {},
            'access_ip_v4': access_ip_v4,
            'access_ip_v6': access_ip_v6,
            'availability_zone': availability_zone,
            'root_device_name': root_device_name,
            'progress': 0,
            'pci_requests': pci_request_info,
            'numa_topology': numa_topology,
            'system_metadata': system_metadata}

        options_from_image = self._inherit_properties_from_image(
                boot_meta, auto_disk_config)

        base_options.update(options_from_image)

        # return the validated options and maximum number of instances allowed
        # by the network quotas
        return (base_options, max_network_count, key_pair, security_groups,
                network_metadata)

    def _provision_instances(self, context, instance_type, min_count,
            max_count, base_options, boot_meta, security_groups,
            block_device_mapping, shutdown_terminate,
            instance_group, check_server_group_quota, filter_properties,
            key_pair, tags, trusted_certs, supports_multiattach,
            network_metadata=None):
        # Check quotas
        num_instances = compute_utils.check_num_instances_quota(
                context, instance_type, min_count, max_count)
        security_groups = self.security_group_api.populate_security_groups(
                security_groups)
        self.security_group_api.ensure_default(context)
        LOG.debug("Going to run %s instances...", num_instances)
        instances_to_build = []
        try:
            for i in range(num_instances):
                # Create a uuid for the instance so we can store the
                # RequestSpec before the instance is created.
                instance_uuid = uuidutils.generate_uuid()
                # Store the RequestSpec that will be used for scheduling.
                # 将于 scheduling 相关的参数都封装到 RequestSpec 对象，便于在 Nova Scheduler 中应用。
                req_spec = objects.RequestSpec.from_components(context,
                        instance_uuid, boot_meta, instance_type,
                        base_options['numa_topology'],
                        base_options['pci_requests'], filter_properties,
                        instance_group, base_options['availability_zone'],
                        security_groups=security_groups)

                # 如果是 Boot from Volume，则使用系统盘作为操作系统的根盘
                if block_device_mapping:
                    # Record whether or not we are a BFV instance
                    root = block_device_mapping.root_bdm()
                    # 标记为 Boot from volume
                    req_spec.is_bfv = bool(root and root.is_volume)
                else:
                    # If we have no BDMs, we're clearly not BFV
                    req_spec.is_bfv = False

                # NOTE(danms): We need to record num_instances on the request
                # spec as this is how the conductor knows how many were in this
                # batch.
                req_spec.num_instances = num_instances
                # 创建 RequestSpec 数据库记录
                req_spec.create()

                # NOTE(stephenfin): The network_metadata field is not persisted
                # and is therefore set after 'create' is called.
                if network_metadata:
                    req_spec.network_metadata = network_metadata

                # Create an instance object, but do not store in db yet.
                instance = objects.Instance(context=context)
                instance.uuid = instance_uuid
                instance.update(base_options)
                instance.keypairs = objects.KeyPairList(objects=[])
                if key_pair:
                    instance.keypairs.objects.append(key_pair)

                instance.trusted_certs = self._retrieve_trusted_certs_object(
                    context, trusted_certs)

                instance = self.create_db_entry_for_new_instance(context,
                        instance_type, boot_meta, instance, security_groups,
                        block_device_mapping, num_instances, i,
                        shutdown_terminate, create_instance=False)
                        
                # 确定 Block Device 时可用的，并设定其 Size 且与 Instance 关联起来
                block_device_mapping = (
                    self._bdm_validate_set_size_and_instance(context,
                        instance, instance_type, block_device_mapping,
                        supports_multiattach))
                instance_tags = self._transform_tags(tags, instance.uuid)

                # 将于虚拟机创建（启动）相关的参数封装到 BuildRequest 对象
                build_request = objects.BuildRequest(context,
                        instance=instance, instance_uuid=instance.uuid,
                        project_id=instance.project_id,
                        block_device_mappings=block_device_mapping,
                        tags=instance_tags)
                build_request.create()

                # Create an instance_mapping.  The null cell_mapping indicates
                # that the instance doesn't yet exist in a cell, and lookups
                # for it need to instead look for the RequestSpec.
                # cell_mapping will be populated after scheduling, with a
                # scheduling failure using the cell_mapping for the special
                # cell0.
                
                # 将于虚拟机定位（cells）相关的参数封装到 InstanceMapping 对象
                inst_mapping = objects.InstanceMapping(context=context)
                inst_mapping.instance_uuid = instance_uuid
                inst_mapping.project_id = context.project_id
                inst_mapping.cell_mapping = None
                inst_mapping.create()

                instances_to_build.append(
                    (req_spec, build_request, inst_mapping))

                if instance_group:
                    if check_server_group_quota:
                        try:
                            objects.Quotas.check_deltas(
                                context, {'server_group_members': 1},
                                instance_group, context.user_id)
                        except exception.OverQuota:
                            msg = _("Quota exceeded, too many servers in "
                                    "group")
                            raise exception.QuotaError(msg)

                    members = objects.InstanceGroup.add_members(
                        context, instance_group.uuid, [instance.uuid])

                    # NOTE(melwitt): We recheck the quota after creating the
                    # object to prevent users from allocating more resources
                    # than their allowed quota in the event of a race. This is
                    # configurable because it can be expensive if strict quota
                    # limits are not required in a deployment.
                    if CONF.quota.recheck_quota and check_server_group_quota:
                        try:
                            objects.Quotas.check_deltas(
                                context, {'server_group_members': 0},
                                instance_group, context.user_id)
                        except exception.OverQuota:
                            objects.InstanceGroup._remove_members_in_db(
                                context, instance_group.id, [instance.uuid])
                            msg = _("Quota exceeded, too many servers in "
                                    "group")
                            raise exception.QuotaError(msg)
                    # list of members added to servers group in this iteration
                    # is needed to check quota of server group during add next
                    # instance
                    instance_group.members.extend(members)

        # In the case of any exceptions, attempt DB cleanup
        except Exception:
            with excutils.save_and_reraise_exception():
                self._cleanup_build_artifacts(None, instances_to_build)

        return instances_to_build

至此，POST /servers 在 nova-api service 的工作流程基本就完成了，总的来说主要做了两件事情：校验和转换。

API 接口的版本控制、数据类型校验（e.g. validation.schema）
用户操作意图许可校验（e.g. context.can、supports_device_tagging、supports_multiattach）
虚拟机关联资源的可用性校验
用户输入数据整理、归纳、分类并转化为程序内部使用的数据结构（对象）

后面通过 self.compute_task_api 调用进入到 nova-conductor service 的工作流程。

Nova 启动虚拟机流程解析

目录

文章目录

前言

从请求说起

nova-api service 阶段