增强的 VNF 部署¶

包含您的 Launchpad 蓝图的 URL

https://blueprints.launchpad.net/tacker/+spec/enhanced-vnf-placement

本规范尝试使用声明式方法有效地部署 VNF 的 VDU。

问题描述¶

VNF 的 VDU 的部署方式与普通 VM 相同。这不能满足 VNF 的性能要求

IO 密集型
计算密集型

提议的变更¶

在 VNFD 模板中引入新的主机属性，允许指定 CPU pinning、Huge pages、NUMA 部署和每个 VDU 的 vCPU 拓扑。此外，还允许指定 VDU 网络接口的 SR-IOV 网卡。

CPU pinning 通过将 guest vCPU 固定到 host CPU，避免不可预测的延迟和 host CPU 过度提交，从而提高 guest 中运行的应用程序的性能。

Huge pages 有助于确保 guest 拥有 100% 专用 RAM，永远不会被交换出去。

NUMA 部署通过避免 guest 跨节点内存和 I/O 设备访问来降低延迟。

将 SR-IOV 端口分配给 guest 可以使网络流量绕过 hypervisor 的软件层，直接在 SR-IOV 网卡和 guest 之间流动，从而提高性能。

VNFD 主机属性模式

topology_template:

node_templates

vdu1
type: tosca.nodes.nfv.VDU

capabilities

nfv_compute:

properties
disk_size: {get_input: dsize} #VM 的磁盘大小，单位为 GB

num_cpus: {get_input: cpu_count} #VM 的 CPU 数量

mem_size: {get_input: msize} #VM 的内存大小，单位为 MB

cpu_allocation
cpu_affinity: {get_input: affinity}

#支持的有效值为 ‘dedicated’。值 ‘dedicated’ 确保与 VDU 关联的 guest vCPU 将严格固定到一组 host pCPU。如果指定了任何其他值或未指定值，则允许 guest vCPU 在 host pCPU 上自由浮动。

thread_allocation: {get_input: threadalloc}

#有效值为 ‘avoid’、‘separate’、‘isolate’ 和 ‘prefer’。这些值仅在 ‘cpu_affinity’ 设置为 ‘dedicated’ 时适用。值 ‘avoid’ 表示不要将 guest 放置在具有超线程的 host 上。值 ‘separate’ 允许将每个 vCPU 放置在不同的核心上（如果 host 具有线程）。值 ‘isolate’ 将将每个 vCPU 放置在不同的核心上，并且来自其他 guest 的任何 vCPU 都不会放置在同一核心上。如果 host 具有线程，则值 ‘prefer’ 允许将 vCPU 放置在同一核心上，使其成为线程兄弟。

socket_count: {get_input: sock_cnt}

#指定要暴露给 guest 的首选套接字数量。大于 1 的套接字数量使 VM 能够跨 NUMA 节点扩展。注意：虽然模板指定了确切的套接字、核心和线程数量，但底层的 IaaS 系统（在本例中为 Nova）可能会优化为跨套接字、核心和线程的略有不同的核心数量组合。

core_count: {get_input: core_cnt}

#指定要暴露给 guest 的每个套接字的内核数量。

thread_count: {get_input: thrdcnt}

#指定要暴露给 guest 的每个核心的线程数量。

mem_page_size: {get_input: mem_pg_sz}

#允许指定使用 Huge pages 时的值，允许的值为 ‘small’、‘large’、‘any’ 和 ‘custom page size in MB’。‘small’ 通常映射到 x86 上的 4K 页面大小，large 映射到 x86 上的 2 MB 或 1 GB，‘any’ 交给驱动程序实现。

numa_node_count: count: {get_input: numa_count}

#指定要暴露给 guest 的 NUMA 节点数量。当指定 numa_node_count 时，guest 的 CPU 和内存资源将在 NUMA 节点上对称分配。仅支持指定 numa_node_count 或 numa_nodes 中的一个，如果同时指定，则考虑 numa_node_count 值。

numa_nodes:

#允许指定 CPU 和 RAM 的不对称分配。要使此生效，应定义至少 2 个具有唯一节点标签的节点。

<node_label>:
#为 node_label 指定一个唯一名称。

id: {get_input: numa_id}

#指定 NUMA 节点 ID

vcpus: {get_input: vcpu_list}

#指定 vCPU 列表到 NUMA 节点的映射

memory: {get_input: mem_size}

#指定 RAM 在 MB 到 NUMA 节点的映射

对于 SR-IOV 支持，引入了一个名为“type”的新属性，该属性将接受 ‘sriov’ 值，用于 tosca.nodes.nfv.CP 类型

VNFD 模板模式示例¶

1. CPU Pinning¶

以下是将 guest vCPU 固定到 host pCPU 的示例

topology_template:
  node_templates:
    VDU1:
      type: tosca.nodes.nfv.VDU

      capabilities:
        nfv_compute:
          properties:
            num_cpus: 8
            mem_size: 4096 # Memory Size in MB
            disk_size: 8 # Value in GB

            cpu_allocation:
              cpu_affinity: dedicated
              thread_allocation: isolate

2. Huge Pages¶

指定 Huge pages 用于 guest VM 的示例

topology_template:
  node_templates:
    VDU1:
      type: tosca.nodes.nfv.VDU

      capabilities:
        nfv_compute:
          properties:
            num_cpus: 8
            mem_size: 4096 # Memory Size in MB
            disk_size: 8 # Value in GB
            mem_page_size: large

3. 不对称 NUMA 部署¶

以下是不对称分配 CPU 和 RAM 跨 NUMA 节点的示例

topology_template:
  node_templates:
    VDU1:
      type: tosca.nodes.nfv.VDU

      capabilities:
        nfv_compute:
          properties:
            num_cpus: 8
            mem_size: 6144
            disk_size: 8
            numa_nodes:

              node1:
                id: 0
                vcpus: [ 0,1 ]
                mem_size: 2048
              node2:
                id: 1
                vcpus: [ 2, 3, 4, 5]
                mem_size: 4096

4. 对称 NUMA 部署¶

以下是对称分配 CPU 和 RAM 跨 NUMA 节点的示例

topology_template:
  node_templates:
    VDU1:
      type: tosca.nodes.nfv.VDU

      capabilities:
        nfv_compute:
          properties:
            num_cpus: 8
            mem_size: 6144
            disk_size: 8
            numa_node_count: 2

5. 组合示例¶

以下是指定 HugePages、CPU pinning、NUMA 部署、禁用 host 超线程以及向 guest 暴露 sockets、cores 和 thread count 的示例

topology_template:
  node_templates:
    VDU1:
      type: tosca.nodes.nfv.VDU

      capabilities:
        nfv_compute:
          properties:
            num_cpus: 8
            mem_size: 4096
            disk_size: 80
            mem_page_size: 1G
            cpu_allocation:

              cpu_affinity: dedicated
              thread_allocation: avoid
              socket_count: 2
              core_count: 2
              thread_count: 2

            numa_node_count: 2

6. 网络接口示例¶

以下是定义多个网络接口和 sriov nic 类型的示例

topology_template:
  node_templates:
    VDU1:
      type: tosca.nodes.nfv.VDU

      capabilities:
        nfv_compute:
          properties:
            num_cpus: 8
            mem_size: 4096 MB
            disk_size: 8 GB
            mem_page_size: 1G

            cpu_allocation:
              cpu_affinity: dedicated
              thread_allocation: isolate
              socket_count: 2
              core_count: 8
              thread_count: 4

            numa_node_count: 2

    CP11:
      type: tosca.nodes.nfv.CP

      requirements:
        - virtualbinding: VDU1
        - virtualLink: net_mgmt

    CP12:
     type: tosca.nodes.nfv.CP

     properties:
         anti_spoof_protection: false
         type : sriov
     requirements:
      - virtualbinding: VDU1
      - virtualLink: net_ingress

    CP13:
      type: tosca.nodes.nfv.CP

     properties:
         anti_spoof_protection: false
         type : sriov

      requirements:
        - virtualbinding: VDU1
        - virtualLink: net_egress

    net_mgmt:
      type: tosca.nodes.nfv.VL.ELAN

    net_ingress:
      type: tosca.nodes.nfv.VL.ELAN

备选方案¶

另一种方法是提前创建 flavor，并在 VNFD 模板中使用该 flavor。

数据模型影响¶

无

REST API 影响¶

安全影响¶

其他最终用户影响¶

性能影响¶

其他部署者影响¶

预计部署者将在计算节点上准备 Host OS（grub 更改），以保留 Huge Pages、隔离 CPU 并启用 SR-IOV。预计 nova 和 neutron 配置文件中会有配置更改。

开发人员影响¶

实现¶

负责人¶

主要负责人: 龚永胜 gong.yongsheng@99cloud.net
其他贡献者: Vishwanath Jayaraman <vishwanathj@hotmail.com>

工作项¶

numa 支持
sriov 支持

依赖项¶

https://blueprints.launchpad.net/tacker/+spec/automatic-resource-creation

测试¶

要测试 numa、sriov 和 pci passthough 需要特殊的硬件，openstack CI 上的正常环境不满足它。

因此必须进行手动测试，希望有人能提供他们实验室中的自己的 host 来进行第三方测试。

其他选项是

联系 openstack-infra / -qa 团队，请求在 gate 上添加计算资源，以测试规范中的功能。
让供应商支持一个第三方 CI 作业，并反对规范中调用的功能。

文档影响¶

文档将更新以指导如何使用此功能。

增强的 VNF 放置