openstack笔记

Cloud Architecture

Includes explanation of services and how they relate to each other.

RabbitMQ A messaging system the implements AMQP. Basically, it’s a server thatpasses messages around between the other components that make up Nova.

nova-apiThis is the API server, naturally. It implements a subset of the AmazonEC2. We’re working on adding additional APIs, but it takes time. Italso implements a subset of the Rackspace API.

nova-objectstoreThis service stores objects. It implements the S3 API, but it’s rathercrude. If you’re serious about storing objects, Swift is what you want.

nova-compute The component that runs virtual machines.

nova-networkThe network worker. Depending on configuration, it may just assign IP’sor it could work as the gateway for a bunch of NAT’ed VM’s.

nova-schedulerThe scheduler decides which host gets to run the VM. When a user wantsto run a virtual machine, they send a request to the API server. TheAPI server asks the network worker for an IP and then passes offhandling to the scheduler.

Architectural Overview for OpenStack Compute

== Architectural Overview ==

Start with the differences (fundamental) between Files, Servers, and *other bits* (auth, monitoring, etc).

Supporting infrastructure:

* authentication

* logging

* monitoring

Maybe start from here: http://wiki.openstack.org/Overview

Start a glossary of terms so we're all on the same page (huddle vs pod vs cluster vs node).

Definethe problem we're trying to solve at a high level and introduce acouple of high level architectures that we can drill down into.

Coreprincipals/concerns - horizontal scale out, async operations,distributed, determinstic operations, share-nothing architecture,modular.

What is already complete in Nova, what needs to change? High level list that we can drill down into during sessions.

How is Nova and Cloud Files integration going?

Play along at home: http://wiki.openstack.org/ArchitecturalOverview

=== OpenStack Compute (Nova) ===

MISSION:

1,000,000 hosts

60,000,000 guests

Will support multiple apis at the same time.

A native openstack API (based on the Rackspace API) and EC2 are already supported.

PROPOSAL: Taxonomy discussion, is multi-tenancy inherently supported?

=== OpenStack ObjectStore (Swift) ===

100 Petabytes per cluster

100,000 requests / second

Hundreds of billions of objects

OVERVIEW

Thisdocumentcontains the basic technical overview of the OpenStackComputetechnology offering. It is intended as an introductorytechnicaldocument and is not targeted at developers or architects butrather atechnical audience looking to understand the basic structureandcomponents of Compute. Also, this document is concerned with thesimplecomponents of the computing system and should not be used as a guide to building Compute.

INTRODUCTION

OpenStackComputeis the underlying cloud computing fabric controller for theOpenStackcloud. This means that all activities within the OpenStackcloud such asstarting and stopping virtual machines are handled bythis product. Ineffect, OpenStack Compute is the “operating system”for the OpenStackcloud and manages all resources, security issues, andscalability needsfor the cloud.

COMPONENTS

Messaging Server

The entire Compute infrastructure communicates via an asynchronous message queue system running RabbitMQ (www.rabbitmq.com). Itis critical that any task waiting for a response to a message notblockother tasks thus an asynchronous messaging solution was chosen.

User Data Server

This server uses Open LDAP to store all information about users, projects, and roles who are authorized to use the Compute infrastructure.

You can use redis as storage in fake LDAP mode. This is the only way that ReDIS is used right now.

Users, Projects, and Roles can also now be stored in a SQL DB

State Server

This server uses Redis (http://code.google.com/p/redis)toprovide atomic database operations for data structures providing afast,shared storage tool to the platform. As a distributedkey/valuedatabase, this service stores xxx

now, state is stored in SQL DB (using via SQL Alchemy, but that is development detail notneededhere). By default sqlite is used, which works only for 1 server.Onmultinode install, real SQL server is needed (mysql, postgresql)

HTTP Server

This server handles large files by running the nginx (http://nginx.org/en) http server. <*** What are these large files? ***> Largefiles are images of systems. Tornado cannot serve so big fileswithoutproblems. Nginx can. But in future, we might not need nginxanymore(when we will get rid of Tornado).

Thefileserving is now done through Twisted. The objectstore is asimplefile-based storage for images, and it will be replaced by theglanceproject in the future.

Cloud Controller

=nova :) -- maybe you mean 'Public API Server'

Volume Node

User Manager

Network Controller

Live Notes may be taken for this topic at: http://etherpad.openstack.org/Architecture and http://etherpad.openstack.org/nova-archdoc

“Small” components, loosely coupled

Queue based (currently AMQP/RabbitMQ)
Flexible schema for datastore (currently Redis)
LDAP (allows for integration with MS Active Directory via translucent proxy)
Workers & Web hooks (be of the web)
Asynchronous everything (don't block)
Components(queue, datastore, http endpoints, ...) should scale independently andallow visibility into internal state (for the pretty charts/operations)

Development goals

Testing & Continuous Integration
Fakes (allows development on a laptop)
Adaptable (goal is to make integration with existing resources at organization easier)

Queue

Eachworker/agent listens on a general topic, and a subtopic for that node.Example would be "compute" & "compute:hostname"
Messages in the queue are currently Topic, Method, Arguments - which maps to a method in the python class for the worker
exposed via method calls
- rpc.cast to broadcast the message and not wait for a response
- rpc.call to send a message and wait for the response

Datastore

Pre-Austin, data is stored in Redis 2.0 (RC)
Do the work on write - make reads FAST
- maintain indexes / lists of common subsets
- use pools (SETs in redis) that are drained for IPs instead of tracking what is allocated

Delta

Scheduler does not exist (instances are distributed via the queue to the first worker that consumes the message)
Objectstore in Nova is a naive stub which would be replaced with Cloud Filesin Production (a simple object store that mimics Cloud Files might begood for development)
Tornado should be phased out for WSGI-based web framework

Networking

Currently, there are three strategies for networking, implemented by different managers:

FlatManager-- ip addresses are grabbed from a network and injected into the imageon launch. All instances are attached to the same manually configuredbridge.
FlatDHCPManager-- ip addresses are grabbed from a network, and a single bridge iscreated for all instances. A dhcp server is started to pass outaddresses
VlanManager-- each project gets its own vlan, bridge and network. A dhcpserver isstarted for each vlan, and all instances are bridged into that vlan.

Theimplementation of creating bridges, vlans, dhcpservers, and firewallrules is done by the driver linux_net. This layer of abstraction is sothat we can at some point support configuring hardware switches etc.using the same managers.

Networking Overview

Live notes may be taken for this topic at: http://etherpad.openstack.org/Networking

Nova Implementation

Current implementation

Private networking and VPNs
- Instances attached to separated VLAN tagged bridges
IP Address allocation handled by API
DHCP Server assigns addresses

NovaNetworkingDiagram.png

Instance launch network steps

On Network Node
- If vlan doesn't exist:
  - create vlan and bridge for project
  - run dhcp server bridged into vlan
- generate mac address
- if cloudpipe instance:
  - give specific ip to instance
- else:
  - find free private ip
- configure dhcp server with mac and ip
On Compute Node
- If vlan doesn't exist:
  - create vlan and bridge for project
- Spawn vm and nic with specified mac address
- Bridge the vm nic into the project vlan

Volume creation network steps

volume node creates lvm
volume node exposes lvm using vblade-persist

Volume attach network steps

compute node discovers volume
compute node attaches volume to vm as pci device

Ip association

Find free public ip
Associate the ip with public interface
Set up iptables rules to forward to private ip

Future implementation

Pluggable Network Implementation
- Support for flat networking model
- Support for IP injection
Network is its own worker process and uses queue
Support for dedicated network hardware

Rackspace Implementation

Current implementation

Flat Network Design
Networking configurations injected into instances, or pulled via a Guest Agent
IPs pulled from Cluster Controller per network group.
Instances protected by various IPTables, Ebtables, Arptables rules
- Protects instances from IP/MAC Address Spoofing
- Protects instances from ARP Poisoning Attacks
Host machines connect three Datacenter Networks: public, service-net, management-net
- Management-net is used for communication from controllers to host.
Instances connected to single bridge for each network, public, service-net
Bandwidth throttling.

Future implementation

Addition of host-net bridge for internal communication from Instances.
- Needed for Hypervisor agnostic communication between Host and Guest Communication. (we can't rely only on XenStore)
- IPs assigned via DHCP over local host network.
- Is this an additional guest network interface or does this piggy back existing?
Open vSwitch
- Instance networking protection rules could be pushed into the vSwitch.

IPv6

IPv6 shouldhave first-class support, we can derive IPv4 address binding with theIPv6 to IPv6 mapping space and configuration options.

IPv4 Countdown Clock

For more discussion of network architecture, see Networking.