[AWS DA Guru] Kinesis

Overview

Kinesis Streams

Streaming data and video in real-time

Kinesis Data Firehose

Data analytics with BI tools

Kinesis Data Analytics

Real-time data analytics with SQL

Kinesis Streams

  • Has Producers and Consumers
  • Has Shards
  • Has Retention period

Kinesis Shard

The data capacity of the stream is determined by the number of shards. If the data rate increase, you can increase capacity on your stream by increase the number of shards.

  • Kinesis streams are made up of shards
  • Each shared is a sequence of one or more data recods and provides a fixed unit of capacity
  • 5 read pre second. Max total read rate is 2 MB per second
  • 1000 writes per second. Max total write rate is 1 MB per second

Kinesis Firehose

  • No shards
  • No consumers
  • Using existing BI tools
  • Store data

Kinesis Data Analytsis

  • Real-time
  • SQL

Kinesis Client Library

  • The KCL ensures that for every shared there is a recod processor.
  • If you have only one consumter, the KCL will create all the recod processors on single consumer.
  • If you have two consumers it will load balance and create half the processors on one instance and half on another.

Scaling Out Consumers

  • With KCL, the number of instances does not exceed the number of shards
  • You never need multiple instances to handle the processing load of one shard.
  • However, one worker can process multiple shards.
  • It's fine that number of shards exceeds the number of instances.
  • Reshard, doesn't mean need more instances.
  • Instead, CPU utilisation is what should drive the quantity of consumer instances you have, NOT the number of shards in your Kinesis stream.
  • Using Auto Scaling group, and base scaling decision on CPU load on your consumers.

原文地址:https://www.cnblogs.com/Answer1215/p/14732642.html