Julia's coding blog - Practice makes perfect

From January 2015, she started to practice leetcode questions; she trains herself to stay focus, develops "muscle" memory when she practices those questions one by one. 2015年初, Julia开始参与做Leetcode, 开通自己第一个博客. 刷Leet code的题目, 她看了很多的代码, 每个人那学一点, 也开通Github, 发表自己的代码, 尝试写自己的一些体会. She learns from her favorite sports – tennis, 10,000 serves practice builds up good memory for a great serve. Just keep going. Hard work beats talent when talent fails to work hard.

Thursday, August 8, 2019

Messaging at Scale at Instagram

Messaging at Scale at Instagram

https://www.youtube.com/watch?v=E708csv4XgY

Chaines tasks

Batch of 10,000 followers per task
Tasks yield successive tasks
much finer-grained load balancing
Failure/Reload penalty low

Other Async tasks

Cross-posting to other networks
search indexing
spam analysis
account deletion
API hook

Gearman framework - load balancer

Gearman in production

persistence horrifically slow, complex
So we ran out of memory and crashed, no recovery
Single core, didn't scale well;
60ms mean submission time for us
Probably should have just used Redis

(Gearman vs Redis)

Celery

Distributed task framework
Highly extensible, pluggable
Mature, Feature rich
Great tooling
Excellent Django support
celeryd

Which broker?

Redis

We already use it
Very fast, eifficient
Polling for task distribution
Messy Non-Synchronous replication
Memory limits task capacity

Beanstalk

Purpose-built task queue
Very fast, efficient
Pushes to Consumers
Spills to disk
No replication
Useless for anything else

RabbitMQ

Reasonablely fast, efficient
Spill-to-disk
Low-maintenance synchronous replication
Excellent celery compatibility
Supports other use cases
We don't know Erlang

Out RabbitMQ Setup

Rabbit MQ 3.0
Clusters of two brokers nodes, Mirrowed
Scale out by adding broker clusters
EC2 c1.xlarge, RAID instance storage
Way overprovisioned

Alerting

We use Sensu
Monitors & alerts on queue length threshold
Uses rabbitmqctl list_queues

Scaling out

Celery only supported 1 broker host last year when we started
Created kombu-multbroker "shim"
Multiple brokers used in a round-robin fashion
Breaks some Celery management tools :(

Concurrency models

multiprocessing (pre-fork)
eventlet
gevent
threads

Problem:

Network-bound tasks sometimes need to take some action

Ruun higer concurrency?
Inefficient :(

Lower batch (prefetch) size?
Min is concurrency count, inefficient :(

Separate slow & fast tasks :)

OUr concurrency levels

fast (14)
feed (12)
default (6)

Problem
Task fails sometimes

Work crashes still lost task

Problem:
Slow tasks monopolize workers

NLP proof gives us choices: to retry or not to retry

problem

Early on, drop task

Publishers confirms

Confirm tasks

Avoid using async tasks as a "backup" mechanism only during failures. It'll probably break.

Better grip on RabbitMQ performance
Utilize result storage
Single cluster for control queues
Eliminate kombu-multibroker

-- study topics --

Sensu

https://docs.sensu.io/sensu-core/1.4/reference/checks/#what-is-a-sensu-check

----------------------------------

Study one more topic

AWS Elastic Beanstalk
https://aws.amazon.com/elasticbeanstalk/

AWS Elastic Beanstalk

Load balancing, provisioning, Application health monitoring, Auto scaling

capacity provisioning, load balancing, auto-scaling, and application health monitoring.

Manageing and configuring servers, databases, load balancers, firewalls, and networks.

Attachments area

Preview YouTube video Messaging at Scale at Instagram

Julia's coding blog - Practice makes perfect

Thursday, August 8, 2019

Messaging at Scale at Instagram

No comments:

Post a Comment