Chris Keathley / @ChrisKeathley / [email protected]Building Adaptive Systems
View Slide
Server Server
Server ServerI have a request
Server ServerNo Problem!
Server ServerThanks!
Server ServerI’m a little busy
Server ServerI’m a little busyI have more requests!
Server ServerI don’t feel so good
Server
ServerWelp
All services haveobjectives
A resilient service shouldbe able to withstand a 10xtraffic spike and continueto meet those objectives
Lets Talk About…QueuesOverload MitigationAdaptive Concurrency
What causesoverload?
What causes overload?ServerQueue
What causes overload?ServerQueueProcessing TimeArrival Rate >
Little’s LawElements in the queue = Arrival Rate * Processing Time
Little’s LawServer1 requests = 10 rps * 100 ms100ms
Little’s LawServer2 requests = 10 rps * 200 ms200ms
Little’s LawServer2 requests = 10 rps * 200 ms200msBEAM Processes
Little’s LawServer2 requests = 10 rps * 200 ms200msBEAM ProcessesCPU Pressure
Little’s LawServer3 requests = 10 rps * 300 ms300msBEAM ProcessesCPU Pressure
Little’s LawServer30 requests = 10 rps * 3000 ms3000msBEAM ProcessesCPU Pressure
Little’s LawServer30 requests = 10 rps * ∞ ms∞BEAM ProcessesCPU Pressure
Little’s Law30 requests = 10 rps * ∞ ms
Little’s Law∞ requests = 10 rps * ∞ ms
Little’s Law∞ requests = 10 rps * ∞ msThis is bad
OverloadArrival Rate > Processing Time
OverloadArrival Rate > Processing TimeWe need to get these under control
Load SheddingServerQueueServer
Load SheddingServerQueueServerDrop requests
Load SheddingServerQueueServerDrop requestsStop sending
Autoscaling
AutoscalingServer DBServer
AutoscalingServer DBServerRequests start queueing
AutoscalingServer DBServerServer
AutoscalingServer DBServerServerNow its worse
Autoscaling needs tobe in response toload shedding
Circuit Breakers
Circuit BreakersServer Server
Circuit BreakersServer ServerShut off traffic
Circuit BreakersServer ServerI’m not quite dead yet
Circuit Breakers areyour last line ofdefense
We want to allow asmany requests as wecan actually handle
Adaptive LimitsTimeConcurrency
Adaptive LimitsActual limitTimeConcurrency
Adaptive LimitsActual limitDynamic DiscoveryTimeConcurrency
Load SheddingServerServer
Load SheddingServerServerAre we at the limit?
Load SheddingServerServerAm I still healthy?
Load SheddingServerServerUpdate Limits
Adaptive LimitsTimeConcurrencyIncreased latency
LatencySuccessful vs. Failed requestsSignals for Adjusting Limits
Additive Increase Multiplicative DecreaseSuccess state: limit + 1Backoff state: limit * 0.95TimeConcurrency
Prior Art/Alternativeshttps://github.com/ferd/pobox/https://github.com/fishcakez/sbroker/https://github.com/heroku/canal_lockhttps://github.com/jlouis/safetyvalvehttps://github.com/jlouis/fuse
Regulatorhttps://github.com/keathley/regulator
Regulator.install(:service, [limit: {Regulator.Limit.AIMD, [timeout: 500]}])Regulator.ask(:service, fn ->{:ok, Finch.request(:get, "https://keathley.io")}end)Regulator
Conclusion
Queues areeverywhere
Those queues needto be bounded toavoid overload
If your system isdynamic, yoursolution will alsoneed to be dynamic
Go and buildawesome stuff
ThanksChris Keathley / @ChrisKeathley / [email protected]