Kafka Summit SF 2018 Keynote by Martin Kleppmann (Researcher, University of Cambridge). Martin Kleppmann is a distributed systems researcher at the University of Cambridge, and author of the acclaimed O’Reilly book “Designing Data-Intensive Applications” (http://dataintensive.net/). Previously he was a software engineer and entrepreneur, co-founding and selling two startups, and working on large-scale data infrastructure at LinkedIn.
ACID - Atomic, Consistency, Integrity, Durability
Durability
3:05 Kafka - provide durability
Write to disk, async
Archive tape, fsync to disk
Copy to several machines, if you lose the machine, then you will still have the data.
Atomicity - Handling faults (crashes)
Concurrency
Write data to different places, do not let web app directly write to those places. Write to Kafka.
The log of Karfa, guarantee now ..., atomic is easy to write a single event, a stream.
They independently consume the log.
6:55 PM
13:57/ 28:14
How to do it in Karfka?
Put an event in Karfka, Json to document the transaction.
{eventType: transfer, from Account: 12345, toAccount: 54321, amount: 100.0, event ID:abcdef}
Two events: take 100 from the account, input 100 into the account
Second stream process -
Isolation - ACID
Serializable - as if there is a database available, ...
relational database:
start transaction
select count(*)
from user_account
where user = 'jane'
Non-serializable execution
Consistency - ACID
Enforcing invariants
Integrity ...
Sooo... is this a database?
No ad-hoc queries, for sure ...
26:50
Transactions broken down into multi-stage stream pipelines
But: stronger consistency properties than many distributed datastores...
Durability
3:05 Kafka - provide durability
Write to disk, async
Archive tape, fsync to disk
Copy to several machines, if you lose the machine, then you will still have the data.
Atomicity - Handling faults (crashes)
Concurrency
Write data to different places, do not let web app directly write to those places. Write to Kafka.
The log of Karfa, guarantee now ..., atomic is easy to write a single event, a stream.
They independently consume the log.
6:55 PM
13:57/ 28:14
How to do it in Karfka?
Put an event in Karfka, Json to document the transaction.
{eventType: transfer, from Account: 12345, toAccount: 54321, amount: 100.0, event ID:abcdef}
Two events: take 100 from the account, input 100 into the account
Second stream process -
Isolation - ACID
Serializable - as if there is a database available, ...
relational database:
start transaction
select count(*)
from user_account
where user = 'jane'
Non-serializable execution
Consistency - ACID
Enforcing invariants
Integrity ...
Sooo... is this a database?
No ad-hoc queries, for sure ...
26:50
Transactions broken down into multi-stage stream pipelines
But: stronger consistency properties than many distributed datastores...


No comments:
Post a Comment