Saturday, August 10, 2019

Martin Kleppmann | Kafka Summit SF 2018 Keynote (Is Kafka a Database?)

Here is the link.

Kafka Summit SF 2018 Keynote by Martin Kleppmann (Researcher, University of Cambridge). Martin Kleppmann is a distributed systems researcher at the University of Cambridge, and author of the acclaimed O’Reilly book “Designing Data-Intensive Applications” (http://dataintensive.net/). Previously he was a software engineer and entrepreneur, co-founding and selling two startups, and working on large-scale data infrastructure at LinkedIn.


ACID - Atomic, Consistency, Integrity, Durability

Durability

3:05 Kafka - provide durability
Write to disk, async
Archive tape, fsync to disk
Copy to several machines, if you lose the machine, then you will still have the data.
Atomicity - Handling faults (crashes)

Concurrency

Write data to different places, do not let web app directly write to those places. Write to Kafka.

The log of Karfa, guarantee now ..., atomic is easy to write a single event, a stream.

They independently consume the log.

6:55 PM
13:57/ 28:14

How to do it in Karfka?

Put an event in Karfka, Json to document the transaction.

{eventType: transfer, from Account: 12345, toAccount: 54321, amount: 100.0, event ID:abcdef}

Two events: take 100 from the account, input 100 into the account

Second stream process -



Isolation - ACID

Serializable - as if there is a database available, ...

relational database:

start transaction

select count(*)
from user_account
where user = 'jane'

Non-serializable execution

Consistency - ACID

Enforcing invariants
Integrity ...

Sooo... is this a database?
No ad-hoc queries, for sure ...

26:50
Transactions broken down into multi-stage stream pipelines

But: stronger consistency properties than many distributed datastores...




No comments:

Post a Comment