Start of main content
Spark magic: How high-level pipelines become distributed hardcore
Day 2
RU
Spark is the most popular tool for building data pipelines. Every data engineer knows Spark, blah-blah-blah… OK, but Spark is just a distributed Java Streams, right? But how does it work then? Oh, it turns out you can't just call "flatMap" or "groupBy" to a remote machine. Codegen! Interested? Come and find more!
Speakers
Invited experts
Evgeny Mandrikov
SonarSource