Frank McSherry(Microsoft Research Silicon Valley)
"Naiad: a system for iterative, incremental, and interactive distributed dataflow"
In this talk I'll describe the Naiad system, based on a new model for low-latency incremental and iterative dataflow. Naiad is designed to provide three properties we do not think yet exist in a single system: the expressive power of loops, concurrent vertex execution, and fine-grained edge completion. Removing any one of these requirements yields an existing class of solutions (respectively: streaming systems like StreamInsight, iterative incremental systems like Nephele, and callback systems like Percolator), but all three together appear to require a new system design. We will describe Naiad's structured cyclic dataflow model and protocol for tracking and coordinating outstanding work, more closely resembling memory fences than traditional distributed systems barriers. We give several examples of how Naiad can be used to efficiently implement many of the currently popular "big data" programming patterns, as well as several new ones, and experimental results indicating that Naiad's relative performance ranges from "as good as" to "much better than" existing systems.
This is joint work with Derek G. Murray, Rebecca Isaacs, Michael Isard, Paul Barham, and Martin Abadi.
Bio: Frank joined the MSR Silicon Valley lab in 2002,
immediately after completing his graduate degree with Anna Karlin
at the University of Washington. His interests are in Privacy and
large scale Data Mining and Analysis, and the theoretical and practical
issues surrounding them. In particular, he helped to develop the
recent definition of Differential Privacy, and has been designing
and implementing the Privacy Integrated Queries data
analysis platform providing these guarantees.
|Zeit:||Montag, 24.06.2013, 14.00 Uhr|
|Ort:||MPI-SWS Gebäude Kaiserslautern, Paul Ehrlich Str. Gebäude G26, Raum 113|
|Hinweis:||Der Vortrag wird live zum MPI-SWS Gebäude Saarbrücken, Wartburg, 5. Etage übertragen.|