Modularis: Modular Data Analytics for Hardware, Software, and Platform Heterogeneity

04/07/2020
by   Dimitrios Koutsoukos, et al.
0

Today's data analytics displays an overwhelming diversity along many dimensions: data types, platforms, hardware acceleration, etc. As a result, system design often has to choose between depth and breadth: high efficiency for a narrow set of use cases or generality at a lower performance. In this paper, we pave the way to get the best of both worlds: We present Modularis-an execution layer for data analytics based on fine-grained, composable building blocks that are as generic and simple as possible. These building blocks are similar to traditional database operators, but at a finer granularity, so we call them sub-operators. Sub-operators can be freely and easily combined. As we demonstrate with concrete examples in the context of RDMA-based databases, Modularis' sub-operators can be combined to perform the same task as a complex, monolithic operator. Sub-operators, however, can be reused, can be offloaded to different layers or accelerators, and can be customized to specialized hardware. In the use cases we have tested so far, sub-operators reduce the amount of code significantly-or example, for a distributed, RDMA-based join by a factor of four-while having minimal performance overhead. Modularis is an order of magnitude faster on SQL-style analytics compared to a commonly used framework for generic data processing (Presto) and on par with a commercial cluster database (MemSQL).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro