G-Research Distinguished Speaker Series: Apache Arrow - High Performance Columnar Data Framework

In the latest edition of the G-Research Distinguished Speaker Series, Wes McKinney, co-creator of Apache Arrow and creator of Python pandas, discusses the latest developments in Apache Arrow - a multi-language toolbox for accelerated data interchange and in-memory processing. Learn more about G-Research (https://www.gresearch.co.uk/) and view talks from speakers such as Martin Hairer and Shoshana Zuboff in our Distinguished Speaker Series playlist. In this talk, Wes McKinney talks through the following: Compute and Data Silos Trends in Hardware Apache Arrow and where it’s up to Defragmentation Interoperability Standard In-Memory Format Goals Apache Arrow Data Types Apache Arrow Streaming Binary Protocol Zero-copy data interchange High Performance Bridge to Storage Arrow Flight and fast data sharing with Arrow Flight Parallel Data Access with Arrow Flight Arrow Flight SQL Query Engines for Arrow Near future: Modular Arrow Computing Substrait: Serialised Relational Algebra Portable Query Plans / Substrait in perspective Analytics database Architecture Analytics database, deconstructed Apache Arrow 7.0.0 Coming soon with Apache Arrow Engine Interfaces in Python