Presto On Spark: A Unified SQL Experience

  Переглядів 3,116

Databricks

Databricks

Рік тому

Presto was originally designed to run interactive queries against data warehouses, but now it has evolved into a unified SQL engine on top of open data lake analytics for both interactive and batch workloads. However, Presto doesn't scale to very large and complex batch pipelines. Presto Unlimited was designed to address such scalability challenges but it didn’t fully solve fault tolerance, isolation, and resource management.
Spark is the tool of choice across the industry for running large scale complex batch ETL pipelines. This motivated the development of Presto On Spark.
Presto on Spark runs Presto as a library that is submitted with spark-submit to a Spark cluster. It leverages Spark for scaling shuffle, worker execution, and resource management. It thereby eliminates any query conversion between interactive and batch use cases. This solution helps enable a performant and scalable platform with seamless end-to-end experience to explore and process data.
Many analysts at Intuit use Presto to explore data in the Data Lake/S3 and use Spark for batch processing. These analysts would earlier spend several hours converting these exploration SQLs written for Presto to Spark SQL to operationalize/schedule them as data pipelines.
Presto On Spark is now used by analysts at Intuit to run thousands of critical jobs. No query conversion is required here, improved analysts' productivity and empowered them to deliver insights at high speed.
Benefits from session:
Attendees will learn about Presto On Spark architecture
Attendees will learn when To Use Spark's Execution Engine With Presto
Attendees will learn how Intuit runs thousands of presto jobs daily leveraging databricks platform which they can apply to their own work
Connect with us:
Website: databricks.com
Facebook: / databricksinc
Twitter: / databricks
LinkedIn: / data. .
Instagram: / databricksinc

КОМЕНТАРІ: 2
@andrescmarin
@andrescmarin 19 днів тому
Is there something like this for Trino (formerly Presto SQL)?
@hasanmougharbel8030
@hasanmougharbel8030 Рік тому
Hey there, god bless your efforts in this channel. As a new sql learner i have only few enquires. I made my mind to work on database systems that are designed to perform data analytics and not merely transactional functions. Is it a good start to start by learning on sql server or i should consider other softwares. Also, is there any ETL tools that i should leverage right from the beginig to ease my learing process. I aim to start with any open source softwares or inexpensive solutions throughout my learning process. Thanks for taking care of this. Looking forward to learn from you.
Trino: An Origin Story
19:24
bur2chee
Переглядів 9 тис.
How To Use Databricks SQL for Analytics on Your Lakehouse
1:10:15
Databricks
Переглядів 3 тис.
НЕОБЫЧНЫЙ ЛЕДЕНЕЦ
00:49
Sveta Sollar
Переглядів 6 млн
Presto 101: An Introduction to Open Source Presto
20:38
Databricks
Переглядів 8 тис.
Apache Spark - Computerphile
7:40
Computerphile
Переглядів 239 тис.
Presto on Apache Spark: A Tale of Two Computation Engines
25:25
Databricks
Переглядів 6 тис.
What Is DBT and Why Is It So Popular -  Intro To Data Infrastructure Part 3
9:48
Presto: Fast SQL-on-Anything |  Starburst
42:01
Data Council
Переглядів 15 тис.
An introduction to Apache Parquet
5:16
Learn Data with Mark
Переглядів 31 тис.
Database vs Data Warehouse vs Data Lake | What is the Difference?
5:22
Alex The Analyst
Переглядів 690 тис.
ELI5: Presto
2:37
Meta Open Source
Переглядів 7 тис.
Игровой ноутбук за 100тр в МВИДЕО
0:58
KOLBIN REVIEW
Переглядів 730 тис.
Клавиатура vs геймпад vs руль
0:47
Balance
Переглядів 979 тис.
APPLE УБИЛА ЕГО - iMac 27 5K
19:34
ЗЕ МАККЕРС
Переглядів 84 тис.
Индуктивность и дроссель.
1:00
Hi Dev! – Электроника
Переглядів 121 тис.
На iPhone можно фоткать даже ночью😳
0:30
GStore Mobile
Переглядів 1 млн
С Какой Высоты Разобьётся NOKIA3310 ?!😳
0:43
Samsung UE40D5520RU перезагружается, замена nand памяти
0:46
Слава 100пудово!
Переглядів 3,9 млн