Emily's Parkay butter pics made me laugh. Really enjoyed this. Great job Emily!!
@flwi6 років тому
Wow, great presentation!
@manjunath155 років тому
Very informative and nicely articulated.
@maa1dz1333q2eqER5 років тому
Great presentation, touched a lot of important areas, thanks
@HasanAmmori2 роки тому
Fantastic talk! I wish there was a little more info on the format spec itself.
@Tomracc2 роки тому
this is wonderful, enjoyed start to end :)
@TheAjit11114 роки тому
Great talk, Thank you
@tianzhang31203 роки тому
Awesome presentation!
@gmetrofun4 роки тому
AWS S3 supports random access queries (i.e., Range Header), consequently pushdown is also supported on AWS S3
@bnsagar903 роки тому
Can you please some text or link where I can read more about this. Thanks.
@betterwithrum5 років тому
Where are the slides?
@bogdandubas39783 роки тому
Amazing speaker!
@amitbhattacharyya59252 роки тому
good explanations , this would be great if some git code they can mention
@djibb.78766 років тому
Great talk!!! I set up a spark-cluster with 2 workers. I save a Dtaframe using partitionBy ("column x") as a parquet format to some path on each worker. The matter is that i am able to save it but if i want to read it back i am getting these errors: - Could not read footer for file file´status ...... - unable to specify Schema ... Any Suggestions?
@clray1236 років тому
Eh so basically any sort of growing data can be only partitioned in one way (along the dimension of the growth - which for many use cases will be some meaningless "autoincrement" id). Which then defeats all the push-down filtering for any other dimension. Not to mention that if your data keeps growing in small increments and you need access to latest of it, you will have to jump through hoops to somehow integrate all those small increments into bigger files - because scanning 20000 tiny files ain't gonna be efficient (and this means lots of constant rewriting - that's why write speed DOES matter and it's not "write-once", but write-many)...
@HughMcBrideDonegalFlyer7 років тому
Great talk on a very important (and too often overlooked ) topic
@ardenjar79427 років тому
Awesome thanks!
@thomasgong55384 роки тому
具有一定的指导学习作用。
@deenadayalmuli27566 років тому
to my experience, orc supports nesting...
@pradeep4225 років тому
The only thing I liked is the way Emily executed it.
@mikecmw84925 років тому
Why is everyone a "spark expert"?? Get real and just show us how to do it...
@betterwithrum5 років тому
there are spark experts, just far and few between. I've hired a few, but they were unicorns