def mirrors my experience too. Vast majority of spark jobs were easily ported to...

MrPowers · on Feb 11, 2021

I've rolled out Scala based Spark interfaces to non-programmers in Databricks notebooks, so it's definitely possible, but only if you stick with the basic language features.

Here's a more detailed PySpark vs Scala comparison in case folks are interested: https://mungingdata.com/apache-spark/python-pyspark-scala-wh...

I think Scala Spark (using 10% of the language features) is a better technical decision (because it provides huge benefits like fat JARs, shading, better text editor support, etc), but the worse overall choice for most organizations because people are generally terrified of Scala.

They'd rather do nothing than write Scala code. I can empathize with their position.

valenterry · on Feb 12, 2021

Even when Scala is used more or less like python?

aynyc · on Feb 11, 2021

> scala is real big impediment to making data processing accessible to general public in your company

Ding Ding Ding! Presto/Athena now is becoming huge in BI ecosystem. We don't really use Spark for ad-hoc BI anymore, we use it for data science and large repetitive workload.