Experience working in large Python, Java, Kotlin, or Go codebases and running cloud-native Spark systems (e.g. AWS EMR, Databricks, GCP Dataproc)
Experience in performance tuning of Spark, Ray, Maestro, or Airflow jobs
Knowledge of data formats such as Parquet, Avro, Arrow, Iceberg, or Delta Lake, and object storage (e.g. S3, GCS)
Expertise with cloud-scale query performance, query optimization, query planning, heuristic query execution techniques, and cost-driven optimizations
Experience with internals of distributed systems, SQL/NoSQL databases, data lakes, or data warehouses
Strong communication skills and ability to write detailed technical specifications
Excitement about coaching and mentorship of junior engineers
BSc, MS, or PhD in Computer Science or related fields
8+ years of experience in building product software systems