Senior Software Engineer - Distributed Data Systems


San Francisco, CA, US
  • Job Type: Full-Time
  • Function: Engineering Software
  • Post Date: 06/16/2021
  • Website:
  • Company Address: , San Francisco, CA

About Databricks

Databricks is focused on making Big Data easier than ever before by delivering a turnkey cloud platform built around Spark, the lightening-fast open source cluster computing framework.

Job Description

Our engineering teams build highly technical products that fulfill real, important needs in the world. We constantly push the boundaries of data and AI technology, while simultaneously operating with the resilience, security and scale that is critical to making customers successful on our platform.

We develop and operate one of the largest scale software platforms. The fleet consists of millions of virtual machines, generating terabytes of logs and processing exabytes of data per day. At our scale, we regularly observe cloud hardware, network, and operating system faults, and our software must gracefully shield our customers from any of the above.

Modern data analysis employs sophisticated methods such as machine learning that go well beyond the roll-up and drill-down capabilities of traditional SQL query engines. As a software engineer on the Runtime team at Databricks, you will be building the next generation distributed data storage and processing systems that can outperform specialized SQL query engines in relational query performance, yet provide the expressiveness and programming abstractions to support diverse workloads ranging from ETL to data science.

Below are some example projects:

Apache Spark: Develop the de facto open source standard framework for big data.

Data Plane Storage: Deliver reliable and high performance services and client libraries for storing and accessing humongous amount of data on cloud storage backends, e.g., AWS S3, Azure Blob Store.

Delta Lake: A storage management system that combines the scale and cost-efficiency of data lakes, the performance and reliability of a data warehouse, and the low latency of streaming. Its higher level abstractions and guarantees, including ACID transactions and time travel, drastically simplify the complexity of real-world data engineering architecture.

Delta Pipelines: It’s difficult to manage even a single data engineering pipeline. The goal of the Delta Pipelines project is to make it simple and possible to orchestrate and operate tens of thousands of data pipelines. It provides a higher level abstraction for expressing data pipelines and enables customers to deploy, test & upgrade pipelines and eliminate operational burdens for managing and building high quality data pipelines.

Performance Engineering: Build the next generation query optimizer and execution engine that’s fast, tuning free, scalable, and robust.


  • BS in Computer Science, related technical field or equivalent practical experience.
  • Optional: MS or PhD in databases, distributed systems.
  • Comfortable working towards a multi-year vision with incremental deliverables.
  • Driven by delivering customer value and impact.
  • 5+ years of production level experience in either Java, Scala or C++.
  • Strong foundation in algorithms and data structures and their real-world use cases.
  • Experience with distributed systems, databases, and big data systems (Spark, Hadoop).


  • Comprehensive health coverage including medical, dental, and vision
  • 401(k) Plan
  • Equity awards
  • Flexible time off
  • Paid parental leave
  • Family Planning
  • Gym reimbursement
  • Annual personal development fund
  • Employee Assistance Program (EAP) 

Related Jobs

Engineering Manager - Full Stack

Databricks - Amsterdam, NL

Engineering Manager - Next Generation Execution Engine

Databricks - San Francisco, CA, US

Software Engineer - Backend

Databricks - Amsterdam, NL

Software Engineer - Full Stack

Databricks - Amsterdam, NL

Engineering Manager - Distributed Data Systems

Databricks - Amsterdam, NL
Disclaimer: Local Candidates Only
This company does NOT accept candidates from outside recruiting firms. Agency contacts are not welcome.