Software Engineer - Replication Manager

Cloudera

Santa Clara, CA, US
  • Job Type: Full-Time
  • Function: Engineering Software
  • Post Date: 01/20/2021
  • Website: cloudera.com
  • Company Address: 395 Page Mill Rd, Palo Alto, CA, 94304

About Cloudera

Cloudera, advancing digital transformation with an enterprise data cloud from the Edge to AI.

Job Description

Job Description:

The Replication Manager team is looking for passionate developers to join our growing engineering team. The team is responsible for building out the data, metadata, and permissions replication support for the Hadoop ecosystem. The goal of the team is to have a seamless experience for our customers for moving the data and all entities associated with that to migration as well disaster recovery use cases. 

Replication Manager enables you to replicate data across data centers or to/from the cloud for disaster recovery and migration scenarios. Replications can include data stored in HDFS, data stored in Hive tables, Hive metastore data, and Impala metadata (catalog server metadata) associated with Impala tables registered in the Hive metastore. Replication Manager not only replicates data and metadata but also translates security and governance policies as part of the move. The datasets can range from terabytes to petabytes of data with some additional challenges like millions of directories, individual file sizes ranging in gigabytes, etc.

Key Responsibilities:

  • Build and maintain large-scale replication systems on top of Hadoop

  • Work with a team of engineers to design cloud-based, low RPO, RTO replication architectures

  • Support replication across multiple Hadoop components like HDFS, Hive, HBase, Kudu, etc

  • Mentor junior engineers

  • Work with product management to formulate a product roadmap

Requirements:

  • 8+ years experience building complex systems that handle “big data”.

  • Strong proficiency in one JVM language such as Java, Scala

  • Familiarity with cloud-based systems

  • Strong understanding of systems, databases, networks and the web

  • Systems experience

Preferred:

  • Experience with scalable systems (petabytes and beyond)

  • Prior replication experience

  • Experience with AWS, Azure, GCP

  • Current expertise with Java/Scala developer ecosystems

Why Cloudera?

  • Amazing people - We are a fun and smart team, including many of the top luminaries in Hadoop and related open source communities. We frequently interact with the research community, collaborate with engineers at other top companies and host cutting edge researchers for tech talks.

  • Innovative work - Cloudera pushes the frontier of big data and distributed computing, as our track record shows. We test and deploy our code on clusters with hundreds of nodes, terabytes of RAM, and petabytes of storage.  We work on high-profile open source projects, interacting daily with engineers at other exciting companies, speaking at meet-ups, etc.

Related Jobs

Senior Software Engineer - Streaming Team

Cloudera - Budapest, HU

Software Engineer - Machine Learning Platform

Cloudera - Santa Clara, CA, US

Sr. Software Engineer - DIM

Cloudera - Budapest, HUSzeged, HU

Manager, Engineering - Data Catalog

Cloudera - Bangalore, IN

Senior Software Engineer

Cloudera - Budapest, HU
Disclaimer: Local Candidates Only
This company does NOT accept candidates from outside recruiting firms. Agency contacts are not welcome.