Data Engineer


Pune, IN / Maharashtra, IN
  • Job Type: Full-Time
  • Function: Data Science
  • Post Date: 03/30/2021
  • Website:
  • Company Address: 1314 7th St. Floor 5, Santa Monica, CA, 90401

About GumGum

GumGum is an artificial intelligence company with deep expertise in computer vision and natural language processing. Its mission is to solve hard problems across media by teaching machines to see and understand the world. Since 2008, the company has applied its patented capabilities to serving media-related industries, including advertising and professional sports.

Job Description

At GumGum, our ad servers produce over 50 TB of new raw data every day. It amounts to ~100 billion events that are processed every day. Dealing with data at this scale is challenging in a number of ways. We deal with a number of off-the-shelf frameworks including Spark, Kafka, Cassandra, DynamoDB, Redshift, but often push them past their limits. This team is responsible for providing critical ad reporting data for GumGum’s internal and external customers. 

As a Data Engineer, you will be building and maintaining exciting systems, services and data tools. You’ll bring your experience with complex distributed systems, passion for performance and optimization, and ability to write highly scalable and fault tolerant code.


  • Refining our data infrastructure technologies such as Kafka, Spark, Snowflake and Druid to support real time analysis of data 
  • Own the core data pipelines and scale our data processing flow.
  • Build scalable systems with various AWS & Big Data technologies, participate in code reviews, promoting engineering best practices. Must be able to write quality code and build secure, highly available systems. 
  • Work on GumGum’s proprietary Reporting Server
  • Work on various reports using Groovy, SQL and Java
  • Work on GumGum’s proprietary forecasting system


  • At least 2 years of Apache Spark experience 
  • At least a Bachelor's degree in Computer Science or equivalent
  • 3+ years of Software Engineering experience (Java/Scala/Python)
  • Experience with large scale distributed real-time systems with tools such as AWS, Spark, Kafka, Hadoop
  • Familiar with various AWS services, Serverless architecture and containers
  • Experience with high volume, high availability production systems.

Strong problem solving skills, strong verbal and written communication skills

Related Jobs

Sales Manager (SaaS/Sports)

GumGum - London, GB

Account Director

GumGum - Minneapolis, MN, US

Customer Success Manager

GumGum - Remote

Sales Director, Sports

GumGum - Remote

Sales Director

GumGum - San Francisco, CA, US
Disclaimer: Local Candidates Only
This company does NOT accept candidates from outside recruiting firms. Agency contacts are not welcome.