Job Description
- Work with structured and unstructured real-world medical data
- Design, build and launch efficient and reliable data pipelines to move complex data
- Write high-quality, efficient, testable code in Python, C++, or Go.
- Build data expertise and own data quality
- Collaborate with software and AI engineers to design and implement data architecture
- Integrate with 3rd party analytics tools and APIs (Mixpanel, Google Analytics)
- Build horizontally scalable infrastructure to support ML training and data mining research
Qualifications and Skills:
- B.E. in Computer Science or a related discipline, or related practical experience
- Minimum 4 years of Experience with Data Engineering & Data Architecture
- Experience using advanced SQL and databases in a business environment with large-scale datasets (Hadoop, Hive, Presto)
- Experience with statistical modeling and analyzing large data sets
- Experience with product analytics tools and APIs (Mixpanel, Google Analytics)
- AWS expertise (S3, EC2, Lambda, Redshift, Athena) is a plus
- Experience developing scalable microservices also a plus
- Familiarity with Kimball's data warehouse lifecycle