Data Engineer (Life Sciences)

Health Catalyst

United States of America / Remote
  • Job Type: Full-Time
  • Function: Data Science
  • Post Date: 06/02/2021
  • Website:
  • Company Address: 3165 East Millrock Drive Suite 400, Salt Lake City, UT, 84121

About Health Catalyst

Comprehensive performance management and predictive analytics platform for health systems

Job Description

Our mission is to be the catalyst for massive, measurable, data-informed healthcare improvement through:

    Data: integrate data in a flexible, open & scalable platform to power healthcare’s digital transformation​
    Analytics: deliver analytic applications & services that generate insight on how to measurably improve​
    Expertise: provide clinical, financial & operational experts who enable & accelerate improvement​
    Engagement: attract, develop and retain world-class team members by being a best place to work​

Role: Data Engineer

Team: Life Sciences

Location: US Remote

Travel: <10-15%, US

The Data Engineer for the Life Sciences Business will be responsible for going into client environments to augment existing professional services resources in building out client data, including the acquisition of source marts and harmonized data marts (DOS Marts). The Data Engineer will help the Life Sciences business achieve the goal of having a homogenous set of data elements across clients that will enable cross-client data analytics. Additionally, the Data Engineer will assist clients in extending the standard set of data available across each client, for instance integrating clinical text data and data from patient registries.

This role is a great fit for someone who has significant data management and acquisition experience in the healthcare space. The work that the Data Engineer performs will have immediate impact on healthcare providers, but also contribute to the mission of accelerating clinical innovation and precision medicine through novel Life Sciences partnerships.

Duties & Responsibilities

Big Data Engineer with extensive experience with the aspect of building end to end data platforms and data pipelines and data flows. This should include data ingestion/integration, data storage,

data transformation, data processing, data deployment, data operations and data cataloging.

The Engineer should be able to design and work closely with other big data architects, big data developers, data scientists, DevOps and data ops engineers to develop a platform capable of executing operational, analytic and predictive workloads that serves thousands of applications and supports machine learning deployment and inferencing.

Required Skills

    Extensive experience as a data Engineer, database developer and building data driven applications
    good understanding of distributed systems and distributed databases
    Experience with ETL/ELT development, Batch processing and stream processing.
    Familiarity with frameworks like Spark and Kafka and tools around them
    Extensive experience with Data Warehouses on cloud - Redshift / Big Query / Snowflake / Azure Synapse
    Experience with Azure Ecosystem - Data Lakes - ADLS Gen2, Azure SQLWarehouse
    Understanding of Big Data Ingestion/Integration/Storage/Processing, transformation/ETL tools and data formats for storage
    Ability to debug, troubleshoot and optimize data solutions in the Big Data Ecosystem with tools like Spark, Presto, Hive, Kafka and NoSQL & relational Databases and Data Warehouses
    Experience working with SQL Engines on large data - Presto, Impala, Dremio, SparkSQL, Hive, Drill, Druid and others
    Knowledge and Experience of working with DevOps and DataOps teams and collaborate with them to develop the process and automate deployment
    Programming experience with one or more - Python, Java, Scala
    Expertise with both intermediate and advanced level of SQL query development
    ability to understand and work with complex datasets and build solutions around them with data modeling
    Work with other team members - business analysts and data analyst, data stewards to understand the requirements and build solutions
    ability, passion and aptitude to learn new programming and querying languages and applying them to build data solutions
    good understanding of tools around the DevOps ecosystem with basic understanding of dockers and CI/CD processes
    good level of expertise working with GIT
    experience working with DevOps teams
    Experience with data orchestration tools - dbt, prefect, Airflow

Preferred Skills

    Experience with Data Warehousing, Data Modeling, Data Marts, Data Virtualization, MPP based Engines like Azure Synapse, Redshift, Vertica, BigQuery, Snowflake etc.
    Experience with relational databases like - SQLServer, Postgres, MySQL, MariaDB, Oracle etc
    Working with at least one or more of NoSQL Databases and able to develop a data model with at least one or more of the main types of NoSQL
    databases like
    Key Value data stores - Redis, DynamoDB, Riak,
    Document databases - MongoDB, CouchDB, Couchbase
    Graph Databases - Neo4J,
    Wide column databases - Cassandra, HBase, Scylla
    Time Series databases - InfluxDB, TimeScale
    Search engines and databases - Elastic Search, Solr.
    InMemory databases or InMemory Grids - Apache Ignite, GridGain etc

Education & Relevant Experience

    BS in Computer Science, Health Informatics, or related field
    2+ years of experience working in a SQL-based data engineering role
    2+ years of experience with clinical/healthcare data
    2+ years of experience of direct exposure to data from a leading EMR (Epic, Cerner, Meditech, etc.)
    Experience in programming/scripting languages such as Python a plus

The above statements describe the general nature and level of work being performed in this job function.  They are not intended to be an exhaustive list of all duties, and indeed additional responsibilities may be assigned by Health Catalyst.

At Health Catalyst, we appreciate the opportunity to benefit from the diverse backgrounds and experiences of others. Because of our deep commitment to respect every individual, Health Catalyst is an equal opportunity employer.

Related Jobs

Sr. Director of Quality Management

Health Catalyst - Remote

Analytics Engineer (Intermediate/Sr. Level) - ASO

Health Catalyst - Remote

Data Engineer (Master Data Management)

Health Catalyst - Salt Lake City, UT, US

Director of Engineering - Technology Platform - Network

Health Catalyst - RemoteUnited States of America

Senior Build and Release Engineer

Health Catalyst - RemoteUnited States of America
Disclaimer: Local Candidates Only
This company does NOT accept candidates from outside recruiting firms. Agency contacts are not welcome.