Lead Data Architect

Health Catalyst

United States of America / Remote
  • Job Type: Full-Time
  • Function: Data Science
  • Post Date: 06/09/2021
  • Website: healthcatalyst.com
  • Company Address: 3165 East Millrock Drive Suite 400, Salt Lake City, UT, 84121

About Health Catalyst

Comprehensive performance management and predictive analytics platform for health systems

Job Description

Our mission is to be the catalyst for massive, measurable, data-informed healthcare improvement through:

    Data: integrate data in a flexible, open & scalable platform to power healthcare’s digital transformation​
    Analytics: deliver analytic applications & services that generate insight on how to measurably improve​
    Expertise: provide clinical, financial & operational experts who enable & accelerate improvement​
    Engagement: attract, develop and retain world-class team members by being a best place to work​

Team: Life Sciences

Travel: <5%, US

Job Profile Summary

The Data Architect Lead will have extensive experience with all aspects of building end-to-end Data platforms and architecture. This should include data ingestion and integration, data storage, data transformation, data processing, data deployment, data operations and data cataloging. The Architect should be able to design and help big data engineers develop a platform capable of executing operational,  analytic and predictive workloads that serve thousands of applications and supports machine learning deployment and inferencing.

Job Description

    Duties & Responsibilities

        Manage Data Engineering team in the building of Data Products and provide career development support and mentorship
        Develop and ensure engineering best practices according to standards and procedures set forth by Data Products team
        Ensure Data Engineering tasks are completed with high quality and in timely matter according to the Data Products Roadmap
        Work closely with Sr Director of Data Products in design and strategy behind Data Products
        Work closely with VP of Real World Evidence and Insights to ensure Data Products meet research requirements.

    Important  Soft & Technical Skills

        Extensive experience as a data architect, data engineer, database internal and building data intensive architectures and applications
        Deep understanding of distributed systems and distributed databases
        Experience with Azure Cloud and Azure ecosystem - ADLS Gen2, Azure SQL Ecosystem
        Extensive experience with ETL, Batch processing and stream processing.
        Extensive experience with DWs concepts and Products like - Azure Synapse, Redshift / Snowflake / BigQuery
        Deep expertise with frameworks like Spark and Kafka and ecosystems around them
        Deep understanding of Big Data Ingestion/Integration/Storage/Processing, transformation/ETL tools and technologies and understanding of related concepts (such as data cataloging and curation, etc.)
        Track record of implementing Big Data solutions with large enterprises and start-ups
        Deep Knowledge of foundation infrastructure requirements such as Networking, Storage, and Hardware Optimization with Hands-on experience with Amazon Web Services (AWS)
        Design and Implementation and tuning experience in the Big Data Ecosystem - Databricks, (such as Hadoop, Spark, Presto, Hive), Database (such as Oracle, MySQL, PostgreSQL, MS SQL Server) and Data Warehouses (such as Redshift, Teradata, Vertica)
        SQL Engines on large data - Presto, Impala, Dremio, SparkSQL
        Good understanding of Data Governance - encompassing - Data Catalogs, Data Auditing, Lineage, Metadata and Master data management
        experience with DataOps process and tools
        Programming experience with one or more - Python, Java, Scala
        Experience with data orchestration systems like dbt, Airflow, Prefect etc
        Deep experience to ensure Non functional requirements like (NFRs) on the platform like - scalability, performance, availability, reliability, fault-tolerance.
    Nice to Have Soft & Technical Skills
        Deep wide scale experience with any relational databases like - MySQL, Postgres, SQLServer
        NoSQL Databases architectures and data modeling with at least one or more of the main types of NoSQL
        Databases like:
            Key Value data stores - Redis, DynamoDB, Riak,
            DOcument databases - MongoDB, CouchDB, Couchbase
            Graph Databases - Neo4J,
            Wide column databases - Cassandra, HBase, Scylla
            Time Series databases - InfluxDB, TimeScale
            Search engines and databases - Elastic Search, Solr.
            InMemory databases or InMemory Grids - Apache Ignite, GridGain etc

    Relevant Experience & Education  (Optional)
        BS in Computer Science, Health Informatics, or related field
        6+ years of experience working in a data engineering role
        3+ years of experience with clinical/healthcare data 1+ years of experience managing a data engineering team

The above statements describe the general nature and level of work being performed in this job function.  They are not intended to be an exhaustive list of all duties, and indeed additional responsibilities may be assigned by Health Catalyst.

At Health Catalyst, we appreciate the opportunity to benefit from the diverse backgrounds and experiences of others. Because of our deep commitment to respect every individual, Health Catalyst is an equal opportunity employer.

Related Jobs

Sr. Director of Quality Management

Health Catalyst - Remote

Analytics Engineer (Intermediate/Sr. Level) - ASO

Health Catalyst - Remote

Data Engineer (Master Data Management)

Health Catalyst - Salt Lake City, UT, US

Director of Engineering - Technology Platform - Network

Health Catalyst - RemoteUnited States of America

Senior Build and Release Engineer

Health Catalyst - RemoteUnited States of America
Disclaimer: Local Candidates Only
This company does NOT accept candidates from outside recruiting firms. Agency contacts are not welcome.