Back to Job Search

Data Engineer

Posted 20 days ago

100% Remote opportunity!

Data is integral to nearly everything that we do, and as a result, the Data Engineering team plays a key role at our company. You should be excited to work with data, have the curiosity to dive deeply into issues, and feel empowered to make a meaningful impact at a mission-driven company. The Data Engineering team is responsible for capturing, storing and organizing key business data assets, supporting many departments including data scientists in building algorithms used for patient identification / engagement / treatment, and supporting useful and actionable reporting for the business.

The Senior Data Engineer should have a passion for developing scalable data-based solutions. 

The Primary Responsibilities:

  • Design, build, implement and maintain large-scale batch (ELT/ETL) data pipelines and near real-time streaming processes on Google Cloud Platform (e.g., BigQuery, GCS, Composer, etc.) leveraging tools such as Informatica (IICS / CDI / CAI) and Python
  • Design, build, implement and maintain data-based APIs and services leveraging tools such as Python, Informatica (IICS / CAI) and Apigee
  • Collaborate with Data Engineers, Internal teams and Clients to ensure the high availability, integrity and confidentiality of healthcare data
  • Implements processes to monitor data quality and operations, ensuring production data is accurate and available for business processes that depend on it.
  • Writes unit/integration tests, contributes to engineering wiki, and documents work.
  • Supports gathering requirements with the Data Product Owner and Data Architect. Including breaking down development work into manageable work items.
  • Provide support for ongoing operations of the data engineering processes and perform data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Participate in daily team standups and other Agile ceremonies

Key Initiatives over the next 12 months

  • Modern implementation of the data ingestion processes to create a more efficient processes and to adhere to industry standards
  • Build out of several new Data Lake sources
  • Build out of a new Data Warehouse supporting critical functions

Position Requirements:

Experience:

  • 8+ year of experience in Data Engineering / Data Warehousing / ETL Development
  • 5+ years SQL
  • 5+ years of Python experience
  • 2+ years of Informatica Cloud (IICS) experience
  • 5+ years of working knowledge of data technologies, such as MongoDB, BigQuery, MySQL, Firestore and Postgres
  • 5+ years of experience with data pipeline and workflow management tools: Airflow, etc.
  • 3+ years of experience with RESTful services
  • Self-starter with experience independently breaking down large projects into actionable steps
  • BA in Information Systems, Computer Science, or other related discipline preferred
  • Healthcare domain experience is a plus
  • Experience with working on Agile teams and leveraging Jira is a plus
  • Experience with event-based architectures (e.g., Kafka, pub/sub) is a plus
  • Experience with PySpark is a plus
  • Experience working on Google Cloud Platform is a plus
  • Experience working with Batch and Real-time data integration is a plus

Competencies:

  • Proven problem solver and team player
  • Demonstrated emotional intelligence
  • Excellent written and oral communicator

Who You Are:

  • You are a collaborator. You build and maintain strong, productive working relationships with internal stakeholders and external customers.
  • You are an engineer. You have a strong sense of architectural patterns (and anti-patterns) and an intuition of what it takes to get complex projects shipped.
  • You hustle. You don't need to wait for people to tell you what to work on.
  • You are empathetic. You seek to understand each individual’s diversity of background and thought.

Quality Control:

  • Collaborate on the definition and design of detailed acceptance test scenarios.
  • Support the delivery of high-quality test-driven solutions that pass automated tests.