Work with the brightest minds at one of the largest financial institutions in the world. This is long-term contract opportunity that includes a competitive benefit package!
Our client has been around for over 150 years and is continuously innovating in today's digital age. If you want to work for a company that is not only a household name, but also truly cares about satisfying customers' financial needs and helping people succeed financially, apply today.
Position: Big Data Engineer
Location: PHILADELPHIA, Pennsylvania, 19130
Term: 12 months
Day-to-Day Responsibilities:
- Lead complex initiatives on selected domains.
- Ensure systems are monitored to increase operational efficiency and managed to mitigate risk.
- Define opportunities to maximize resource utilization and improve processes while reducing cost.
- Lead, design, develop, test and implement applications and system components, tools and utilities, models, simulation, and analytics to manage complex business functions using sophisticated technologies.
- Resolve coding, testing and escalated platform issues of a technically challenging nature.
- Lead team to ensure compliance and risk management requirements for supported area are met and work with other stakeholders to implement key risk initiatives.
- Mentor less experienced software engineers.
- Collaborate and influence all levels of professionals including managers.
- Lead team to achieve objectives. Partner with production support and platform engineering teams effectively.
Is this a good fit? (Requirements):
- 7+ years of Big Data Platform (data lake) and data warehouse engineering experience demonstrated through prior work experiences. Preferably with Hadoop stack: HDFS, Hive, SQL, Spark, Spark Streaming, Spark SQL, HBase, Kafka, Sqoop, Atlas, Flink, Kafka, Cloudera Manager, Airflow, Impala, Hive, HBase, Tez, Hue, and a variety of source data connectors.
- 3+ years of hands-on experience building modern, resilient, and secure data pipelines, including movement, collection, integration, transformation of structured/unstructured data with built-in automated data controls, and built-in logging/monitoring/alerting, and pipeline orchestration managed to operational SLAs. Preferably using Airflow, DAGS, connector plugins.
- 1+ years of experience with Google cloud data services such as cloud storage, cloud proc, cloud flow, and Big Query.
- 5+ years of strong Python and other functional programming skills.
- 5+ years of Specialty Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education.
Desired Qualifications:
- Hands-on experience developing and managing technical and business metadata
- Experience creating/managing Time-Series data from full data snapshots or incremental data changes
- Hands-on experience with implementing fine-grained access controls such as Attribute Based Access Controls using Apache Ranger
- Experience automating DQ validation in the data pipelines
- Experience implementing automated data change management including code and schema, versioning, QA, CI/CD, rollback processing
- Experience with automating end to end data lifecycle on the big data ecosystem
- Experience with managing automated schema evolution within data pipelines
- Experience implementing masking and/or other forms of obfuscating data
- Experience designing and building microservices, APIs and, MySQL
- Advanced understanding of SQL and NoSQL DB schemas
- Advanced understanding of Partitioned Parquet, ORC, Avro, various compression formats
- Developing containerized Microservices and APIs
- Familiarity with key concepts implemented by Apache Hudi or Iceberg, or Databricks Delta Lake (bonus)