Back to Job Search

Big Data Engineer

  • Location: Philadelphia, 19130
  • Salary: 92.89
  • Job Type:Contract

Posted 5 days ago

Work with the brightest minds at one of the largest financial institutions in the world. This is long-term contract opportunity that includes a competitive benefit package!

Our client has been around for over 150 years and is continuously innovating in today's digital age. If you want to work for a company that is not only a household name, but also truly cares about satisfying customers' financial needs and helping people succeed financially, apply today.

Position: Big Data Engineer
Location: PHILADELPHIA, Pennsylvania, 19130
Term: 12 months

Day-to-Day Responsibilities:

  • Lead the design and development of sophisticated, resilient, and secure engineering solutions for modernizing our data ecosystem that typically involve multiple disciplines, including big data architecture, data pipelines, data management, and data modeling specific to consumer use cases.
  • Provide technical expertise for the design, implementation, maintenance, and control of data management services – especially end-to-end, scale-out data pipelines.
  • Develop self-service, multitenant capabilities on the cyber security data lake including custom/of the shelf services integrated with the Hadoop platform and Google cloud, use API and messaging to communicate across services, integrate with distributed data processing frameworks and data access engines built on the cluster, integrate with enterprise services for security, data governance and automated data controls, and implement policies to enforce fine-grained data access
  • Build, certify and deploy highly automated services and features for data management (registering, classifying, collecting, loading, formatting, cleansing, structuring, transforming, reformatting, distributing, and archiving/purging) through Data Ingestion, Processing, and Consumption stages of the analytical data lifecycle.
  • Provide the highest technical leadership in terms of design, engineering, deployment and maintenance of solutions through collaborative efforts with the team and third-party vendors.
  • Design, code, test, debug, and document programs using Agile development practices.
  • Review and analyze complex data management technologies that require in depth evaluation of multiple factors including intangibles or unprecedented factors.
  • Assist in production deployments, including troubleshooting and problem resolution.
  • Collaborate with enterprise, data platform, data delivery, and other product teams to provide strategic solutions, influencing long range internal and enterprise level data architecture and change management strategies.
  • Provide technical leadership and recommendation into the future direction of data management technology and custom engineering designs.
  • Collaborate and consult with peers, colleagues, and managers to resolve issues and achieve goals.


Required Qualifications:

  • 7+ years of Big Data Platform (data lake) and data warehouse engineering experience demonstrated through prior work experiences. Preferably with Hadoop stack: HDFS, Hive, SQL, Spark, Spark Streaming, Spark SQL, HBase, Kafka, Sqoop, Atlas, Flink, Kafka, Cloudera Manager, Airflow, Impala, Hive, HBase, Tez, Hue, and a variety of source data connectors.
  • 3+ years of hands-on experience building modern, resilient, and secure data pipelines, including movement, collection, integration, transformation of structured/unstructured data with built-in automated data controls, and built-in logging/monitoring/alerting, and pipeline orchestration managed to operational SLAs. Preferably using Airflow, DAGS, connector plugins.
  • 1+ years of experience with Google cloud data services such as cloud storage, cloud proc, cloud flow, and Big Query.
  • 5+ years of strong Python and other functional programming skills


Desired Qualifications:

  • Hands-on experience developing and managing technical and business metadata
  • Experience creating/managing Time-Series data from full data snapshots or incremental data changes
  • Hands-on experience with implementing fine-grained access controls such as Attribute Based Access Controls using Apache Ranger
  • Experience automating DQ validation in the data pipelines
  • Experience implementing automated data change management including code and schema, versioning, QA, CI/CD, rollback processing
  • Experience with automating end to end data lifecycle on the big data ecosystem
  • Experience with managing automated schema evolution within data pipelines
  • Experience implementing masking and/or other forms of obfuscating data
  • Experience designing and building microservices, APIs and, MySQL
  • Advanced understanding of SQL and NoSQL DB schemas
  • Advanced understanding of Partitioned Parquet, ORC, Avro, various compression formats
  • Developing containerized Microservices and APIs
  • Familiarity with key concepts implemented by Apache Hudi or Iceberg, or Databricks Delta Lake (bonus)