Summary:
We’re looking for an experienced Data Engineer with deep expertise in the AWS big data ecosystem. The ideal candidate will have a strong background in designing and building data pipelines, managing streaming data platforms, and working with various data formats. Experience with Spark (Scala/Python), real-time analytics, and cloud-native solutions is essential.
Key Skills:
- Technologies: AWS (EMR, Lambda, Glue, Step Functions), Scala, Hadoop, Spark, Hive, Shell Scripting, Sqoop, Flume/Kafka
- Databases: Amazon Redshift, Oracle, SQL Server
- Data Handling: Expertise in ingesting both structured and semi-structured data
- Streaming & Real-Time Analytics: Hands-on experience with streaming data platforms and real-time data processing
- Data Modeling: Proficient in dimensional/star schema modeling
- Big Data Stack: Strong experience across Hadoop ecosystem tools and frameworks
- Pipeline Development: Proven track record in building robust and scalable data pipelines in AWS cloud environments
- Scripting: Strong in Spark (Scala/Python) and Shell scripting with excellent debugging skills