Lead Data Engineer

Full Time 8 months ago
Employment Information

The Role

Data is essential for all our clients decision-making needs whether it’s related to franchisee advertising effectiveness, helping key stakeholders understand business KPI’s or building new products in emerging markets. This data is deeply valuable and gives us insights into how we can continue improving our service for our team, advertisers and partners. Comma8 is seeking a highly motivated Data Engineer with a strong technical background and passionate about diving deeper into Big Data to develop state of the art Data Solutions.

“About Us” sections always feel like a forced dating bio. So we’ll cut right to it: we solve complex problems for some of the biggest brands in fit-tech. We’ve collected the expertise needed to help them find and amaze their customers with top-notch digital fitness experiences.

And we work with some unbiasedly awesome people, who believe in deep diving on new challenges, personal growth, and having a life. Since we know you’ll deliver, we’re a Results-Only Workplace Environment, where we offer as many vacation, sick and mental-health days as you need to get the job done.

With projects ranging from high-impact, cutting-edge solutions to complete strategic overhauls we can promise you one thing—your work will have a meaningful impact on an active and thriving customer base.

Benefits

Though we have an office in Irvine, CA and co-working spaces in Los Angeles, we've become a remote-first company through the pandemic. When it's safe to do so, we'd love to meet in person and host retreats. Therefore, candidates in Southern California and Pacific time-zone are preferred, but we're happy to consider any candidates in the US.

We're a Results-Only Workplace Environment (ROWE) which allows you as many vacations, sick, and mental-health days as you need, as long as you are meeting your performance goals, assigned tasks, and projects.

Other benefits include: Health, Dental, Vision and Life Insurance, 401K Plan, Various Fitness and Wellness Benefits (Group Classes, Free Training, Products, etc.), Continued Education Courses, and Conference Attendance The responsibilities and job scope discussed above may change as necessitated by business demands.

Responsibilities

  • Lead the design and growth of our Products and Data Warehouses around our clients Analytics
  • Design and develop scalable data warehousing solutions, building ETL pipelines in Big Data environments (cloud, on-prem, hybrid)
  • Manage the transformation of large daily batch data volumes in the cloud using Apache Spark, EMR, and Glue, ensuring streamlined processing and cost savings
  • Construct and maintain high-throughput streaming data pipelines using technologies like Kinesis, Spark Streaming, and Elasticsearch, while minimizing response lag
  • Automate and orchestrate complex data workflows using Python, Apache Airflow, and Step Functions to eliminate bottlenecks in data pipelines
  • Mentor and guide a team, providing technical expertise in SQL query execution, data manipulation, data visualization, and performance optimization
  • Develop, test, and deploy scalable reverse ETL solutions using API Gateway, Python (Flask), and Lambda, achieving near-zero latency and high scalability
  • Help architect data solutions/frameworks and define data models for the underlying data warehouse and data lakes
  • Collaborate with key stakeholders to map, implement, and deliver successful data solutions
  • Maintain detailed documentation of your work and changes to support data quality and data governance
  • Ensure high operational efficiency and quality of your solutions to meet SLAs and support commitment to our clients
  • Be an active participant and advocate of agile/scrum practice to ensure health and process improvements for your team

Qualifications

  • 3-5 years of data engineering experience developing large data pipelines
  • Strong SQL skills and ability to create queries to extract data and build performant datasets
  • Hands-on experience with data integration tools (e.g. Apache Spark, Apache Kafka)
  • Hands-on experience with cloud-based data services (e.g., AWS Glue, Azure Data Factory, Google Cloud Dataflow)
  • Experience with version control systems (e.g., Git) and collaborative development practices
  • Strong programming skills in Python
  • Experience with at least one major MPP or cloud database technology (Snowflake, Redshift, Big Query)
  • Solid experience with data integration toolsets (i.e Airflow) and writing and maintaining Data Pipelines
  • Strong in Data Modeling techniques and Data Warehousing standard methodologies and practices
  • Familiar with Scrum and Agile methodologies
  • You are a problem solver with strong attention to detail and excellent analytical and communication skills
  • Nice to have experience with Cloud technologies like AWS (S3, EMR, EC2)

 

Skills
Python AWS Apache Spark
Startup Jobs - Your source for tech startup jobs

Hundreds of jobs are posted daily
Subscribe to get updates

Startup Jobs - Your source for tech startup jobs
© Copyright 2023, Startup Jobs, a service of TechStartups.com