Data Engineer (Apache / Lakehouse) – AI Project

Full-Time
, Remote

Are you a seasoned Data Engineer ready to build innovative data pipelines that power AI-driven insights?

Join Camplight, where your expertise will help create a cutting-edge data fusion platform that transforms how organizations leverage public and private data sources.

What you’ll be working on?

We are partnering on an exciting project to develop an AI-powered data fusion and enrichment platform that seamlessly integrates public and private data sources with WebINT capabilities.

This platform will enable organizations to collect, process, and analyze massive datasets from diverse sources, creating enriched data products that drive intelligent decision-making. You’ll be working with modern data lakehouse architecture to build scalable ETL pipelines that transform raw data into valuable insights.

Your Role

Your role will involve designing and implementing robust data pipelines, optimizing data processing workflows, and ensuring data quality across the platform. You’ll architect scalable solutions using cutting-edge technologies like Apache Spark, Airflow, and NiFi while collaborating with data scientists and ML engineers to support AI model development. Your expertise in data lakehouse architecture and ETL processes will be crucial for the platform’s success.

You’ll work with our modern data stack including:

  • Unity Catalog for metadata management and data governance
  • MinIO as our object storage solution
  • Delta Lake as our open table format
  • Parquet as our primary file format
  • Trino as our distributed SQL query engine
  • Apache Superset for data exploration and visualization
  • Your expertise in data lakehouse architecture and ETL processes will be crucial for the platform’s success.

About Camplight

We build self-organizing technical teams, offer software development services, and work with businesses and entrepreneurs to create new products.

With over 300 successful software projects, some ongoing for over 8 years, we strive for long-term success for our partners.

By following the principles of self-management and organizing as a cooperative, we achieve 95% satisfaction among them.

We seek the best talents to join us and value transparency, collaboration, trust, responsibility, and innovation.

When joining Camplight, you can become a co-owner of the cooperative, allowing you to steer the business and share in the rewards of our collective success.

MPI 5206

What are we looking for?

  • Ownership mindset: We want individuals who care about their work. Individuals who take pride in being professionals, have high standards and deliver on them. This is the main way Camplight stands out in front of competitors.
  • Technical expertise: We expect you to know your tools and be able to write high-quality software efficiently.
  • Communication skills: You’ll be frequently communicating with our partners and other team members. That’s why it’s important to have clear written and verbal communication, outstanding English, emotional intelligence, and a desire to understand the person in front of you. Knowledge of how to distill requirements and manage stakeholder expectations will be a big plus.

Requirements

  • Strong experience with data lakehouse architecture and Apache Spark
  • Proficiency with Apache Airflow and NiFi for workflow orchestration and data ingestion
  • Experience building and optimizing ETL processes and Spark pipelines
  • Familiarity with query engines like Presto/Trino
  • Cloud-based data platform experience (AWS, Azure, or GCP)
  • Knowledge of streaming technologies (Kafka, Spark Streaming) is beneficial
  • Experience with WebINT and open-source intelligence gathering is advantageous
  • Python or Scala programming skills preferred

What do we offer?

We focus on health, wealth, and empowering relationships:

  • Fully remote work with flexible work hours
  • Competitive salary
  • Opportunity to become a co-owner of the cooperative
  • Individual career development plan
  • Friendly team and company culture
  • Prioritization of mental and physical health in the workplace, with the freedom to make decisions about oneself, supported by peers committed to a healthy lifestyle.
  • Empowering relationships for engineering alongside colleagues who cherish growth mindsets in a unique environment that blends service and product craftsmanship.
DJI 0120

What does the interview process look like?

  1. Initial Interview: We’ll start with a friendly 45-minute cultural and technical interview. Two members of our team will assess your cultural fit, past experience, and engineering expertise, the major challenges you’ve tackled, and discuss your ideal workspace.
  2. You can choose between two Technical Deep Dive options:
    1. Homework Assignment: If there’s a match, we’ll provide a brief homework assignment designed to take around 2 hours to complete. This will be followed by a 1-hour technical interview to discuss the homework and conduct a technical deep dive.
    2. Pair Programming: If you prefer not to do a homework assignment, we’ll have a 2-hour technical deep dive session, primarily focused on pair programming.

Regardless of the outcome, we will provide you with constructive feedback to help you grow.