Compensation:
- Equity: 0.5% - 2% (depending on experience and contribution)
- Salary (post-funding): $80,000 - $120,000
- Deferred Salary Agreement: Payment will begin once pre-seed funding is finalized (by March), with retroactive compensation.
About InsightBox
InsightBox is revolutionizing marketing with the first AI Audience Intelligence platform that identifies the most valuable audiences based on specific business context. Sports/event marketers have all data but do not know the “so what” part. InsightBox AI will tell you which offers to send, to whom, when, and with what message.
Our platform empowers marketers with a level of precision and efficiency that wasn’t possible before. By leveraging generative AI and a proprietary library of over 100 AI modules, we translate complex business objectives into clear, data-driven marketing actions. Our technology streamlines audience targeting, reducing workflows that once took months to just weeks.
Trusted by industry leaders like a top MLB team, InsightBox is already driving real-world impact in sports and entertainment and soon in fashion and travel. Our machine-learning architecture is designed to cut through marketing noise, helping brands connect with their audiences in a more relevant and effective way.
We are part of Alchemist, Silicon Valley’s premier B2B accelerator, and ranked in the top 25% of the program based on investor interest and potential. We are in the final stages of closing our pre-seed round, with strong traction and interest from top venture capital firms. This is an exciting time to join as we gear up for rapid growth and expansion.
What’s it like to work with us?
At InsightBox, we are a team of builders, problem-solvers, and industry experts who saw a gap in marketing intelligence and decided to fix it. We believe in fostering an environment where curiosity and collaboration thrive. If you join us, you’ll work at the forefront of AI-driven marketing, shaping how businesses connect with their audiences.
We value:
- Humility – Recognizing that learning never stops.
- Fast learning – Staying ahead in AI and marketing requires adaptability.
- Transparency – We support each other and tackle all challenges together as a team. Honesty is one of the core values.
- Proactivity – Thinking ahead, taking initiative, and pushing boundaries.
- Authenticity – Early-stage startups are fast-paced; we want you to be yourself and enjoy the journey with all of us. It has been a fun ride!
About the Role
As a Founding Data Engineer at InsightBox, you will play a pivotal role in building the foundation of our AI-driven audience intelligence platform. Your work will enable our machine learning engineers and statisticians to develop high-impact AI models by ensuring clean, structured, and accessible data.
This is a hands-on role where you'll design and optimize data pipelines, develop robust ETL processes, and ensure data quality across our infrastructure. You’ll work with large-scale datasets, real-time data processing, and cloud-based data architectures to support AI-driven marketing intelligence in industries like sports, entertainment, fashion, and travel.
If you’re excited about solving complex data challenges, building scalable infrastructure, and working in a high-growth AI startup, we’d love to hear from you.
Responsibilities
- Design, build, and maintain scalable data pipelines and ETL/ELT processes using Google Cloud Platform (GCP) services.
- Develop robust data storage solutions with BigQuery, Cloud Storage, and Cloud SQL for AI model training and analytics.
- Ensure data integrity, validation, and governance using Data Catalog, Dataform, and Cloud DLP.
- Optimize data transformation workflows with dbt, Apache Beam, and Google Dataflow for real-time and batch processing.
- Implement data orchestration and workflow automation with Cloud Composer (Apache Airflow).
- Collaborate with ML engineers and statisticians to prepare structured datasets for AI model development.
- Develop and maintain monitoring and observability for data pipelines using Google Cloud Logging and Monitoring.
- Ensure compliance with data security and governance using IAM roles, VPC Service Controls, and encryption best practices.
- Optimize data processing and query performance to handle large-scale structured and unstructured datasets efficiently.
Requirements
- Proficiency in Python and SQL for data processing and pipeline automation
- Strong experience with Google Cloud Platform (GCP), including BigQuery, Cloud Storage, Cloud SQL, Dataflow (Apache Beam), Cloud Composer (Apache Airflow), Data Catalog, and Dataform
- Experience in ETL/ELT development, data transformation, and pipeline automation
- Understanding of data modeling, schema design, and query optimization for AI and analytics applications
- Knowledge of distributed data processing frameworks such as Apache Spark or Flink (on GCP)
- Experience in API integrations to extract data from Google Ads, Google Analytics, and CRM tools
- Understanding of data security, governance, and compliance in cloud environments
- Experience in AI/ML data preparation workflows and working with ML teams
- Familiarity with event-driven architectures using Pub/Sub and Cloud Functions
- Knowledge of real-time data streaming and analytics with BigQuery Streaming API
- Experience with Looker or Data Studio for data visualization and insights
- Ability to thrive in a fast-paced startup environment, balancing innovation with execution