Data Engineering at SpotHero

Team enjoying food

At SpotHero as we grow from being a start-up to scaled-up, our internal platforms and teams are getting much more performant, stable and resilient. One of the pillars on this journey to being scaled-up as both a platform for other squads within Engineering and for other Departments e.g. Marketing and Finance

What do we do?

The Data Engineering team’s primary mission is enablement. We make it easy for our stakeholders to get the data and machine learning services they need to do their jobs. Our direct stakeholders at SpotHero include the Executive, Analysts, Data scientists, and Software Engineers, but the work we all do powers data-driven decisions in every corner of SpotHero. But our work also powers our partners whether they be Parking Operators or Technology partners

Some of the challenges we tackle include:
  • Problems that incur from growing data sizes from our 2 tier market place and external data sources
  • Forecasting the growth rate of such data and scaling accordingly
  • Managing infrastructure and tools that constantly change to adapt to these growing data needs
  • Cost analysis in adding features to internal tooling that compete with external tools you could buy
  • Keeping up with industry gold standards on security, features and upgrade in API for data sources
  • Mainly focusing on business needs either direct from business units or indirectly servicing these use-cases through Analysts and Data Scientists

So what does this actually look like? Although the Data Engineering team spends a portion of our time and effort making custom data pipelines, a much larger portion is spent developing, testing, and maintaining a collection of services that allow our users to build data pipelines and machine learning projects by themselves. Collectively, we refer to these services as the Data Platform™.

On any given day, a member of the Data Engineering team may interact with any of the many services consisting of technologies like:

  • Python / Kotlin + Java (to write application code)
  • Docker (to containerize our code)
  • Apache Spark (to handle memory-intensive workloads)
  • Kafka, Kafka Streams, and Kafka Connect (to build streaming services)
  • Apache Airflow (to orchestrate data pipelines)
  • Trino and Hive/Iceberg (Data Lake)
  • AWS Redshift (Data Warehouse)

For our services, we rely heavily on Terraform to provision and manage infrastructure as code, and Kubernetes to deploy and scale our applications in AWS.

To be clear, we do not measure success on our team by the sprawl of our technology footprint. In fact, we try to do just the opposite. We highly prioritize reducing technical debt as much as possible and are incredibly intentional about only expanding our tech stack when doing so provides a clear benefit to our team and the organization.

The breadth of our stack comes from the number of features and the level of performance we try to provide to stakeholders. Whether it be generating batch or streaming data pipelines, transforming unstructured datasets in the data lake into structured data in the data warehouse, or provisioning machine learning POC clusters dynamically though a web interface, the services we provide enable users of the data platform to do their work robustly and efficiently.

While the large number of domains in which the data engineering team operates does sometimes present unique challenges for the team, it’s certainly never boring! Working on the data team is also a great way to quickly learn a bunch of skills, both technical and ones that involve collaborating effectively with other teams. Moreover, you get to help some pretty awesome people do some pretty cool work!

How do we socialize with tech communities?

The Data Engineering team at SpotHero works in multiple programming languages (e.g. bash, Python, Kotlin, several flavors of SQL), deploys a wide range of open source technologies (e.g. Apache Airflow, Apache Spark, Trino), and manages our own infrastructure supporting applications and technologies.

As a result, we’re fairly active in the communities of practice around these technologies.

Internally, the team contributions presentations, feedback, and guidance to several “guild” groups:

  • Backend Unite – all backend engineering concerns at SpotHero
  • Kotlin Guild – community of practice for users of the Kotlin programming language
  • Ops Guild – SpotHero-wide concerns related to networking, security, infrastructure, and developer experience.

Externally, we have made open source code contributions, participated in the Slack communities around the technologies we use, and hosted some local meetup groups in our Chicago office.

This year, the Data Engineering squad has directly participated in hosting two meetup groups:

MLOps Community Chicago (September 2022)
James Lamb presenting at Chicago Python Meetup (March 2022)

One of our team members has also spoken publicly about his experiences as a machine learning engineer at SpotHero. See “Building for small data science teams”, James Lamb’s interview with the MLOps Community podcast (video link).

Who do we work with?

Data Engineering Squad works with all teams at SpotHero that work with Data, Yup that pretty much covers a very wide audience, this includes all engineering stakeholders i.e. Business Analytics, Data Scientists, Precision pricing team, platform engineering team, all of Operator and Driver supporting teams. Additionally we work with teams outside of engineering that drive SpotHero growth i.e. Marketing, Business, Finance and Revenue. Our work with stakeholders includes a wide range of tasks like recommending stable and long term data landing patterns based on stakeholder use-case.

Who is on the Data Engineering Squad?

Data Engineering Squad includes product focused engineers skilled with all kinds of talents and is a mixture of Data Engineers, Machine Learning Engineers and Product Manager.

Meet the Data Engineering Squad 👋

DE Squad at a team dinner!

Thinking of joining us?

Are you looking for your next BigData challenge or looking to learn and grow as a Data Engineer, we want to hear from you! Take a look at our job postings, At the time of writing this Blog we still have 2 open Data Engineering positions – please refer to our careers page here for more details.

Team gatherings through 2022:

Gaming away
Sit out at stand up!
Chatting Away

More on the team

If we had to pick a word to describe our team, I would choose “High-Performing”- Yes those two words we cannot avoid when describing Data Engineering Squad at SpotHero. We attribute the squad to Having clear scope, automation where appropriate, engaging with stakeholders, Handling TechDebt, appropriate Observability, dealing with ambiguity and really being Agile.

Featured Profiles

Zack Lawson – Squad Lead, DE SquadZack Lawson

Zack Lawson
  • About me: I got my start in technology doing data analysis and then from there I’ve free fallen into all things data! I’ve bipped and bopped around the West Coast and am currently located in Seattle. When I’m not doing computer things for work, I’m doing computer things with games and cooking food. I think you’re cool and its just the best you’re reading this biography.
  • Hometown: Berkeley, CA
  • Current town: Seattle, WA
  • Parking Style: With as minimal contact with other cars as possible.
  • Favorite Movies: Shaun of the Dead, Kung Fu Hustle, Mad Max: Fury Road, Pontypool
  • Random Personal Fact: I’ve been cooking for myself since I was 6 years old.

James Salvatore – Sr MLE, DE Squad

James Salvatore
  • About me: James was born in New Jersey but moved to Chicago to do undergrad in mathematics, then moved to New Orleans to do graduate work. After a whole ton of academia, James decided to transition into the industry and did QA, Data Analysis, Data Engineering, and Data Science before coming over to SpotHero to do Data Engineering and Machine Learning Engineering.
  • Hometown: Central New Jersey.
  • Current town: Chicago, Illinois.
  • Parking Style: Bike parking only. :'[
  • Favorite Movies: The Silence of the Lambs (1991), Metropolis (1927), The Grand Budapest Hotel (2014).
  • Random Personal Fact: I took piano lessons for a year so that I could beat an old-school Nintendo piano-controlled game.

Surya Dutta – Sr Data Engineer, DE Squad

Surya Dutta
  • About me: I was born in Kolkata, India, and moved to the US when I was four. In undergrad, my research work in experimental particle physics introduced me to programming, data analytics, and machine learning. I love that Data Engineering sits in the intersection of all of these fields. Outside of work, I usually spend my time perfecting my espresso shots, trying new recipes from Serious Eats, playing video games (mostly Formula 1 sim racing), and going on long walks with Luna, my rescue pup.
  • Hometown: Johns Creek, GA
  • Current town: Chicago, IL
  • Parking Style: Anywhere there’s EV charging! Bonus if it’s free.
  • Favorite Movies: Nope, The Prestige, Lord of the Rings Trilogy (extended editions)
  • Random Personal Fact: I spent 6 months competing on my college’s ballroom dancing team

James Lamb – Staff MLE, DE Squad

James Lamb
  • About me: I’m a machine learning engineer at SpotHero, and have worked in the past as a backend engineer, data scientist, economic analyst, teaching assistant, marketing intern, custodian, and little league umpire. I’m an active open source contributor, and have been a contributor, maintainer, and conference/meetup speaker for the last 6 years (see my GitHub).
  • Hometown: Chicago Heights, IL
  • Current town: Oak Lawn, IL
  • Parking Style: low-skill, low-confidence (if we might have to parallel park somewhere, my wife drives)
  • Favorite Movies: V for Vendetta, The Sting, Boo! a Madea Halloween
  • Random Personal Fact: For 2 years in college, I ran a hip-hop radio show where we interviewed small-time rappers from around the country.

Megha Lakshmi Narayanan – PM, DE Squad

  • About me: I am the Product Manager for both the Data Engineering & Data Science squads at SpotHero. My journey to data product management came by way of past stints as a Data Scientist and a Data Engineer in AdTech and FinTech spaces. Outside of work, I’m usually watching a lot of TV shows, and doomscrolling on Twitter.
  • Hometown: Bangalore, India
  • Current town: Toronto, Canada
  • Parking Style: Nonexistent since I do not drive 😄
  • Favorite Movies: Prefer TV shows over movies – Psych, Breaking Bad+Better Call Saul, Rectify, Reply 1988
  • Random Personal Fact: Have appeared on both a TV program and the newspaper (for different reasons) in India

Todd Schreiber – Agile Coach, DE Squad

Todd Schreiber
  • About me: I was born and raised in Evanston, Illinois and currently reside there with my wife of nearly-20 years and my three children (boys 16, 13; girl 11). I am an ambivert who can operate in a group environment, but doesn’t actively seek out plans. I am someone who uses a lighthearted, but serious-when-needed approach to coaching teams I work with; let’s dig into the numbers and our approach and see how we can possibly make things better or more efficient
  • Hometown: Evanston, Il
  • Current town: Evanston, Il
  • Parking Style: Parallel with a Minivan
  • Favorite Movies: The Usual Suspects, Shawshank Redemption, The Other Guys
  • Random Personal Fact(s): Got run over by a car at age 6 / Had about 8 three-pointers rained on me by Tony Romo in an intramural basketball game

Sunny Gurm – EM, SpotHero IQ

Sunny Gurm
  • About me: I was born in India, and immigrated with my family to the US and eventually Canada where I did a majority of my education (highlight: doing an internship abroad in Karlsruhe, Germany!). I worked for 10+ years in the medtech/biotech industries focusing on R&D, machine learning, and computer vision applications. I am a sports fanatic, avid reader, traveller, and amateur beer brewer.
  • Hometown: Kitchener, ON, Canada
  • Current Town: Ottawa, ON, Canada
  • Parking Style: If nobody is watching, I’ll (try to) parallel park
  • Favourite Movies: Trading Places, Pulp Fiction, The Great Escape
  • Random Personal Fact: I genuinely think Malort tastes good

Sudhir Vissa – EM, DE Squad

Sudhir Vissa
  • About me: Born and raised in south India with a culture of festivities and good food. I traveled to New Jersey for education and after graduation I found myself modding mobile phones as first job in Motorola – NJ, Fast forward 16 years I find myself with SpotHero working on BigData, Through this journey I have been in research, product releases on hardware, software and SaaS. I am passionate about empowering businesses with game changing technologies and processes.
  • Hometown: Hyderabad, India
  • Current town: Chicago, IL
  • Parking Style: 360, if I could!
  • Favorite Movies: Matrix, Love Death and Robots, Twelve angry men.
  • Random Personal Fact: I love indoor plants and encourage growing plants from scratch!
, ,