Data Engineer (Remote)

  • Full Time
  • Remote
  • Mid Level

Website Stash Stash

Invest in Yourself

Want to help everyday Americans invest and build wealth? Financial inequality is increasing, and too many people are getting left behind. At Stash, we are passionate about democratizing wealth creation through education, advice, and products that help customers achieve greater financial freedom.

At Stash, data is at the core of how we make decisions and build great products for millions of users. As a Data Engineer you will be a part of our Data Platform Team which is leading the architectural design decisions and implementation of a modern data infrastructure at scale. You will build distributed services and large scale processing systems that will support various teams to work faster and smarter. You will partner with Data Science to help productionize machine learning models and algorithms into actual data driven products that will help make smarter products for our users.

Tools and technologies in our tech stack (evolving):
Hadoop, Yarn, Spark, MongoDB, Hive
AWS EMR/EC2/Lambda/kinesis/S3/Glue/DynamoDB/API Gateway, Redshift
ElasticSearch, Airflow, and Terraform.
Scala, Python

What you’ll do:

  • Build core components of data platform which will serve various types of consumers including but not limited to data science, engineers, product, qa
  • Build various data ingestion and transformation job/s as and when they are needed
  • Productionize our machine learning models and algorithms into data-driven feature MVPs that scale
  • Leverage best practices in continuous integration and deployment to our cloud-based infrastructure
  • Build scalable data services to bridge the gap between analytics and application space
  • Optimize data access and consumption for our business and product colleagues
  • Develop an understanding of key product, user, and business questions

 

Who we’re looking for:

  • 3+ years of professional experience working in data engineering
  • BS / MS  in Computer Science, Engineering, Mathematics, or a related field
  • You have built large-scale data products and understand the tradeoffs made when building these features
  • You have a deep understanding of system design, data structures, and algorithms
  • Experience (or a strong interest in) working with Python or Scala
  • Experience with working with a cluster manager (YARN / Mesos / Kubernetes)
  • Experience with distributed computing and working with Spark, Hadoop, or MapReduce Framework
  • Experience working on a cloud platform such as AWS
  • Experience with ETL in general

Gold stars:

  • Experience working with Apache Airflow
  • Experience working with AWS Glue
  • Experience in Machine Learning and Information Retrieval

**no recruiters please**

Tagged as: airflow, AWS, hadoop, lambda, python, redshift, scala, spark, terraform, yarn

To apply for this job please visit grnh.se.