We are looking for a Data Engineer who will be working as part of a team
responsible for fully automating the execution of campaigns and optimising towards
requested KPI's and objectives.
Search on its own is a highly complex data-set. Captify’s technologies have been
built to extract maximum value from search for brands, partners and businesses all
over the world and also to innovate & improve the consumer experience.
Captify’s world-class engineers, semantic specialists and product teams are building
the future of Search and as part of our Engineering Team you will play a key part in
developing our offering.
What you’ll be doing:
You will clean, transform, and analyse
vast amounts of raw data from various
systems using Spark
You will also train and deploy ML
models in production
Create Pyspark/Scala jobs for data
transformation and aggregation
Build ML Pipelines
Produce unit tests
What you need to be successful:
Proficiency in Python/Pyspark and
Apache Spark 2.x: RDD, SQL
DataFrame, MLLib, Streaming
Spark query tuning and performance
PSQL database integration
Some experience working with HDFS,
Captify are the largest holder of 1st party consumer search data outside of Google, and its unique technology understands the intent of consumers across all channels, including voice Search, desktop on-site search and in-app search.