#req11156
manipulate data from multiple sources.
· Automate data workflows such as data ingestion, aggregation, and ETL processing.
· Prepare raw data in Data Warehouses into a consumable dataset for both technical and non-technical stakeholders.
· Partner with data scientists and functional leaders in sales, marketing, and product to deploy machine learning models in production.
· Build, maintain, and deploy data products for analytics and data science teams on cloud platforms (e.g. AWS, Azure, GCP).
· Ensure data accuracy, integrity, privacy, security, and compliance through quality control procedures.
· Monitor data systems performance and implement optimization strategies.
· Leverage data controls to maintain data privacy, security, compliance, and quality for allocated areas of ownership.
You have what it takes if you have...
· 3+ years of SQL skills and experience with relational databases and database design.
· Experience working with cloud Data Warehouse solutions - Databricks, Apache Spark
· Experience working with data ingestion tools such as Fivetran, stitch, or Matillion.
· Working knowledge of Cloud-based solutions (e.g. AWS, Azure, GCP).
· Experience building and deploying machine learning models in production.
· Strong proficiency in object-oriented languages: Python, Java, C++, Scala.
· Strong proficiency in scripting languages like Bash.
· Strong proficiency in data pipeline and workflow management tools (e.g., Airflow).
· Strong project management and organizational skills.
· Excellent problem-solving, communication, and organizational skills.
· Proven ability to work independently and with a team.
Extra dose of awesome if you have...
· Good understanding of NoSQL databases like CrateDB, Redis, Cassandra, MongoDB, or Neo4j.
· Experience with working on large data sets and distributed computing (e.g. Hive/Hadoop/Spark/Presto/MapReduce).