#1905
nabling data science and search based applications on large and low latent data sets in both a batch and streaming context for processing. To that end, this role will engage with team counterparts in exploring, developing and deploying technologies for creating data sets using a combination of batch and streaming transformation processes. These data sets support both off-line and in-line machine learning training and model execution. Other data sets support search engine based analytics. Exploration and deployment of technologies activities include identifying opportunities that impact business strategy, selecting data solutions software, and defining hardware requirements based on business requirements. Responsibility also includes coding, testing, and documentation of new or modified scalable analytic data systems including automation for deployment and monitoring. This role participates along with team counterparts to architect an end-to-end framework developed on a group of core data technologies. Other aspects of the role include developing standards and processes for data engineering projects and initiatives.
Evaluate, research, experiment with data engineering technologies in a lab to keep pace with industry innovation while assessing business impact and viability for use cases associated with efforts in hand
Work with data engineering related groups to inform on and showcase capabilities of emerging technologies and to enable the adoption of these new technologies and associated techniques
Define and refine processes and procedures for the data engineering practice
Work closely with data scientists, data architects, ETL developers, other IT counterparts, and business partners to identify, capture, collect, and format data from the external sources, internal systems, and the data warehouse to extract features of interest
Code, test, deploy, monitor, document, and troubleshoot data engineering processing and associated automation
Define data engineering architecture both hardware and software reflective of business requirements to be included in end-to-end solution architecture
Educate and develop ETL developers on data engineering so as to enable transition to data engineer and practice
Conduct code reviews, suggest improvements, support technology upgrades for the common libraries, handover them to the corresponding development teams for quality check and support them till deployment into production
Support ETL developers and Operations teams to troubleshooting of the incidents for root cause analysis and assist in solutioning to meet the service level agreements
Work with Operations teams in Big Data, IT and Information Security with monitoring and troubleshooting of incidents to maintain service levels
Contribute to the evolving distributed systems architecture to meet changing requirements for scaling, reliability, performance, manageability, and cost
Report utilization and performance metrics to user communities
Contributes to planning and implementation of new/upgraded hardware and software releases
Responsible for monitoring the Linux, Hadoop, and Spark communities and vendors and report on important defects, feature changes, and or enhancements to the team
Research and recommend innovative, and where possible, automated approaches for administration tasks
Identify approaches to efficiencies in resource utilization, provide economies of scale, and simplify support issues
QUALIFICATIONS
What Makes You A Dream Candidate?
Strong working knowledge of Hadoop and Spark cluster security, networking connectivity and IO throughput along with other factors that affect distributed system performance
Strong working knowledge of disaster recovery, incident management, and security best practices
Working knowledge of containers (e.g., docker) and major orchestrators (e.g., Mesos, Kubernetes, Docker Datacenter)
Working knowledge of automation tools (e.g., Puppet, Chef, Ansible)
Working knowledge of software defined networking
Working knowledge of parcel based upgrades with Hadoop (i.e., Cloudera)
Working knowledge of hardening Hadoop with Kerberos, TLS, and HDFS encryption
Working knowledge with directed analytic graph stream processing using Beam, Flink, Nifi and/or Samza
Excellent knowledge of Linux, AIX, or other Unix flavors
Working knowledge of Cloud based implementations (e.g., Microsoft Azure) with emphasis on security using ACLs and Artifactory Groups
Ability to accept change and to adapt to shifting organizational challenges and priorities
Ability to coach, develop and lead others
Ability to evaluate problems and issues quickly, and to make recommendations for courses of action
Ability to make independent decisions and use sound judgment in relation to the management of team members
Ability to prioritize tasks and ensure their completion in a timely manner
Excellent analytical and troubleshooting skills
Strong interpersonal, verbal and written skills
Experience and Education:
5-7 years experience with software engineering to include Java, Scala, and Python required
5-7 years proficiency with processing large data sets with Kafka, RabbitMQ, Flume, Hadoop, HBase, Cassandra and/or Spark or similar distributed system required
3-5 years hands-on experience with scripting with Bash, Perl, Ruby required
3-5 years hands-on development / processing experience on Kafka, HBase, Solr, and Hue required
2-4 years hands-on experience with ETL and Business Intelligence technologies such as Informatica, DataStage, Ab Initio, Cognos, BusinessObjects, or Oracle Business Intelligence required
2-3 years hands-on experience with SQL, data modeling, and relational databases such as Oracle, DB2, and Postgres required
Proven track record with NoSQL data stores such as MongoDB, Cassandra, HBase, Redis, Riak or other technologies that embed NoSQL with search such as MarkLogic or Lily Enterprise required
0-2 years management experience with data engineering team preferred
High School Diploma or equivalent required
Bachelor's Degree in related field or equivalent work or military experience required
What We Offer: Generous benefits package available on day one to include: 401K matching, bonding leave for new parents (12 weeks, 100% paid), tuition assistance, training, GM employee auto discount, community service pay and nine company holidays.
Our Culture: Our team members define and shape our culture - an environment that welcomes innovative ideas, fosters integrity, and creates a sense of community and belonging. Here we do more than work - we thrive.
Compensation: Competitive pay and bonus eligibility
Work Life Balance: Flexible hybrid work environment, 3 days a week in office
#LI-hybrid
#LI-KC1
#GMFjobs