Cloud Operations Engineer
Honeywell changes the way the world works.
Honeywell is charging into the IoT revolution with the establishment of Honeywell Connected Enterprise (HCE), building on our heritage of invention and deep, on-the-ground industry expertise. HCE is the leading disruptor, building and connecting software solutions to streamline and centralize the assets, people and processes that help our customers make smarter, more accurate business decisions.
HCE provides cloud-based Enterprise Performance Management offerings in a variety of domains such as Buildings, Industrial Plants and Aero enabling customers with faster innovation, efficient operations and maximizing the value of their assets. Powering all the HCE offerings is the underlying platform — Honeywell Forge which provides an open and extensible foundation to unify operations and leverage the innovative capabilities consistently across the customer business units.
Be part of a team that designs, develops and integrates highly complex software functions within Honeywell HCE. You will use your experience and judgment to plan and accomplish goals. You will also generate innovative solutions in work situations; trying different and novel ways to deal with problems and opportunities.
Hands-on design, analysis, development and troubleshooting of highly-distributed large-scale production systems and event-driven, cloud-based services
Primarily Linux Administration, managing a fleet of Linux and Windows VMs as part of the application solutions
Involved in Pull Requests for site reliability goals
Advocate IaC (Infrastructure as Code) and CaC (Configuration as Code) practices within Honeywell HCE
Ownership of reliability, up time, system security, cost, operations, capacity and performance-analysis
Monitor and report on service level objectives for a given applications services. Work with the business, Technology teams and product owners to establish key service level indicators.
Ensuring the repeatability, traceability, and transparency of our infrastructure automation
Support on-call rotations for operational duties that have not been addressed with automation
Support healthy software development practices, including complying with the chosen software development methodology (Agile, or alternatives), building standards for code reviews, work packaging, etc.
Create and maintain monitoring technologies and processes that improve the visibility to our applications’ performance and business metrics and keep operational workload in-check.
Partnering with security engineers and developing plans and automation to aggressively and safely respond to new risks and vulnerabilities.
Develop, communicate, collaborate, and monitor standard processes to promote the long-term health and sustainability of operational development tasks.
Participate in technical training events, game day scenarios, and professional conferences
YOU MUST HAVE
3 years mastery of infrastructure automation technologies (like Terraform, CodeDeploy, Puppet, Ansible, Chef)
3 years expertise in container/container-fleet-orchestration technologies (like Kubernetes, Openshift, AKS, EKS, Docker, Vagrant, etc, zookeeper)
5 years Cloud and container native Linux administration/build/management skills
5 Years of experience in system administration, application development, infrastructure development or related areas
3 years of in reading, understanding and writing code in the same
Versatility with troubleshooting diverse sets of hosting technologies strongly desired. These include web server platforms, application platforms, operating systems, network components, virtualization technologies, storage, and database platforms
Expertise with cloud- continuous-deployment- based software development lifecycles (e.g. CI/CD)
Cloud database operations and deployment experience (RDS MySQL/Postgres/Aurora), Caching operations & deployment experience (memcache, Redis)
Expertise with Lean/Agile deployment processes (Blue/Green, ZDT, Canary, load balancers/DNS strategies A/B test, feature flagging methodologies)
Familiarity with site and infrastructure monitoring systems (like ELK, Datadog, AppDynamics, New Relic, Splunk, Sumologic, Grafana)
Ability to design and manage escalation response plans from monitoring, react, respond, remediate and retrospect in culturally aligned (proactive, customer focused, collaborative, data-driven) ways
Demonstrated expertise building and managing highly scaled production infrastructure in the cloud (Azure required; GCP, AWS, OpenStack a plus)
Expertise with SDLC branching, SCM, and code deployment systems (Bitbucket, git/gitflow, Jenkins, CircleCI, TravisCI, etc.)
To apply for this job please visit careers.honeywell.com.