DevOps Lead (remote work)
|Our client has built the leading cloud-based online scheduling, point of sale, and automated marketing suite to power the operations of local service businesses from salons and spas to pet groomers. It's used at over 14,000 locations across the US with sizes ranging from single-location sole proprietors to large chains and franchises.
The product is a mission-critical software that customers use for over 10 hours per day. It is an all-in-one business solution built around a world-class scheduling platform that can grow revenue by as much as 30% through automated marketing and online distribution.
What makes this product unique is that its a platform that allows many third-party partners to integrate the online booking functionality into their own products.
The longer term vision is to become the largest platform in the world for online scheduling and local commerce.
The group holds an estimated 20% market share in the credit insurance industry and is present directly or via strategic partnerships in 100 countries, with the ability to offer multinationals a structure suited to their needs that combines global reach and proximity, with a deep knowledge of local economic conditions in multiple sectors.
This is a hybrid role combining the ownership of System Administration and DevOps. You will be responsible for ensuring our services and infrastructure are fast, stable, and scalable. You will also build out services and tooling that are not already attainable via open source software. Operational tasks such as infrastructure, build/release, and systems administration will also fall within your realm of responsibilities.
As our primary SysAdmin, you will lead the effort to build solutions to problems that provide high availability, routine security updates, and stability across our infrastructure. Wherever possible, you will seek out and drive cost reductions through service optimizations and demand-based auto scaling, while working in conjunction with IT, engineering, and business groups to understand the functionality, scalability, performance, security, and integration requirements.
From a DevOps perspective, you will be responsible for configuration management, and the build and release lifecycle.
The ideal candidate will need a strong software development background along with a solid understanding of systems, database architecture, and data integrity. We are looking for someone with a passion for programming and automation, but who can also think about business needs and how to improve the current state of our infrastructure and fulfill those needs.
System troubleshooting and problem solving across platform and application domains will be a part of the job, but we are primarily looking for someone who can proactively suggest architecture improvements to our engineering process and system design in general.
If you have a passion for programming and automation, and actively look for opportunities to develop tools to streamline and simplify the development and delivery process, we would love to talk!
Oversee and manage the release process
Investigate and recommend best practices for maintaining code quality, including development of code metrics, code review workflows, code coverage measurement and the efficient use of static and dynamic analysis
Build and maintain tools for release, infrastructure and application monitoring and operations
Maintain appropriate technical documentation regarding configurations, operations and troubleshooting procedures
Monitor Linux-based web servers, database servers, application servers, and Elasticsearch clusters
Have experience using cloud infrastructure tooling such as Terraform
Ensure critical system security in compliance with company security policy through the use of best in class cloud security solutions.
Help guide our engineering team by providing better insights into our response and availability metrics
Accept on-call rotations for emergency situations (resolving network, storage, DB, or memory issues)
Coordinate with the appropriate teams for incident resolution for high severity or escalated incidents
Manage Backup and Recovery procedures, in accordance with our Disaster Recovery and Continuity policies
Actively mentor junior developers and train experienced engineers, improving their skills, knowledge of our systems, and their ability to get things done!
Evaluate new technology options and vendor products
Minimum of 5 years of infrastructure operations experience, including architecting databases and web servers for scalability and high availability
Familiarity with systems, networking and software development (OS, firewalls, Load Balancer, Web Server, Application Server, etc)
Familiarity with software development lifecycle (requirements gathering, design, implementing, testing, and production support)
Experience with Amazon Web Services, MySQL and NoSQL databases, Docker containers, Elasticsearch clusters, nginix web servers, HAProxy load balancers, strongly preferred.
Familiarity with tools for monitoring (esp. Cloudwatch, Grafana) and logging (esp. Kibana, Logstash), strongly preferred
Some knowledge of Ruby on Rails, a plus.
Experience with agile software development environments
Excellent communication skills, fluent in English, and eager to learn new technologies and solutions
Compensation & Benefits
- Competitive salary. Equity. Opportunity to make a tremendous impact.
- Full Benefits Package
- The chance to actually make a difference in a growing startup that is solving a big problem