At a Glance
We were approached by the sales and customer service team for a medical device company to help them manage their complex technology stack. Over several years, they had built an ecosystem of tools to manage the complex pipeline of data from marketing through to the customer support efforts. This ecosystem had developed organically over time through a mix of commercial, open-source, and custom tools in response to evolving business needs. Convertiv was asked to perform a deep dive into the toolchain and re-engineer it to pay down accumulated technical debt and improve performance across the pipeline.
Our customer found themselves managing a sprawling web of data services, including a central custom CMS and a web of API-driven microservices. This network of tools, built partially by third party agencies and partially by in-house engineers, had grown organically to support the evolving needs of the marketing, sales, and customer support teams.
The organic growth, driven by emerging business requirements, resulted in a system that was critical to the company’s success and a significant liability. As usage climbed, load increased and outages became common. New features were bolted on in order to meet urgent business needs, resulting in a fractured, confusing user experience. This friction was consuming both technical and administrative resources, creating serious drag on growth.
Our mandate is to partner with stakeholders, understand their mission, and identify the ways in which technology is creating drag on that mission. In this case, with the stability of the system at risk, we divided our efforts. Our operational engineers immediately linked the existing infrastructure to our instrumentation and tooling to help us visualize the structural problems. While that work was underway, our UX team began modeling the complete user journey for each set of stakeholders. Together, these two efforts produced a clear picture of the liabilities and systemic friction that was creating drag.
Define system performance indicators and instrument the critical system to provide improved visibility. Systems of this nature are often a black box, even to people who work closely with them. Introducing instrumentation, analytics, logging, and data visualization tools gave us visibility into the system.
Have rigorous, clear, well-documented guidelines and patterns for technical processes. In a busy company, it’s easy to reach for any tool that solves the immediate problem. Even temporary solutions can quickly become part of mission critical infrastructure. We worked with our customer to develop and document clear technical guidelines at each level. These documents and patterns serve as a check on future debt accumulation, ensuring that new processes and tools fit well in the ecosystem and advance the entire stack. During this project, we included the following to help manage long-term debt:
Consolidate the toolchain to focus on mission-critical goals, and build on those tools. Once we had a clear set of standards and visiblity into the process, we focused on finding tools and processes that were creating choke points. In this case, we found a combination of issues. Some systems were poorly designed, ill-defined, or inadequately maintained. In particular, the proliferation of API microservices, combined with a complex, poorly written CMS, contributed significantly to the problems. By eliminating these tools, and consolidating the processes onto a single platform, we significantly improved the stakeholder experience, while reducing cost, operational liability, and security exposure.
We refocused staff away from operations and training into core business mission. With improved uptime, and a consistent, intuitive UI, the teams could focus their efforts on their core mission.
We implemented a faster development cycle for new custom tools. By building on a solid foundation and following well-established patterns, the customer is now able to respond more quickly to emerging business needs and bring new systems online at a fraction of the cost and time.
We reduced cloud costs buy 60 percent. By eliminating unnecessary microservices and performance bottlenecks, we were able to reduce cloud costs dramatically.
We achieved 99.9968% critical system uptime. Before remediation, the legacy system was experiencing several hours of downtime per month. Our instrumentation gave us deep visibility and enabled proactive improvements, eliminating the performance bottlenecks. Since the launch of phase 1, the new platform has recorded 99.9968% uptime.
reduced cloud costs