Modernizing Data Sources Using Shims

Business intelligence services must constantly adapt to meet the needs of partners and team members, which calls for constant technological advancement. However, despite the fact that technologists work in a constantly evolving and improving environment, it can frequently take a long time (sometimes even years) to reach a point where we can retire legacy systems.

IT IS COMPLEX TO TRANSPORT DATA SOURCES

When cutting-edge solutions and antiquated systems must coexist, it can be challenging. Managing a rollout from the legacy system to the modern solution under your direct control becomes more difficult when working with integrated systems, especially when you’re the data provider for other systems, even though creating a new greenfield system from scratch with all the engineering bells and whistles is relatively straightforward.

Complicated factors to consider:

Maintaining the Data’s Accuracy. Because most new system implementations do not occur in a “big bang” overnight, modern systems might not have all the data on hand on day one. Depending on the stakeholders’ level of readiness, this could take months or even years to finish.

Redefining the Schemas for new Data Sources. While implementing new systems can be exciting, changing data relationships or formats can have a negative impact on other systems and business operations.

Adapting integration strategies for systems we can attest that this is true. In the heyday of the mainframe and monolith, point-to-point integrations and direct database access served a useful purpose, but a services-first modular architecture has since taken their place.

encouraging consuming systems to change. If they are also trying to modernize, they may not be eager to. Who would want to work on the outdated codebase if it were to disappear?

assuming timing. Timelines are frequently off, even when teams make a good faith effort to stick to them.

What then can you do to successfully navigate a landscape of systems, deliverables, and data requirements that is constantly changing without completely upsetting all the business processes that rely on your data?

Try shimming

Shimming, while not always the best option, can be a great solution in some circumstances.

You can create a “whole” data set rather than two separate sets of data by using a shim, which is a temporary patch of data used during that time of coexistence. Shims have the following two main uses:

Shim back is the process of replicating data from a contemporary/future system into a old/legacy data source. The objective is to keep current clients “whole.”

Shim forward is the process of replicating data from an outdated or legacy system into a current or future data source. The objective is to make the new data source “complete” so that customers can switch over as soon as possible.

When a rollout approach gradually switches from 100% shim forward to 100% shim back over time, a bi-directional shim of both shim back and shim forward may be necessary to prevent duplicate transactions in any one data source.

Shim construction takes time; in fact, it takes so much time that it may be tempting to think it would be simpler to just “move fast” and omit the step altogether. While tempting, skipping shims in a technology environment the size and scope of BizBrolly is simply unrealistic given the amount of technical debt being addressed.

Shims are not always the best option, however. We established guidelines for when to use shims, which allowed us to accept them as a tool.

Here are a few illustrations based on typical criteria we used to determine whether or not to construct shims.

Master Location Data

This information includes the address, business hours, and capabilities for every our locations, including stores, distribution centers, and headquarters. In our mainframe environment, the legacy source is stored in DB2. The modern source would change event Kafka topics and be service enabled via APIs.

It’s a big decision whether or not to use shims. Here are some factors that we take into account before making a choice:

Considerations: If we move forward, can we significantly speed up the adoption of the new data source?

Solution: There is no lengthy rollout period; the conversion to modern can be completed in a single day.

Considerations: If we shim back, do we actually cause more issues than we solve?

Solutions: The mapping is actually quite simple.

Considerations: How long will this really have to go on for?

Solution: To modernize every legacy customer will take years.

Considerations: If we don’t shim back, how many consumers will be affected?

Solutions: There are tens of thousands of programs, and the majority of access is through joins and SQL queries that are hardcoded.

VERDICT: SHIM BACK

WHY: Due to the large number of essentially hardcoded legacy consumers, we decided it would be better for the team in charge of the location to shim back to the legacy data source, even if it meant that the shim would need to last for a very long time. Since the rollout was minor in terms of timing, there was no need to shim ahead.