The painful cost of overestimating data quality in migration

The painful cost of overestimating data quality in migration

Given the critical role of data as part of high-priority digital transformation programs (something Ian Crone blogged about recently), it’s surprising – alarming, even – how much is assumed about it as companies go into new technology projects. Until they know differently, IT project teams and their business leads tend to operate on the premise that the chosen new Regulatory Information Management (RIM) system will transform the way they work and deliver the desired outcomes. The corporate data it draws on is more an afterthought. No one doubts its quality and complexity – until it’s too late.

It’s this risky assumption that must be challenged, something that should happen much earlier along the project timeline – long before any system/platform implementation has been set in motion, and before any deadlines have been agreed. The inherent risk, otherwise, is that the implementation project will have to stop part way in – when it becomes obvious that the data sets involved are not all they could or should be.

Identifying discrepancies

Often, it’s the smallest discrepancies that can cause the biggest hiccups in data consistency. Simple inconsistencies, such as using different abbreviations for the same compound, or misspelling a product name, can result in duplicates appearing in a system. More complex issues occur when the project is linked to IDMP preparations, whether as a primary focus or as a secondary benefit, there may be fields yet to be completed or which require content from other sources. Multiply this up by potentially millions of data points and you see the risk.

It could be that multiple different RIM systems are being consolidated into one new one. As each different system is trawled for its constituent information, consideration needs to be given to differing formatting, data duplication and variability in data quality. A myriad of dependencies and checkpoints, between systems to ensure the success of the content migration project.

Inevitably there will be interdependencies between the data in different systems too, and links between content (between source RIM data and stored market authorisation documents, for instance). All of this needs to be assessed, so that project teams understand the work that will be involved in consolidating, cleaning and enriching all of the data before it can be transferred to and rolled up in any new system.

The sobering costs of project recalibration

If a system implementation is already underway when data issues are identified, project teams must recalculate and recalibrate, which can incur significant cost and effort. Before they know it, a project that was scheduled to take a year needs an additional three months to clean up and enhance the data.

Processing change requests will require key resources that are now committed elsewhere – not to mention additional budget that hasn’t been provided for (a 25%+ hike in costs is not unusual, as data quality issues are identified). Meanwhile, there are Users waiting for capabilities that now aren’t going to materialize as expected: delays are inevitable. Data is critical to the business and without the right quality data, a new system cannot go live.

All of this could be avoided if the data migration implications of a project were analysed, assessed, understood and scoped fully ahead of time. The good news is that this oversight is relatively easy to rectify – for future initiatives at least. It’s just a case of calling in the right experts sufficiently early in the transformation definition and planning a process to perform appropriate analyses.

More on our Content Migration Assessment

Get in touch with us

 

 

Give data migration its own RFP

Give data migration its own RFP

In this context, and to improve their own agility and responsiveness, Research and Development (R&D) organizations have now set out two areas of renewed interest for strategic focus and investment. One is biobanking, as the demands around clinical trial sample storage soar. The other is the need to strengthen data assets, as the ability to apply these confidently and swiftly across all kinds of regulatory processes becomes crucial to speed to market.

The data imperative – which surrounds the integrity, quality and traceability of regulated product data – concerns all pharma organizations. The driver for change could be a regulatory information management (RIM) consolidation project following a merger; it might be an initiative to standardize on IDMP fields and vocabularies; or an attempt to bring new traceability to medical device or cosmetics manufacture. But, too often, companies set the wheels in motion and start to implement a new business solution, before they consider the work that might be necessary to vet and prepare data so that it can be migrated reliably into the target IT system and/or new data model.

Choosing the new system first could be putting the cart before the horse

In some cases, project owners assume that any matters relating to data assessment, preparation and migration will be taken care of by the new system vendor, and that this would be addressed as part of their proposal. It’s only when the analysis phase of the project begins – that the realization dawns that the incoming data is messy/conflicting/incomplete – that they begin to understand that they have underestimated, and skimped on, this critical cornerstone of a successful delivery.

Instead of expecting software vendors (which typically lack the depth of data experience) to provide for data assessment, cleaning and enhancement as vital preparation work ahead of any data migration, the only real way to make a proper job of it is to itemize it separately – in other words, break out this work with a separate request for information (RFI), or request for proposal (RFP).

Avoiding the tough decision of compromise vs bill shock

By separating out data-specific activity, companies will also save themselves from any bill shocks as vendors are forced to bring in specialist partners to rescue a project – at short notice, and with their own mark-up on the extra costs. If the parameters are known much earlier on in the project cycle, the data preparation and migration work can be more accurately planned for and integrated more seamlessly into the overall deployment – with much less risk of the project over running or exceeding its budget.

It’s one thing to prioritize cost when sizing up vendors for a new system project, but if this introduces new risk, because the required specialist skills and resources have not been allowed for, it is a false economy. Certainly, the work could end up costing a lot more and taking a lot longer if critical data preparations turn into last-minute firefighting. Far better to have the right skills cued up from the outset, with a clear remit which includes responsibility for servers, security and more during the data preparation and migration phase.

In 2022, a whole range of digital transformation drivers including IDMP compliance preparations and improved traceability will see new system implementation projects and associated data migration initiatives increases. To maximize success, it’s definitively advisable to separate out your data requirement and prioritize this from the outset so that any system project builds on solid foundations.

More on our Life Sciences Portfolio

Get in touch with us

 

 

A master plan: why master data management is key to successful migration projects

A master plan: why master data management is key to successful migration projects

»Master data management is the method that an organization uses to define and manage its critical data in order to achieve a single source of truth across the enterprise.«

The importance of master data cannot be overstated. Master data represents the most critical data for operations within an organization or function. It is data that can be trusted, is unlikely to change, has been verified as correct and error free, meets compliance requirements, is complete and consistent, is common to all stakeholders, and is crucial to the business’ operations.

The term “master data” is often applied to databases of business-critical information, such as customer information files, product information files and so on. This is true, these types of databases are usually the authoritative source of the master data values. That master data is typically used by other applications, such as content management systems, which must be kept in synch with changes in the authoritative source.

What can hinder the use of master data in a content migration project is inadequate resources, lack of standards, failure to implement an internal governance process, poor planning, changing direction part way through a project, and not having a business owner or champion that understands the complexities of master data.

In one example, errors occurred early in the migration phase because the master data requirements and master data interdependencies were unclear. That meant the team responsible for the migration had to unexpectedly address a lot of master data issues. Adding to the problems faced was the fact that the organization was in the process of developing and changing naming convention standards for drug products, so the data in the system and the data used for mapping were different. Finally, further problems were encountered as the company decided on a last-minute major change to its approach to handling application master data. This occurred because the business users failed to understand how the original approach would affect their user community. As a result, the project was delayed, and huge effort had to be expended to tackle the issues.

These problems can be averted with the right approach and the right product.

Five lessons learned on how to mitigate migration issues

  1. Plan early. A successful content migration project needs to evaluate master data requirements at the planning stage to ensure a complete set of the right data is made ready for migration. If master data is missing, incorrect or not available at the right time, it can lead to delays and increased costs. It may even result in projects being cancelled or suspended.
  2. Develop a cross-functional team. Often, anything to do with data management is seen as an IT issue. However, it’s important that a cross-functional migration team, comprising IT and business stakeholders, works together to determine the master data required for migration. It is essential that one or more business representatives take ownership of the master data aspect of the project.
  3. Bring in the resources needed. Many organizations don’t understand how master data affects the application or migration process. That’s because many simply don’t have the internal resources or expertise to address master data management as an integral part of a migration project.
  4. Spend time on business analysis. Another issue is that few organizations have a business analyst function sitting between the business and IT. Spending the time upfront on analysis of the master data needed for the migration and comparing it to master data in the existing system can prevent project delays and disruptions.
  5. Always consider master data context in determining which master data to use in a content migration. If the documents are being used for a specific part of the business, it makes sense to only incorporate master data relevant to the project and end users. For example, if manufacturing documents are being migrated, the master data should be relevant to manufacturing users. In this case, internal drug product names are probably more appropriate than drug product trade names.

Experience has shown that making master data management as a key element of any migration project vastly improves the success of a project. Master data, developed and supported through a collaborative process, should be the bedrock of any migration project.

More on our Content Migration Services

Quality in migration projects – “a life necessity” in the Life Sciences industry: Interview with Markus Schneider

Quality in migration projects – “a life necessity” in the Life Sciences industry: Interview with Markus Schneider

Jens Dahl:
Markus, you have been working in the area of ECM in the Life Science and Pharmaceutical industry for more than 20 years. You know many platforms, applications, client environments and of course, the processes in that industry by heart. I think one could call you one of the top specialists in this area. What makes this industry so special?

Markus Schneider:
Mainly the regulatory requirements, only the aviation and nuclear industry are subject to similar strict quality and security regulations. This is obviously in everyone’s best interest. In these frameworks, you have to keep in mind what could happen if there were errors in the approval or manufacturing processes.

Jens Dahl:
So, It can be said that the industry is so sensitive because it is about people’s lives and health. That is why processes and quality assurance during the introduction of IT-Systems are strongly regulated, supervised and in case of occurrence of irregularities severely sanctioned. Consequently, these possible sanctions bear high risks for the enterprise itself. The strong regulations and risks in case of violation naturally apply to any kind of content migration.
According to your experience, what is the biggest challenge in these kind of content migration projects?

Markus Schneider:
The biggest challenges are ensuring of data quality and the process of validation itself.

For example, if submission-relevant data is to be migrated (data that must be reported to the regulatory authorities) the requirement is often that the data must be migrated 100% error-free. In these migration projects, the effort and complexity depends crucially on the data quality in the legacy system. The actual data quality, however, can often not be estimated correctly at the start of the project. In general, the client assumes a very high level of data quality, but this is often not the case. The reasons for data errors in the legacy system can be very diverse; but this is a separate topic in itself. The fact is, however, that these errors must be identified and corrected during migration. Data migration is therefore not just about copying data from A to B. The challenge lies in the cleansing of the existing data and for example, the mapping of it to controlled value lists in the target system. Since, as a rule, the legacy systems are also validated, the necessary corrections can often not be made there, i.e. the correction must be carried out during the migration. We have therefore developed procedures and approaches in our team to be able to master these challenges.

The second aspect is the validation process. Our customers are of course very skilled and experienced in implementing validation processes in the IT environment; their experience is most often limited to changes in existing systems or processes and the introduction of new applications. But, data migrations cannot be validated in the same way as the introduction of a new software application. There exists elementary differences that must be considered as part of the validation plan already. It is our task to advise our clients in this area and to propose solutions that offer a reasonable cost/benefit ratio.

Jens Dahl:
Would you say certain aspects of content migration projects are regularly misjudged or underestimated?

Markus Schneider:
As the developer of the migration-center we at fme are repeatedly asked by clients to provide our assessment of their failed migration projects. Therefore, I have already prepared several analyses of failed migrations. Two aspects are particularly noticeable here:

The incorrect selection of the migration tool and / or incorrect migration approach
Once it is clear that data in the legacy system needs to be adapted to fit controlled value lists in the target system, you need a rule-based tool with which these values can be transformed. Static mapping tables or customized scripts won’t do the job, because users will continue working in the legacy system just until the day of the productive migration. Therefore, it is likely that there will be new data sets, which might not be able to be correctly processed by your planned migration approach. Therefore, you will need to adapt the implementation followed by additional tests.

In the meantime, users keep creating more and more data and finally you will end up in an endless reengineering circle and without getting closer to the initial goal.

Incorrect estimation of data quality in the legacy system
As I have already described , this is a particular important point. If you assume from the beginning of the project that the data in the legacy system correspond 100% to the object model, it often comes to the situation that the calculated project budget is not sufficient and the milestones of the project cannot be reached as planned. This is then often compensated by compromises in data quality, which will regularly lead to failure during the Qualification Phase because the acceptance criteria cannot be fulfilled, resulting in not getting the approval to proceed with the migration to the productive system.

Jens Dahl:
So what’s your advice or rather which conclusions do clients draw from this projects?

Markus Schneider:
In many cases, the client underestimates his own efforts and contributions in a migration project. There are dozens, if not hundreds of detailed decisions to be made within the scope of a data migration. We at fme can make sure the migration runs smoothly from the technical point of view and of course we can bring in our migration services experience and expertise to provide important information to create a profound decision base. What we cannot do is to take business content related decisions for them. This remains the responsibility of the client. For example when matching data values from legacy systems to controlled value lists of the EMA (European Medicines Agency) the customer must decide which values are to be assigned and how. We will of course ensure that their decisions will be technically implemented correctly and error-free during the migration. My experience is that the more realistic a customer estimates his own efforts and creates the necessary organizational basis, the better and faster the projects will run.

Jens Dahl:
Well, I would say having high-quality processes, methods and tools is something that is desirable independently from the industry. Is this something you could imagine applying these methods in other industries as well, but to reduce the high costs caused by additional documentation and regulatory requirements by something more pragmatic?

Markus Schneider:
There are two things that need to be clearly separated here. The quality of a migration in validated environments is not guaranteed by a high documentation effort. Exactly the opposite is the case. Only if the quality is very high, can the documentation effort in a validated environment be achieved in a reasonable framework. Everybody who has already undertaken projects in validated environments knows what it means to document unexpected deviations and to explain them in a deviation analysis. That’s why the migration approach which we developed aims especially for high quality and an optimal migration rate. This also means, that the costs for a migration project do not drop drastically due to the reduction of the documentation effort.

The procedures, the implementation standards and the continuous, automated validation of data quality enable us to plan and execute migrations accurately and precisely controlled.

In the pharmaceutical industry the additional mandatory documentation is an added expense, but as this is not obligatory to the same extent in other industries these costs could be considerably reduced.

The assumption that migrations in regulated environments are always more expensive is not necessarily true. The question is always what causes high costs in the individual migration projects and are the underlying costs actually comparable?

Here is a simple example:

Supplier A calculates a project budget of 100.000€ for the migration of 2 million documents and achieves a migration rate of 98 %.
Supplier B calculates 130.000 € for the same migration but achieves a migration rate of 100%.

From the project point of view the first offer might seem more attractive, but from the business perspective, the second offer is definitely the better one. The choice for supplier A could turn out to be very expensive for the company, because its employees have to migrate the missing 2% manually. This will take up a significant amount of their time, during which they cannot concentrate on their daily business . This issue is quite often underestimated or neglected by clients.

more about our migration projects and services
more about Life Sciences at fme US

This interview was conducted by Jens Dahl, leader of the competence center Migration Services at fme AG. It was originally recorded in German and translated into English.

Getting an Edge with Your Box Migration So It’s Efficient, Fast and Focused

Getting an Edge with Your Box Migration So It’s Efficient, Fast and Focused

It’s all about how to transition data fast and securely with minimal downtime or disruption to the organization and employees. But that’s much easier said than done in most environments – especially when you start thinking about the scale of content to be migrated and what’s technically required to make content “discoverable” through search so users can locate their documents quickly.

If you’ve been following our blog series, you know about our migration-center and how it was designed to overcome many of these challenges. We’re now taking our expertise in migrations to help companies restructure and migrate their enterprise content to the Box Cloud through our upgraded version of the Box connector. Our mantra? Make it easy. Make it painless. The migration-center can connect and scan documents from over 140+ migration paths to migrate large volumes of content from any system to Box. You’re thinking “wow” but what exactly does that mean? It means the migration-center with its powerful transformation rules takes care of where and how the documents ends up in Box Cloud when it moves over.

The Ins and Outs of Migrating to Box

Data migration is arguably the most overlooked part of a change initiative. That’s because it’s labor intensive, repetitive and requires meticulous attention to every minute detail. And it assumes that the data you have to migrate is clean.

Here is what you need to plan for in order to migrate to Box Cloud and take full advantage of its functionality. Consider:

  • Metadata: The metadata helps categorize content and helps to retrieve the content easily. You’ll have custom data associated with your files that need to be imported so you’ll want to be sure your metadata is accurate and up to date. If you don’t have metadata defined now, it will be a good opportunity to set them when importing.
  • Tags: This gives users the ability to mark, sort and easily search for related files. Very often, the originating content might not have sufficient custom metadata specific to how an individual user would like to search. You may want to tag the content based on how the user community would like to search using some keywords.
  • Comments: Often discounted when you migrate from one system to another, forcing you to lose the trail of changes when you omit comments. Plan to include the comments during migration into Box.
  • Collaborator roles: In Box Cloud, you’ll have full controls to set permissions. You can set permissions for users and groups to files and folders, similar to access control list. You may either bring in the same permission set that the source system has or you can set a different permission sets that are defined in the Box.
  • Versions: The source system might have multiple versions of the documents in different lifecycle states. Decide which versions or which state of the documents that you want to migrate. Sometimes, you may not want to bring all the drafts and just bring the important content to Box Cloud.
  • Tasks: There might be documents in pending workflows with some users having tasks in their Inbox. Consider creating tasks for those documents while importing them so that users will have the tasks created in the Inbox of their target system once migrated.
Many organizations moving to Box Cloud plan for all of the benefits of an enterprise-class content management platform that offers a single place to manage and share content. What’s not to like? It’s easy to deploy, it has a graphical user interface (which always gets people hooked), and it sure beats most platforms in terms of collaboration and document versioning, which is essential for compliance and internal tracking. But excitement can quickly turn to despair if organizations don’t plan in advance to take full advantage of what Box offers and have a way to get their data into the system fast.

more on migrating to box

BoxWorks – From the perspective of a Life Sciences Consultant

BoxWorks – From the perspective of a Life Sciences Consultant

Let me start with a quick summary of the highlights:

Box Feed (available as Beta)
Box feeds allow you to get a stream of activities happening in the shared folders which the user can access. This is a nice collaboration function as are you are seeing the content of your coworkers directly in the stream. This has been missing for a while and Box is now delivering a first iteration with the possibility for you to comment on the documents.

Activity Streams and Recommended apps (2019)
With this feature Box is now tracking the way the content is flowing within other cloud systems, so that co-workers can follow all the steps that have happened with the document along the way. For example, if you add a box document to a certain account in Salesforce and afterwards sign it with DocuSign, box users will see these actions as activities with the possibility of directly accessing the content or associated activity within these applications. Integrations with custom applications are possible here as well.

Box Automation (2019)
Box automation is a workflow/rules engine which will allow business users to automate content processes directly within the Box platform. There are a lot of possibilities with this new functionality from simple workflow approvals to slightly more complex processes. There are certain events available which can trigger the flow of your content, and the possibility of selecting from a number of possible actions. There are currently some mockups available but no beta version yet.

Box for GSuite (available as Beta)
Box is one of the first cloud providers to allow integration into the three big content creation/editing platforms. It now has deep integration into Microsoft Office 365, into Apple IWork and with this latest edition now with GSuite as well, allowing users to pick their favorite tool for content editing.

Box Skills / Box Custom Skills
Box Skills allows you to analyze and process content files. It consists of a UI part to display metadata and it also allows service providers to create custom skills. Custom Skills can extract information from audio, image and video files. There will be an SDK made available by the end of the year which will allow to extract and pull in any additional information as skill data which can be used for metadata and content retrieval.

Box Shield (later in 2018)
Box improves the security platform even more with the new Box Shield. For example, it allows you to prevent downloads of content with a certain classification, so that you can configure that only a set of users can access the document. It can also detect downloads of a large amount of folders and documents at once and will generate an alarm. Or it will detect a login attempt from another country.

Other Product enhancements
Box has integrated the ability to activate 2FA for external users. This is very important for larger companies to ensure enhanced security.

These are all great services and enhancements and create a global collaboration and work platform, inter-connecting activities and systems of daily importance to the users.

But how does that fit into core Life Sciences processes?

At the conference there was a separate track with several Life Sciences sessions offered which helped to catch a good glimpse of the current situation.

One of the big items was the GxP Validation package for Box that was announced beginning of the year (Box GxP Validation).
This service package turns any Box instance into a validated platform which can host validated content management applications. Whereas non-regulated processes have been supported before, this now additionally allows for the coverage of regulated processes.

More specifically, this package covers the following aspects:

  • Audit – Quality Management System: Box QMS Documentation and SOPs built on GAMP5 and ISO9001 standards
  • Validate – Validation Accelerator Pack (VAP): Validation lifecycle documentation and tools to make the Box instance GxP-compliant
  • Maintain – Always-on Testing: Daily automated testing reports on nearly 150 tests of Box functionality at the API layer; creates daily reports and artifacts for customers and audits

This definitely provides a good starting point for the deeper use of the platform in the Life Sciences industry, but will only be of value in combination with the robust regulated document management functionalities. In this regard, the message by Box is clear: Box does provide a hosted Core platform, industry use-cases and how to add additional business value through extended functionality and applications is in the hands of the customer and Box technology partners.

In comparison with OpenText Documentum Content Management capabilities there are still some gaps such as the management of document renditions (specifically PDF formats) or the management of virtual documents and relationships. But with API and Web Services the spectrum of possible implementations (and maybe “workarounds” as needed) is large and highly extensible.

Additional benefits on smaller effort scale can be achieved by just adding-on Box to the Core platforms; whether it is used as a publishing platform for Effective documents, as Collaboration platform for In-Progress documents internally but also with external partners (taking advantage of integrations with core business applications), or whether it is to take advantage of the Skills framework to allow the classification of documents before processing them within your Content Management platform (and by that relieving your business users of some of these classification tasks). Several of the vendors at the conference who are involved with the Life Sciences industry have discussed and demoed these kind of integrations to increase the value of their already existing applications.

Overall, Box offers a great set of functionality and is creating a widely integrated work environment for users – a foundation for the “Future of Work”. It does not replace a feature-rich regulated Content Management platform such as OpenText Documentum at this point, but does offer a great platform for integrating business applications.

We are excited to be able to support our clients who are leveraging this platform. Our first focus area is on content migrations and we are happy to announce that an extension of our Box Importer for migration-center will be available in Q4, with newly added key functionalities such as custom metadata, versioning and security support.