Garbage In, Garbage Out: How Poor Data Quality Clogs Machine Learning Training Pipeline 

What is common in the success stories of businesses as diverse as Amazon, Airbnb, and Kakao Bank. The answer is data and a leadership that was relentless in the pursuit of good data quality. In the digital age, good quality data is a key differentiator – an invaluable asset that gives organizations an edge over competitors who have not been as dogged about the same (data quality). As a result, they are burdened with substandard, scattered, duplicate, and inconsistent data, that weighs them down more heavily than iron shackles. In the world businesses are operating in today, the divide is not been the big and the small organizations but between organizations who have invested in improving their data quality and those who have not. 

A single rotten apple spoils the barrel!    

We have all heard of the story about how a single rotten apple spoils a barrel. It is more or less the same story when it comes to data. Unclean, unrefined, and flawed data does more harm than good. Gartner estimates that poor quality data costs an organization $ 15 million per year. Though survey after survey talks about monetary losses – unclean data or data that has not been refined impacts more than the bottom line – it prevents businesses from deriving actionable insights from their data, leads to poor quality decisions, and drives dissatisfaction among all the people who matter – partners, vendors, and regulatory authorities. We have also heard of several instances where poor data quality has quickly snowballed into a major issue- like a money-laundering scam leading to loss of reputation as well.

Today when a majority of organizations are leveraging the power of AI and machine learning tools and investing millions to stay ahead of the curve, bad data can be a reason for not meeting the ROI. While organizations pour money for AI and ML tools, it is constrained due to bad quality data.      

Bad data hurts the American economy  

The impact of bad data on the American economy is not trickle-down, rather it is a gigantic leak that is hard to plug. Collectively, the impact of bad decisions made from data that are flawed goes into millions and billions. Ollie East, Director, Advanced Analytics and Data Engineering, of the public accounting and consulting firm, Baker Tilly, says that bad data costs the American businesses about $3 trillion annually, and breeds bad decisions made from having data that is just incorrect, unclean, and ungoverned. 

Banks and FIs are no exception to the rule. In fact, because of the privacy and regulatory requirements – they stand to lose more due to bad data. Of the zebibytes of data (including dark data) in existence organization-wide today, organizations are capitalizing only a minuscule percentage. Banks and FIs can ensure that they do not lose business, revenue, and clientele on account of poor data quality. It only takes a bit of effort and strategic planning. Further, the phenomenal success of new-age technologies like AI and machine learning has changed the rules of the game and has enabled banks and FIs to fish value from even the dark data – if only they undertake a planned approach to data standardization, data consistency, and data verification, and ensure that is streamlined for use again and again. Organizations must also account for the new data that enters the workflows and pipelines and ensure that a suitable mechanism is in place to ensure that it is always clean and standardized.     

To reiterate, why lose on the competitive advantage? Here’s a look at how organizations – banks and FIs- can invoke the power of cleaner, structured, data to make their processes crisper, leaner and undeniably more efficient.  

Step 1: Pre-processing data – making data good for downstream processes        

Pre-processing of data is the first step in the journey towards cleaner and refined data. Considering that not many organizations today can claim that their data quality meets expectations – the Harvard Business Review  states: “only 3% of companies’ data meets basic quality standards ” – pre-processing of data is critical for the following reasons: 

  • Identifying of what’s wrong with the organization’s data. What are the core issues? 
  • As the data is more likely to be used again and again in workflows, processes and systems enterprise-wide, good quality data with the right encryptions minimizes conflict of interest and other such discrepancies.        
  • Also, as most organizations are likely to be using some kind of AI and ML for their processes involving this data – it is better to get it in shape to reap the maximum benefits. 

Garbage In Garbage Out – The true potential of AI and ML can be leveraged only when data quality is good 

Today, data scientists and analysts spend more time pre-processing data for quality (fine-tuning it), than analyzing it for business and strategic insights. This iterative pre-processing of data even though extremely time-consuming is important because if organizations feed “bad or poor-quality unrefined data” into the AI model it will spew (to put it across literally) garbage. Garbage In, results in Garbage Out. To leverage the true potential of AI and ML, it is essential that the quality of data being fed into the machine-learning pipeline downstream is of high quality. 

There are of course other substantial benefits as well. One, when the data is cleaned at the point of capture or during entry, banks and FIs have a cleaner database for future use. For example, by preventing the entry of duplicates at the point of capture (either via manual or automated means), organizations are spared from doing menial and repetitive work. It is also relatively easy to build the training model once the data is refined and streamlined. And when banks and FI have a more dependable AI pipeline (thanks to cleaner data) they can gain valuable insights that give them a strategic advantage.    

Carrying out data quality checks 

For ensuring that their data is up-to-date and foolproof, there are several levels of checks or quality tests including the quick-fact checking of data against a universal known truth – such as the age field – in a dataset age filed cannot have a negative value nor can the name field be null. However, a quick-fact check is a basic check (tests only the data and not the metadata which is the source of extremely valuable information such as the origin of data, creator of data, etc.). Therefore, for a comprehensive test of data quality, holistic or historical analysis of datasets must be carried out where organizations test individual data for authenticity or compare them with historical records for validation.  

Manual testing: Herein, staff manually verifies the values for data types, length of characters, formats, etc. The manual verification of the data is not desirable as it is exceedingly time-consuming. It is also highly error-prone. Instead, there are options such as open-source projects and in some cases, coded solutions built in-house, but both are not as popular as automated data quality testing tools.  

Automated data quality testing tools: Using advanced algorithms, these tools invariably make it easier for organizations to test data quality in a fraction of the time that manual effort takes (using data matching techniques). However, as reiterated earlier, machines are as good as the training they receive. If unclean, flawed data is poured into the training pipeline, it clogs the machine and prevents it from giving the desired results.   

The machines have to be taught like humans to understand and manipulate data so that exceptions can be raised and only clean filtered data remain in the dataset. Organizations can gain intelligence from their data either through rules-based engines or machine learning systems. 

1. Rules-based system: Rules-based systems work on a set of strict rules that suggest “if” a certain criterion is met or not met, then what follows. Rules-based data quality testing tools allow organizations to validate datasets against custom-defined data quality requirements. Rule-based systems requiring less effort and is also less risky – false positives are not a concern. It is often asked if rules-based tools and processes are slowly becoming antiquated as banks and FIs deal with an explosion of data. Probably not. They are still a long way from going out of fashion. It still makes sense to use the rules-based approach where the risk of false positives is too high and hence only rules which ensure 100 percent accuracy can be implemented. 

2. Machine learning systems: A machine learning system simulates human intelligence. It learns from the data that it is given (training model). Like a child that learns from its parent, it picks the good, the bad, and the ugly. Hence businesses must be extremely careful at the onset itself. They cannot expect optimum results if they are not careful with the quality of the data used for training. When it comes to its learning capacity and potential, however, ML-based systems’ capacity is infinite.  

Though there are several ways for the machine to learn, supervised learning is the first step. Every time new data gets incorporated in the datasets, the machine learns. The element of continuous learning means that in time it would require minimum human interference – which is good as banks and FIs would like to engage their manpower in far more critical tasks. As machines interpret and categorize data using its historical antecedents, it becomes much smarter and indefinitely more capable than humans. 

In the realm of dark data 

Every day banks and FIs generate, process, and store humongous amounts of data or information assets. Unfortunately, much of this data (nearly 80%) remains in the dark. Banks and FIs rarely tap into it for business insights and for monetizing the business. However, machine learning systems can help organizations unearth value from dark data with minimum effort. Learning, in this case, begins with making data observations, finding patterns and eventually using it to make good strategic decisions. All based on historical evidence or previous examples. What the system simply does here is alert the supervisor (about the exception) and then process that information and learn – that is what continuous learning does long term. 

Data quality is a burning issue for most organizations 

“By 2022, 60% of organizations will leverage machine-learning-enabled data quality technology to reduce manual tasks for data quality improvement.” Gartner  

At Magic FinServ, we believe that high-quality data is what drives top and bottom-line growth. Data quality issues disrupt processes and result in escalated costs as it calls for investment in re-engineering, database processing, and customized data scrubbing. And more for getting data in shape. 

Organizations certainly wouldn’t want that as they are running short of time already. Knowing that manual testing of data quality is not an option – it is expensive and time-consuming, it is cost-effective and strategically sound to rely on a partner like Magic FinServ with years of expertise.

Ensuring quality data – the Magic FinServ way

Magic FinServ’s strategy to ensure high-quality data is centered around its key pillars or capabilities – people, in-depth knowledge of financial services and capital markets, robust partnerships (with the best-in-breed), and a unique knowledge center (in India) for development, implementation, upgrade, testing, and support. Our capabilities go a long way in addressing the key challenges enterprises face today related to their data quality, spiraling data management costs, and cost-effective data governance strategy with a well-defined roadmap for enhancing data quality. 

Spinning magic with AI and ML: Magic FinServ has machine learning based tools to optimize operational cost by using Al to automate exception management and decision making. We can deliver a savings of 30% – 70% in most cases. As a leading a digital technology services company for the financial services industry, we bring a rare combination of capital markets domain knowledge and new-age technology skills, enabling leading banks and FinTech’s to accelerate their growth.  

Cutting costs with cloud management services: We help organizations manage infrastructure costs offering end-to-end services to migrate (to cloud from enterprise), support and optimize your cloud environment.

Calling the experts: We can bring in business analysts, product owners, technology architects, data scientists, and process consultants at a short notice. Their insight in reference data, including asset classes, entities, benchmarks, corporate actions and pricing, brings value to the organization. Our consultants are well-versed in technology. Apart from traditional programming environments like Java and Microsoft stack, they are also well versed in data management technologies and databases like MongoDB, Redis Cache, MySQL, Oracle, Prometheus, Rocks dB, Postgres, and MS SQL Serve. 

Partnerships with the best: And last but not least, the strength of our partnerships with the best in the industry gives us an enviable edge. We have not only tied with multiple reference data providers to optimize costs and ensure quality, but have partnership with reputed organizations dealing with complex and intractable environments, multi-domains, covering hundreds of thousands of data sources, to help our clients create a robust data governance strategy and execution plan.

So that is how we contain costs and also ensure that the data quality is top notch. So why suffer losses due to poor data quality. 

Connect with us today by writing to us at mail@magicfinserv.com.

The Hedge Fund industry is witnessing an unprecedented boom (CNBC) – a record high that has the market tizzy. Barclay Hedge reveals that Hedge funds made more than $552.1 billion – in trading profits alone. At the same time, the AUM swelled to 42% in the past 12 months, indicating a resurgence of investor trust despite the turbulent times that the industry had weathered earlier.  

It is evident that the rebound in the economy and government stimulus packages contributed significantly to the strong backdrop and increased investor confidence that we are witnessing today. But we cannot afford to overlook the role technology has played in allaying investor fears and enabling the industry to reach a “record high.”  

The turbulence that markets witnessed last year due to the pandemic was reason enough for many hedge funds to change gears – from manual to intelligent automation. But even earlier, the changes in the economy and new regulations had put pressure on IT teams to explore options for ensuring strategic growth while ensuring compliance. Hedge Funds that remained committed in their efforts to adopt technology were able to shake off the monotony and chaos of antiquated processes, even while others scrambled to come to terms with the new world order (post-pandemic) that required them to approach technology outsourcing vendors to meet the remote working needs. Cloud computing coupled with AI, ML, and blockchain has disrupted the world of capital markets immensely. These new technologies have not only streamlined services but ensured transparency and cost-effectiveness as well. And that’s what the investors wanted–transparency, trust, standardization, and accountability.  

Now or never – the future is cloud 

The future is in the cloud. With its “virtually unlimited storage capacity, scalability and compute facility that is available on-demand,” it offers a huge advantage to hedge funds, institutional asset managers, fund administrators who have been grappling with the data problem. John Kain, the Head of Business and Market Development, Banking & Capital Markets, Amazon Web Services (AWS) Financial Services, says that within four years of his joining AWS, he has seen a significant increase in the volumes of data being placed in the cloud. He also mentions that the sophistication of cloud-based tools used by fund managers has amplified, indicating fund manager’s confidence in the cloud to tackle the sheer scale of data used in making everyday investment decisions. Today the top drivers for Financial Institutions resorting to cloud usage are:  

  • Reduction in costs from CapEx to OpEx: The burden of maintaining legacy architecture and overhead costs get resolved when you move to the cloud.   
  • Amplifying the speed of technology deployment: With the cloud, updates are almost instantaneous. 
  • Cutting costs of legacy maintenance; Simplifying  IT management and support: Maintenance and support are the vendor’s responsibility.   
  • Induce nimbleness and scalability: Incredibly easy to add space, storage, and RAM without waiting for lengthy paperwork, as is the case with infrastructure deployment. 
  • Ensure business continuity with Disaster Recovery: FIs can ensure business as usual even during critical times with the cloud as it is equipped with disaster recovery.     

This blog will discuss what has fundamentally been the biggest disrupter of the decade – the cloud and its benefits. Apart from the private cloud, there are also public and hybrid cloud models. Financial institutions must plan before shifting from on-prem to cloud.  Many choose a SaaS-based approach. Certainly SaaS platforms on cloud are more fruitful compared to migrating legacy platforms to cloud as it delivers immediate business results. For a short or medium period of time, some satellite applications might go to cloud before going full-on SaaS. All facilitated by DevOps practices since that enables faster changers and fewer errors. 

So it adds strategic value if you have a partnership with a third-party vendor with experience handling cloud transformation journeys, specifically for FI’s.   

Choosing your Cloud – Public, Private, Hybrid  

Public cloud – open and affordable 

The public cloud infrastructures like Azure, AWS, and Google Cloud offer highly compelling incentives and advantages for hedge funds and asset management firms, including small firms like family offices, thus leveling the playing field immensely. Flexibility and ease of deployment are persuasive drivers when it comes to choosing the public cloud model. In addition to this, the costs of this model are readily acceptable to even small players.  

The most popular public cloud offerings for financial institutions include ancillary systems like cloud-resident office suites such as Microsoft Office 365, customer relationship management systems (CRM) like Salesforce, Market research systems, and HR systems. 

Limitations of public cloud:  

Despite some of these obvious advantages, some big financial institutions remain unwilling to outsource their core banking structures and much of their mission-critical systems into the cloud, where there have been some highly publicized security and data breaches in the past. 

The concern arises from the financial institution’s fiduciary responsibilities to its customers. If any financial/sensitive data gets leaked/compromised, the financial institutions face significant liabilities resulting from identity theft, fraud, and other malicious acts. However, this doesn’t mean those large financial institutions aren’t invested in public cloud solutions. They are raising significant engagements with public clouds but in areas that promote collaboration among employees and departments that help them reduce the costs of internal IT.  

Apart from security, scalability was another primary concern. The File sharing Tools and Services are not scalable due to rising costs, and as the firm grows, it (the firm) requires more than file sharing among a small group of people. In due time, a growing firm needs CRM, OMS, accounting tools, etc., which File Sharing tools cannot accommodate.  

Private cloud – In-built disaster recovery and high performance  

The private cloud has been the go-to option for financial and investment firms requiring business-class IT infrastructure. With its inherent security, privacy, and performance, it provided a seamless experience. In addition, a private cloud allowed the firm to exercise greater control over network traffic in terms of security, service quality, and availability. In most cases, the private cloud is operated professionally by a service provider based solely on controlling, managing, and maintaining the network to satisfy business requirements and compliance directives. Thus, businesses benefit from seasoned industry professionals with expertise who live & breathe private financial IT. 

If security and high-performance matter most, then it is the private cloud that is best. You do not have to invest in disaster recovery with a private cloud as it is already in-built into the cloud offering. 

Hybrid cloud – a mix of private and public 

Hedge funds, Asset managers, and other investment firms need not take an either/or approach to their IT infrastructures. Hybrid clouds – defined as a composition of two or more clouds that remain unique entities but are bound together – are the most popular choice today. Through a hybrid cloud solution that blends many of the public and private cloud’s most compelling characteristics and features, firms can utilize a unique, flexible, and scalable platform that serves a wide variety of the firm’s needs while keeping all regulatory compliance and security measures intact. In addition to the combined benefits, the beauty of the hybrid cloud is that it supports a slow transition, too, as risk is mitigated, compliance requisites are understood, and budgets get approved. As per the Hedge Fund Journal, “this is where working with a trusted cloud provider can add value to the process and ensure the benefits of hybrid cloud are realized.” 

Any talk about hybrid cloud would be incomplete without mentioning APIs and API orchestrators like Postman, which are gaining ground for facilitating app to app conversations, and DevOps and other Orchestrators given that the next-gen Dev environment on Cloud would be underpinned by tools such as Terraform, Ansible, Octopus etc. These automate the “Integrated Pod” structure, starting from rapid fire spin up/down of infra, testing out code, and automated code deployment/integration, saving time and effort.     

Why cloud? 

More control and less chaos with cloud  

Cloud technology has unequivocally also changed the way that hedge funds run their operations today. It is not an exaggeration to say that the public cloud, led by the growth of Amazon Web Services and Microsoft Azure, has been a game-changer and arguably has allowed small hedge funds to compete on a level playing field. The cloud with its capability to support front, middle and back-office functions – from business applications to data management solutions and accounting systems is one of the most powerful assets for the 10,000 odd hedge funds spanning the globe today, as the demand for seamless, scalable, and efficient IT solutions grows exponentially. With the cloud, organizations have more control over their processes. Data management and storage also become less of a concern when one moves to the cloud.     

Innovation begins with cloud   

The advantages of cloud-based solutions are many and go beyond cost efficiency and access to highly scalable storage and computing power. The most significant benefit that the cloud offers hedge funds is that it quickly opens the doors for new opportunities. Let’s take the example of Northern Trust, which uses its novel cloud-based platform to update client systems 20 times in a month or more, even as competitors struggle to update their clients’ systems on a quarterly or annual update cycle.   

Elaborating how the cloud-based platform mitigates risks, Melaine Pickett, Head of Front Office Solutions, Northern Trust says, “That de-risks the releases for our clients – because we’re not releasing huge chunks of code and hoping nothing goes wrong – and it also makes us able to iterate very quickly because we’re not waiting until the next quarter or next year to add new features or make other changes.” 

Leveling the playing field with cloud-based SaaS   

The one with the next big idea leads the race in today’s digital world. We have examples like Kakao bank of South Korea which onboarded millions of customers in a week – thanks to its extremely powerful platform – who have proven that small can be powerful. As Ranjit Bawa, principal and U.S. technology cloud leader with Deloitte Consulting LLP says, “Innovation can’t be mandated, but innovative teams can be empowered with tools that let them test the waters on their own.” And the cloud is one such medium.   

To quote Bawa, “the cloud democratizes the ability to test great ideas and bring them to life.” And by doing so, it levels the playing field for emerging managers – who are high on ideas and innovation. Though constrained due to bandwidth and headcount, cloud platforms provide them tremendous opportunity to test out new ideas.    

Cloud as a part of Business Continuity Plan 

Last year, small and big hedge funds suffered due to an unforeseen crisis – the coronavirus pandemic. Nobody had expected to be in the midst of a situation where remote working would be the only option. Firms that had invested in the cloud infrastructure could wriggle out of the crisis relatively unscathed, but others were forced to rethink their strategy. The days of over-reliance on manual labor were over. As a part of the business continuity plan (BCP), firms were forced to either implement a cloud-based infrastructure or work with a technology vendor like Magic FinServ to meet their IT and software needs.    

 The future however is multi-cloud  

Paradoxically the over reliance on separate cloud environments has led to silos again. So we have data developers, IT teams, cloud architects, and security teams with diametrically opposite business and technology requirements working in silos. VP Marketing, VMWare cloud, Matthew J. Morgan (Forbes) reports organizations are no longer relying on a single-vendor IT environment for cloud. Instead, the typical practice (in all organizations that he has worked with) consists of relying on a heterogeneous mix of cloud providers that necessitates a rethink of the “cloud strategy to ensure cohesiveness”. As the proliferation of separate cloud environments has resulted in the creation of new silos in the IT organization.” 

Hence – the need for a multi-cloud strategy. Morgan, in his Forbes article, states that the  “multi-cloud strategy and quick implementation was and continues to be a priority for the advancement of the business and the cohesiveness and security of the technology.” 

That multi-cloud is the future leaves no one in doubt. We are already witnessing organizations relying on a mix of multiple public cloud providers (AWS and Azure) working together to resolve businesses’ specific needs. Also, businesses are not seeing high value in having private data centers, and so we are hearing about more and more FIs publicly talking about adopting a multi-cloud approach. This is also important from a regulatory perspective, as firms will not benefit from relying solely on one cloud provider. 

Magic FinServ – your trusted cloud transformation partner 

As a trusted cloud partner, we service FI’s, including investment banks, hedge funds, fund administrators, asset managers, etc. While the journey might seem daunting at first, with a partner like Magic FinServ –  an expert in assessment, design, build, migration, and management of cloud for leading Financial Institutions – you can be assured of the desired results.  

We deploy new-age technologies like AI and Machine Learning to reduce the time-to-market, add security layers using the Infra-as-a-code approach, diminish system redundancy, and continuously narrow down the cloud deployment and monitoring costs. Our Opensource framework approach enables agility and cloud-agnostic development as well.  

We have worked with Tier 1 investment banks, top-tier hedge funds with up to 10B AUM, fast-growing SaaS companies from fintech and insuretech, and blockchain enterprise platforms. We cater to organizations of various sizes across the globe, serviced out of our offices in New York and Delhi.  

If you are looking for a financial services specialized cloud service provider, Magic FinServ is your ultimate answer. You can book a consultation by writing us mail@magicfinserv.com. You can also download our ebook for other information.  

The Banking and Financial Services Sector and the fintechs supporting them have not been unaffected by the winds of change sweeping across the business landscape. But unlike the past, technology has proved to be an equalizer. Size is no longer a necessary condition for success. Today we have Challenger Organizations, typically SMEs and large institutions alike, competing on a level playing field, and whosoever simplifies their processes or automates first gains an edge. Examples of how Intuit, Square, and similar Challenger Organizations redefined the meaning of Customer experience are proof enough. Automation, elimination of repetitive manual tasks, and consolidation of redundant activities into fewer steps have played a crucial role in enhancing Straight Through Processing (STP). 

Straight Through Processing has hitherto been addressed by optimizing underlying applications, eliminating data silos, and integrating applications better. While process automation through RPA and similar technologies have helped optimize downstream processes, the manual effort was still significant due to the disparate data sources that could not be consolidated and integrated. It is this final boundary that is sought to be breached through the application of Emerging Technologies. With a holistic end-to-end straight-through processing (STP), banks and FIs have taken a quantum leap forward and what would have once taken days to accomplish now takes minutes. STP removes the need for manual intervention. The only time human intervention is “ ideally” required is during “exception handling ” or “exceptions processing”  – when the system sees something that is unusual and raises a red flag. The human annotator then sets about making the necessary changes.  

White whales for STP implementation or Taming the white whales (with STP)

STP implementation is ideal for processes that involve a lot of repetitive work. The costs of system integration (might) dissuade many smaller players from considering it. However, like Captain Ahab’s relentless pursuit of the White Whale – Moby Dick, digital transformation experts have relentlessly argued the case for STP implementation (without similar calamitous consequences) in some of the most tiresome and time-consuming processes like KYC, loans processing, foreign exchange transaction handling,  accounts payables amongst others. Organizations that must meet quality SLAs as part of the business agreement have much to lose if they do not innovate on the technology front. Humans are more liable to make errors –  attach a wrong document, classify a document incorrectly (highly probable if there are 100 odd classifications to choose from), or simply feed data incorrectly into the system. And in the event this takes place, the likelihood of not meeting the SLAs (resulting in client dissatisfaction) is high.

 Secondly, banks and FIs, manually administering processes like KYC, have no time for value-added activity (like sales and customer retention and experience) as they are busy meeting the deadlines. As manual labor is both slow and expensive, there is a considerable backlog. Let’s take the example of the KYC process – a  McKinsey report states that banks generally employ about 10 percent of the workforce in financial-crime-related activities. Now that is a lot in terms of labor cost. The report has also indicated that KYC reviews were often the costliest activities handled by the bank.

With STP, banks and the financial services sector can eliminate a lot of paperwork, many unnecessary checkpoints (translating into unnecessary headcount), manual data entry while ensuring that SLAs are met and invoices, KYC, onboarding, accounts payable and accounts receivable, and other such processes and document-oriented activities are conducted quickly and cost-efficiently and with relatively fewer margins of error. So that organizations can ensure higher levels of transparency and trust and a good customer experience.

Cost of quality   

Let’s be honest; even the best human keyers/classifiers make more errors or mistakes than machines. While an error percentage of 2 to 5 % might not seem much, if we apply the 1-10-100 rule for the “cost of quality” and take into account a million documents that are being classified and whose data is being extracted, 5 % does make a huge difference. Automatically that would translate into a lot of work that would require human intervention. For every error that could be prevented, the cost of rectifying it is 10 times more. Leaving an error unattended is costlier – 100 times more expensive than doing it right the first time. 

Machines, however, are more capable. A Mckinsey report states, “ Assuming that standardization and coding of rules are performed correctly, quality can be improved significantly (by a range of 15 to 40 percent, experience indicates). Manual errors are reduced, and the identification and documentation of risks are improved. Rework loops can be shortened as well, as “first time right” ratios and regulatory targets are met more quickly.”

And now comes the really tricky part, since 100 percent accuracy is still unthinkable even for machines. When we are talking about documents, and relevant data fields, an accuracy of 99 % at the character level does not ensure STP. If documents need validation from their human supervisors due to high error rates,  zero STP will ensue.  Here we’d need something more robust than RPA. With machine learning (ML) and advanced capture, it is possible to increase accuracy to validate data using advanced rules.  We’d need a system that constantly adjusts and optimizes data. So every time the system encounters a variance or an anomaly, it adapts ( taking help from the human-in-loop) and improvises, becoming better after each iteration.

We would also have to take into account the variance in straight-through processing when it comes to structured and unstructured documents. While a standardized document such as a 10k will enable higher levels of  STP, semi-standardized documents such as invoices and unstructured documents such as notes and agreements will allow lower levels of STP.  

Simplifying banking processes

Today the imperative for banks and FIs to simplify their processes is huge. Future growth today is dependent on the ease with which banks and FIs can conduct their business. There is no escaping that! Reiterating the need for simplification, Hessel Veerbek, partner strategy, KPMG Australia, writes about “how some banks and insurers have replaced key elements of their core systems and consolidated their ancillary systems to rationalize their IT estate, modernize their capabilities, reduce costs and, at the same time, provide the capabilities to adapt and evolve their business models to secure future growth.”

Banks need to simplify operation by a core banking system that takes care of processes like loans processing and accounts payable end-to-end. At the heart of a simplified and automated banking architecture is end-to-end STP, which can be a complex undertaking, but as leading banks have shown, it is worth the trouble as it boosts efficiency. The challenges to STP incorporation are really in the mindset of organizations as “complex processes, high-risk customers and non-standard accounts” are still excluded from the purview of STP. It’s typical for organizations to consider STP while conducting low-risk tasks such as KYC for low-risk accounts in the KYC process. However, as the Mckinsey report suggests, if applied with dexterity and foresight, STP can eventually be enabled for the high-risk segment of customers as well. But here, a simple rules-based approach will not suffice, and organizations would need to rely on data science to create a system that will ensure a reliable output.

Augmenting human labor  – when machine learning tools perform the task and optimize it as well

The question is not really about how STP reduces the need for manual intervention – it is about how it augments human skills. The time it takes a human to classify and extract information from documents is disproportionate to the gains. Considering that almost 100 % of all data entry tasks can be automated and results can be obtained in a fraction of the time,  it makes sense to invest in tools that would ensure end-to-end automation. With STP, banks and Fintechs can not only eliminate the need for manual keying and classification of data, but in time sophisticated machine learning tools can also eliminate the need to verify that data manually.        

Getting to zero-touch! 

For true STP, we want an error rate that is such that human intervention is not required. This STP is the percentage or number of documents that go through the system with zero-touch or human contact. If the error rate or adverse media reaction is high, every document would have to be reviewed. Hence organizations must work on increasing accuracy by leveraging the power of AI and ML tools. 

If we are talking about cost efficiency, the need for software that easily integrates with organization-wide legacy systems is also a  prerequisite.  

The automation success story is not only about STP. 

It is important to remember that the automation success story does not depend on STP alone; other factors like investment costs, capital performance, cycle time, ROI, headcount reduction also matter. While “customer satisfaction” and “experience “ are good to have, Net Present Value (NPV), cost efficiency, and headcount reductions matter a lot. After all, leaner, nimbler, and more efficient operations are what most organizations are after.

While considering STP, it is also essential to do some homework regarding the investment costs, the complexity of the process (number of data elements that must be extracted, variance in documents, etc. ), cycle time, and the headcount reduction, etc. 

Experience tells us that the shift from highly manual-oriented processes to STP is not easy. It requires massive levels of patience and commitment as it takes time to reach the desired levels of accuracy. The actual test of STP success for any process depends on determining with a high degree of precision if a task has been executed accurately or not. A high rate of error or human intervention results in zero STP.   

Regardless of the challenges underlined earlier, STP remains a significant milestone in any organization’s journey towards automation. Banks and FIs that have successfully implemented STP have reaped many visible benefits. With Straight Through Processing, banks and FIs can choose to re-direct their efforts towards customer experience and retention, as they now have the time and bandwidth. When banks and FIs automate invoices and payments, they pave the way for a happier customer and employee experience. 

The question today is not whether STP is the ultimate test for automation progression; the question today is whether organizations can afford to do without STP – considering the astronomical costs of processing files and increased competition. Magic FinServ, with its years of experience serving a diverse clientele set comprising some of the top American banks and Fintechs, is well acquainted with the opportunities and risks associated with process optimization and simplification using AI and ML. If organizations are not careful, the costs could escalate disproportionately and disrupt the drive towards digital transformation.   Magic FinServ helps you navigate uncharted waters by leveraging our understanding of the financial services business to re-engineer existing applications, design new platforms, and validate machine learning solutions to suit your business needs. To explore our solutions, reach out to us mail@magicfinserv.com

Enterprises have increasingly realized that they must implement AI to succeed as digital natives are fast outpacing the ones relying on monolithic architectures. However, lack of synchronization between downstream and upstream elements, failure to percolate the AI value and culture in the organization’s internal dynamics, unrealistic business goals, and lack of vision often means that the AI projects either get stuck in a rut or fail to achieve the desired outcomes. What seemed like a sure winner in the beginning soon becomes an albatross around one’s neck.

Mitigating the pitfalls with a well-drawn and comprehensive AI roadmap aligned to company needs  

According to a Databricks report, only one in three AI and predictive analytics projects are successful across enterprises. Most AI projects are time-taking – it takes six months to go from the concept stage to the production stage. Most executives admit that the inconsistencies in AI adoption and implementation stems from inconsistent data sets, silos, and lack of coordination between IT and management and data engineers and data scientists. Then there’s the human element that had to be taken into account as well. Reluctance to invest, lack of foresight, failure to make cultural changes are as much responsible for falling short of the AI targets as the technical aspects enumerated earlier.

This blog will consider both the technical and the human elements vital for conducting a successful AI journey. To mitigate any disappointment that could accrue later, enterprises must assess the risk appetite, ensure early wins, get the data strategy in place, drive real-time strategic actions, implement a model and framework that resonates with the organization’s philosophy while keeping in mind the human angle – ensuring responsible AI by minimizing bias.

Calculating the risk appetite – how far the organization is willing to go? 

Whether the aim is to enhance customer experience or increase productivity, organizations must be willing to do some soul searching and find out what they are seeking. What are the risks they are prepared to take? What is the future state of readiness/ AI maturity levels? And how optimistic are things at the ground level?  

From the utilitarian perspective, investing in a completely new paradigm of skills and resources which might or might not result in ROI (immediately) is debatable. However, calamities of a global scale like COVID-19 demand an increased level of preparedness. Businesses that cannot scale up quickly can become obsolete; therefore, building core competencies with AI makes sense. Automating processes mitigates the challenges of the unforeseeable future when operations cannot be reliant on manual effort alone. So even if it takes time to reach fruition, and all projects do not translate into the desired dividends, it is a risk many organizations willingly undertake.

There is a lot at stake for the leadership as well. Once AI is implemented, and organizations start to rely on AI/ML increasingly, the risks compound. Any miscalculation or misstep in the initial stages of AI/ML adoption could cause grievous damage to the business’s reputation and its business prospects. Therefore, leadership must gauge AI/ML risks.     

Importance of early wins – focussing on production rather than experimentation.  

Early wins are essential. It elicits hope across an organization. Let us illustrate this with an example from the healthcare sector – the ‘moon shot’ project. Launched in 2013 at the MD Anderson Cancer Centre, the ‘moon shot project’ objective was to diagnose and recommend treatment plans for certain forms of cancer using IBM’s Watson cognitive system. But as the costs spiraled, the project was put on hold. By 2017, “moon shot” had accumulated costs amounting to $62 million without being tested on patients. Enough to put the management on tenterhooks. But around the same time, other less ambitious projects using cognitive intelligence were showing remarkable results. Used for simple day-to-day activities like determining if the patient needed help with bills payment and making reservations, AI drove marketing and customer experience while relieving back-officer care managers from the daily grind. MD Anderson has since remained committed to the use of AI.

Most often, it makes sense to start with process optimization cases. When a business achieves an efficiency of even one percent or avoids downtime, it saves dollars – not counting the costs of workforce and machinery. It is relatively easy to calculate where and how we can ensure cost savings in existing business cases instead of exploring opportunities where new revenue can be driven, as illustrated by the MD Anderson Cancer Centre case study. As we already know how the processes operate, where the drawbacks are, it is easier to determine areas where AI and ML can be baked for easy wins. The data is also in a state of preparedness and requires less effort.

In the end, the organization will have to show results. They cannot experiment willy-nilly. It is the business impact that they are after. Hence the “concept of productionize” takes center stage. While high-tech and glamorous projects look good, these are best bracketed as “aspirational.” Instead, the low-hanging fruit that enables easy gains should be targeted first.

The leadership has a huge responsibility, and to prioritize production, they must work in tandem with IT.  Both should have the same identifiable business goals for business impact. 

Ensuring that a sound data strategy is in place – data is where the opportunity lies!

If AI applications process data a gazillion times faster than humans, it is because of the trained data models. Else, AI apps are ordinary software running on conventional code. It is these amazing data models trained to carry out a range of complex activities and embedding NLP, computer vision, etc., that makes AI super-proficient. As a result, the application or system can decipher the relevant text, extract data from images, generate natural language, and carry out a whole gamut of activities seamlessly. So if AI is the works, data is the heart.          

Optimizing data pool

Data is the quintessential nail in the absence of which all the effort devised for drafting an operating model for data and AI comes to naught. Data is the prime mover when it comes to devising an AI roadmap. For data to be an asset, it must be “findable, accessible, interoperable, and reusable”. If it exists in silos, data ceases to be an asset. It is also not helpful if it exists in different formats. It is then a source of dubiety and must be cleaned and formatted first. Without a unique identifier (UID), attached data can create confusion and overwrite. What the AI machinery needs is clean, formatted, and structured data that can easily be baked on existing systems. Data that can be built once and used in many use cases is fundamental to the concept of productized data assets.

It serves to undertake data due diligence or an exploratory data analysis (EDA). Find out where data exists, who is the owner, how it can be accessed, linkages to other data, how it can be retrieved, etc., before drawing out the roadmap. 

The kind of data defines the kind of machine learning model that can be applied, for example, for supervised machine learning models, data and labels are essential for enabling the algorithm to draw an inference about the patterns in the label, whereas unsupervised learning comes when data does not have labels. And transfer learning when the data that an existing machine learning model has learned is used to build a new use case.

Once the data has been extracted, it must be validated and analyzed, optimized, and enriched by integrating it with external data sources such as those existing online or in social media and to be fed into the data pipeline. A kind of extract, transform and load. However, if it is done manually, it could take ages and still be biased and error-prone. 

Drawing the data opportunity matrix to align business goals with data

Once the existing data has been sorted, find how it can be optimized for business by integrating it with data from external sources. For this purpose, an opportunity matrix, also known as the Ansoff matrix comes in handy. A two-by-two matrix that references new business and current business with the data subsets (internal and external), it aids the strategic planning process and helps executives, business leaders understand where they are in terms of data and how they would like to proceed further.   

Driving real-time strategic actions for maximum business impact using AI: Leadership matters 

Real-time strategic actions are important. For example, millennial banks and financial institutions must keep pace with customer expectations or else face consequences. By making the KYC process less painstaking with AI, banks and FinTechs can drive unexpected dividends. When the KYC is done manually, it is time taking. By the time the KYC is complete, the customer is frustrated. When AI and Machine Learning capabilities are applied to existing processes, organizations reduce manual effort and errors substantially. The costs of conducting the KYC are reduced as well. However, the biggest dividend or gain that organizations obtain is in the customer experience that rebounds once the timelines ( and human interaction) are reduced. That is like having the cake and eating it too!    

SAAS, on-prem, open-source code – finding out what is best!

If it is the efficiency and customer experience that an enterprise is after, SaaS works best. Hosted and maintained by a third party, it frees the business from hassles. However, if one wants complete control over data and must adhere to multiple compliance requirements, it is not a great idea. On-prem, on the other hand, offers more transparency and is suitable for back-end operations in a fintech company for fast-tracking processes such as reconciliations and AML/KYC. Though SaaS is feasible for organizations looking for quality and ease of application, open-source code produces better software. It also gives control and makes the organization feel empowered.          

Conclusion: AI is not a simple plug and play 

AI is not a simple plug-and-play. It is a paradigm shift and not everyone gets it right the first time. Multiple iterations are involved as models do not always give the desired returns. There are challenges like the diminishing value of data which would require organizations to broaden their scope and consider a wider data subset for maximizing accuracy.  

Notwithstanding the challenges, AI is a proven game-changer. From simplifying back-office operations to adding value to day-to-day activities, there is a lot that AI can deliver. Expectations, however, would have to be set beforehand. The transition from near-term value to closing in on long-term strategic goals would require foresight and a comprehensive AI roadmap. For more information on how your organization could use AI to drive a successful business strategy, write to us at  mail@magicfinserv.com to arrange a conversation with our AI Experts.     

“Worldwide end-user spending on public cloud services is forecast to grow 18.4% in 2021 to total $304.9 billion, up from $257.5 billion in 2020.” Gartner

Though indispensable for millennial businesses, cloud and SaaS applications have increased the complexity of user lifecycle management manifold times. User provisioning and de-provisioning, tracking user ids and logins have emerged as the new pain points for IT as organizations innovate and migrate to the cloud. In the changing business landscape,  automatic provisioning has emerged as a viable option for identity and user management.        

Resolving identity and access concerns

Identity and access management (IAM) is a way for organizations to define user’s rights to access and use organization-wide resources. There have been several developments in the last couple of decades for resolving identity and access concerns (in the cloud). 

The Security Assertions Markup Language (SAML) protocol enables the IT admin to set up a single sign-on (SSO) for resources like email, JIRA, CRM, (AD), so that when a user logs in once they can use the same set of credentials for logging in to other services. However, app provisioning or the process of automatically creating user identities and roles in the cloud remained a concern. Even today, many IT teams register users manually. But it is a time-consuming and expensive process. Highly Undesirable, when the actual need is for higher speed. Just-in-Time (JIT) methodology and System for Cross-domain Identity Management (SCIM) protocol ushers in a new paradigm for identity management. It regulates the way organizations generate and delete identities. Here, in this blog, we will highlight how JIT and SCIM have redefined identity and access management (IAM). We will also focus on cloud directory service and how it reimagines the future of IAM.     

  1. Just-in-Time (JIT) provisioning

There are many methodologies for managing user lifecycles in web apps; one of them is JIT or Just-in-Time. In simple terms, Just-in-Time (JIT) provisioning enables organizations to provide access to users (elevate user access) so that only they/it can enter the system and access resources and perform specific tasks. The user, in this case, can be human or non-human, and policies are governing the kind of access they are entitled to. 

How it works    

JIT provisioning automates the creation of user accounts for cloud applications. It is a methodology that extends the SAML protocol to transfer user attributes (new employees joining an organization) from a central identity provider to applications (for example, Salesforce or JIRA). Rather than creating a new user within the application, approving their app access, an IT admin can create new users and authorize their app access from the central directory. When a user logs into an app for the first time, those accounts are automatically created in the federated application. This level of automation was not possible before JIT, and each account had to be manually created by an IT administrator or manager. 

  1. System for Cross-domain Identity Management (SCIM) 

SCIM is the standard protocol for cross-domain identity management. As IT today is expected to perform like a magician -juggling several balls in the air and ensuring that none falls, SCIM has become exceedingly important as it simplifies IAM. 

SCIM defines the protocol and the scheme for IAM. The protocol defines how user data will be relayed across systems, while the scheme/identity profile defines the entity that could be human or non-human. An API-driven identity management protocol, SCIM standardizes identities between identity and service providers by using HTTP verbs.

Evolution of SCIM

The first version of SCIM was released in 2011 by a SCIM standard working group. As the new paradigm of identity and access management backed by the Internet Engineering Task Force (IETF), and with contributions from Salesforce, Google, etc., SCIM transformed the way enterprises build and manage user accounts in web and business applications. SCIM specification allocates a “common user schema” that enables access/exit from apps.  

Why SCIM? 

Next level of automation: SCIM’s relevance in the user life cycle management of B2B SaaS applications is enormous.   

Frees IT from the shackles of tedious and repetitive work: Admins can build new users (in the central directory) with SCIM. Through ongoing sync, they can automate both onboarding and offboarding of users/employees from apps. SCIM frees the IT team from the burden of having to process repetitive user requests. It is possible to sync changes such as passwords and attribute data. 

Let us consider the scenario where an employee decides to leave the organization or is on contract, and their contract has expired. SCIM protocol ensures that the account’s deletion from the central directory accompanies the deletion of identities from the apps. This level of automation was not possible with JIT.  With SCIM, organizations achieve the next level of automation.

  1. Cloud Directory Services

Cloud directory service is another category of IAM solutions that has gained a fair amount of traction recently. Earlier, most organizations were on-prem, and Microsoft Active Directory fulfilled the IAM needs. In contrast, the IT environment has dramatically changed in the last decade. Users are more mobile now, security is a significant concern, and web applications are de facto. Therefore the shift from AD to directory-as-a-service is a natural progression in tune with the changing requirements. It is a viable choice for organizations. Platform agnostic, in the cloud, and diversified, and supporting a wide variety of protocols like SAML, it serves the purpose of modern organizations. These directories store information about devices, users, and groups. IT administrators can simplify their workload and use these for extending access to information and resources.

Platform-agnostic schema: As an HTTP-based protocol that handles identities in multi-domain scenarios, SCIM defines the future of IAM. Organizations are not required to replace the existing user management systems as SCIM acts as a standard interface on top. SCIM specifies a platform-agnostic schema and extension model for users and classes and other resource types in JSON format (defined in RFC 7643). 

Ideal for SaaS: Ideal for SaaS-based apps as it allows administrators to use authoritative identities, thereby streamlining the account management process.

Organizations using internal applications and external SaaS applications are keen to reduce onboarding/deboarding effort/costs. A cloud directory service helps simplify processes while allowing organizations to provision users to other tools such as applications, networks, and file servers. 

It is also a good idea for cloud directories service vendors like Okta, Jumpcloud, OneLogin, and Azure AD to opt for SCIM. They benefit from SCIM adoption, as it makes the management of identities in cloud-based applications more manageable than before. All they need to do is accept the protocol, and seamless integration of identities and resources/privileges/applications is facilitated. Providers can help organizations manage the user life cycle with supported SCIM applications or SCIM interfaced IDPs (Identity Provider).   

How JIT and SCIM differ

As explained earlier, SCIM is the next level of automation. SCIM provisioning automates provisioning, de-provisioning, and management, while JIT automates account development. Organizations need to deprovision users when they leave the organization or move to a different role. JIT does not provide that facility. While the user credentials stop working, the account is not deprovisioned. With SCIM, app access is automatically deleted.     

Though JIT is more common, and more organizations are going forward with JIT implementation, SCIM is in trend. Several cloud directory service providers realizing the tremendous potential of SCIM have accepted the protocol. SCIM, they recognize, is the future of IAM.   

Benefits of SCIM Provisioning

  1. Standardization of provisioning

Every type of client environment is handled and supported by the SCIM protocol. SCIM protocol supports Windows, AWS, G Suite, Office 365, web apps, Macs, and Linux. Whether on-premise or in the cloud, SCIM is ideal for organizations desiring seamless integration of applications and identities. 

  1. Centralization of identity

An enterprise can have a single source of truth, i.e., a common IDP (identity provider), and communicate with the organization’s application and vendor application over SCIM protocol and manage access.

  1. Automation of onboarding and offboarding 

Admins no longer need to create and delete user accounts in different applications manually. It saves time and reduces human errors. 

  1. Ease of compliance 

As there is less manual intervention, compliance standards are higher. Enterprises can control user access without depending upon SaaS providers. Employee onboarding or turnover can be a massive effort if conducted manually. Especially when employees onboard or offboard frequently, the corresponding risks of a data breach are high. Also, as an employee’s profile will change during their tenure, compliance can be a threat if access is not managed correctly. With SCIM, all scenarios described above can be transparently handled in one place.

  1. More comprehensive SSO management

SCIM complements existing SSO protocols like SAML. User authentication, authorization, and application launch from a single point are taken care of with SAML. Though JIT user provisioning with SAML helps provision, it does not take care of complete user life cycle management. SCIM and SAML combination SSO with user management across domains can be easily managed.

SCIM is hard to ignore

Modern enterprises cannot deny the importance of SCIM protocol. According to the latest Request for Comments – a publication from the Internet Society (ISOC) and associated bodies, like the Internet Engineering Task Force (IETF) – “SCIM intends to reduce the cost and complexity of user management operations by providing a common user schema, an extension model, and a service protocol defined by this document.” Not just in terms of simplifying IAM and enabling users to move in and out of the cloud without causing the IT admin needless worry, SCIM compliant apps can avail the pre-existing advantages like code and tools. 
At Magic FinServ, we realize that the most significant benefit SCIM brings to clients is that it enables them to own their data and identities. It helps IT prioritize their essential functions instead of getting lost in the mire tracking identities and user access. Magic FinServ is committed to ensuring that our clients keep pace with the latest developments in technology. Visit our cloud transformation section to know more.

2020-2021 marked a new epoch in the history of business. For the first time, a massive percentage of the workforce was working from home. While employees struggled to cope with the limitations of working virtually, artificial intelligence (AI) emerged as a reliable partner for enterprises worldwide. With AI, enterprises were assured that business processes were not disrupted due to the scarcity of labor and resources.  

Now that the worst seems over, there are more reasons than ever to invest in AI. AI has been an infallible ally for many organizations in 2020. It helped them meet deadlines and streamline internal operations while eliminating wasteful expenditure. It helped them cope with burgeoning workloads. The impact AI had on employee productivity was significant. By unfettering staff in back and middle offices from the cycle of mundane, repetitive, and tiresome tasks, AI-enabled the workforce to engage in high-value tasks. 

So even as employees return to the office in the coming days,  many organizations will continue to amplify their AI efforts. Wayne Butterfield, director of ISG Automation, a unit of global technology research and advisory firm ISG, attributes this new phenomenon to the powerful impact AI had last year. He says, “As the grip of the pandemic continues to affect the ability of the enterprise to operate, AI in many guises will become increasingly important as businesses seek to understand their COVID- affected data sets and continue to automate day-to-day tasks.”

Indeed, in the banking and financial sector, the benefits driven by AI in the past year were monumental. It ensured frictionless interactions, cut repetitive work by half, and reduced error, bias, and false positives – the result of human fallacies – significantly. What organizations got was a leaner and more streamlined, and efficient organization. So there is no question that the value driven by AI in domains like finance and banking, which rely heavily on processes, will only continue to grow in the years to come. 

Setting pace for innovation and change

The pandemic has redefined digital. With enterprises becoming more digitally connected than ever before, it is AI that helps them stay operational. As a report from the Insider indicates, there will be significant savings in the middle, back, and front office operations if AI is incorporated. Automation of middle-office tasks can lead to savings of $70 billion by 2025. The sum total of expected cost savings from AI applications is estimated at $447 billion by 2023. Of this, the front and middle office will account for $416 billion of the aggregate.  

That AI will set the pace for innovation and change in the banking and financial services sector is all but guaranteed. The shift towards digital had started earlier; the pandemic only accelerated the pace. So here are some of the key areas where Fintechs and banks are using AI :   

  • Document Processing  
  • Invoice processing
  • Cyber Security
  • Onboarding/KYC 

Document processing with AI 

Enterprises today are sitting on a data goldmine that comes from sources as diverse as enterprise applications, public/private data sets, and social media. However, data in its raw form is of no use. Data, whether it is in textual, pdfs, spreadsheets, have to be classified, segregated, summarized and converted into formats (JSON, etc.) that can be understood by machines and processes before they can be of use to the organization. 

Earlier, image recognition technologies such as OCR were used for document processing. However, their scope is limited given that organizations deal with humongous amounts of data in diverse formats including print, and handwritten, all of which are not recognizable with OCR. Document processing platforms have a distinct advantage over traditional recognition technologies such as OCR and ICR. The system is trained first using data sets, and a core knowledge base is created. In time the knowledge base expands, and the tool develops the ability to self-learn and recognize content and documents. This is achieved through the feedback or re-training loop mechanism under human supervision. Realizing that artificial intelligence, machine learning, natural language processing, and computer vision can play a pivotal role in document processing, organizations are increasingly relying on these to enhance the efficiency of many front and back-office processes.  

Invoice Processing and AI

Covid-19 has intensified the need for automated Accounts Payable processes. Organizations that were earlier relying on manual and legacy systems for invoice processing were caught off-guard as employees were forced to work from home. Ensuring timely delivery on payment approvals became a challenge due to archaic legacy practices and an increasing number of constraints. Then there was the question of enhanced visibility into outstanding payments. All this led to chaos in invoice processing. A lot of frayed tempers and missed deadlines.

A major chunk of invoice processing tasks is related to data entry. Finance and accounts personnel shift through data that comes from sources such as fax, paper, and e-mail. But a study on 1000 US workers reiterated that no one likes data entry. The survey indicated that a whopping 70 percent of the employees were okay if data entry and other such mundane tasks were automated. With automated invoice processing, it is possible to capture invoices from multiple channels. Identify and extract data (header and lines) using validation and rules. And best in time, with little human supervision, become super proficient in identifying relevant information. It can also do matching and coding.  Magic FinServ’s Machine Learning algorithm correctly determined General Ledger code to correctly tag the invoice against an appropriate charge code and finally, using RPA, was able to insert the code on the invoice.    

Banks and other financial services stand to gain a lot by automating invoice processing. 

  • By automating invoice processing with artificial intelligence, organizations can make it easier for the finance staff and back-office team to concentrate on cash-generating processes instead of entering data as -a typical administration function. 
  • Automating the accounts payable process for instance, can help the finance teams focus on tasks that generate growth and opportunities. 
  • An automated invoice processing provides enhanced visibility into payments and approvals.
  • It speeds up the invoice processing cycle considerably as a result; there are no irate vendors
  • It makes it easier to search and retrieve invoices.      

Cyber Security and AI

Cybersecurity has become a prime concern with the enterprises increasing preference for cloud and virtualization. Cybersecurity concerns became graver during Covid-19 as the workforce, including software developing teams, started working from home. As third parties and vendors were involved in many processes as well, it became imperative for organizations to ensure extreme caution while working in virtualized environments. Experiences from the past have taught us that data breaches spell disaster for an organization’s reputation. We need to look no further than Panera bread and Uber to realize how simple code left in haste can alter the rules of the game. Hence a greater impetus for the shift left narrative where security is driven in the DevOps lifecycle instead of as an afterthought. The best recourse is to implement an AI-driven DevOps solution. With AI baked into the development lifecycle, organizations can accelerate the development lifecyc in the present and adapt to changes in the future with ease.

Onboarding/KYC and AI

One of the biggest challenges for banks is customer onboarding and KYC. In the course of the KYC, or onboarding banks have to handle thousands, sometimes even millions of documents. And if that were not enough, they also have to take account of exhaust data and the multiple compliances and regulatory standards. No wonder then that banks and financial institutions often fall short of meeting the deadlines. Last year, as the Covid-19 crisis loomed large, it was these tools powered with AI and enabled with machine learning that helped accelerate paperwork processes. These digitize documents and extract data from it. And as the tool evolves with time, it makes it easier for the organization to extract insights from it. 

Let us take the example of one prominent Insurtech company that approached Magic FinServ for the resolution of KYC challenges. The company wanted to reduce the time taken for conducting a KYC and for SLAs roll-out of new policies gained confidence and customer appreciation as Magic’s “soft template” based solution augmented by Artificial Intelligence provided them the results they wanted.  

Tipping point

Though banks and financial institutions were inclining towards the use of AI for making their processes robust, the tipping point was the pandemic. The pandemic made many realize that it was now or never. This is evident from the report by the management solutions provider OneStream. The report observed that the use of AI tools like machine learning had jumped from about 20% of enterprises in 2020 to nearly 60% in 2021. Surprisingly, analytics firms like FICO and Corinium that a majority of top executives (upwards 65%) do not know how AI works. 

At Magic FinServ, our endeavor is to ensure that the knowledge percolates enterprise-wide. Therefore, our implementation journey starts with a workshop wherein our team of AI engineers showcases the work they have done and then engages in an insightful session where they try to identify the areas where opportunities exist and the deterrents. Thereafter comes the discovery phase, where our team develops a prototype. Once the customer gives the go-ahead as they are confident about our abilities to meet expectations, we implement the AI model that integrates with the existing business environment. A successful implementation is not the end of the journey as we keep identifying new areas of opportunities so that true automation at scale can be achieved.     

Catering to Banks and FinTechs: Magic FinServ’s unique AI optimization framework    

At Magic FinServ, we have a unique AI Optimization framework that utilizes structured and unstructured data to build tailored solutions that reduce the need for human intervention. Our methodology powered by AI-ML-NLP and Computer vision provides 70% efficiency in front and middle office platforms and processes. Many of our AI applications for Tier 1 investment, FinTechs, Asset Managers, Hedge Funds, and InsuranceTech companies have driven bottom and top-line dividends for the businesses in question. We ensure that our custom-built applications integrate seamlessly into the existing systems and adhere to all regulatory compliance measures ensuring agility. 

For some time now, asset managers have been looking at ways to net greater profits by optimizing back-office operations. The clamor to convert back-office from a “cost-center” to a “profit center” is not recent. But it has increased with the growth of passive investment and regulatory controls. Moreover, as investment fees decline, asset managers look for ways to stay competitive. 

Back-office is where AI and ML can drive massive business impact. 

For most financial organizations considering a technology upgrade, it is the back office where they must start first. Whether reconciliation or daily checkout or counterparties, back-office processes are the “low-hanging fruits” where AI and ML can be embedded within existing architecture/tools without much hassle. The investment costs are reasonably low, and financial organizations are generally assured of an ROI if they choose the appropriate third-party vendor with expertise in handling such transitions.         

Tasks in the back-office that AI can replace

AI can best be applied to tasks that are manual, voluminous, repetitive, and require constant analysis and feedback. This makes back-office operations/processes a safe bet for AI, ML, and NLP implementation. 

The amount of work that goes behind the scenes in the back office is exhaustive, never-ending, and cumbersome. Back-office operatives are aided in their endeavors by core accounting platforms. Accounting platforms, however, provide the back-office operator with information and data only. Analysis of data is primarily a manual activity in many organizations. As a result, the staff is generally stretched and has no time to add value. Silos further impede process efficiency, and customer satisfaction suffers as the front, back, and middle offices are unable to work in tandem.  

While there is no supplementing human intelligence, the dividends that accrue when AI is adopted are considerable. Efficiency and downtime reduction boost employee and organization morale while driving revenue upstream.

This blog will consider a few use cases from the back-office where AI and ML can play a significant role, focusing on instances where Magic FinServ was instrumental in facilitating the transition from manual to AI with substantial benefits.  

KYC: Ensuring greater customer satisfaction 

Data that exists in silos is one of the biggest challenges in fast-tracking KYC. Unfortunately, it is also the prime reason behind a poor customer experience. The KYC process, when done manually, is long and tedious and involves chasing clients time and again for the information. 

With Magic DeepSight’s™ machine learning capabilities, asset managers and other financial institutions can reduce this manual effort by up to 70% and accomplish the task with higher speed and lower error rate, thereby reducing cost. Magic DeepSight™ utilizes its “soft template” based solution to eliminate labor-intensive tasks. It has enabled several organizations to reduce the time taken for KYC and overall improve SLAs for new client onboarding.  

Reconciliation: Ensuring quicker resolution

As back-office operations are required to handle exceptions quickly and accurately, they need manual effort supplemented by something more concrete and robust. Though traditional tools carry out reconciliation, many organizations still resort to spreadsheets and manual processes, and hence inconsistencies abound. As a result, most organizations manually reconcile anywhere between 3% to 10% volume daily.

So at Magic FinServ, we designed a solution that can be embedded/incorporated on top of an existing reconciliation solution. This novel method reduces manual intervention by over 95% using artificial intelligence. This fast-tracks the reconciliation process dramatically, ensures quicker time to completion, and makes the process less error-prone. Magic FinServ implemented this ‘continuously learning’ solution for a $250B AUM Asset Manager and reduced the trade breaks by over 95%.

Fund Accounting: Ensuring efficiency and productivity 

Fund accounting can be made more efficient and productive with AI. Instead of going through tons of data in disparate formats, by leveraging the powers of AI, the back office can analyze information in income tax reports, Form K-1 tax reports, etc., at a fraction of time taken manually and make it available for dissemination. For example, Magic FinServ’s Text Analytics Tool, which is based on Distant Supervision & Semantic Search, can summarize almost any unstructured financial data with additional training. For a Tier 1 investment bank’s research team that needed to fast-track and made their processes more efficient, we created an integrated NLP-based solution that automated summarizing the Risk Factors section from the 10-K reports.

Invoice and Expense Automation: Eliminating the manual effort

Automated invoice processing is the answer for organizations that struggle with a never-ending backlog of invoices and expenses. An AI integrated engine captures and extracts invoice and expense data in minutes. Without setting new templates and rules, data can be extracted from different channels. There’s also the advantage of automated learning facilitated by the AI engine’s self-learning and validation interface.

Magic FinServ used its sophisticated OCR library built using Machine Learning to get rid of manual effort in uploading invoices to industry-standard invoice & expenses management applications. Another Machine Learning algorithm was able to correctly determine General Ledger code to tag the invoice against an appropriate charge code correctly, and finally, using RPA was able to insert the code on the invoice.

Streamlining corporate actions operations:  

Corporate actions are one of the classic use-cases for optimization using AI. Traditionally, most corporate actions have been done manually, even though they are low-value activities and can mostly be automated with suitable systems. However, whether it is managing an election process with multiple touchpoints or disseminating accurate and complete information to stakeholders and investment managers, the fallout of missing an event or misreporting can be considerable. One way to reduce the risk is to receive notifications from more than one source. But that would compound the back-office workload as they would have to record and reconcile multiple notifications. Hence the need for AI.

Magic FinServ’s AI solution streamlines several routine corporate action operations delivering superior quality. The AI system addresses inefficiencies by reading and scrubbing multiple documents to capture the corporate action from the point of announcement and create a golden copy of the corporate action announcement with ease and efficiency. This takes away the need for manual processing of corporate action announcements saving up to 70% of the effort. This effort can be routed to other high-risk and high-value tasks. 

Conclusion: 

Back-office automation drives enormous dividends. It improves customer satisfaction and efficiency, reduces error rates,  and ensures compliance. Among the five technology trends for banks (for 2020 and beyond), the move towards “zero back offices” – Forrester report, is a culmination of the increasing demand for process automation in the back office. “Thirty percent of tasks in a majority of occupations can be automated, and robotics is one way to do that. For large back offices with data-entry or other repetitive, low judgment, high-error-prone, or compliance-needy tasks, this is like a panacea.”McKinsey Global Institute. For a long time, we have also known that most customer dissatisfaction results from inadequacies of back-office. As organizations get ready for the future, there is a greater need for synchronization between the back, middle, and front office. There is no doubt that AI, ML,  and NLP will play an increasingly more prominent role in the transition to the next level.

85% of organizations include workload placement flexibility in their top five technology priorities – and a full 99% in their top 10.” 

The pandemic has been an eye-opener. While organizations gravitated towards the cloud before the pandemic, they are more likely to opt for the cloud now as they realize the enormous benefits of data storage and processing in an environment unencumbered by legacy systems. The cloud facilitates the kind of flexibility that was unanticipated earlier. Other reasons behind the cloud’s popularity are as follows:  

  • Consolidates data in one place: Organizations do not have to worry about managing data on-prem data centers anymore.
  • Self-service capability: This feature of the cloud enables organizations to monitor network storage, server uptime, etc., on their own.
  • Promotes agility: The monolithic model that companies were reliant on earlier was rigid. With the cloud, teams can collaborate from anywhere instead of on-prem.
  • Ensures data security: By modernizing infrastructure and adopting the best practices, organizations can protect their critical data from breaches.
  • Fosters innovation: One can test new ideas and see if it works. For example, the deployment team can conduct a quick POC and see if it meets the desired objectives.
  • Scalable: One can scale up and down as per the need of the hour. Operational agility ranks high in the list of CIO objectives.
  • High availability: Ensures anytime and anywhere access to tools, services, and data. In the event of a disaster, backup and recovery are easily enabled. Not so for on-prem data storage.
  • Affordable: Cloud services use the pay-per-use model. There is no upfront capital expenditure for hardware and software. Most organizations resort to the pay-as-you-go model and thereby ward off unnecessary expenditure.      

Migration strategies 

Ninety percent of organizations believe a dynamically adjustable cloud storage solution will have a moderate to high impact on their overall cloud success.”

While most organizations are aware that they must move their workloads to the cloud – given the marketplace’, they are not sure how to start. Every cloud migration is unique because each organization has its priorities, application design, timelines, cost, and resource estimates to consider while pursuing the cloud strategy. Hence, the need for a vendor that understands their requirements. After all, a digital native would pursue a cloud strategy completely differently from organizations that have complex structures and legacy systems to consider. Their constraints and priorities being different, the one-size-fits-all approach does not work, especially for financial services organizations. The key is to incorporate a migration strategy at a pace the organization is comfortable with instead of going full throttle. 

This article has identified the three most important cloud migration strategies and the instances where these should be used.  

  1. Lift & Shift
  2. Refactor 
  3. Re-platform

Lift & Shift – for quick ROI

The Lift & Shift (Rehosting) strategy of cloud migration re-hosts the workload, i.e., the application “as-it-is” from the current hosting environment to a new cloud environment. The rehosting method is commonly used by organizations when they desire speedy migration with minimal disruption. 

Following are the main features of the rehosting approach: 

  • Super quick turnaround: This strategy is useful when tight deadlines are to be met. For example, when the current on-prem or hosting provider’s infrastructure is close to decommissioning/end of the contract, or when the business cannot afford prolonged downtime. Here, the popular approach is to re-host in the cloud and pursue app refactoring later to improve performance.  
  • Risk mitigation: Risk mitigation is important. Organizations must ensure the budget and mitigation plan takes account of the inherent risks. It is probable that no issues surface during the migration, but after going live, run-time issues might surface. The risk mitigation in such instances could be as small as the ability to tweak or refactor as per need.
  • Tools of transformation: Lift & Shift can be performed with or without the help of migration tools. Picking an application as an image and exporting it to a container or VM, running on the public cloud using migration tools like VM Import or CloudEndure is an example of Lift & Shift, frequently employed by organizations. 

While choosing lift-and-shift, remember that quick turnaround comes at the cost of restricted use of features that make the cloud efficient. All cloud features cannot be utilized by simply re-hosting an application workload in the public cloud. 

Refactor – for future-readiness

Refactoring means modifying an existing application to leverage cloud capabilities. This migration strategy is suitable to refactor to cloud-native applications that utilize public cloud features like auto-scaling, serverless computing, containerization, etc.

We have provided here a few easy cloud feature adaptation examples where the refactoring approach is desirable:

  • Use “object storage services” of AWS S3, GCP, etc., to download and upload files.
  • Auto-scaling workload to add (or subtract) computational resources
  • Utilizing cloud-managed services like managed databases, for example, AWS Relational Database Services (RDS ) and Atlas Mongo. 

Distinguishing features of this kind of cloud migration, and what organizations should consider:

  • Risk mitigation: Examine the expense – capital invested. Appraise the costs of business interruptions due to rewrite. Refactoring software is complex as the development teams who developed code could be busy with other projects.  
  • Cost versus benefit: Weigh the advantages of the refactoring approach. Refactoring is best if benefits outweigh the costs and the migration is feasible for the organization considering the constraints defined earlier.
  • Refactor limited code: Due to these limitations, businesses usually re-factor only a limited portion of their portfolio of applications (about 10%).

Though the benefits of this approach – like disaster recovery and full cloud-native functionality – more than makes up for the expenses, businesses nonetheless must consider other dynamics. Another advantage of this approach is its compatibility with future requirements.              

Re-platform – meeting the middle ground.

To utilize the features of cloud infrastructure, re-platform migrations transfer assets to the cloud with a small amount of modification in the deployment of code. For example, using a managed DB offering or adding automation-powered auto-scaling. Though slower than rehosting, re-platforming provides a middle ground between rehosting and refactoring, enabling workloads to benefit from basic cloud functionality.

Following are the main features of the re-platform approach:

  • Leverage cloud with limited cost and effort: In case the feasibility study reveals that refactoring is possible, but the organization wants to leverage cloud benefits, re-platforming is the best approach.
  • Re-platform a portion of workload: Due to constraints, companies opt to re-platform 20-30 % workload that can be easily transformed and can utilize cloud-native features.
  • Team composition: In such projects, cloud architecting and DevOps teams play a major role without depending heavily on development team/code changes. 
  • Leverage cloud features: Cloud features that can be leveraged are: auto-scaling, managed services of the database, caching, containers, etc. 

For an organization dealing with limitations like time, effort, and cost while desiring benefits of the cloud, re-platforming is the ideal option. For example, for an e-commerce website employing a framework that is unsuitable for serverless architecture, re-platforming is a viable option.  

Choosing the right migration approach secures long-term gains.

What we have underlined here are some of the most popular cloud migration strategies adopted by businesses today. There are others (migration approaches) like repurchasing, retaining, and retiring. These function as their names imply. In the retain (or the hybrid model), organizations keep certain components of the IT infrastructure “as-it-is” for security or compliance purposes. When certain applications become redundant, they are retired or turned off in the cloud. Further, organizations can also choose to drop their proprietary applications and purchase a cloud platform or service. 

At Magic FinServ, we have a diverse team to deliver strategic cloud solutions. We begin with a thorough assessment of what is best for your business. 

Today, organizations have realized that they cannot work in silos anymore. That way of doing business became archaic long ago. As enterprises demand more significant levels of flexibility and preparedness, the cloud becomes irreplaceable. It allows teams to work in a  collaborative and agile environment while ensuring automatic backup and enhanced security. As experts in the field, Magic FinServ suggests that organizations approach the migration process with an application-centric perspective instead of an infrastructure-centric perspective to create an effective migration strategy. The migration plan must be resilient and support future key business goals. It must adhere to the agile methodology and allow continuous feedback and improvement. Magic Finserv’s cloud team assists clients in shaping their cloud migration journey without losing sight of end goals and ensuring business continuity. 

If your organization is considering a complete/partial shift to the cloud, feel free to write to mail@magicfinserv.com to arrange a conversation with our Cloud Experts. 

A couple of years ago, Uber – the ride-sharing app, revealed that it had exposed the personal data of millions of users. The data breach happened when an Uber developer left the AWS access key in the GitHub repository. (Scenarios such as these are common since in a rush to release code, developers unknowingly fail to protect secrets.) Hackers used this key to access files from Uber’s Amazon S3 Datastore.

As organizations embrace the remote working model, security concerns have increased exponentially. This is problematic for healthcare and financial sectors dealing with confidential data. Leaders from the security domain indicate that there would be dire consequences if organizations do not shed their apathy about data security. Vikram Kunchala, US lead for Deloitte cyber cloud practice, warns that the attack surface (for hackers) has become much wider (as organizations shift to cloud and remote working) and is not limited to the “four walls of the enterprise.” He insists that organizations must consider application security a top priority and look for ways to secure code –  as the most significant attack vector is the application layer. 

Hence a new paradigm with an ongoing focus on security – shifting left. 

Shifting left: Tools of Transformation. 

Our blog, DevSecOps: When DevOps’ Agile Meets Continuous Security, focused on the shifting left approach. The shift-left approach means integrating security early in the DevOps cycle instead of considering it as an afterthought. Though quick turnaround time and release of code are important, security is vital. It cannot be omitted.  Here, in this blog, we will discuss how to transform the DevOps pipeline into the DevSecOps pipeline and the benefits that enterprises can reap by making the transition.  

At the heart of every successful transformation of the Software Development Life Cycle (SDLC) are the tools. These tools run at different stages of the SDLC and add value at different stages. While SAST, Secret detection, and Dependency scanning run through the create and build stage, DAST is applicable in the build stage. 

To provide an example, we can use a pipeline with Jenkins as CI/CD tool. For security assessment, the possible open-source tools that we can consider include Clair, OpenVAS, etc.

Static Application Security Testing (SAST) 

SAST works on static code and does not require finished or running software (unlike DAST). SAST identifies vulnerability and possible threats by analyzing the source code. It enforces coding best practices and standards for security without executing the underlying code.

It is easy to integrate SAST tools into the developer’s integrated development environment (IDE), such as Eclipse. Some of the rules configured on the developer’s IDE – SQL injection, cross-site scripting (XSS), remote code injection, open redirect, OWASP Top 10, can help identify vulnerabilities and other issues in the SDLC. In addition to IDE-based plugins, you can activate the SAST tool at the time of code commit. This will allow collaboration as users review, comment, and iterate on the code changes.

We consider SonarQube, NodeJsScan, GitGuardian as the best SAST tools for financial technology. Among the three, SonarQube has an undisputed advantage. It is considered the best-automated code review tool in the market today. It has thousands of automated Static Code Analysis rules that help save time and enable efficiency. SonarQube also supports multiple languages, including a combination of modern and legacy languages. SonarQube analyzes the repository branches and informs the tester directly in “Pull Requests.”

Other popular tools for SAST are – Talisman and Findbug. These mitigate security threats by ensuring that potential secrets/sensitive information does not leave the developer’s workstation.

SAST tools must be trained or aligned (in the configuration) as per the use case. For optimized effectiveness, one must figure a few iterations beforehand to remove false positives, irrelevant checks, etc., and move forward with zero-high severity issues.

Secret Detection

GitGuardian has revealed that it detected more than two million “secrets” in public GitHub repositories last year. 85% of the secrets were in the developers’ repositories which fell outside corporate control. Jeremy Thomas, the GitGuardian CEO, worries about the implications of the findings. He says, “what’s surprising is that a worrying number of these secrets leaked on developers’ personal public repositories are corporate secrets, not personal secrets.” 

Undoubtedly, secrets or codes that developers leave in their remote repositories (sometimes) are a significant security concern. API keys, database credentials, security certificates, passwords, etc., are sensitive information, and unintended access can cause untold damage. 

Secret Detection tools are ideal for resolving this issue. Secret detection tools prevent unintentional security lapses as it scans source code, logs, and other files to detect secrets left behind by the developer. One of the best examples of a secret detection tool is GitGuardian. GitGuardian’s code searches for proof of secrets in developers’ repositories and stops “hackers from using GitHub as the “backdoor to business.” From keys to database connection strings, SSL certificates, usernames, and passwords, GitGuardian protects 300 different types of secrets. 

Organizations can also prevent leaks with vaults and pre-commit hooks.         

Vaults: Vaults are an alternative to using secrets directly in source code. Vaults make it impossible for developers to push secrets to the repository. Azure vaults, for example, can store keys and secrets whenever needed. Alternatively, secrets can be used in Kubernetes.                                                                                                                                     

Pre-Commit hooks: Secret detection tools can also be activated with pre-commit tools, such as the tools embedded in the developer’s IDE to identify sensitive information like keys, passwords, tokens, SSH keys. 

Dependency Scanning 

When a popular NPM module, npm left-pad (a code shortcut), was deleted by an irate developer, many software projects for Netflix, Spotify, and other titans were affected. The developer wanted to take revenge as he was not allowed to name one of his codes “Kik,” as it was the name of a social network. The absence of a few lines of code could have created a major catastrophe if action was not taken on time. NPM decided to un-publish the code and give it to a new owner. Though it violated the principles of “intellectual property,” it was necessary to end the crisis.    

It is beyond doubt that if libraries/components are not up to date, vulnerabilities creep in. Failure to check dependencies can have a domino effect. If one card falls, others fall as well. Hence the need for clarity and focus because “components, such as libraries, frameworks, and other software modules, run with the same privileges as the application. If a vulnerable component is exploited, such an attack can facilitate serious data loss or server takeover. Applications and APIs using components with known vulnerabilities may undermine application defenses and enable various attacks and impacts.” 

Dependency Scanning identifies security vulnerabilities in dependencies. It is vital for instilling security in SDLC. For example, if your application is using an external (open source) library (known to be vulnerable), tools like Snyk and White Source Bolt can detect and fix all vulnerabilities.    

Dynamic Application Security Testing (DAST) 

DAST helps to find vulnerabilities in running applications. Assists in the identification of common security bugs such as SQL injection, cross-­site scripting, OWASP top 10., etc. It can detect runtime problems that static analysis ignores, such as authentication and server configuration issues and vulnerabilities – apparent when a known user logs in. 

OWASP ZAP is a full-featured, free, and open-source DAST tool that includes automated vulnerability scanning and tools to aid expert manual web app pen-testing. ZAP can exploit and recognize a large number of vulnerabilities.

Interactive Application Security Testing (IAST) – Works best in the QA environment.  

Known as “grey box” testing, Interactive Application Protection Monitoring (IAST) examines the entire application. It has an advantage over DAST and SAST. It can be scaled. Normally an agent inside the test runtime environment implements IAST (for example, instrumenting the Java Virtual Machine [JVM] or the.NET CLR) – watches for operations or attacks and detects flaws. 

Acunetix is a good example of an IAST tool.

Runtime Application Self Protection (RASP)

Runtime Application Self Protection (RASP) is server-side protection that activates on the launch of an application. Tracking real-time attacks, RASP shields the application from malicious requests or actions as it monitors application behavior.  RASP detects and mitigates attacks automatically,  providing runtime protection. Issues are instantly reported after mitigation for root cause analysis and fixes.

An example of the RAST tool is Sqreen; Sqreen defends against all OWASP top 10 security bugs, including SQL injection, XSS, and SSRF. Sqreen is effective with its ability to use request execution logic to block attacks with fewer false positives. It can adapt to your application’s unique stack, requiring no redeployment or configuration inside your software, making setup easy and straightforward.

Infrastructure Scan  

These scans are performed on production and other similar environments. These scans look for all the possible vulnerabilities: software running, open ports, SSLs, etc., to keep abreast with the latest vulnerabilities discovered and reported worldwide. Periodic scans are essential. Scanning tools utilize vulnerability databases like Common Vulnerability and Exposure (CVE) and U.S. National Vulnerability Database (NVD) to ensure that they are up to date. Open VAS, Nessus, etc., are some excellent infrastructure scan tools. 

With containers gaining popularity, container-specific tools like Clair DB are gaining prominence. Clair is a powerful open-source tool that helps scan containers and docker images for potential security threats.  

Cultural aspect 

Organizations must change culturally and ensure that developers and security analysts are on the same page. Certain tools empower the developer and ensure that they play a critical role in instilling security. SAST in the DevSecOps pipeline, for example, empowers developers with security knowledge. It helps them decipher the bugs that they might have missed. 

Kunchala acknowledges that organizations that have defense built into their culture face less friction handling application security compared to others. So a cultural change is as important as technology. 

Conclusion: Security cannot be ignored; it cannot be an afterthought

No one tool is perfect. Nor can one tool solve all vulnerabilities. Neither can one apply one tool to the different stages of the SDLC. Tools must be chosen according to the stage of product development. For example, if a product is at the “functionality-ready” stage, it is advisable to focus on tools like IAST and RASP. The cost of fixing issues at this stage will be high though. 

Hence the need to weave security at all stages of the SDLC. Care must also be taken to ensure that the tools complement each other. That there is no noise in communication and the management and security/development are in tandem when it comes to critical decisions.

This brings us to another key aspect if organizations are keen on incorporating robust security practices – resources. Resource availability and the value addition they bring during the different stages of the SDLC counter the investment costs.  
The DevOps team at MagicFinserv works closely with the development and business teams to understand the risks and the priorities. We are committed to further the goal of continuous security while ensuring stability, agility, efficiency, and cost-saving.

To explore DevSecOps for your organization, please write to us at mail@magicfinserv.com.

Enterprise-level distributed/decentralized applications have become an integral part of any organization today and are being designed and developed to be fault-tolerant to ensure availability and operability.  However, despite the time and efforts invested in creating a fault-tolerant application,  no one can be 100 %  sure that the application will be able to bounce back with the nimbleness desired in the event of failures. As the nature of failure can differ each time, developers have to design, considering all kinds of anticipated failures/scenarios. From a broader perspective, the failures can any of the four types  mentioned below: 

  1. Failure Type1: Network Level Failures
  2. Failure Type2: Infrastructure (System or Hardware Level) Failures
  3. Failure Type3: Application Level Failure
  4. Failure Type4: Component Level Failures 

Resiliency Testing – Defining the 3-step process: 

Resiliency testing is critical for ensuring that applications perform as desired in real-life environments. Testing an application’s resiliency is also essential for ensuring quick recovery in the event of unforeseen challenges arising.      

Here, the developer’s need is to develop a robust application that can rebound with agility for all probable failures. Due to the complex nature of the application, there are still unseen failures that keep coming up in production. Therefore, it has become paramount for testers to continually verify the developed logic to define the system’s resiliency for all such real-time failures. 

Possible ways for testers to emulate real-time failures and check how resilient an application is against such failures

Resiliency testing is the methodology that helps to mimic/emulate various kinds of failures, as defined earlier. The developer/tester determines a generic process for each failure identified earlier before defining a strategy for resiliency testing for distributed and decentralized applications. 

Based on our experience with multiple customer engagements for resiliency testing, the following  3-Step process must be followed before defining a resiliency strategy.

  1. Step-1: Identification of all components/services/any sort of third-party library or tool or utility. 
  2. Step-2: Identification of intended functionality for each components/ services/ library/ tool/ utility.
  3. Step-3: Build an upstream and downstream interface and expected result to function and integration as per acceptance criteria.

As per the defined process, the tester has to collect all functional/non-functional requirements and acceptance criteria for all the four failure types mentioned earlier. Once all information gets collected, it should be mapped with the 3-step process to lay down what is to be verified for each component/service. After mapping for each failure using the 3-Step process, we are ready to define a testing strategy and automate the same to achieve accuracy while reducing execution time. 

We elicited the four ways to define distributed/decentralized networks for the testing environment in our previous blog. This blog explains the advantages/disadvantages of each approach to set up applications in a test environment. It also describes why we prefer to first test such an application with containerized application followed by the cloud environment over virtual machines and then a physical device-based setup. 

To know more about our Blockchain Testing solutions, read here

Three modes of Resiliency testing 

Each mode needs to be executed with controlled and uncontrolled wait times. 

Mode1: Controlled execution for forcefully restarting components/services

Execution of “components restart” can be sequenced with a defined expected outcome. Generally, we flow successful and failed transactions followed by ensuring reflection of transactions on the system from overall system behavior. If possible, then we can assert the individual components/services response for the flowed transaction based on the intended functionality of restarted component/service. This kind of execution can be done with: 

  • The defined fixed wait time duration for restarting
  • Randomly selecting the wait time interval.
Mode2: Uncontrolled execution (randomization for choosing component/service) for forcefully restarting components/services

Execution of a component restart can be selected randomly with a defined outcome. Generally, we flow successful and failed transactions, followed by ensuring reflection of the transaction on the system from overall system behavior. If possible, then we can assert the individual components/services response for the flowed transaction based on the intended functionality of restarted component/service. This kind of execution can be done with: 

  • The defined fixed wait time duration for restarting
  • Randomly selecting a wait time interval.
Mode3: Uncontrolled execution (randomization for choosing multiple components/services) for forcefully restarting components/services

Though this kind of test is more realistic to be performed, it has a lot of complexity based on how the components/services are designed. If the number of components/services is too many, then the combination of test scenarios will increase exponentially. So the tester should create the test with the assistance of system/application architecture to make the group of components/services represent the entity within the system. Then Mode1 & Mode2 can be executed for those groups. 

Types of Failures

Network Level Failures

As distributed/decentralized application uses peer-to-peer networking to establish a connection among the nodes,  we need to get specific component/service detail on how it can be restarted. We also need to know how to verify the behavior during downtime and restarting the same. Let’s assume the system has one container within each node that is responsible for setting up communication with other available nodes; then the following verification can be performed – 

  1. During downtime, other nodes are not able to communicate with the down node.
  2. No cascading effect of the down node occurs to the rest of the nodes within the network.
  3. After restart and initialization of restarted component/service, other nodes can establish communication with the down node, and the down node can also process the transaction.
  4. The down node can also interact with other nodes within the system and route the transaction as expected.
  5. Data consistency can be verified.
  6. Thestem’s latency can also be captured before/after the restart to ensure performance degradation is introduced to the system.
Infrastructure (System or Hardware Level) Failures

As the entire network is being run through containerized techniques so to emulate infrastructure failure, we can use multiple strategies like: 

  1. By taking containerized application down or if Docker is being used, then taking docker daemon process down.
  2. By imposing a resource limit for memory, CPUs, etc., so low at the container level that it quickly gets exhausted with a mild load on the system.
  3. Overload the system with a high number of transactions with various sizes of data generated by the transaction.

We can verify if the system as a whole is meeting all functional and non-functional requirements with each failure described above.

Application Level Failure

As a distributed application uses a lot of containers to run the application, so we only target to stop and start a specific container having the application logic. The critical aspect for restarting application containers is the timing of stopping and starting the container to keep track of transaction processing. Three time-dependent stages for an application related to container stop and start are:

  1. Stage1: Stop the container before sending a transaction.
  2. Stage2: Stop the container after sending a transaction with different time intervals, e.g., stopping the container immediately, after 1000 milliseconds, 10 seconds, etc.
  3. Stage3: Stop the container when a transaction is in a processing stage.

System behavior can be captured and asserted against functional and non-functional acceptance criteria for all the above three stages.

Component Level Failures

The tester should verify the remaining containers for all three modes with three different stages with respect to time. We can create as many scenarios for these containers depending upon the following factors :

  1. The dependency of remaining containers on other critical containers.
  2. Intended functionality of the container and frequency of usage of those containers in most frequently used transactions.
  3. Stop and start for various time intervals (include all three stages to have more scenarios to target any fragile situation).
  4. Most fragile or unstable or mostly reported error within remaining containers.

By following the above-defined strategy for resiliency, the tester should always reconcile the application under test to check whether any areas are still left to be covered or not. If there is any component/service/third-party module or tool or utility that is untouched, then we can design scenarios by combining the following factors: 

  1. Testing modes
  2. Time interval stages 
  3. Execution mode, e.g., sequential and randomization of restarts
  4. Grouping of containers for stopping and restarting

Based on our defined approach followed by the implementation for multiple customers, we have prevented almost 60-70% of real-time issues related to resiliency. We also keep revising and upgrading our approach based on new experiences with new types of complicated distributed or decentralized applications and new failures so we can increase the prevention of real-time issues at a comprehensive level. To explore resiliency testing for your decentralized applications, please write to us at mail@magicfinserv.com.

Get Insights Straight Into Your Inbox!

    CATEGORY