Resiliency testing for decentralized applications with containerized application

Enterprise-level distributed/decentralized applications have become an integral part of any organization today and are being designed and developed to be fault-tolerant to ensure availability and operability.  However, despite the time and efforts invested in creating a fault-tolerant application,  no one can be 100 %  sure that the application will be able to bounce back with the nimbleness desired in the event of failures. As the nature of failure can differ each time, developers have to design, considering all kinds of anticipated failures/scenarios. From a broader perspective, the failures can any of the four types  mentioned below: 

  1. Failure Type1: Network Level Failures
  2. Failure Type2: Infrastructure (System or Hardware Level) Failures
  3. Failure Type3: Application Level Failure
  4. Failure Type4: Component Level Failures 

Resiliency Testing – Defining the 3-step process: 

Resiliency testing is critical for ensuring that applications perform as desired in real-life environments. Testing an application’s resiliency is also essential for ensuring quick recovery in the event of unforeseen challenges arising.      

Here, the developer’s need is to develop a robust application that can rebound with agility for all probable failures. Due to the complex nature of the application, there are still unseen failures that keep coming up in production. Therefore, it has become paramount for testers to continually verify the developed logic to define the system’s resiliency for all such real-time failures. 

Possible ways for testers to emulate real-time failures and check how resilient an application is against such failures

Resiliency testing is the methodology that helps to mimic/emulate various kinds of failures, as defined earlier. The developer/tester determines a generic process for each failure identified earlier before defining a strategy for resiliency testing for distributed and decentralized applications. 

Based on our experience with multiple customer engagements for resiliency testing, the following  3-Step process must be followed before defining a resiliency strategy.

  1. Step-1: Identification of all components/services/any sort of third-party library or tool or utility. 
  2. Step-2: Identification of intended functionality for each components/ services/ library/ tool/ utility.
  3. Step-3: Build an upstream and downstream interface and expected result to function and integration as per acceptance criteria.

As per the defined process, the tester has to collect all functional/non-functional requirements and acceptance criteria for all the four failure types mentioned earlier. Once all information gets collected, it should be mapped with the 3-step process to lay down what is to be verified for each component/service. After mapping for each failure using the 3-Step process, we are ready to define a testing strategy and automate the same to achieve accuracy while reducing execution time. 

We elicited the four ways to define distributed/decentralized networks for the testing environment in our previous blog. This blog explains the advantages/disadvantages of each approach to set up applications in a test environment. It also describes why we prefer to first test such an application with containerized application followed by the cloud environment over virtual machines and then a physical device-based setup. 

To know more about our Blockchain Testing solutions, read here

Three modes of Resiliency testing 

Each mode needs to be executed with controlled and uncontrolled wait times. 

Mode1: Controlled execution for forcefully restarting components/services

Execution of “components restart” can be sequenced with a defined expected outcome. Generally, we flow successful and failed transactions followed by ensuring reflection of transactions on the system from overall system behavior. If possible, then we can assert the individual components/services response for the flowed transaction based on the intended functionality of restarted component/service. This kind of execution can be done with: 

  • The defined fixed wait time duration for restarting
  • Randomly selecting the wait time interval.
Mode2: Uncontrolled execution (randomization for choosing component/service) for forcefully restarting components/services

Execution of a component restart can be selected randomly with a defined outcome. Generally, we flow successful and failed transactions, followed by ensuring reflection of the transaction on the system from overall system behavior. If possible, then we can assert the individual components/services response for the flowed transaction based on the intended functionality of restarted component/service. This kind of execution can be done with: 

  • The defined fixed wait time duration for restarting
  • Randomly selecting a wait time interval.
Mode3: Uncontrolled execution (randomization for choosing multiple components/services) for forcefully restarting components/services

Though this kind of test is more realistic to be performed, it has a lot of complexity based on how the components/services are designed. If the number of components/services is too many, then the combination of test scenarios will increase exponentially. So the tester should create the test with the assistance of system/application architecture to make the group of components/services represent the entity within the system. Then Mode1 & Mode2 can be executed for those groups. 

Types of Failures

Network Level Failures

As distributed/decentralized application uses peer-to-peer networking to establish a connection among the nodes,  we need to get specific component/service detail on how it can be restarted. We also need to know how to verify the behavior during downtime and restarting the same. Let’s assume the system has one container within each node that is responsible for setting up communication with other available nodes; then the following verification can be performed – 

  1. During downtime, other nodes are not able to communicate with the down node.
  2. No cascading effect of the down node occurs to the rest of the nodes within the network.
  3. After restart and initialization of restarted component/service, other nodes can establish communication with the down node, and the down node can also process the transaction.
  4. The down node can also interact with other nodes within the system and route the transaction as expected.
  5. Data consistency can be verified.
  6. Thestem’s latency can also be captured before/after the restart to ensure performance degradation is introduced to the system.
Infrastructure (System or Hardware Level) Failures

As the entire network is being run through containerized techniques so to emulate infrastructure failure, we can use multiple strategies like: 

  1. By taking containerized application down or if Docker is being used, then taking docker daemon process down.
  2. By imposing a resource limit for memory, CPUs, etc., so low at the container level that it quickly gets exhausted with a mild load on the system.
  3. Overload the system with a high number of transactions with various sizes of data generated by the transaction.

We can verify if the system as a whole is meeting all functional and non-functional requirements with each failure described above.

Application Level Failure

As a distributed application uses a lot of containers to run the application, so we only target to stop and start a specific container having the application logic. The critical aspect for restarting application containers is the timing of stopping and starting the container to keep track of transaction processing. Three time-dependent stages for an application related to container stop and start are:

  1. Stage1: Stop the container before sending a transaction.
  2. Stage2: Stop the container after sending a transaction with different time intervals, e.g., stopping the container immediately, after 1000 milliseconds, 10 seconds, etc.
  3. Stage3: Stop the container when a transaction is in a processing stage.

System behavior can be captured and asserted against functional and non-functional acceptance criteria for all the above three stages.

Component Level Failures

The tester should verify the remaining containers for all three modes with three different stages with respect to time. We can create as many scenarios for these containers depending upon the following factors :

  1. The dependency of remaining containers on other critical containers.
  2. Intended functionality of the container and frequency of usage of those containers in most frequently used transactions.
  3. Stop and start for various time intervals (include all three stages to have more scenarios to target any fragile situation).
  4. Most fragile or unstable or mostly reported error within remaining containers.

By following the above-defined strategy for resiliency, the tester should always reconcile the application under test to check whether any areas are still left to be covered or not. If there is any component/service/third-party module or tool or utility that is untouched, then we can design scenarios by combining the following factors: 

  1. Testing modes
  2. Time interval stages 
  3. Execution mode, e.g., sequential and randomization of restarts
  4. Grouping of containers for stopping and restarting

Based on our defined approach followed by the implementation for multiple customers, we have prevented almost 60-70% of real-time issues related to resiliency. We also keep revising and upgrading our approach based on new experiences with new types of complicated distributed or decentralized applications and new failures so we can increase the prevention of real-time issues at a comprehensive level. To explore resiliency testing for your decentralized applications, please write to us at mail@magicfinserv.com.

DevSecOps: When DevOps’ Agile Meets Continuous Security

The business landscape today is extremely unpredictable. The number of applications that are hosted on disparate cloud environments or on-prem has proliferated exponentially, and hence there is a growing need for swifter detection of discrepancies (compliance and security-related) in the IT infrastructure. Continuous security during the development and deployment of software is critical as there is no forewarning when and where a breach could happen. As organizations evolve, there is always a need for greater adherence to security and compliance measures.

Earlier, software updates were fewer. Security, then, was not a pressing concern and it was standard to conduct security checks late in the software development lifecycle. However, the times have changed. Frequent software updates imply that codes are changed frequently as well. In turn, this poses unimaginable risks (if care is not taken) and as there are changes in attack surfaces and risk profiles. So, can organizations afford to be slack about security? 

The answer is no. Security is not optional anymore, it is a fundamental requirement and must be ingrained at the granular level and hence the concept of continuous security. To arrest any flaw or breach or inconsistency in design (before it too late). Organizations must check different aspects of security periodically. Whether the check happens after a predefined time or in real-time depends upon the need of the business. Security checks can be manual or automated; it can be a review of configuration parameters on one hand and constant activity monitoring on the other.  

Defining Continuous Security 

Constant activity monitoring became de facto with the rise of parameter security. And when that happened, operations started using systems like IDS, IPS, WAF, and real-time threat detection systems. But this kind of security approach tended to take account of security monitoring involving operations or infrastructure teams. The continuous security paradigm made it possible for organizations to ensure greater levels of security. The continuous security model relies on organizational processes, approvals, and periodic manual checks to monitor the different kinds of hardware and software involved in operations.

Why DevSecOps 

“In 2018, Panera Bread confirmed to Fox News that it had resolved a data breach. However, by then it was too late as the personal information including name, email, last four digits of customer credit card number had been leaked through the website. Interestingly, Panera Bread was first alerted to the issue by security researcher Dylan Houlihan. According to KrebsOnSecurity 37 million accounts were likely to be impacted.” 

As organizations realized the importance of continuous security, the need for making it an extension of the DevOps process arose. Organizations desiring streamlined operations adopt DevOps as a means to shorten the systems development life cycle and ensure continuous delivery with high software quality.  

As DevOps, Cloud, and Virtualization gained prominence, agility and flexibility became the new axioms of development. But existing security and compliance processes that involved multiple levels of stakeholder engagement, and associated manual checks and approvals were time-consuming and tedious. A barrier to the development of a truly nimble enterprise.

We also know that as the number of people involved (stakeholders) increases, it takes greater effort to keep the business streamlined and agile. Despite that, stakeholders are integral to the DevOps process as they are responsible for the speed of delivery and quality of the application. Another barrier arises as a result of the bias and error inherent in manual security and compliance checks.    

Businesses must give due consideration to security best practices while ensuring the speed of delivery, flexibility, and agility as continuous changes in software during  DevDops are risky. But when security is integrated into DevOps’s continuous delivery loop, the security risks are minimized significantly. And so the natural extension of the concept of DevOps to DevSecOps. In the scheme of things, DevSecOps is where agile and continuous security meet.  

Ingraining Continuous Security in DevOps

While earlier, security was incorporated at the end of the software development lifecycle through manual/automated reviews, DevSecOps ensures that changes are incorporated at every stage. In doing so, loopholes that exist in code are revealed early. A quick reconciliation or remediation ensures better lead times and delivery outcomes.

Traditionally, instead of running security and compliance checks in parallel, security was taken care of after the application life cycle was complete. Though in recent years, developers have taken to writing safe code and following security best practices for developing applications, even today enterprises have not assimilated security in the continuous delivery process., Security assessments, PEN testing, vulnerability assessment, etc., are not covered in the DevOps cycle. As a result, the objective of “software, safer, sooner” is not achieved.     

SecDevOPs’ biggest asset is its inclusivity. It addresses security at every layer. All stakeholders are involved as well at the very beginning of the application’s lifecycle. It is a continuous process. Here, the security teams use all the tools and automation done by DevOps in conjunction with security teams.

Advantage of DevSecOps

DevSecOps Security is Built-In

DevSecOps runs on a very simple premise. Ensuring application and infrastructure security from the very beginning. Automating security tools and processes is integral to this approach as it is dependent on the speed of delivery that takes a hit whenever repeated or recurring low-complexity tasks are allocated to manual labor. Security scans and audits are onerous and time-consuming if done manually. 

However effective the DevOps team may be with automation and tools, its success depends upon integrating the work of security and audit teams within the development lifecycle. The sooner done, the better. As data breaches become common and the costs of remediating them are exorbitant, it becomes crucial to employ security experts at every stage of the software development life cycle instead of relegating them to gatekeeping activity.        

“DevSecOps is security within the app life cycle. Security is addressed at every stage”

DevSecOps Solution to Compliance Concern

With more access comes a greater threat. As applications moved to the cloud and DevOps became the much-sought means for streamlining operations, there were concerns about breaches. As third-party vendors were accessible to many of the internal processes, it became necessary to delineate access and ensure greater compliance. With the DevSecOps approach, all the fears were repudiated. It was evident that DevOps had no adverse effect. Instead, it ensured compliance. It is now more important to focus on how DevOps is implemented. How to balance automation of compliance adherence with minimal disruption to the business.  

Seven Salient Features of the DevSecOps Approach 

    Promote the philosophy “Security is everyone’s concern”

Develop security capability within teams and work with domain experts. Security teams work with DevOps to automate the security process. DevSecOps operatives work with security teams and integrate security as part of the delivery pipeline. Development teams and testing teams are trained on security so that they can focus on security to be as important as functionality.

❖     Address security bugs early.

Find and fix security bugs and vulnerabilities as early as possible in the Software Development Lifecycle (SDLC). This is done by automated scans and automated security testing, integrated with CI/CD pipeline. This requires a shift left approach in the delivery pipeline – the development and testing teams fix the issues as soon as it arises and then moves onto the next stage of the cycle. Right after addressing the concern. 

❖     Integrate all security software centrally

Integrating all security software (which includes code analysis tools, automated tests, vulnerability scans, etc.,) at a central location – accessible to all stakeholders. Since it is not viable to address multiple concerns at the same time. As it is a bit too much work in the early stages of the project, teams must prioritize. Priority must be accorded based on potential threats and known exploits. Doing this would help utilize the test results more effectively. 

❖     Continuously measure and shrink the attack surface.

Going beyond perimeter security by implementing continuous vulnerability scans and automated security tests minimizes the attack surface. Issues and threats are addressed before they can be exploited.

❖      Automation to reduce effort and increase accuracy.  

Agility and accuracy in security risk mitigation are dependent on the ability of the DevOps team to automate. This reduces the manual effort and associated errors that arise due to ingrained bias and other factors. The choice of tools used by the team is important as it should support automation. For obvious reasons, organizations prefer open-source tools as they are flexible and can be modified.  

  ❖    Automation in change management 

The push for automation has resulted in teams (involved application development and deployment) defining a set of rules for decision making. Increased availability of automation tools and machine learning gave greater impetus to change management automation. Only exceptional cases require manual intervention, thus decreasing the turnaround time.

❖     Ensures 24 x 7 compliance and reporting 

Compliance no longer remains a manual and cumbersome work to be done at certain times in the software life cycle. DevSecOps enables using automation to monitor compliance continuously and alert when the possible risk of breach happens. Compliance reporting often considered as an overhead, and time-intensive activity is now readily available. Thus, a system can be in a constant state of compliance.

DevSecOps – ensuring agility and security

The ever-increasing complexity in multi-cloud and on-premise and the highly distributed nature of DevOps operations (teams are spread across different zones) are driving organizations to ensure that continuous security is one of the pillars of the operational processes. In the evolving business landscape in the COVID-19 era, DevSecOps drives a culture of change. One, where security is no longer a standalone function and security teams work in tandem with development and testing teams to ensure that continuous deployment meets continuous security.     

As a leading technology company for financial services, Magic FinServ enables clients to scale to the next level of growth at optimal costs while ensuring adherence to security and compliance standards. Partnering with clients, in their application development and deployment journey, we establish secure practices from Day 0 to implement SecDevOps practices. From continuous feedback loops to regular code audits, all are performed in a standardized manner to ensure consistency. 

To explore DevSecOps for your organization, please write to us at mail@magicfinserv.com.

Banks and FinTechs: Enriching CX with Frictionless KYC, using AI.

A Forrester Report suggests that by 2030, banking would be invisible, connected, insights-driven, and purposeful. ‘Trust’ will be key for building the industry in the future.  

But how do banks and FinTechs enable an excellent customer experience (CX) that translates into “trust” when the onboarding experience itself is time-consuming and prone to error. The disengagement is clear from industry reports. 85% of corporates complained that the KYC experience was poor. Worse, 12% of corporate customers changed banks due to the “poor” customer experience.

Losing a customer is disastrous because the investment and effort that goes into the process are immense. Both KYC and Customer Lifecycle Management (CLM) are expensive and time-consuming. Banks could employ hundreds of staff for a high-risk client for procuring, analyzing, and validating documents. Thomson Reuters reports that, on average, banks use 307 employees for KYC. They spend $40 million (on average) to onboard new clients. When a customer defects due to poor customer engagement, it is a double whammy for the bank. It loses a client and has to work harder to cover the costs of the investment made. Industry reports indicate that new customer acquisition is five times costly than retaining an existing one. 

The same scenario is applicable for financial companies, which must be very careful about who they take in as clients. As a result, FinTechs struggle with greater demand for customer-centricity while fending competition from challengers. By investing in digital transformation initiatives like digital KYC, many challenger banks and FinTechs deliver exceptional CX outcomes and gain a foothold. 

Today Commercial Banks and FinTechs cannot afford to overlook regulatory measures, anti-terrorism, anti-money laundering (AML) standards, and legislation, violations of which would incur hefty fines and lead to reputational damage. The essence of KYC is to create a robust, transparent, and up-to-date profile of the customer. Banks and FinTechs investigate the source of their wealth, ownership of accounts, and how they manage their assets. Scandals like Wirecard have a domino effect, and so banks must flag off inconsistencies in real-time. As a result, banks and FinTechs have teamed up with digital transformation partners and are using emerging technologies AI, ML, and NLP to make their operations frictionless and customer-centric. 

Decoding existing paint-points and examining the need for a comprehensive data extraction tool to facilitate seamless KYC

Long time-to-revenue results in poor CX

Customer disengagement in the financial sector is common. Every year, financial companies lose revenue due to poor CX. Here the prime culprit for customer dissatisfaction is the prolonged time-to-revenue. High-risk clients average 90-120 days for KYC and onboarding. 

The two pain points are – poor data management and traditional methods for extracting data from documents (predominantly manual). Banking c-suite executives concede that poor data management arising due to silos and centralized architecture is responsible for high time-to-revenue.  

The rise of exhaust data 

Traditionally, KYC involved checks on data sources such as ownership documents, stakeholder documents, and the social security/ identity checks of every corporate employee. But today, the KYC/investigation is incomplete without verification of exhaust data. And in the evolving business landscape, it is exigent that FinTech and banks take exhaust data into account. 

Emerging technologies like AI, ML, and NLP make onboarding and Client Lifecycle Management (CLM) transparent and robust. With an end-to-end CLM solution, banks and FinTech can benefit from an API-first ecosystem that supports a managed-by-exception approach. An API-first ecosystem that supports an exception management approach is ideal for medium to low-risk clients. Data management tools that can extract data from complex documents and read like humans elevate the CX and save banks precious time and money. 

Sheer volume of paperwork prolongs onboarding. 

The amount of paperwork accompanying the onboarding and KYC process is humongous. When it comes to business or institutional accounts, banks must verify every person’s existence on the payroll. Apart from social security and identity checks, ultimate beneficial owners (UBO), and politically exposed persons (PEP), banks would have to cross-examine documents related to the organization’s structure. Verifying the ownership of the organization and the beneficiaries’ check adds to the complexity. After that, corroborating data with media checks and undertaking corporate analysis to develop a risk profile. With this kind of paperwork involved, KYC could take days. 

However, as this is a low-complexity task, it is profitable to invest in AI. Instead of employing teams to extract and verify data, banks and FinTechs can use data extraction and comprehension tools (powered with AI and enabled with machine learning) to accelerate paperwork processes. These tools digitize documents and extract data from structured and unstructured documents, and as the tool evolves with time, it detects and learns from document patterns. ML and NLP have that advantage over legacy systems – learning from iterations.   

Walking the tightrope (between compliance and quick TOI)

Over the years, the kind of regulatory framework that America has adopted to mitigate financial crimes has become highly complex. There are multiple checks at multiple levels, and enterprise-wide compliance is desired. Running a KYC engages both back and front office operations. With changing regulations, Banks and FinTechs must ensure that KYC policies and processes are up-to-date. Ensuring that customers meet their KYC obligations across jurisdictions is time-consuming and prolonged if done manually. Hence, an AI-enabled tool is needed to speed up processes and provide a 360-degree view and assess the risk exposure. 

In 2001, the Patriot Act came into existence to counter terrorist and money laundering activities. KYC became mandatory. In 2018, the U.S. Financial Crimes Enforcement Network (FinCEN) incorporated a new requirement for banks. They had to verify the “identity of natural persons of legal entity customers who own, control, and profit from companies when those organizations open accounts.” Hefty fines are levied if banks fail to execute due diligence as mandated.

If they are to rely on manual efforts alone, banks and FinTechs will find it challenging to ensure CX and quick time-to-revenue while adhering to regulations. To accelerate the pace of operations, they need tools that can parse through data with greater accuracy and reliance than the human brain. And also can learn from processes.  

No time for perpetual KYC as banks struggle with basic KYC

For most low and medium-risk customers, a straight-through-processing (STF) of data would be ideal. It reduces errors and time to revenue. Client Lifecycle Management is essential in today’s business environment as it involves ensuring customers are compliant through all stages and events in their lifecycle with their financial institution. That would include raking through exhaust data and traditional data from time to time to identify gaps. 

A powerful document extraction and comprehension tool is therefore no longer an option but a prime requirement.  

Document extraction and comprehension tool: how it works 

Document digitization: IDP begins with document digitization. Documents that are not in digital format are scanned. 

OCR: Next step is to read the text. OCR does the job. Many organizations use multiple OCRS for accuracy. 

NLP: Recognition of text follows the reading of the text. With NLP, words, sentences, and paragraphs are provided a meaning. NLP uses sentiment analysis, part of speech tagging, and making it easier to draw a relation. 

Classification of documents: Manual categorization of documents is another lengthy process that is tackled by IDP’s classification engine. Here machine learning (ML) tools are employed to recognize the kinds of documents and feed them to the system.  

Extraction: The penultimate step in IDP is data extraction. It consists of labeling all expected information within a document and extracting specific data elements like dates, names, numbers, etc.

Data Validation: Once the data has been extracted, it is combined and pre-defined validation rules based on AI check for accuracy and flag off errors, improving the quality of extracted data.     

Integration/Release: Once the data has been validated/checked, the documents and images are exported to business processes or workflows. 

The future is automation!

The future is automation. An enriched customer experience begins with automation. To win customer trust, commercial banks and FinTechs must ensure regulation compliance, improve CX, reduce the costs by incorporating AI and ML and ensure a swifter onboarding process. In the future, banks and FinTechs that improvise their digital transformation initiatives and enable faster and smoother onboarding and customer lifecycle management will facilitate deeper customer engagement. They would have gained an edge. Others would struggle in an unrelenting business landscape.

True, there is no single standard for KYC in the banking and FinTech industry. The industry is as vast as the number of players. There are challengers/start-ups and decades-old financial institutions that coexist. However, there is no question that data-driven KYC powered by AI, ML brings greater efficiency and drives customer satisfaction. 

A tool like Magic DeepSight™ is a one-stop solution for comprehensive data extraction, transformation, and delivery from a wide range of unstructured data sources. Going beyond data extraction, Magic DeepSight™ leverages AI, ML, and NLP technologies to drive exceptional results for banks and FinTechs. It is a complete solution as it integrates with other technologies such as API, RPA, smart contract, etc., to ensure frictionless KYC and onboarding. That is what the millennial banks and FinTechs need.  

Assessing your firm’s Cloud Maturity

Burdened by silos and big and bulky infrastructure, the financial services sector seeks a change that brings agility and competitiveness. Even smaller financial firms are dictated by a need to cut costs and stand out. 

“The widespread, sudden disruptions caused by the COVID situation have highlighted the value of having as agile and adaptable a cloud infrastructure as you can — especially as we see companies around the world expedite investments in the cloud to enable faster change in moments of uncertainty and disruption like we faced in 2020.” Daniel Newman 

Embracing cloud in 2021

The pandemic has been the meanest disrupter of the decade. Many banks went into crisis mode and were forced to rethink their options and scale up to ensure greater levels of digital transformation. How quickly these were able to scale up to meet the customer’s demands became a critical asset in the new normal. 

With technology stacks evolving at lightning speeds and application architecture replaced with private, public, hybrid, or multi-cloud, the financial services sector can no longer resist the lure of the cloud.  Cloud has become synonymous with efficiency, customer-centricity, and scalability.  

Moreover, most financial institutions have realized that the ROI for investment in the cloud is phenomenal. The returns that a financial firm may get in 5 years are enormous. As a result, financial firms’ investment in the cloud market is expected to grow at a CAGR of 24.4% to $29.47 billion by 2021. The critical levers for this phenomenal growth would be business agility, market focus, and customer management.               

Unfortunately, while cloud adoption seems inevitable, many financial industry businesses are still grappling with the idea and wondering how to go about it efficiently. The smaller firms are relative newcomers in terms of cloud adoption. The industry had been so heavily regulated that privacy and fear of data leaks almost prevented the financial institutions from moving to the cloud. The most significant need is trust and reliability as migration to the cloud involves transferring highly secure and protected data. Therefore, the firms need a partner with expertise in the financial services industry to securely envision a transition to the cloud in the most seamless manner possible.  

Identifying your organization’s cloud maturity level     

The first step towards an efficient move to the cloud is identifying your organization’s cloud maturity level. Maturity and adoption assessment is essential as there are benefits and risks involved with short-and long-term impacts. Rushing headlong into uncharted waters will not serve the purpose. Establishing the cloud maturity stage accelerates the firm’s cloud journey by dramatically reducing the migration process’s risks and sets the right expectations to align organizational goals accordingly.

Progressing from none to optimized, presented below are the levels in terms of maturity. Magic FinServ uses these stages to assess a firm’s existing cloud state and then outlines a comprehensive roadmap that is entirely in sync with the firm’s overall business strategy. 

STAGE 1: PROVISIONAL

Provisional is the beginner stage. At this stage, the organization relies mainly on big and bulky infrastructure hosted internally. There is little or no flexibility and agility. At the most, the organization or enterprise has two or three data centers spread across a country or spanning a few continents. The LOBs are hard hit as there is no flexibility and interoperability. Siloed culture is also a significant deterrent in the decision-making process. 

The process for application development ranges from waterfall to basic forms of agile. The monolithic architecture/three-tier architecture hinders flexibility in the applications themselves. The hardware platforms are typically a mix of proprietary & open UNIX variants (HP UX, Solaris, Linux, etc.) to Windows.

There is a great deal of chaos in the provisional stage. Here the critical requirement is assessing and analyzing the business environment to develop an outline first. The need is to ensure that the organization gains confidence and realizes what it needs for fruitful cloud implementation. There should be a strong sense of ownership and direction to lead the organization into the cloud, away from the siloed culture. The enterprise should also develop insights on how they will further their cloud journey.

STAGE 2: VIRTUALIZATION 

In this next stage of the cloud maturity model, server virtualization is heavily deployed across the board. Though here again, the infrastructure is hosted internally, there is increasing reliance on the public cloud. 

The primary challenges that organizations face in this stage of cloud readiness are related to proprietary virtualization costs. LOBs may consider accelerating movement to Linux-based virtualization running on commodity servers to stay cost-competitive. However, despite the best efforts, system administration skills and costs associated with migration remain a significant bottleneck.

STAGE 3: CLOUD READY 

At this significant cloud adoption stage, applications are prepared for a cloud environment, in the public or private cloud as part of the portfolio rationalization exercise. 

The cloud migration approaches are primarily four types   

  • Rehosting: It is the most straightforward approach to cloud migration and as the name implies consists of lifting and shifting applications, virtual machines from the existing environment to the public cloud. When a lift-and-shift approach is employed, businesses are assured of minimum disruption, less upfront cost, and quick turn-around-time (this is the fastest cloud migration approach). But there are several drawbacks as well – there is no learning curve for cloud applications. Performance is not enhanced as there is no change in code. It is only moved from the data center to the cloud.        
  • Replatforming: Optimize lift and shift or move to another cloud from the existing cloud. Apart from what is done in the standard lift-and-shift, it involves optimization of the operating system (OS), changes in API, and middleware upgrade.   
  • Refactoring/Replacing: Here, the primary need is to make the product better and hence developers re-architect legacy systems to build cloud-native systems from scratch.    

The typical concerns at this cloud adoption stage are quantitative such as the economics related to infrastructure costs, developer/admin training, and interoperability costs. Firms or organizations are also interested in knowing the ROI and when it will finally break-even.

At this stage, an analysis of the organization’s risk appetite is carried out. With the help of a clear-cut strategy, firms can stay ahead of the competition as well. 

STAGE 4: CLOUD OPTIMIZED

Enterprises in this stage of cloud adoption realize that cloud-based delivery of IT services (applications or servers or storage or application stacks for developers) will be its end objective. They have the advantage of rapidly maturing cloud-based delivery models (IaaS and SaaS) and are increasingly deploying cloud-native architecture strategies and design across critical technical domains.

In firms with this level of maturity, cloud-native ways of developing applications are de facto. As cloud-native applications need to be architected, designed, developed, packaged, delivered, and managed based on a deep understanding of cloud computing frameworks, the need is for optimization throughout the ecosystem. The applications are designed for scalability, resiliency, and incremental enhance-ability from the get-go. Depending on the application, supporting tenets include IaaS deployment & management and container orchestration alongside Cloud DevOps. 

Conclusion

Cloud adoption has brought the immense benefits of reduced Capex Spend, lowered complexity in IT management, and improved security and agility across firms. The financial services sector has also increasingly adopted the cloud. Despite the initial apprehensions in terms of security and data breaches, an overwhelming 92% of banks are either already making effective use of the cloud or planning to make further investments in 2021/22, as evident from a report by the Culture of Innovation Index, recently published by ACI Worldwide and Ovum.  

While cloud adoption is the new norm, doing it effectively starts with identifying where the firm is currently and how long the journey is to be ‘cloud-native.’ 

Magic FinServ’s view of Cloud Adoption for Financial Firms

Magic FinServ understands the importance of a practical cloud roadmap. It strategizes and enables firms to understand what it is that they need. We are committed to finding the right fitment according to the financial firm’s business.

While in recent times, the preference is for a multi-vendor hybrid cloud strategy. With our cloud assessment and remediation services tailored specifically for financial institutions, we thoroughly understand the specialized needs of the capital market. Our team comprises capital-market domain expert cloud architects who assess, build, design, migrate cloud solutions tailored just for capital market players, and are in total compliance with the industry’s complex regulatory requirements.

At Magic FinServ, the journey begins with assessing maturity in terms of technical and non-technical capabilities. Magic has developed a comprehensive 128 point assessment that measures your organization’s critical aspects of cloud and organizational readiness. We understand the operational, security, and confidentiality demands of the buy-side industry and advise your firm on the best course of action. 

Magic FinServ helps demystify the cloud migration journey for firms and then continually improve the environment stability with the advanced Cloud DevOps offering, including SecDevOps. Our highly lauded 24/7 Production support is unique as it is based on adhering to SLAs at each stage of the journey. The SLAs are met across the solution and not just one area, and proper reporting is done to prevent any compliance-related issues. To explore how your organization can realize optimum cloud benefits across various stages of the cloud adoption journey, reach out to us at mail@magicfinserv.com or Contact Us.

Data on-boarding from unstructured sources: Bridging a critical gap in leveraging industry platforms

Ingesting Unstructured data into other Platforms

Industry specific Products / Platforms like the ERP for specific functions and processes have contributed immensely to enhancing efficiency and productivity. SI partners and end-users have focused on integrating these platforms with existing workflows through a combination of customization/configuring of these platforms and re-engineering existing workflows. Data Onboarding is a critical activity however it has been restricted to integrating the platforms with the existing ecosystem. A key element that is very often ignored is integrating Unstructured Data sources in the Data Onboarding process.

Most enterprise-grade products and platforms require a comprehensive utility that can extract and process a wide set of unstructured documents, data sources and ingest the output into a defined set of fields spread across several internal and third-party applications on behalf of their clients. You are likely extracting and ingesting this data manually today, but an automated utility could be a key differentiator that reduces time, effort and errors from this extraction process. 

Customers have often equated use of OCR technologies as solutions to these problems, however OCR suffers from quality and efficiency issues thereby requiring manual efforts. More importantly OCR extracts the entire document and not just the relevant Data Elements, thereby adding significant noise to the process. And finally, the task of ingesting this data into the relevant fields in the applications / platforms is still manual.

When it comes to widely used and “customizable” case management platforms for Fincrime applications, CRM platforms, or client on-boarding/KYC platforms, there is a vast universe of unstructured data that requires processing outside of the platform in order for the workflow to be useful. Automating manual extraction of critical data elements from unstructured sources with the help of an intelligent data ingestion utility enables users to repurpose critical resources tasked with repetitive offline data processing.

Your data ingestion utility can be a “bolt on” or a simple API that is exposed to your platform. While the document and data sets may vary, as long as there is a well-defined list of applications and fields that are required to be populated, there is a tremendous opportunity to accelerate every facet of client lifecycle management. There are several benefits to both “a point solution” which automates extraction of a well-defined document type/format as well as a more complex, machine learning based utility for a widely defined format of the same document type. 

Implementing Data Ingestion

An intelligent pre and post processing data ingestion can be implemented in 4 stages, each stage increasing in complexity and value extracted from your enterprise platform:

Stage 1 
  • Automate the extraction of standard templatized documents. This is beneficial for KYC and AML teams that are handling large volumes of standard identification documents or tax filings which do not vary significantly. 
Stage 2 
  • Manual identification and automated extraction of data elements. In this stage, end users of an enterprise platform can highlight and annotate critical data elements which an intelligent data extraction utility should be able to extract for ingestion into a target application or specified output format. 
Stage 3
  • Automated identification and extraction as a point solution for specific document types and formats.
Stage 4
  • Using stage 1-3 as a foundation, your platform may benefit from a generic automated utility which uses machine learning to fully automate extraction and increase flexibility of handling changing document formats. 

You may choose to trifurcate your unstructured document inputs into “simple, medium, and complex” tiers as you develop a cost-benefit analysis to test the outcomes of an automated extraction utility at each of the aforementioned stages. 

Key considerations for an effective Data Ingestion Utility:

  • Your partner should have the domain expertise to help identify the critical data elements that would be helpful to your business and end users 
  • Flexibility to handle new document types, add or subtract critical data elements and support your desired output formats in a cloud or on-premise environment of your choice
  • Scalability & Speed
  • Intelligent upfront classification of required documents that contain the critical data elements your end users are seeking
  • Thought leadership that supports you to consider the upstream and downstream connectivity of your business process

An Automated solution for Blockchain Infrastructure Testing

This current blog is part three in the series of blogs on DLT infrastructure testing. 

While in the first blog, we covered all aspects of infrastructure testing for decentralized applications built on the blockchain or distributed ledger platforms and the Magic FinServ approach. In the second blog, we have addressed why customers must make infrastructure testing an integral part of the QA process. 

In this third blog of the series, we address another issue of critical importance – automation. Automation is an essential requirement in any organization today when disruptive forces are sweeping across domains. And as a McKinsey report indicates – “Automation can transform testing and quality control because the increased capacity it provides allows a company to move from spot checks to 100 percent quality control, which reduces the error rate to nearly zero.” 

Infrastructure testing -A critical requirement

While the importance of infrastructure testing cannot be denied, four attributes make it extremely complicated from the tester’s perspective. These are peer-to-peer networking(P2P), consensus algorithms, role-based nodes along with the permission for each node (only for private networks), and lastly, state and transactional data consistency under high load along with resiliency of nodes. 

To know more about these in detail,  you can check the links provided below, which lead to the first and second in the series of blogs:       

Infrastructure Testing for Decentralized Applications built on Blockchain or Distributed Ledger Platform

Why is Infrastructure Testing important for Decentralized Applications built on any Blockchain or DLT

From these blogs, it becomes evident that though infrastructure testing is an essential requirement for any decentralized application, it is also a time-consuming task. Most of the supported features for such applications require different configurations/arrangements of nodes meaning different network topologies for each feature. There is a high possibility that one feature may be tested with some number of nodes. However, for a proper test fix or enhancement of any sort, a different number of nodes from what was designed earlier is needed. 

Developing a comprehensive test strategy

As far as test strategies are concerned, most often deployed one utilizes docker-based containers to copy different network topologies with minimal changes. However, defining docker-based containers ( a.k.a. docker service) with different numbers is also a highly time-consuming activity. The addition of a single new container, depending upon the number of nodes, usually takes a couple of hours to set up Docker-based containers to create different network topologies. It is not only tedious but too complicated. 

One also must take into account the cloud. Most organizations now require infrastructure-testing to be carried out on cloud platforms to mimic the closure environment that they would be using in real-time, as closely as possible. However, setting up one node on any existing cloud service could easily take two to three hours, even with automated ways to spin off machines. Therefore to ensure quicker results, the option at hand is automation.

Automating the untested – how to get started

Today almost every organization/enterprise uses Agile methodology for product development and an automated way (with CI/CD) to create builds daily. Functional testing can be automated and integrated within CI/CD easily, but it is not so with non-functional testing like Infrastructure, Performance, Security, Resiliency, Load testing, etc. These are not easily integrated with CI/CD. Even if these are integrated within CI/CD, non-functional testing does not provide the kind of results organizations desire. 

When it comes to the question of manual  non-functional testing, it is rather tedious. Since there are frequent builds that have to be tested (for non-functional areas like infrastructure), manually setting up a different network topology is not viable. It takes a lot of time and is highly error-prone.  Non-functional testing of Blockchain (other than infrastructure) relies on node level rather than the network level; therefore for tests related to Performance, Security, Resiliency (all of which come under non-functional testing) are performed on standard network topologies. Thereby indicates that infrastructure testing directly relates to network topologies, whereas other non-functional testing processes mentioned earlier are impacted on a case-to-case basis.

For infrastructure testing, organizations must carry out the following activities to define the network topology:

  • Impact analysis of all changes related to the four significant factors listed earlier
  • If any of the four factors is impacted, then defining network topologies for each scenario
  • Set up of nodes for all probable network topologies
  • Creation of network for each network topology
  • Execution of functional/non-functional testing on each network topology to ensure that all network topologies are working as per the acceptance criteria

Impact analysis of changes 

To define the required numbers of network topology, organizations must first identify what all changes are to be done and how those changes would be impacted by peer-to-peer (P2P) networking logic, consensus algorithms logic, permissioning handler logic, or data/transaction consistency logic. If the impact is apparent, then organizations must define network topology. This activity is the most time-consuming task of all as one has to understand all the changes. 

Another critical task for organizations is to perform impact analysis for all changes and find out whether the four major factors have been impacted or not. The easiest way to process this task would be to get developers to register this information with meaningful keywords that can automate impact analysis. With proper automation in place, organizations can use impact analysis to determine whether existing network topologies can be used or a new one has to be created.

Defining network topologies: 

Once impact analysis is done and it is decided that new network topologies must be created to account for changes, then the next requirement is to define all network topologies.

For instance, if an organization reports an issue related to the functioning of nodes. Whenever there is an even number of consensus nodes within the network, then consensus seems to get stuck or takes longer than usual. To resolve the problem, developers work out the logic. In case the network does not have an even number of consensus nodes, then the need is to either convert one existing node to a consensus node or add the new one to the network. Either way, the network topology will be changed from one that exists. 

With proper automation in place, it is possible to keep the registry of all existing QA network topologies. Once the required network topology is fed in, it should provide data with pertinent information whether a new node is to be created or any existing network can be utilized after modifying a  number of nodes. Manually performing this task could take hours and sometimes even days if the organization has a long list of network topologies in their QA environments.

Setting up nodes for required network topology: 

There are two possibilities here, either modify the existing node or create a new node to have a new network topology. In either case, nodes will have to be set up manually. Again this will take time and require the engagement of someone who understands QA environments from an administrative perspective to set up nodes. Hence, it will increase the time taken and create dependency on new groups to coordinate for node setup without automation.

Creation of Network Topology: 

After setting up the required number of nodes, a network is created based on the network initialization process. If multiple network topologies have to be tested with several scenarios, then for each network topology, the following activities have to be performed:

  • Cleaning all involved nodes if existing network nodes is used
  • Initialization of network
  • Allowing for stabilization of the nodes for all components/services
  • Execution of functional scenarios
  • Destroying the network to free the nodes

As all the above activities are required to be completed for each network topology so without automation, this will consume a lot of time and make testing highly error-prone. Most of the time, network topologies use nodes that have overlap with other network topologies; hence missing any of the activities underlined earlier will result in inconsistency on the other running network. Experience suggests that cleaning the nodes is a highly error-prone activity within a shared environment of various network topologies. It becomes tough to discriminate why the errors are ensuing. Whether those are actual bugs to be reported or some nodes are now being used for two or more networks responsible for the error since clean up (of nodes) has not been done correctly. Without proper automation, all these activities will take significant time and raise a false alarm for the issues that have popped up due to some human error.

Execution of functional and non-functional tests: 

Functional tests must be executed without fail, whereas non-functional tests are always subject to the changes being made. These (non-functional tests) become essential if there is performance improvement or fix required for any security vulnerability. Even in case of any exceptional fix that hurts performance, this is required. 

Functional tests are implicitly covered in the network creation phase, and almost every organization focuses (or gives priority) on automating functional testing. Non-functional testing has always been the least priority for most; however, this becomes very tedious if required to be performed on multiple network topologies. It is rare to run non-functional testing for all network topologies as it has the very least dependency on different network topologies. Most of the time, non-functional testing is at node level rather than dependent upon different network topologies. 

Conclusion

In its Hype Cycle report for Blockchain Business for 2019, Gartner predicts that within five to 10 years, Blockchain will have a transformational impact across industries. According to David Furlonger, distinguished research vice-president at Gartner, permissioned ledgers in several key areas in banking and investment services will witness increased focus. In light of the uptick in interest from banking and investment services CIOs seeking to improve decades-old operations and processes, automation is desirable for driving ROI and efficiency in Blockchain incorporation and its automation.

Automated testing enables the developers to easily and quickly check new apps and updates for errors, defects, and other weaknesses. Infrastructure testing is one such area that organizations must automate as soon as possible if they desire to build robust decentralized applications. 

Magic FinServ’s automated test methodology is unique, and we have the relevant expertise to drive automation for testing Blockchain Infrastructure. We have had success with several clients who built financial products on blockchain platforms. 

To explore automated testing for blockchain infrastructure, write to us at mail@magicfinserv.com