Has Digitalization Led To The Problem Of Unstructured Data

unstructured data

Before we actually get into details as to how digitalization has contributed to unstructured data, we really need to understand what is meant by the terms, Digitalization and Unstructured Data.

Digitization: is the process of converting information into a digital format. In this format, information is organized into discrete units of data (called bits) that can be separately addressed (usually in multiple-bit groups called bytes). (Source: iohttps://whatis.techtarget.com/definition/digitization)

Unstructured Data: is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy but may contain data such as dates, numbers, and facts as well. This results in irregularities and ambiguities that make it difficult to understand using traditional programs as compared to data stored in fielded form in databases or annotated (semantically tagged) in documents. (Source: https://en.wikipedia.org/wiki/Unstructured_data)

Now to establish connections between above two, I begin with a point, that every day there is new evolution happening in Technology space, and in addition to this desire to digitalize everything around us is also gaining momentum.

However, we haven’t thought that this process will solve our problem, or will lead to a bigger problem which will be common across all the current verticals and new verticals of the future.

Actually, if we do deep thinking around this then we will realize that instead of creating a solution for the digital world or digitized economy we have actually paved the path for making data as unstructured or for that matter Semi/Quasi structured, and this heap/pile of unstructured data is growing day by day.

Certain questions crop in our minds that what are various factors which are contributing to the unstructured data pile. Some of them are mentioned below:

  1. The rapid growth of the Internet leading to data explosion resulting in massive information generation.
  2. Data which is digitalized and given some structure to it.
  3. Free availability and easy access to various tools that help in the digitization of data.

The other crucial angle for unstructured data is how do we manage it.

Some insights and facts around unstructured data problem, that stresses it is a serious affair:

  • According to projections from Gartner, white-collar workers will spend anywhere from 30 to 40 percent of their time this year managing documents, up from 20 percent of their time in 1997
  • Merrill Lynch estimates that more than 85 percent of all business information exists as unstructured data – commonly appearing in e-mails, memos, notes from call centers and support operations, news, user groups, chats, reports, letters, surveys, white papers, marketing material, research, presentations, and Web pages

(Source – http://soquelgroup.com/wp-content/uploads/2010/01/dmreview_0203_problem.pdf)

  • Nearly 80% of enterprises have very little visibility into what’s happening across their unstructured data, let alone how to manage it.

(Source – https://www.forbes.com/sites/forbestechcouncil/2017/06/05/the-big-unstructured-data-problem/2/#5d1cf31660e0Source –)

Is there a solution to this?

In order to answer the above question, I would say data (information) in today’s world is Power, and Unstructured data is tremendous power because the essence/potential is still untapped, which when realized effectively and judiciously can turn fortunes for the organizations.

On the other hand, Organizations and business houses which are trying to extract meaning/sense out of this chaotic mess will be well-positioned to reap competitive edge and will have a competitive advantage among the peer group.

Areas to focus on addressing the problem related to unstructured data are.

  1. Raising awareness around it.
  2. Identification and location in the organization.
  3. Ensure information is searchable
  4. Make the content context and search friendly
  5. Build Intelligent content.

The good news is that we, at Magic, realized the quantum of this challenge sometime back and hence have designed a set of offerings specifically designed to solve the unstructured & semi-structured data problem for the financial services industry.

Magic FinServ focuses on 4 primary data entities that financial services regularly deals with:

Magic Finserv Offerings

Market Information – Research reports, News, Business and Financial Journals & websites providing Market Information generate massive unstructured data. Magic FinServ provides products & services to tag meta data and extracts valuable and accurate information to help our clients make timely, accurate and informed decisions.

Trade – Trading generates structured data, however, there is huge potential to optimize operations and make automated decisions. Magic FinServ has created tools, using Machine Learning & NLP, to automate several process areas, like trade reconciliations, to help improve the quality of decision making and reduce effort. We estimate that almost 33% effort can be reduced in almost every business process in this space.

Reference data – Reference data is structured and standardized, however, it tends to generate several exceptions that require proactive management. Organizations spend millions every year to run reference data operations. Magic FinServ uses Machine Learning tools to help the operations team reduce the effort in exception management, improve the quality of decision making and create a clean audit trail.

Client/Employee data – Organizations often do not realize how much client sensitive data resides on desktops & laptops. Recent regulations like GDPR make it now binding to check this menace. Most of this data is semi-structured and resides in excels, word documents & PDFs. Magic FinServ offers product & services that help organizations identify the quantum of this risk and then take remedial actions.

Reference Data And Its Management

Reference Data and Its Management

Reference data is an important asset in financial firm. Due to recent crisis in global market, regularity changes and explosion of derivative and structured products, the need for valuable market & reference data has become central focus for financial institutions. For any financial transaction accurate information/data is the key element and faulty data is the major component of the operation risk.

Reference data used in financial transactions can be classified as static and dynamic

  • Static Data: Data elements which have unalterable characteristics such as financial instrument data, indexes, legal entity/ counterparty, markets and exchanges.
  • Dynamic Data: Variable data such as closing and historical prices, corporate actions.
blog-chart

Reference data is stored and used across front office, middle office and back office systems of the financial institutions. In a transaction life cycle, reference data is used to interact with various systems and application internally and externally. Problems related to faulty reference data continue to exist and this leads to increased operations risks and cost.

To reduce data related risk & issues and contain cost, financial institutions are looking at innovative solutions to improve data management efficiency. Centralization, standardization and automations of data management process is key to achieve this goal.

Industry Challenges

  • Poor data quality; lack of global standards; presence of data silos; multiple data sources leading to inefficiency in the whole data governance process.
  • Data duplication and redundancy across various business functions.
  • Lack of data governing policies.
  • Lack of standardized data definition.
  • Time consuming data source onboarding process.
  • Inconsistent data leading to poor reporting and management.
  • High manual intervention in data capturing and validation process.

Poor data quality is leading to

blog-chart

Solution

  • Deploy centralized reference data management system and create data management framework.
  • Create golden copy of the reference data received from the various sources within an organization that can be accessed by all business functions.
  • Update the data daily/real time at this single point.
  • Validate data at single place before distributing to relevant business functions.
  • Resolves data exception centrally to avoid issues at downstream systems.
blog-chart

Benefits

  • Improve process efficiency by centralization of data management.
  • Reduced operational and data management cost.
  • More control over data quality and change management.
  • Reduced turnaround time for new business needs and meeting new regulatory requirement.
  • Early detection and resolution of potential data issues.
blog-chart

Enterprise Data Management In Capital market

Enterprise Data Management in Capital market

Reference data is the data used to classify other data in any enterprise. Reference data is used within every enterprise application, across back-end systems through front-end applications. Reference data is commonly stored in the form of code tables or lookup tables, such as country codes, state codes, and gender codes.

Reference data in the capital market is the backbone of all financial institutions, banks and investment management companies. Reference data is stored and used in the front office, middle office, and back-office systems. A financial transaction uses the reference data when interacting with other associated systems and applications. Reference data is also used in price discovery for the financials instruments.

Reference data is primarily classified into two types –

  • Static Data– Financial instruments & their attributes, specifications, identifiers (CUSIP, ISIN, SEDOL, RIC), Symbol of exchange, Exchange or market traded on(MIC), regulatory conditions, Tax Jurisdiction, trade counterparties, various entities involved in a various financial transaction.
  • Dynamic Data– Corporate actions and event-driven changes, closing prices, business calendar data, credit rating, etc.

Market Data

Market data is price and trade-related data for a financial instrument reported by the stock exchange. Market data allows traders and investors to know the latest price and see historical trends for instruments such as equities, fixed-income products, derivatives, and currencies.

Legal Entity data

The 2008 market crisis exposed severe gaps in measuring market credit and market risk. Financial institutions are facing a hard challenge to identify the complex corporate structure of the security issuer and other counterparties & entities involved in their business. Institutions must have the ability to roll up, assess, and disclose the aggregate exposure to all the entities across all asset classes and transactions. Legal Entity is the key block of this data which will help the Financial institution to know all the parties with whom they are dealing with and help to manage the risk.

The Regulation rules like The Foreign Account Tax Compliance Act (FATCA), MiFID II will require absolute clear identification of all the entities associated with the security. LEI plays a vital role to perform such due diligence.

EDM workflow

  • Data Acquisition – Data is acquired from leading data providers like Bloomberg, Reuters, IDC, Standards & Poors, etc.
  • Data Processing –Data normalization & transformation rules are applied & validation processes clean the data.
  • Golden Copy creation – Cleaned & validated data is transformed into more trusted Golden Copy data through further processing.
  • Data Maintenance –  Manual intervention if necessary to handle the exceptions that cannot be handled automatically.
  • Distribution/Publishing – Golden Copy data is published to the consumer application like Asset Management, Portfolio Management, Wealth Management, Compliance, Risk & Regulatory applications, other Business Intelligence platform for Analytics.

Importance of efficient EDM system

The fast-changing regulatory & business requirements of the financial industry, poor quality of data, competition demand a high-quality centralized data management system across the firm.

In current market situation, companies must be able to quickly process customer requests, execute trading requests quickly, identify holdings and positions, assess and adjust risk levels, maximize operational efficiency and control, and optimize cost all while implementing regulatory and compliance needs in a timely fashion.

An efficient EDM system enables the business to –

  • Establish a Centralized database management system
  • Reduced manual work
  • Decreased operational risk
  • Lower data sourcing costs
  • Having a better view of data
  • Governance & auditing needs
  • Better overview of risk management
  • Tailor-made user rights
  • Analytics & data-driven decision

Challenges need to overcome

  • Data quality & data accuracy.
  • Siloed data and disparate data across firms making it difficult to have a consolidated view of the risk exposure.
  • Data lineage.
  • Keeping the cost lower in such a fast-changing financial market.
  • Ability to quickly process customer requests, accurately price holdings, assess and adjust risk levels accordingly.
  • The complexity of the latest national and international regulations.