Enterprise Data Management

Before we actually get into details as to how digitalization has contributed to unstructured data, we really need to understand what is meant by the terms, Digitalization and Unstructured Data.

Digitization: is the process of converting information into a digital format. In this format, information is organized into discrete units of data (called bits) that can be separately addressed (usually in multiple-bit groups called bytes).

Unstructured Data: is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy but may contain data such as dates, numbers, and facts as well. This results in irregularities and ambiguities that make it difficult to understand using traditional programs as compared to data stored in fielded form in databases or annotated (semantically tagged) in documents. (Source: https://en.wikipedia.org/wiki/Unstructured_data)

Now to establish connections between above two, I begin with a point, that every day there is new evolution happening in Technology space, and in addition to this desire to digitalize everything around us is also gaining momentum.

However, we haven’t thought that this process will solve our problem, or will lead to a bigger problem which will be common across all the current verticals and new verticals of the future.

Actually, if we do deep thinking around this then we will realize that instead of creating a solution for the digital world or digitized economy we have actually paved the path for making data as unstructured or for that matter Semi/Quasi structured, and this heap/pile of unstructured data is growing day by day.

Certain questions crop in our minds that what are various factors which are contributing to the unstructured data pile. Some of them are mentioned below:

  1. The rapid growth of the Internet leading to data explosion resulting in massive information generation.
  2. Data which is digitalized and given some structure to it.
  3. Free availability and easy access to various tools that help in the digitization of data.

The other crucial angle for unstructured data is how do we manage it.

Some insights and facts around unstructured data problem, that stresses it is a serious affair:

  • According to projections from Gartner, white-collar workers will spend anywhere from 30 to 40 percent of their time this year managing documents, up from 20 percent of their time in 1997
  • Merrill Lynch estimates that more than 85 percent of all business information exists as unstructured data – commonly appearing in e-mails, memos, notes from call centers and support operations, news, user groups, chats, reports, letters, surveys, white papers, marketing material, research, presentations, and Web pages

(Source – http://soquelgroup.com/wp-content/uploads/2010/01/dmreview_0203_problem.pdf)

  • Nearly 80% of enterprises have very little visibility into what’s happening across their unstructured data, let alone how to manage it.

(Source – https://www.forbes.com/sites/forbestechcouncil/2017/06/05/the-big-unstructured-data-problem/2/#5d1cf31660e0Source –)

Is there a solution to this?

In order to answer the above question, I would say data (information) in today’s world is Power, and Unstructured data is tremendous power because the essence/potential is still untapped, which when realized effectively and judiciously can turn fortunes for the organizations.

On the other hand, Organizations and business houses which are trying to extract meaning/sense out of this chaotic mess will be well-positioned to reap competitive edge and will have a competitive advantage among the peer group.

Areas to focus on addressing the problem related to unstructured data are.

  1. Raising awareness around it.
  2. Identification and location in the organization.
  3. Ensure information is searchable
  4. Make the content context and search friendly
  5. Build Intelligent content.

The good news is that we, at Magic, realized the quantum of this challenge sometime back and hence have designed a set of offerings specifically designed to solve the unstructured & semi-structured data problem for the financial services industry.

Magic FinServ focuses on 4 primary data entities that financial services regularly deals with:

Market Information – Research reports, News, Business and Financial Journals & websites providing Market Information generate massive unstructured data. Magic FinServ provides products & services to tag meta data and extracts valuable and accurate information to help our clients make timely, accurate and informed decisions.

Trade – Trading generates structured data, however, there is huge potential to optimize operations and make automated decisions. Magic FinServ has created tools, using Machine Learning & NLP, to automate several process areas, like trade reconciliations, to help improve the quality of decision making and reduce effort. We estimate that almost 33% effort can be reduced in almost every business process in this space.

Reference data – Reference data is structured and standardized, however, it tends to generate several exceptions that require proactive management. Organizations spend millions every year to run reference data operations. Magic FinServ uses Machine Learning tools to help the operations team reduce the effort in exception management, improve the quality of decision making and create a clean audit trail.

Client/Employee data – Organizations often do not realize how much client sensitive data resides on desktops & laptops. Recent regulations like GDPR make it now binding to check this menace. Most of this data is semi-structured and resides in excels, word documents & PDFs. Magic FinServ offers product & services that help organizations identify the quantum of this risk and then take remedial actions.

People are often confused with the terms – Visual analytics and Visual Representations. They many times take both words for the same meaning – presenting a set of data into some kind of graphs which looks good to the naked eye. However deep down, ask an analyst and they will tell you that visual representation and visual analytics are two different arts.

Visual Representation is used to present the analyzed data. The representations directly show the output from the analysis and are of less help to drive the decision. The decision is already known with analytics already performed on data.

On the other hand, Visual analytics is an integrated approach that combines visualization, human factors, and data analysis. Visual analytics allows human direct interaction with the tool to produce insights and transform the raw data into actionable knowledge to support decision- and policy-making. It is possible to get representations using tools, but not interactive visual analytics visualizations which are custom made. Visual Analytics capitalizes on the combined strengths of human and machine analysis (computer graphics, machine learning) to provide a tool where alone human or machine has fallen short.

The Process

The enormous amount of data comes with a lot of quality issue where data would be of different types and from various sources. In fact, the focus is now shifting from structured data towards semi-structured and unstructured data. Visual Analytics combines the visual and cognitive intelligence of human analysts, such as pattern recognition or semantic interpretation, with machine intelligence, such as data transformation or rendering, to perform analytic tasks iteratively.

The first step involves the integration and cleansing of this heterogeneous data. The second step involves the extraction of valuable data from raw data. Next comes the most important part of developing a user interface based on human knowledge to do the analysis which uses the combination of artificial intelligence as a feedback loop and helps in reaching the conclusion and eventually the decision.   

If the methods used to come to conclusion are not correct, the decisions emerging from the analysis would not be fruitful. Visual analytics takes a leap step here by providing methods/user interfaces to examine the procedures using the feedback loop.  

In general, the following paradigm is used to process the data:

Analyze First – Show the Important – Zoom, Filter and Analyze Further – Details on Demand (from:  Keim D. A, Mansmann F, Schneidewind J, Thomas J, Ziegler H: Visual analytics: Scope and challenges. Visual Data Mining: 2008, S. 82.)

Areas of Application

Visual Analytics could be used in many domains. The more prominent use could be seen in

  1. Financial Analysis
  2. Physics and Astronomy
  3. Environment and Climate Change
  4. Retail Industry
  5. Network Security
  6. Document analysis
  7. Molecular Biology

Today’s era greatest challenge is to handle the massive data collections from different sources. This data could run into thousands of terabytes or even petabytes/exabytes. Most of this data is in a semi-structured or unstructured form which makes it highly difficult for only a human to analyze or only a computer algorithm to analyze.

E.g. In the financial industry a lot of data (mostly unstructured) is generated on a daily basis and many qualitative and quantitative measures can be observed through this data. Making sense of this data is complex due to numerous sources and amount of ever-changing incoming data. Automated text analysis could be coupled with human interaction and knowledge (domain specific) to analyze this enormous amount of data and reduce the noise within the datasets. Analyzing the stock behavior based on news and the relation to world events is one of the prominent behavioral science application areas. Tracking the buy-sell mechanism of the stocks including the options trading in which the temporal context plays an important role, could provide an insight into the future trend. By combining the interaction and visual mapping of automated processed world events, the user could be supported by the system in analyzing the ever-increasing text corpus.  

Another example where visual analytics could be fruitful is the monitoring of information flow between various systems used by financial firms. These products are very specific to the domain and perform specific tasks within the organization. However, there is an input of data which is required for these products to work. This data flows between different products (either from the same vendor or different vendor) through integration files. Sometimes, it could become cumbersome for an organization to replace an old system with a new one due to these integration issues. Visual analytic tools could provide the current state of the flow and could help in detecting the changes would be required while replacing the old system with a new system. It could help in analyzing which system would be impacted most based on the volume and type of data being integrated reducing the errors and minimizing the administrative and development expenses.

Visual analytics tools and techniques create an interactive view of data that reveals the patterns within it, enabling to draw conclusions. At Magic FinServ, we deliver the intelligence and insights from the data and strengthen the decision making. Data service team from Magic would create more value for your organization by improving decision making using various innovative tools and approaches.

Magic also partners with top data solution vendors to ensure that your business gets the solution that fits your requirements, this way we rightly combine the technical expertise with business domain expertise to deliver greater value to your business. Contact us today and our team will be happy to speak with you for any queries.

Reference data is an important asset in financial firm. Due to recent crisis in global market, regularity changes and explosion of derivative and structured products, the need for valuable market & reference data has become central focus for financial institutions. For any financial transaction accurate information/data is the key element and faulty data is the major component of the operation risk.

Reference data used in financial transactions can be classified as static and dynamic

  • Static Data: Data elements which have unalterable characteristics such as financial instrument data, indexes, legal entity/ counterparty, markets and exchanges.
  • Dynamic Data: Variable data such as closing and historical prices, corporate actions.

Reference data is stored and used across front office, middle office and back office systems of the financial institutions. In a transaction life cycle, reference data is used to interact with various systems and application internally and externally. Problems related to faulty reference data continue to exist and this leads to increased operations risks and cost.

To reduce data related risk & issues and contain cost, financial institutions are looking at innovative solutions to improve data management efficiency. Centralization, standardization and automations of data management process is key to achieve this goal.

Industry Challenges

  • Poor data quality; lack of global standards; presence of data silos; multiple data sources leading to inefficiency in the whole data governance process.
  • Data duplication and redundancy across various business functions.
  • Lack of data governing policies.
  • Lack of standardized data definition.
  • Time consuming data source onboarding process.
  • Inconsistent data leading to poor reporting and management.
  • High manual intervention in data capturing and validation process.

Poor data quality is leading to

Solution

  • Deploy centralized reference data management system and create data management framework.
  • Create golden copy of the reference data received from the various sources within an organization that can be accessed by all business functions.
  • Update the data daily/real time at this single point.
  • Validate data at single place before distributing to relevant business functions.
  • Resolves data exception centrally to avoid issues at downstream systems.

Benefits

  • Improve process efficiency by centralization of data management.
  • Reduced operational and data management cost.
  • More control over data quality and change management.
  • Reduced turnaround time for new business needs and meeting new regulatory requirement.
  • Early detection and resolution of potential data issues.

Reference data is the data used to classify other data in any enterprise. Reference data is used within every enterprise application, across back-end systems through front-end applications. Reference data is commonly stored in the form of code tables or lookup tables, such as country codes, state codes, and gender codes.

Reference data in the capital market is the backbone of all financial institutions, banks and investment management companies. Reference data is stored and used in the front office, middle office, and back-office systems. A financial transaction uses the reference data when interacting with other associated systems and applications. Reference data is also used in price discovery for the financials instruments.

Reference data is primarily classified into two types –

  • Static Data– Financial instruments & their attributes, specifications, identifiers (CUSIP, ISIN, SEDOL, RIC), Symbol of exchange, Exchange or market traded on(MIC), regulatory conditions, Tax Jurisdiction, trade counterparties, various entities involved in a various financial transaction.
  • Dynamic Data– Corporate actions and event-driven changes, closing prices, business calendar data, credit rating, etc.

Market Data

Market data is price and trade-related data for a financial instrument reported by the stock exchange. Market data allows traders and investors to know the latest price and see historical trends for instruments such as equities, fixed-income products, derivatives, and currencies.

Legal Entity data

The 2008 market crisis exposed severe gaps in measuring market credit and market risk. Financial institutions are facing a hard challenge to identify the complex corporate structure of the security issuer and other counterparties & entities involved in their business. Institutions must have the ability to roll up, assess, and disclose the aggregate exposure to all the entities across all asset classes and transactions. Legal Entity is the key block of this data which will help the Financial institution to know all the parties with whom they are dealing with and help to manage the risk.

The Regulation rules like The Foreign Account Tax Compliance Act (FATCA), MiFID II will require absolute clear identification of all the entities associated with the security. LEI plays a vital role to perform such due diligence.

EDM workflow

  • Data Acquisition – Data is acquired from leading data providers like Bloomberg, Reuters, IDC, Standards & Poors, etc.
  • Data Processing –Data normalization & transformation rules are applied & validation processes clean the data.
  • Golden Copy creation – Cleaned & validated data is transformed into more trusted Golden Copy data through further processing.
  • Data Maintenance –  Manual intervention if necessary to handle the exceptions that cannot be handled automatically.
  • Distribution/Publishing – Golden Copy data is published to the consumer application like Asset Management, Portfolio Management, Wealth Management, Compliance, Risk & Regulatory applications, other Business Intelligence platform for Analytics.

Importance of efficient EDM system

The fast-changing regulatory & business requirements of the financial industry, poor quality of data, competition demand a high-quality centralized data management system across the firm.

In current market situation, companies must be able to quickly process customer requests, execute trading requests quickly, identify holdings and positions, assess and adjust risk levels, maximize operational efficiency and control, and optimize cost all while implementing regulatory and compliance needs in a timely fashion.

An efficient EDM system enables the business to –

  • Establish a Centralized database management system
  • Reduced manual work
  • Decreased operational risk
  • Lower data sourcing costs
  • Having a better view of data
  • Governance & auditing needs
  • Better overview of risk management
  • Tailor-made user rights
  • Analytics & data-driven decision

Challenges need to overcome

  • Data quality & data accuracy.
  • Siloed data and disparate data across firms making it difficult to have a consolidated view of the risk exposure.
  • Data lineage.
  • Keeping the cost lower in such a fast-changing financial market.
  • Ability to quickly process customer requests, accurately price holdings, assess and adjust risk levels accordingly.
  • The complexity of the latest national and international regulations.

Get Insights Straight Into Your Inbox!

    CATEGORY