Metadata management is a science

Meta analytics is the new model for enabling complete data and process oversight.

By Mervyn Mooi, Director of Knowledge Integration Dynamics (KID) and represents the ICT services arm of the Thesele Group.

Data governance is crucial, and is embedded in most well-run enterprises. But most organisations are not yet maximising the full potential of their management and governance measures, in that they are not yet linking data management to governance to compliance.

data-governance-1-.png

Data management differs from governance. Data management refers to planning, building and running capabilities. Governance relates to monitoring, evaluating and directing enablers, with creation through assuring efficiencies using governance “place-holders” or gates, the latter which are entrenched in system or project management life-cycles.

Governance monitors, ensures and directs data management practices not only in the execution of processes and business activities, but also needs to help achieve efficiencies; eg, in project management and system development life-cycles.

Moving to the next level

Most governance happens at a purely technical and operational level, but to elevate governance to support high-level compliance, organisations need to link rules, regulations, policies and guidelines to the actual processes and people at operational level. Compliance is set to become ever-more challenging as organisations deal with growing volumes of data across an expanded landscape of processes.

Compliance is set to become ever-more challenging as organisations deal with growing volumes of data across an expanded landscape of processes.

AdobeStock_108492184_vector

I advocate that governance not only be addressed at technical/operational (data management) levels, but should also be linked to compliance which carries risk and drives the organisation’s strategy. Major South African enterprises are starting to realise that linking governance to compliance could support the audit process and deliver multiple business benefits at the same time.

Recently, I highlighted how data stewards were stepping up their focus on mapping governance, risk management and compliance rules to actual processes, looking to the management of meta data to provide audit trails and evidence of compliance.

Traditionally, these audit trails have been hard to come by. Auditors – many of them with a limited technical background – had to assess reams of documents and request interviews with IT to track the linkages from legislation and guidelines to actual processes. In most cases, the processes linked to are purely technical in nature.

From a regulatory compliance point of view, traditional models do not provide direct links to a particular clause in legislation or best practice guidelines, illustrating the location and management of data, including where it resides, who uses it and how, in light of the requirements of the clause or legislation. Auditors, however, need enterprises to prove lineage and articulate governance in the context of compliance.

Establishing the linkages

While enterprises typically say they are aware they could potentially link data management to governance to compliance, most do not undertake such exercises, possibly because they don’t have a mandate to do so, because they believe the tools to enable this are complex and costly, or simply because they believe the process will be too time-consuming.

Using sound methodology, this once-off exercise can take as little as two to three hours to map a process to legislation or guidelines. In the typical organisation, with around 1 000 processes, it could take less than a year to map all of them.

The organisation then gains the ability to track the processes without having to rely on elaborate business process management tools, and capture it all in Excel, store the information on any relational database and get insights: Where are the propensities, affinities, gaps and manual processes, and more importantly, what accords are they mapped to?

Mapping data is stored with timestamps and current version indicators, so if a process changes over time, or a rule, control or validation has changed, this information will be captured, indicating when it happened and where it was initiated. At the press of a button, the organisation is then able to demonstrate the exact lineage, drill down to any process within the system, and indicate where the concentration of effort lies, and where rules, conditions and checks are done within processes.

Additionally, it can attach risk weights at process level or accord level, helping shape strategy and gauge strategy execution.

Not only does this mapping give enterprises clear linkages between policies or regulations and processes, it also gives sudden new visibility into inefficiencies, the people and divisions involved in each process and more, so helping to enhance efficiencies and supporting overall organisational strategy.

With governance and compliance mandatory, it’s high time organisations moved to support governance and compliance evidence, and make the auditing process simpler and more effective.

Advertisements

Thesele takes 40% stake in Knowledge Integration Dynamics (Pty) Ltd. (KID)

Thesele Group (Thesele) has bought a 40% stake in Knowledge Integration Dynamics (KID), marking the investment holding company’s first foray into the ICT space, and making KID South Africa’s largest black-owned focused data management solutions company.

The multi-million rand deal, which came into effect last month, makes KID a majority black-owned entity with a Level 4 BBBEE rating, with several of its subsidiaries now 100% black-owned as well as BBBEE Level 1 rated companies.

KID co-founder and MD Aubrey van Aswegen says the investment marks the start of a new growth phase for the data management specialists. “KID has grown fairly organically over its 20-year history, but we are now approaching a point where fuelling the same pace of growth will demand a more aggressive expansion phase and possibly strategic acquisitions. Our new partnership with Thesele Group will support this growth strategy,” he says.

60478689_1949076858529640_6242531507840221184_o

Thesele, founded in 2005 by Sello Moloko and Thabo Leeuw, has a diverse investment portfolio across financial services, logistics, manufacturing and automotive industries. Thesele recently announced its acquisition of a 35% stake in South African water and wastewater solutions provider Talbot & Talbot. The KID acquisition is in line with Thesele’s long-term investment approach in existing and emerging growth sectors, says Thesele Executive Director Oliver Petersen.

Van Aswegen says KID had been in the market for a suitable BEE partner for some time. “We were looking for a suitable investor to not only improve our scorecard, but to play an active role in business development for us and bolster our growth aspirations,” he says. Thesele’s track record, networks and reputation in the investment community, along with its ethical approach to business, aligned with KID’s own culture and business model. The partnership will not be a ’passive’ one, he says. Thesele will work closely with KID to support mutually beneficial growth.

For Thesele, the investment in KID leverages several synergies, including the fact that “both entities have long operated in the financial services sector,” says Petersen. “Both groups also have the view that data and data management is a key growth area, with a wide range of opportunities in areas such as big data, the Internet of Things, automation, robotics and Artificial Intelligence.”

“This is a key milestone – not only for KID as a company, but also for its stakeholders, including staff and customers,” Van Aswegen says. “It will facilitate growth for us, and we look forward to Thesele growing their exposure to the ICT space using KID as the platform.”

About Thesele Group

https://www.thesele.co.za/pages/about-us

 

 

Meta Analytics (MA) – the new model for enabling complete data and process oversight

The new science of Meta Analytics has been formalised to enable broad oversight of data and processes, with key objectives of supporting governance thereof, proving compliance, achieving alignment and leveraging efficiencies.

By Mervyn Mooi, director at Knowledge Integration Dynamics (KID)

As governance and compliance becomes an increasingly top of mind issue for data stewards and their enterprises alike, the challenge of mapping governance, risk management and compliance (GRC) rules to actual the data and processes has come to the fore.

Corporate-Governance-Ensure-Compliance

Where once, companies tended to focus on the content – the data itself – rather than its containers – the metadata – management of metadata is now becoming a key focus. Metadata, covering factors such as data and process context, design and specifications, execution information might accurately be described as the ‘information about data’. Metadata Management has become a science in itself. However, in South Africa, systems analysts, database administrators or systems administrators still tend to interrogate metadata at a fairly basic and technical level for operational purposes.

Mapping for GRC

Linking metadata to GRC rules, the latter which are abstracted or prescribed from the organisation’s PPSGs (Policies, Principles, Procedures, Standards, Regulations and Guidelines) has become increasingly important, since it allows for the mapping of metadata directly to organisational capabilities, services, processes, data objects, work-flows, service/business units and individuals. In doing so, it provides a clear view of the business and operational architectural landscapes and data life-cycles. Furthermore, the mappings link in confirmatory communiques and audit trails, as evidence of compliance, action or conformance to the rules (PPSGs).

Typically, the processes and data within computer application systems are designed, built and mapped based on functional and information requirements, often not considering vertical lineage to business processes, work-flows or services, mapping to PPSG (GRC) rules, inclusion of risk management factors and linking to architectural model and capabilities. These are usually managed separately by a different competency team and set of tools e.g. Business Process Management or Data Modelling tools. The operational processes that result from this situation are often disjointed, manual and non-aligned to the PPSGs. Evidence is there of mappings being done for audit purpose, but on a small-scale, ad-hoc basis the practice of which is not sustained and usually do not link directly to the business and operational roles of individuals on the ground tasked with operating in alignment with particular PPSGs.

When called on to produce evidence of compliance or conformance to the PPSGs, business units, departments and IT must often rush to map processes and surface execution evidence of the rules on data thus to prove compliance e.g. with POPI, FICA, other legislation or internal standards. This process is time consuming and challenging, and even though the department can produce mapping and recon reports, they can seldom show exactly which actions were taken where, to align with which GRC rules and the risk of not applying these.

The evolution of MA

GRC mapper tools tend to have limited capabilities – simply linking an accord, condition or service. Moving beyond these rudimentary capabilities is becoming increasingly important as businesses depend more heavily on the quality of their data and processing economy, and GRC compliance / conformance becomes crucial.

In recent years, KID has evolved solutions and methodologies that encompass a “marriage” (convergence) between governance and metadata management, enabling the proving of compliance and delivering a complete oversight of the business and operational landscapes.

Formalizing the solution under the banner Meta Analytics (MA), we can now link respective metadata to all applicable compliancies.

The mapping process can be a lengthy one, but fortunately, it’s a once-off exercise with updates thereafter as the landscapes change / improve. The process identifies the PPSGs in scope and steps thereof – the steps constitute RCCSs (rules, conditions, checks, controls, constraints, technical standards) or actions, as it should be applied in the lineage of the services and/or system processes and roles involved. The RCCSs are plotted against a data management life-cycle (DMLC) for the data being processed and linked to organisational capabilities.  This gives cross-sectional views of the RCCSs against PPSGs, Services, Processes, DMLCs and architectural components e.g. data models.

The advantages of MA

The MA methodology gives enterprises insights of gaps or dispositions within the landscape, to data life-cycles, how GRC (PPSG) rules are articulated and where, where processes are duplicated / overlapped, who are involved in which processes, when rules are executed (actions) and enables risk analysis.  All of this to support efficiencies (e.g. where services or processes could be merged) and prove compliance.

MA shows the affinities of roles and activities, indicates automated or manual processes – it also allows enterprises to attach risk weights to PPSGs, services or individual processes, and attach processes to competency teams or architectural capabilities.  When linked to actual metadata – like system execution logs, e-mail communiques or signed documents – MA delivers evidence that all necessary steps (rules) are being executed.

From a data governance point of view, data stewards and analysts will use MA to determine at any point in time who is using what data and in which processes; or they could gain insights such as the last execution date of a process or the date of the last confirmation email. This allows for the surfacing of many gaps in compliance and support for the identification of risks associated with not applying rules in line with the guidelines. It also allows for the identification of dispositions and mavericks, and supports investigations.

With MA, ad-hoc mapping on demand becomes a thing of the past: if a Chief Data Officer wanted to see a synopsis of all automated and manual processes or gauge compliance risk, they could use these mappings to view the environment in a single step. MA enables data stewards or business unit managers to interrogate exactly what happens to their customer data during its life-cycle.  MA supports back-end efficiency, enhanced customer experience, averts compliance risk and – of course – enables proactive oversight of the entire environment. As a welcome by-product, MA also surfaces gaps and dispositions, or misalignments between the operations and business environments, which is good for efficiency and complimentary to change and project management.

MA brings a new approach to giving oversight for compliance and for surfacing gaps and inefficiencies – not just at a technical level, but also at a business level. It helps enhance both data and processes and delivers better data for better business.

 

Expand data horizons for greater analytics value

Attempting to find purposeful insights in data could be a futile exercise unless you look beyond the siloes.

With the mainstreaming of advanced data analytics technologies, companies today can risk becoming too dependent on the outputs they receive from the analytics tools, which could serve biased results unless solid data analytics models are applied to the way in which the data is interrogated.

While data is your friend, and the only valid way for organisations to strategise based on fact, data analytics tools can only deliver the outputs they have been asked for. If the pool of data being analysed is too limited, or there is no end objective or purpose for using the results after the scientific methods have been applied to the data, then the whole exercise is virtually futile.

dataisyourfriend

It is seldom enough to drill down into a limited data repository and base broad strategic decisions on the findings. In effect, this would be like a novelty manufacturer assessing only the pre-festive season sales and concluding that Christmas trees are a perennial best-seller. Common sense tells us this will not be the case, and that Christmas trees won’t sell at all in January. But in the case of more complex products and services, trends and markets are not as easy to predict. This is where analytics comes in. Crucially, analytics must look beyond specific domain insights and seek a broader view for a more objective insight.

Comparisons and correlations

A factory may deploy analytics to determine which products to focus on to increase profits, for example. But where the questioning is too narrow, the results will not support strategic growth goals. The company must qualify and complement the questioning with comparatives. It is not enough to assess which products are the biggest sellers – the factory also needs to determine what products are manufactured at the lowest cost, and which deliver the highest return. By bringing together more components and correlating the data on the lowest cost products, highest return products and top sellers, the factory is positioned to make better strategic decisions.

In South Africa, many companies do not approach analytics in this way. They have a set of specific insights they want, and once they find them, they stop there. In this siloed approach, the results are not correlated against a broader pool of data for more objective outcomes. This may be due in part to factors such as the time and cost required for ongoing comparison and correlation, but it is also due to a lack of maturity in the market.

In mature organisations, data sciences are applied to all possible angles/queues and information resources to produce insights to monetise or franchise the data.  It is not just a case of finding unknown trends and insights – the discovery has to be purposeful as well.

business-dashboard-types Continue reading

Avoiding the data spaghetti junction

Mervyn Mooi.

 

Despite all their efforts and investments in quality centres of excellence, some enterprises are still grappling with issues, and this at a time when data is more important for than ever before.

The effects of poor data quality are felt throughout the enterprise, impacting everything from operations to customer experience, costing companies an estimated $3 trillion a year in the US alone.

Data quality will become increasingly crucial as organisations seek to build on their data to benefit from advances in analytics (including big data), artificial intelligence and machine learning.

We find organisations unleashing agile disruptors into their databases without proper controls in place; business divisions failing to standardise their controls and definitions; and companies battling to reconcile data too late in the lifecycle, often resulting in a ‘spaghetti junction’ of siloed, duplicated and non-standardised data that cannot deliver on its potential business value for the company.

Controls at source

Data quality as a whole has improved in recent years, particularly in banks and financial services facing the pressures of compliance.

However, this improvement is largely on the wrong side of the fence, after the data has been captured. This may stem from challenges experienced decades ago, when data validation of data being captured by thousands of clerks could slow down systems and result in customers having to wait in banks and stores while their details were captured.

Data quality will become increasingly crucial as organisations seek to build on their data to benefit from advances in analytics, artificial intelligence and machine learning.

But this practice has continued to this day in many organisations, which still qualify data after capture and so add unnecessary additional layers of resources for data cleaning.

Ensuring data quality should start with pre-emptive controls, with strict entry validation and verification rules, and data profiling of both structured and unstructured data.

Controls at the integration layer

Standardisation is crucial in supporting data quality, but in many organisations different rules and definitions are applied to the same data, resulting in duplication and an inability to gain a clear view of the business and its customers.

For example, the definition of the data entity called a customer may differ from one bank department to another: for the retail division, the customer is an individual, while for the commercial division, the customer is a registered business, and the directors of the business, also registered as customers. The bank will then have multiple versions of what a customer is, and when data is integrated, there will be multiple definitions and structures involved.

Commonality must be found in terms of definitions, and common structures and rules applied to reduce this complexity, and relationships in the data must be understood, with data profiling applied to assess the quality of the data.

Controls at the physical layer

Wherever a list of data exists, reference data should also be standardised across the organisation instead of using a myriad of conventions across various business units.

The next prerequisites for data quality are cleaning and data reconciliation. Incorrect, incomplete and corrupt records must be addressed, standardised conventions, definitions and rules applied, and a reconciliation must be done. What you put in must balance with what you take out. By using standardised reconciliation frameworks and processes, data quality and compliance are supported.

Controls at the presentation layer

On the front end where data is consumed, there should be a common data portal and standard access controls, or view into the data. While the consumption and application needs of each organisation vary, 99% of users do not need report authoring capabilities, and those who do should not have the ability to manipulate data out of context or in an unprotected way.

With a common data portal and standardised access controls, data quality can be better protected.

Several practices also support data quality: starting with a thorough needs analysis and defining data rules and standards in line with both business requirements and in compliance with legislation.

Architecture and design must be carefully planned, with an integration strategy adopted that takes into account existing designs and meta-data. Development initiatives must adhere to data standards and business rules, and the correctness of meta-data must be verified.

Effective testing must be employed to verify the accuracy of the test results and designs; and deployment must include monitoring, audit, reconciliation counts and other best practices.

With these controls and practices in place, the organisation achieves tight, well-governed and sustained data quality.

InfoFlow, Hortonworks to offer Hadoop skills courses

Veemal Kalanjee, MD of InfoFlow.
Veemal Kalanjee, MD of InfoFlow.

 

Local management InfoFlow has partnered with international software firm Hortonworks, to provide enterprise Hadoop training courses to the South African market.

Hadoop is an open source software framework for storing data and running applications on clusters of commodity hardware.

While the global Hadoop market is expected to soar, with revenue reaching $84.6 billion by 2021, the sector is witnessing a severe lack of trained and talented technical experts globally.

Hortonworks develops and supports open source Hadoop data platform software. The California-based company says its enterprise Hadoop is in high demand in SA, but to date, Hadoop skills have been scare and costly to acquire locally.

InfoFlow provides software, consulting and specialised services in business intelligence solutions, data warehousing and data integration.

The company will deliver localised expert resources and Hadoop training support programmes to a wide range of local companies across the financial services, retail, telecommunications and manufacturing sectors.

“There is huge demand in SA for enterprise Hadoop skills, with large enterprises having to fly expensive resources into the country to give their enterprise Hadoop projects guidance and structure,” says Veemal Kalanjee, MD of InfoFlow.

“Instead of moving existing skills around across various clients, Hortonworks wants to take a longer term approach by cross-skilling people through the training and leveraging the graduate programme run by InfoFlow.”

This partnership makes InfoFlow the only accredited Hortonworks training entity in Sub-Saharan Africa, adds Kalanjee.

The Hortonworks training will be added to InfoFlow’s broader portfolio of accredited Informatica Intelligent Data Platform graduate programmes across data management and data warehousing, governance, security, operations and data access.

The Hortonworks-InfoFlow partnership will bring to Johannesburg the only Hortonworks training and testing site in SA, according to the companies.

Local professionals will be able to attend classes focusing on a range of Hortonworks product training programmes at InfoFlow’s training centre in Fourways, Johannesburg.

The courses to be offered include: Hadoop 123; Essentials; Hadoop Admin Foundations; Hortonworks Data Platform and Developer Quick Start.

“There is currently no classroom-based training available on Hortonworks locally and if clients require this, the costs are too high. Having classroom-based training affords clients the ability to ask questions, interact on real-world challenges they are experiencing and apply the theory learnt in a lab environment, set up specifically for them.”

InfoFlow will have an accredited trainer early next year and will provide the instructor-led, classroom Hortonworks training at reduced rates, concludes Kalanjee.

Bots set to multi-task in SA’s insurance sector

Robotic process automation will make waves in the insurance market, offering cost savings, efficiencies and improved risk management.

 

Robotic process (RPA) is still relatively new to South Africa, with mainly the major moving to deploy it to manage certain repetitive and manual processes.

But RPA presents significant promise in many sectors where manual processes delay operations and add costs in a price-sensitive market.

The insurance industry is one sector that stands to achieve multiple gains from deploying RPA: through intelligent automation, they can achieve more streamlined processes, improved customer service, lower overheads and reduced risk.

RPA is akin to deploying an army of workers, or bots, to automate processes both in customer-facing and internal functions. From managing invoices and onboarding new customers, to validating data, assessing risk and confirming the market value of insured items, RPA tools can replace human resources; delivering outputs faster and more accurately.

It takes over very mundane manual tasks, like downloading an e-mail attachment and copying it to a directory, or capturing data to a standardised template. By automating rules-based steps, companies can eliminate data entry and capture errors, and reduce the number of resources needed to complete these processes.

RPA is akin to deploying an army of artificial intelligence workers, or bots, to automate processes both in customer-facing and internal functions.

In customer onboarding alone, where the process could cost hundreds of rands per customer, RPA supports both manual and self-service onboarding, and can then automatically check for blacklisting, confirm the market value of insured items and redirect the customer data to the correct service and finance departments.

Streamlining claims processes

The core value behind RPA is realised through automating a process which follows logical, rule-based steps, as with a claims process. Once claim information is captured, there are defined steps that need to be followed in order to assess whether a claim is valid and the communication necessary between the insurer and the claimant, based on the information collated. By introducing automation in this step, the communication is streamlined, accurate and timeous.

Part of any claims process is the phase of estimating what the loss is. Traditionally, this is a manual process of the claimant and estimator/assessor having numerous discussions to come to agreement on the value of the loss. With RPA, this can be streamlined by having the bot access vendor applications to assess the replacement value of the loss, which then forms the basis of the claim. This has both a benefit from the insurer’s side, where the process is shortened, and from the claimant’s side, where the estimation is objectively decided on.

Imagine a claims process where the insurer receives an e-mail from a claimant with an attached claim form, images of the loss as well as proof of purchases of these items. The e-mail is scanned, attachments extracted and sent to the appropriate systems for either capturing or further processing with human intervention.

This is exactly what RPA achieves. The benefit being that a claim can start being assessed almost immediately since all relevant information for processing is automatically captured in the correct systems, without human error or delay.

Not only is RPA efficient at extracting data off forms, it also provides the additional benefit of validating data on forms and in some instances, correcting it. This mitigates problems with the claims process (due to incorrect data) further downstream. It also helps mitigate the risk of fraud.

RPA has the ability to log all actions and reconcile stages within a process down to a low granularity. This is particularly important in the payment phase of claims processing to ensure the correct amount is paid to the claimant. RPA prevents incorrect payments before they happen, instead of waiting for audit findings to report on this.

In future, robots will also be used widely in the real-time review of social media streams to assess claims severity and reduce fraud. RPA will also receive and route advanced telematics data (including video imagery) that will be instantaneously captured during car accidents and downloaded from the cloud.

CX, integration benefits of RPA

One of the less acclaimed benefits of RPA (productivity and cost-saving being the most popular) is customer experience. Driving self-service within digital organisations is a priority, and allowing a claimant to register, manage their portfolio or submit a claim through an RPA-enabled app on their mobile device is one example of self-service. Not only does intelligent self-service improve the customer experience, it also drives down costs significantly.

Integration with other enabling technologies is one of the most important features of any RPA technology. Whether it is invoking a bot through an API, or being able to pass data gathered from a claim form to a downstream data-centric process, RPA technologies will have to integrate into existing systems and new AI-powered systems to prove the true value they can offer.

MD of Infoflow. 

Veemal Kalanjee is MD of Infoflow, part of the Knowledge Integration Dynamics (KID) group. He has an extensive background in data management sciences, having graduated from Potchefstroom University with an MSc in computer science. He subsequently worked at KID for seven years in various roles within the data management space. Kalanjee later moved to Informatica SA as a senior pre-sales consultant, and recently moved back to the KID Group as MD of Infoflow, which focuses on data management technologies, in particular, Informatica.