Delivering Insights outside of the Business Data Lake aims at directly or indirectly (through people) influence Enterprise Business Processes.
Relevant ways to deliver insights at the point of action within the existing Enterprise Landscape include:
Injecting Insights into a external Data Store
Sending messages to external Applications using their native APIs
Providing an API to extract the insight or involve an analytic to drive insight on the fly using data in the BDL.
Unified Data Management
Unified Data Management exploits metadata to manage the data lifecycle, data quality, access policies and services for Master Data Management (MDM), and Reference Data Management (RDM) and metadata management.
Master Data Management
Master Data represents a single source of common, basic business data objects that can be used by the BDL distillation and real-time
analysis processing to verify, enrich and correlate data.
If Master Data Management practices and tools are already deployed in the Enterprise, the Business Data Lake should be built on the side at first. This is typically a case that illustrates how the BDL is deployed complementary to other IT services.
Reference Data Management
Reference Data contains authoritative lists of values or entities. These lists are generally massively re-used and widely “referenced” by other data or metadata. Country codes or Calendars constitutes typical examples of Reference Data. They also can be considered as Master Data from internal or external standards organizations.
Audit & Policy Management
The Big Data Lake standard should be implemented to accommodate the audit controls (e.g. COBIT 5.0) and the centralized application of information policies for security and information governance including provisioning, de-provisioning, access logs, data quality actions,
authentication, authorization, encryption, filtering, log-ins, and single sign-on.
Privacy and Protection
Data in a BDL implementation may come from numerous sources that residebe in different jurisdictions, each with different privacy, retention and appropriate use legislation. This is especially true in large multinational companies. Architects have to be aware of the legislation and ensure that the appropriate controls can be implemented in the BDL.
Information Security
Information Security shall be
architected from the beginning, including the labeling, handling and access to data over time (i.e. the sensitivity of data can vary over time such as a report to shareholders which becomes common knowledge after release).
Unified Operations
Unified Operations concern the ability to provision, configure, monitor and manage the whole Business Data Lake from a single, unified environment that abstracts the distributed infrastructures and the multiple integrated services.
System Monitoring
System monitoring shall consolidate information from multiple levels, at least:
Infrastructures (disk, memory and network usage)
Operating System
Data storage
Processing workflows
The BDL itself can be used to get Insights from the logging data extracted from all the layers and services of the BDL.
System Management
The Business Data Lake System Management mainly consists in
A resource manager for the provisioning of BDL Elastic Infrastructures. It also takes care of failures that can happen among the cluster nodes.
A workflow manager that executes Batch Processing Workflows.
The Resource Manager generally has control over the processing engines, so that the Business Data Lake is as scalable as possible.
System Management must take in account the diversity of Business Compartments, especially regarding the elasticity (or not) of the underlying infrastructures and priorities for processing workflows.
Index
Actions
34
Analytics
13
13
10
13
13
13
19
19
21
22
25
27
27
29
30
30
31
31
31
31
31
31
31
31
32
32
32
32
33
33
Analytics Engine
32
Archimate
11
Audit
36
Batch
13
13
13
28
29
29
30
30
30
37
Batch Ingestion
28
Batch Processing Workflow
30
Big Data
13
14
15
7
10
10
10
13
13
14
15
15
15
15
24
25
25
25
25
27
32
34
36
Business compartments
33
Business Data Lake
21
Data
27
Data-Driven Ecosystem
25
Data-Driven Enterprise
25
25
25
Discovery Platform
24
Ecosystem
15
15
25
25
EDW
16
Enterprise Data Warehouse
16
16
24
Event
27
Information Security
36
Insight
27
Interactive response time
19
IT4IT
11
Knowledge
17
Lambda Architecture
29
Master Data Management
17
18
35
35
35
MDM
17
Metadata
18
27
27
29
Metadata generation
29
Micro Batch
13
Micro-Batch Ingestion
29
Near Real-Time response time
19
O-DEF
11
Open Platform 3.0
19
Platform
6
7
7
10
12
15
19
19
19
19
19
21
24
25
25
31
Policy
36
Privacy
36
Real-Time Ingestion
28
Real-Time processing
33
Real-Time response time
19
Reference Architecture
10
Reference Data Management
35
35
Semi-Structured Data
19
Service Layer
34
Stream
27
Structured Data
19
19
19
System Management
37
System Monitoring
36
TOGAF
10
TOGAF Information Architecture
10
10
Unified Data Management
35
Unified Operations
36
Unstructured Data
19
20