21st century, businesses have evolved. Innovation, Optimization, Creativity is not considered something too advanced, but a need for the business to stay and keep flourishing. The industry is moving from "We make more" model to "We love to share" model, and hence its important to keep upgrading every hour, while keep reducing the cost of running businesses.
The pandemic in 2020, has taught the industries that the next generation scale needs diverse, distributed yet connected workforce. The workforce which comes from different demographics, different skills, different aptitude, different capabilities connected together with help of technologies, and delivering to the best possible targets.
Just like any other manufacturing product like Engines, Motors, this next generation ecosystem also needs a Fuel. The Fuel which enables businesses to effective decision making, the fuel which enables operations team to optimize on the operations, reduce the operations cost, the fuel that enables Marketing team to drill down the product on its usage, and improve the service or product usability.
Data is every where, today Data is the most important investment as well as asset every business is focused on.
Every organization which generates and uses data for decision making, is rethinking the data architecture. No wonder, the businesses has come a long way of learning the art and importance of generating, churning the data into relevant, meaningful information. TOGAF 9.2 (The Open Group Architecture Framework) has given an extremely important weight-age on the Data Architecture or also called as Information architecture to the organization, an essentials of the organization IT strategy.
Organizations are moving towards affirmative and thought driven decisions, rather than reacting to events. Modern data-driven organizations anticipate business needs and market shifts, define the game plan and work proactively to optimize the resources with better outcomes.
The old legacy data architecture is not capable to meet the business demand for agility, real-time data processing, consolidation of multiple sources of structured and unstructured data, security and reslience. This makes it inevitable for organizations to assess and modernize their current data architecture capabilities and shift towards an cognitive insight driven enterprise.
At HooLiv, we are trying to bring in data driven decision making in almost every aspects of business, be it operations, business developments, room allocations, community, product development. This article summarizes the key characteristics of a modern enterprise data architecture and can be referred as a guide to enable organizations with defining a data/information strategy for the enterprise.
From Sales and Business Development being driven by digital websites, PWA (Progressive Web Apps), Quick searches, room recommendations, and other insights. The Co-Living Operations are being optimized to reduce the time to acknowledge, operate, resolve, and the cost of execution.
What is a Data Architecture? Imagine, we had homes, bridges, flyovers built without even designing the foundation, Imagine a scale of millions transactions, without the implementation team knowing how to build the scale. Data/Information creates the organization blueprint for Data environments while aligning with the organization vision, long term and short term goals, and the architecture scope.
Data architecture defines a standard set of protocols, tools, products and frameworks to be used to manage the data via its life cycle at the organization. Its the blueprint, that defines the foundation architecture for information, and elaborates into enterprise information architecture, defining the processes to create/capture, transform, utilize, archive and disseminate organization data assets, while delivering usable insights and value to the business users.
A good data architecture starts from foundation, identifies the actors involved (originators, consumers, destroyers) of the data, with flow from right to left, from data consumers to data sources—not the other way round.
Before the transformation or modernization journey to modern Enterprise architecture starts, its essential to draft the current state architecture. Fundamentally covering Security Architecture, Information Architecture, Business Architecture, Technology Architecture and Application Architecture. This is must to understand where the organization stands as of today, and what it will take to drive the organization to the target state.
It takes effort to built the architectures and platforms. In the past, data architectures were built fairly simple, with straight forward strategies like data dump, simple ETL tools etc, but the effort, technology, required to build such Strategic Data Warehouses, adds to the fire.
A modernization at any stage, is an enormous effort and cost to the organization. Most of such platforms, takes an army of resources to build, with a less or negative return on short term but a valuable positive returns on long terms. Therefore' its important that foundational architectures are laid, translated to the enterprise architectures in phased manner, and with a strategic vision and alignment.
A modern data architecture may still deliver a data warehouse, or more than that a Data lake, or a Data forest. One that is flexible, scalable, adaptable, and agile. An enterprise data warehouse or data lakes are just one of the components of a rigid enterprise modern data architecture. The new Enterprise Data Architectures are living platforms, breathing on every event in the business ecosystem, and continuous learning, gaining and adapting to the learnings. Secure and Governed, to ensure Data Safety and Integrity to the most possible extent.
Lets talk about the important characteristics of such an important aspect of Enterprise Architecture.
The simplest architecture is the best architecture. Lower the complexity, easier to visualize the flow of information between different components, easier to maintain. This is one of the most important and a mammoth task to keep the architecture simple, even in the diversity of the requirements. For organizations smaller in size uses BI tools like PowerBI, Jasper Reports, BIRT, Pentaho etc with a built-in data management capabilities rather than going for big data and hadoop systems. Organization which needs data from different sources, structured / un-structured data being sourced, collated, massaged to get the insights needs diversified systems and complex components.
To reduce complexity of the systems, organizations should advocate for a standard database analytics and management platform across the enterprise, while structuring the MDMs (Master Data Management), reducing data duplicacy across the systems and ensuring data integrity to drive the correctness of the outcomes from the data.
Now-a-days, a lot of stress is laid on Open Apis, which is like data for all. While the Magic word says, "Data for All", but still it has to ensure that data is accessed by the right authority. E.g. Aadhaar Card, the APIs are open however it needs authorization and various set of registration to channel the access to the Aadhaar Card UID Data.
A modern data architecture provides only authorized users ready access to data. The boundaries are defined by Organization's Information Security Policies. Only the users get access to the data they are authorized to access with-in the organizational access policies. The architecture is defined to comply with privacy regulations and compliances of global and local compliances of the countries where the data is hosted, as well as the countries where the services are provided e.g. Health Insurance Portability and Accountability Act (HIPPA), Public Data Protection Act (PDPA) in India and the General Data Protection Regulation emanating from the European Union.
Some of the key controls adapted include masking Personal Idenifiable Information (PII), encrypting Data AT REST and Data IN TRANSIT, tracking and auditing the access to the data with appropriate SIEM Log tools.
Modern data architecture defines the Data Lifecycle and Data Authority that every data element has a source, an owner, a consumer and a destroyer. With the Open Access, and Data sharing coming to the game, its important to ensure Data Governance. Governance is the key to self-service. A modern data architecture defines access points for each type of user to ensure user has access to only the information needed by them. E.g. Some of the examples can be seen as Amazon S3 Bucket Policies, which enables organizations with data control, data access, data archival policies.
A modern data architecture is driven by the organization architecture vision, the architecture scope defined during envisioning and architecture on-boarding. Rather than focusing on the technology required to extract, ingest, transform, and present information, a modern data architecture starts with the business users, the problem statements and flows, define a gap and a move from current state architecture to the future state modern data architecture. A good data architecture continuously evolves to meet new and changing stakeholders/business information needs.
Modern systems accommodates data sourcing from varied sources, structured and un-structured. e.g. Hostel bookings coming from Mobile App, Hostel bookings coming from word, residents using different features and streaming data like usage of community platform, messages, events etc. In a modern data architecture, data flows like a water tunnel. Enterprise data architecture helps to maintain and manage the right flow of right and appropriate information to the business users at right time required, by creating a series of interconnected and bidirectional data pipelines between the systems.
These modern data pipelines are formed based of the master data, transactional data, materialized views, reference data stores, logical relationships, that serve as building blocks as we say in the Enterprise Architecture world, are reused as and when required in different assets and components of the Enterprise Architecture, to ensure steady, and focused flow of high quality information to the business.
Modern Data Architecture transparently automates and continuously optimize the process of data life cycle management across infrastructures and formats. During the implementation, the data sets are created, and profiled with enormous data from the past to define the baseline. This process is called "Metadata injection". The data catalogs are created, models are trained with enormous set of historical and factual data.
As the data flows in, the data sets are tuned, most of the time self tuned, with the exceptions of extreme cases which needs complex changes. This concept is called Machine Learning, which enables the machine i.e. algorithms to get trained and tuned from time to time based on the data being ingested.
Its essential to have the data injections, profiling, tagging, and validations being performed in automated fashions to ensure the steady flow of data to the business in right form and at right time.
S.M.A.R.T - As we say a Modern Data Architecture must be SMART (Specific, Measurable, Actionable, Reliable, Time driven). A data architecture is more than just components, flows and automations. The modern data architecture uses machine learning and artificial intelligence to build the data objects, connect the dots and models that keep data flowing across components. A modern data architecture uses intelligence to learn, adjust, adapt, notify, and recommend. With many use cases emerging out the usage of insights like dynamic pricing, fraud detection, user behavior, the modern data architectures are built not just to provide insights but intelligent to create a value.
S - Specific. The architecture is specific to, focused and driven by the business needs.
M - Measurable. The architecture must abide by the goals and KPIs laid for the architecture, and must be a way to measure these, while optimizing and adapting to the changes in the upcoming architecture cycles.
A - Actionable. The architecture must provides result oriented insights, which can be leveraged by the business to take certain decisions.
R - Reliable. The data architecture becomes a strong foundation, and a source of truth for business to rely on and take important decisions.
T - Time driven. Now this is important. The data architecture must be able to automate and present the right information to the business users at right time. Ofcourse, What will you do looking at the information from 5 years back, when you have to look at the data and decide on the strategy now!!
The modern data architectures are team driven. The TOGAF Enterprise Architecture standards lays down the Organization capability matrix, and hence the responsibilities for each stakeholders involved with-in the architecture. A modern data architecture splits the responsibility for acquiring, transforming data and maintaining the architecture. The Technology team takes the heavy lifting of ingesting data between the systems, and maintaining the flows, whereas, the business users take over the analysis. The modern business units have data scientists that leads the data preparation and data catalog and use them to create and power business data driven applications.
The modern data architectures must be flexible. With every organization expanding across different sub-systems and businesses, the modern data architecture must be able to support multiplicity of the business needs. The architecture must enable multiple types of business users, load various symmetric / asymmetric data loads and different refresh rates, query operations, volumes, physical architecture e.g. on premises, public cloud, private cloud, hybrid, varied data processing engines (relational, OLAP, MapReduce, SQL, etc).
It should also be adapt the handling of new data sources as and when they arise, while continuing the support for existing Enterprise Data sources i.e. Data Warehouses, Data Lakes and other Data Targets.
The Modern Data Architecture must be resilient, with appropriate high availability, disaster recovery, and backup/restore capabilities. Off-course, depending upon the business criticality class (BCC/BCI) of the application/data under consideration.
The modern data architectures run on huge server farms or PaaS (Platform as a Service) offerings from top tier Cloud Service providers, who offer ready to use high availability, and disaster recovery solutions with geographically distributed data centers.
The modern data architecture is an essential arm of the Modern Enterprise Architectures. The above article touches some of the key characteristics of the modern data architectures, which every architect should keep in mind while defining the data transformation journeys and is going to help organizations, understanding and enabling them with the concept of Modern Enterprise Data Architectures.