Success Story TopPharm TopPharm offers industry partners a secure KPIaaS platform for accessing sales data, based on Qlik & iVIEW and hosted in the Swiss BI Cloud.
Success Story FRED Financial Data AG FRED Financial Data AG ensures data quality for asset managers and optimizes processes through migration to Qlik Sense.
Success Story TopPharm TopPharm offers industry partners a secure KPIaaS platform for accessing sales data, based on Qlik & iVIEW and hosted in the Swiss BI Cloud.
Success Story FRED Financial Data AG FRED Financial Data AG ensures data quality for asset managers and optimizes processes through migration to Qlik Sense.
Data architecture from the 1970s until todayGet a comprehensive overview of the evolution of data architectures from hierarchical network databases to data mesh and learn why the data warehouse is still the most widely used architectural model and why there is no universal architecture for all organisations. Data Warehouse Data Lake Data Lakehouse Data Fabric Data Mesh The evolution of data architecture has been driven by the growing importance of data in organisations. From traditional data warehouses to modern data fabric and data mesh approaches, these architectural approaches have overcome specific challenges and opened up new opportunities. Data Architecture Comparison Data Architecture Overview The 70s: Hierarchical and network databasesIn the 1970s, computer systems were mainly dominated by centrally managed mainframes. Data was organised using hierarchical or network database models. These models offered the possibility of organising data in a database in different ways - whether in a hierarchical structure that represented a relationship from one element to another, or in a network that linked many elements together. The 80s: The client-server modelIn the late 1980s and early 1990s, a new paradigm of data architecture emerged with the advent of the client-server model. This model meant a move away from centralised mainframe systems towards a distributed system in which responsibilities were divided between servers (providers of resources or services) and clients (users of these services). In the area of databases, this meant that the database software (DBMS) could be installed on a server, while users or applications could access the data from client computers. This approach revolutionised scalability and accessibility and simplified the management of growing volumes of data and an increasing number of users. The 90s: Traditional Data WarehousingIn the late 1990s, the concept of data warehousing fundamentally changed how companies approached the storage and analysis of data. At its core, a data warehouse is a large, centralized repository for data from various sources.The architecture uses a three-tier structure: the data source layer, the data warehouse layer, and the front-end client layer. ETL processes (Extract, Transform, Load) were used to pull data from different operational databases, convert it into a consistent format, and then load it into the data warehouse. The data was typically stored in a relational database and organized based on an OLAP cube model (Online Analytical Processing), which allowed for complex analytical and ad-hoc queries. The 2000s: Big Data and HadoopIn the 2000s, the proliferation of the internet, social media, and IoT devices led to a drastic increase in data volume, variety, and velocity—giving rise to what is known as "Big Data." Traditional data warehouses could no longer effectively handle these heterogeneous, large volumes of data generated at high speeds.The open-source framework Hadoop revolutionized data architecture starting in 2005. It was specifically designed for processing massive amounts of data in computer clusters. The framework introduced the concept of distributed storage and processing, meaning that data was no longer confined to a single storage location but could be stored and processed across multiple nodes. The 2010s: Cloud and Data Lake ArchitecturesIn the 2010s, the concept of cloud computing emerged as a new paradigm, providing scalable resources as a service over the internet. This development had significant impacts on data architecture and led to the creation of data lakes. Unlike traditional data warehouses, which use the ETL process (Extract, Transform, Load) to ingest data, data lakes employ an Extract-Load-Transform (ELT) process. Data extracted from various sources is first loaded into cost-effective BLOB storage, then transformed, and finally transferred to a data warehouse using expensive block storage.The need to process large volumes of data in real time gave rise to the Lambda and Kappa architecture models. The Lambda architecture employs a hybrid approach, utilizing both batch and stream processing to gain accurate and up-to-date insights. All incoming data is captured and stored as an append-only log, creating an immutable historical record. This architecture is divided into three layers: the Batch Layer, the Speed Layer, and the Serving Layer. In the Kappa architecture, all data is ingested and processed as an unbounded stream of events. This architecture consists of three main components: stream ingestion, stream processing, and long-term storage. The 2020s: Data LakehouseData Lakehouses represent a new generation of data platforms: a Data Lakehouse combines the advantages of Data Lakes and Data Warehouses to store structured, semi-structured, and unstructured data in a unified Data Lake. This eliminates the need for separate data silos and allows data teams to perform analyses and derive insights directly from raw data without the need to move or duplicate data. The Medallion Architecture, also known as the "Multi-Hop" architecture, is used for the logical organization of data in a Lakehouse. Its goal is to gradually and progressively improve the structure and quality of data as it flows through each layer of the architecture (Bronze – Silver – Gold). The 2020s: Data FabricThe Data Fabric represents the fourth generation of data platform architecture. Its goal is to make data available anytime and anywhere. A Data Fabric consists of a network of data platforms such as Data Warehouses, Data Lakes, IoT/Edge devices, and transactional databases that interact with each other and are distributed across the enterprise's computing ecosystem. One node in the fabric can supply raw data to another, which then performs analyses. These analyses can be provided as REST APIs within the fabric, allowing them to be used by transactional systems for decision-making. Data assets can be published in various categories, enabling the creation of an enterprise-wide data marketplace. Future Concept: Data Mesh Data Mesh is an architectural concept for organizing data in large enterprises. Instead of storing and managing data centrally, it is decentralized in a Data Mesh. This means that data remains within individual domains or business areas, and mechanisms are introduced to enable access and exchange between these domains.Data Mesh is typically based on four principles: domain orientation, self-service, data productization, and infrastructure automation. By implementing a Data Mesh, companies can respond more flexibly to changes, as data management is tailored to the specific needs of individual business areas, while simultaneously increasing the scalability and reusability of data. Comparison of Data Architectures Data Warehouse remains the most common Data Architecture ModelAlthough new architectures like Data Lakes and Data Meshes are gaining importance, Data Warehouses remain the most common data architecture variant today. They have established themselves as a proven method for centrally storing and analyzing large volumes of structured data. Companies value their reliability and stability, which they have demonstrated over the years. Additionally, Data Warehouses are closely integrated with Business Intelligence (BI) and analytics tools, enabling seamless analysis of stored data.Another important aspect is the ability of Data Warehouses to efficiently store and process historical data. This allows companies to identify trends, patterns and changes over time and make informed decisions. The centralized storage and management of data in Data Warehouses also support high data quality and consistency, which is crucial for businesses.Modern Data Warehouse technologies also offer scalability options that allow companies to expand their infrastructure as needed to keep up with the growth of data volumes. Architecture Selection must be based on the NeedsThere is no universal architecture suitable for all use cases and every company. Rather, the choice of the appropriate architecture is determined by a variety of factors. These include both current and future use cases, the diversity of the data landscape, as well as the technologies and platforms used. Each organization has its own requirements and challenges that may necessitate a tailored architecture. Therefore, it is essential to develop an architecture that meets both current and future needs while being flexible enough to adapt to changing requirements. DATA WAREHOUSETechnology: DBMSPlatforms: On-prem or CloudData sources: Structured Data Integration: Batch Data models: Dimensional, data vault Data quality: SecuredData Governance: CentralizedImportance of Metadaten: Medium Usage: Standard reports, ad-hoc analysis DATA LAKETechnology: Object StoresPlatforms: CloudData sources: All DataData Integration: CopyData models: Schema-lessData quality: UnverifiedData Governance: UndefinedImportance of Metadaten: LowUsage: Data Science Lambda/KafkaTechnology: StreamingPlatforms: On-prem and/or CloudData sources: Structured and semi-structuredData Integration: Stream und BatchData models: Stream and modeledData quality: Monitoring of streamsData governance: Minimally definedImportance of metadata: Low to mediumUsage: AI-driven real-time analytics DATA LAKEHOUSETechnology: DBMS and Object StoresPlatforms: CloudData sources: HybridData integration: Copy and BatchData models: HybridData quality: Partially secured Data governance: Central Importance of metadata: Medium Usage: Standard reports, ad hoc analysis, data science DATA FABRICTechnology: Data VirtualityPlatforms: On-prem and/or CloudData sources: StructuredData integration: VirtualData models: Dimensional, data vaultData quality: MonitoringData governance: HybridImportance of metadata: HighUsage: Standard reports, ad hoc analysis DATA MESHTechnology: Various formats and data catalogsPlatforms: On-prem and/or CloudData sources: HybridData integration: Copy, Batch, StreamData models: HybridData quality: DecentralizedData governance: DecentralizedImportance of metadata: HighUsage: Standard reports, ad hoc analysis, Data Science, AI-driven real-time analysis Data architectures overview As experienced Modern Intelligence experts for holistic end-to-end Data Intelligence, we support companies in the selection and construction of a customized and future-proof data architecture. Get in touch!Our data management services