数据库相关概念(一)之DB、DW和DBMS



  • 前言

    在学习数据库的时候发现自己对很多概念都不太了解,比如DBDWDBMSOLAPOLTP分别指什么;将图数据库系统(例如Neo4jDgraph等)和图计算系统(例如Spark下的Graphx)混为一谈等等。充分暴露了我基础知识十分匮乏这一事实,陷入“递归式”学习······虽然这只是一些概念,简单了解后直接学习相关数据库操作是完全没有问题的,但秉承着“看都看了不如透彻一点”以及“万一以后面试问到了呢”的想法,我还是整理记录一下。如果有不妥的地方欢迎指正。

    DB/Database/数据库

    Database is a systematic collection of data. A database contains information organized in columns, rows, and tables that is periodically indexed to make accessing relevant information more accessible.

    Many enterprises and organizations create and manage databases using a database management system (DBMS). Organizations most often use databases for online transaction processing (OLTP).

    Database software needs to provide easy access to information and fast querying so that transactions can be carried out efficiently. Databases are often referred to as operational systems, meaning they are used to process day-to-day transactions in an organization.

    Most databases use a normalized data structure. Data normalization means reorganizing data so that it contains no redundant data, and all related data items are stored together, with related data separated into multiple tables. Normalizing data ensures the database takes up minimal disk space while response times are maximized.

    关键词规范化的数据结构 无冗余数据 能方便地访问信息 提供快速查询 用于处理日常事务

    关系:使用DBMS(数据库管理系统)创建和管理DB(数据库),用DB进行OLTP(在线事务处理)

    通俗说法DB保留的是数据信息的最新状态,且只有这一个状态,无法看到历史状态。

    DW/Data Warehouse/数据仓库

    A data warehouse is a system that pulls together data from many different sources within an organization for reporting and analysis. The reports created from complex queries within a data warehouse are used to make business decisions.

    The primary focus of a data warehouse is to provide a correlation between data from existing systems. Data warehouses are used for online analytical processing (OLAP), which uses complex queries to analyze rather than process transactions.

    The more normalized your data is, the more complex the queries needed to read the data because a single query combines data from many tables. The data in a data warehouse does not need to be organized for quick transactions. Therefore, data warehouses normally use a denormalized data structure. A denormalized data structure uses fewer tables because it groups data and doesn’t exclude data redundancies. Denormalization offers better performance when reading data for analytical purposes.

    关键词非规范化的数据结构(星型、雪花型) 不排除冗余数据 集合不同来源的数据 进行分析 支持复杂查询

    关系:用DW进行OLAP(在线分析处理)

    通俗说法:保存的是DB中的不同时间点的状态。对同一个数据信息,保留不同时间点的状态,就便于做统计分析。

    一、数据仓库的架构

    数据仓库的架构由三层组成。架构的底层是加载和存储数据的数据库服务器。中间层包括用于访问和分析数据的分析引擎。顶层是通过报告、分析和数据挖掘工具呈现结果的前端客户端。

    二、数据仓库的运作

    将数据整理成描述数据布局和类型(例如整数、数据字段或字符串)的 Schema。提取的数据将存储在 Schema 描述的各种表中。查询工具使用 Schema 来确定要访问和分析哪些数据表。

    DBMS/Database Management System/数据库管理系统

    Database Management System is a collection of programs which enables its users to create, read, update and delete data in a database. The DBMS essentially serves as an interface between the database and end users or application programs, ensuring that data is consistently organized and remains easily accessible.

    关键词对数据库写读改删 数据库与终端用户/应用之间的接口

    DBMS作为DB与终端用户之间的接口

    小结

    DWDB都是关系数据系统,但它们的构建目的不同。DW用于存储大量历史数据,并支持对所有数据进行快速、复杂的查询,通常使用在线分析处理(OLAP)。 DB用于存储当前事务(指导致数据创建、更新、删除或检索的事件),并允许对正在进行的业务中的事务进行快速访问,即在线事务处理(OLTP)。

    补充说明

    ETL/Extraction-Transformation-Loading用于完成DBDW的数据转存,这一块涉及到的内容较多我还没看...之后会补充上来。

    推荐阅读:The Difference Between a Data Warehouse and a Database



  • 可以看看关系数据库的ACID规则还有MySQL架构相关的博客,可能会对你想了解的知识有一个总览。



  • @ryougi 好的,谢谢学长!正在看Mysql和es6


 

Copyright © 2018 bbs.dian.org.cn All rights reserved.

与 Dian 的连接断开,我们正在尝试重连,请耐心等待