95702 Week 1 - Introduction to Distributed Systems

作者 QIFAN 日期 2016-10-28
95702 Week 1 - Introduction to Distributed Systems

Distributed Systems for Information Systems Management(95702), Fall16, Carnegie Mellon University
课程链接:http://www.andrew.cmu.edu/course/95-702/syllabus.html
👉👉👉 笔记大纲


1 基本概念

  • 分布式系统:分布在不同的相连的计算机网络的,且仅通过传递信息来交流的硬件或软件。

  • 特点

    • 同步性:程序在不同节点同时运行。两个节点间的活动互不影响。
    • 异时性:不同节点的时间不统一
    • 独立失败:一个节点的失败与其他节点隔离,一个程序的崩溃可能并不能被其他节点发觉
  • 主要动机:分布式系统的主要动机来自于“资源分享”

2 分布式系统的实例

2.1 网络搜索

搜索的主要任务在于标记( index )万维网中的所有内容。现在网上已有数百亿的网页和超过百亿的不同URL地址,所以这是十分复杂的。典型如 Google 。

2.2 大规模的多人线上游戏( MMOGs )

  • 有些MMOG采用一台中央服务器的集中架构,这种方式可以减轻副本同步的问题,但是对服务器的要求过高,可能会过载。
  • 还有些将服务器分布在全球,这样的系统容易扩展,单台服务器的要求较低,对不同地区的用户的反应也可以更快,但是同步很难搞。
  • 大部分是MMOG是将两种结合。

2.3 金融交易系统

这些系统的流有数量多,速度快,实时的特点。

3 分布式系统趋势

3.1 现代互联网

一个因特网就是一个非常大的分布式系统

  • 防火墙( firewall ):用于保护系统不受未认证的信息侵扰(信息过滤)
  • 网络中枢( backbone ):有高速传输能力,分布卫星连接、光缆和其他高速宽带电路的网络连接

3.2 移动计算

随着手机和各种移动设备的普及,要求分布式系统具有位置知悉或者内容知悉计算的能力。

3.3 分布多媒体系统

3.4 分布式计算的日常化

4 挑战

4.1 多样性

不同的节点使用不同的编程语言,或者使用不同的字符和数据结构的表示方式。

4.2 开放性

一个系统的开放性决定了这个系统能不能用多种方式进行扩展和重现。

4.3 安全性

信息资源安全的三个要素:保密性(保护不被未受认证个体取得),整体性(保护不分裂),可利用性(避免干扰)

4.4 延展性

如果在增加大量资源和用户的情况下,系统的效率仍能保持不变,则认为这个系统的是可延展的。

4.5 失败处理

通常有两种处理方式:

  • 监测失败。
  • 覆盖失败。将失败掩盖或者使它不那么剧烈。比如重新发送信息和备份资源。
  • 忍受失败
  • 从失败恢复
  • 冗余

4.6 同步性

4.7 透明性

4.8 服务质量

5 万维网

网络主要有三部分组成:

  1. HTML,超文本标记语言。
  2. URL。用于确认那个服务器拥有资源,以及哪一个资源被服务器需要。
  3. client-server 系统架构。使用HTTP

REST:网络上的所有资源都有一个唯一的URL,并且对相同的一系列操作做出反应。

Basic Intro

Fundamental Characteristics of Distributed Systems

  • components located on networked computers and executed concurrently
  • components communicate and coordinate only by passing maessages
  • times differs on each system

Challenges in Constructing DS

  • heterogenity of components may hinder interoperability
  • security:
    Wires may be destroyed on purpose
  • scallability:
    We may need to ramp up the systems with the development
  • failure handling
  • concurrency:
    How to handle several put operations at the same time?
  • openness:
    Is it allowed to add or modify the system?

Models

architecture: the system’s structure can be seperately specified the components and their interrelationships.

Architectural Elements

  • communicating entities
  • communication paradigms
  • roles and resposibilities for communicating entities
  • placement of communicating entites

Who are communicating?

  • system level: processes, threads or simply nodes
  • problem level: objects, components, web services

  • asynchronous systems: client makes a call and continue with other business(don’t wait), it may provide a means for a response

  • synchronous systems: client calls and blocks and waits for the response

Communication paradigms

  • Coupling:
    • in time: both parties exist during the interaction
    • in space: parties know who they are interacting with
  • Interprocess cmmunication: lower level, often use to build higher level abstractions, such as TCP Sockets, UDP Sockets, Multicast Sockets. Coupled in time.
  • Remote invacation: higher level abstractions, two way exchange with a remote operation, procedure or method. Such as RPI, RMI, HTTP, DCOM, CORBA. Coupled in time and space.
  • Indirect communications: not tightly coupled and involing a third party

Roles and responsibilities for entities

  • to perform a useful activity
  • one may act as a client and another as a server
    • Request/Response
    • Request/Acknowledge: when server receives a request, return an acknowledgement to represent request delivered instead of the response
      • Request/Acknowledge/Poll: allows clients to poll for results at their leisure
      • Request/Acknowledge/Callback: server callbacks to clients as soon as the request is handled

Placement of entites

  • entities can be placed on single or multiple machines
  • data may be cahced and services replicated
  • mobile code such as applets and JavaScript
  • mobile agents or worms

Architectural Patterns

  • layered architecture: divide vertical organization of services into layers of abstraction:

    • applications and services layer
    • middleware layer
    • operating system layer
    • computer and netword hardware layer
  • tired architecture

    • always applied to the applications and services layer
    • may logically divided: presentation logic, business logic, data logic
    • main driver: to promote seperate concerns
    • Examples:
      • two-tier: client/server archetecture. business logic and UI on client while data logic layer on the server
      • three-tier: logical description correspond to the physical machines and processes, such as Google Maps
  • proxy pattern: client makes calls on a local object(the proxy) which has the same interface with the remote object. it hides the communication process. Example: check or bank draft is a proxy for funds in an account
  • brokerage pattern: a system that consists of multiple remote objects which interact synchronously or asynchronously.

Transparency

Goal: To raise the level of abstraction

  • access transparency: enables local and remote resources to be accessed using identical operations
  • location transparency: enables resources to be accessed without knowledge of location
  • concurrency transparency: enables several processes to operate concurrently using shared sources without interference between them
  • replication transparency: enables multiple instances of resources to be used without knowledge of the replicas by users or application programmers
  • failure transparency: allow users to complete the tasks despite the failure of hardware or software components
  • mobility transparency: movement of resources and clients within a system
  • performance transparency: reconfigured to improve performance as loads vary
  • scalibility transparency: allow system to expand in scale without changing the structure or algorithms

References