博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
分布式系统原理介绍_分布式系统的全面介绍
阅读量:2527 次
发布时间:2019-05-11

本文共 49157 字,大约阅读时间需要 163 分钟。

分布式系统原理介绍

by Stanislav Kozlovski

斯坦尼斯拉夫·科兹洛夫斯基(Stanislav Kozlovski)

分布式系统的全面介绍 (A Thorough Introduction to Distributed Systems)

什么是分布式系统,为什么这么复杂? (What is a Distributed System and why is it so complicated?)

With the ever-growing technological expansion of the world, distributed systems are becoming more and more widespread. They are a vast and complex field of study in computer science.

随着世界技术的不断发展,分布式系统变得越来越广泛。 它们是计算机科学中一个广阔而复杂的研究领域。

This article aims to introduce you to distributed systems in a basic manner, showing you a glimpse of the different categories of such systems while not diving deep into the details.

本文旨在以一种基本的方式向您介绍分布式系统,向您展示这些系统的不同类别,同时又不深入细节。

什么是分布式系统? (What is a distributed system?)

A distributed system in its most simplest definition is a group of computers working together as to appear as a single computer to the end-user.

在最简单的定义中,分布式系统是一组计算机,这些计算机一起工作,对于最终用户而言,它们就像一台计算机。

These machines have a shared state, operate concurrently and can fail independently without affecting the whole system’s uptime.

这些机器具有共享状态,可以同时运行,并且可以独立发生故障,而不会影响整个系统的正常运行时间。

I propose we incrementally work through an example of distributing a system so that you can get a better sense of it all:

我建议我们通过一个分发系统的示例来逐步进行工作,以便您可以更好地了解这一切:

Let’s go with a database! Traditional databases are stored on the filesystem of one single machine, whenever you want to fetch/insert information in it — you talk to that machine directly.

让我们来看一个数据库! 传统数据库存储在单台计算机的文件系统中,只要您想在其中获取/插入信息,就可以直接与该计算机通信。

For us to distribute this database system, we’d need to have this database run on multiple machines at the same time. The user must be able to talk to whichever machine he chooses and should not be able to tell that he is not talking to a single machine — if he inserts a record into node#1, node #3 must be able to return that record.

为了分发此数据库系统,我们需要使该数据库同时在多台计算机上运行。 用户必须能够与他选择的任何一台计算机交谈,并且不能告诉自己他不在与单台计算机交谈。如果用户将记录插入节点#1,则节点#3必须能够返回该记录。

为什么要分配系统? (Why distribute a system?)

Systems are always distributed by necessity. The truth of the matter is — managing distributed systems is a complex topic chock-full of pitfalls and landmines. It is a headache to deploy, maintain and debug distributed systems, so why go there at all?

系统始终根据需要进行分发。 事实的真相是-管理分布式系统是一个复杂的话题,充满了陷阱和地雷。 部署,维护和调试分布式系统令人头疼,那么为什么要去那里呢?

What a distributed system enables you to do is scale horizontally. Going back to our previous example of the single database server, the only way to handle more traffic would be to upgrade the hardware the database is running on. This is called scaling vertically.

分布式系统允许您执行的操作是水平缩放 。 回到前面的单个数据库服务器示例,处理更多流量的唯一方法是升级运行数据库的硬件。 这称为垂直缩放

Scaling vertically is all well and good while you can, but after a certain point you will see that even the best hardware is not sufficient for enough traffic, not to mention impractical to host.

可以进行垂直扩展,虽然这样可以做得很好,但是经过一定时间后,您会发现即使最好的硬件也不足以满足足够的流量,更不用说托管主机了。

Scaling horizontally simply means adding more computers rather than upgrading the hardware of a single one.

横向扩展仅意味着添加更多计算机,而不是升级单个计算机的硬件。

It is significantly cheaper than vertical scaling after a certain threshold but that is not its main case for preference.

在一定的阈值之后,它比垂直缩放便宜得多,但这不是首选的主要情况。

Vertical scaling can only bump your performance up to the latest hardware’s capabilities. These capabilities prove to be insufficient for technological companies with moderate to big workloads.

垂直扩展只能将您的性能提升到最新的硬件功能。 对于中等或大工作量的技术公司,这些功能被证明是不够的。

The best thing about horizontal scaling is that you have no cap on how much you can scale — whenever performance degrades you simply add another machine, up to infinity potentially.

关于水平缩放的最好之处在于,您对可缩放的数量没有限制-每当性能降低时,您只需添加另一台机器,直到无穷大。

Easy scaling is not the only benefit you get from distributed systems. Fault tolerance and low latency are also equally as important.

轻松扩展并不是从分布式系统中获得的唯一好处。 容错低延迟也同样重要。

Fault Tolerance — a cluster of ten machines across two data centers is inherently more fault-tolerant than a single machine. Even if one data center catches on fire, your application would still work.

容错能力 –跨两个数据中心的十台机器组成的集群本质上比一台机器具有更高的容错能力。 即使一个数据中心着火了,您的应用程序仍然可以运行。

Low Latency — The time for a network packet to travel the world is physically bounded by the speed of light. For example, the shortest possible time for a request‘s round-trip time (that is, go back and forth) in a fiber-optic cable between New York to Sydney is . Distributed systems allow you to have a node in both cities, allowing traffic to hit the node that is closest to it.

低延迟 -网络数据包周游世界的时间实际上受光速的限制。 例如,在纽约到悉尼之间的光缆中,请求往返时间 (即来回)的最短时间为 。 分布式系统使您可以在两个城市中都有一个节点,从而使流量可以到达距离它最近的节点。

For a distributed system to work, though, you need the software running on those machines to be specifically designed for running on multiple computers at the same time and handling the problems that come along with it. This turns out to be no easy feat.

但是,要使分布式系统正常工作,您需要将在这些计算机上运行的软件专门设计为可同时在多台计算机上运行并处理随之而来的问题。 事实证明这并非易事。

扩展我们的数据库 (Scaling our database)

Imagine that our web application got insanely popular. Imagine also that our database started getting twice as much queries per second as it can handle. Your application would immediately start to decline in performance and this would get noticed by your users.

想象一下,我们的Web应用程序异常流行。 还要想象一下,我们的数据库开始每秒收到的查询量是其处理能力的两倍。 您的应用程序性能将立即开始下降,这将引起用户的注意。

Let’s work together and make our database scale to meet our high demands.

让我们一起努力,使我们的数据库规模可以满足我们的高要求。

In a typical web application you normally read information much more frequently than you insert new information or modify old one.

在典型的Web应用程序中,与插入新信息或修改旧信息相比,通常阅读信息的频率要高得多。

There is a way to increase read performance and that is by the so-called Primary-Replica Replication strategy. Here, you create two new database servers which sync up with the main one. The catch is that you can only read from these new instances.

有一种方法可以提高读取性能,即所谓的“ 主副本复制”策略。 在这里,您将创建两个新数据库服务器,它们与主要数据库服务器同步。 问题是您只能从这些新实例中读取

Whenever you insert or modify information — you talk to the primary database. It, in turn, asynchronously informs the replicas of the change and they save it as well.

每当您插入或修改信息时,您都在与主数据库对话。 反过来,它以异步方式通知副本,并保存更改。

Congratulations, you can now execute 3x as much read queries! Isn’t this great?

恭喜,您现在可以执行3倍的读取查询! 这不是很好吗?

陷阱 (Pitfall)

Gotcha! We immediately lost the C in our relational database’s guarantees, which stands for Consistency.

知道了! 我们立即在关系数据库的保证(代表一致性)中丢失了C。

You see, there now exists a possibility in which we insert a new record into the database, immediately afterwards issue a read query for it and get nothing back, as if it didn’t exist!

您会发现,现在存在一种可能性,我们可以在数据库中插入一条新记录,然后立即对其发出读取查询,并且什么也没有得到,就好像它不存在一样!

Propagating the new information from the primary to the replica does not happen instantaneously. There actually exists a time window in which you can fetch stale information. If this were not the case, your write performance would suffer, as it would have to synchronously wait for the data to be propagated.

从主数据库向副本数据库传播新信息不会立即发生。 实际上存在一个时间窗口,您可以在其中获取过时的信息。 如果不是这种情况,您的写入性能将受到影响,因为它必须同步等待数据被传播。

Distributed systems come with a handful of trade-offs. This particular issue is one you will have to live with if you want to adequately scale.

分布式系统需要进行一些折衷。 如果要充分扩展规模,则必须解决这一特定问题。

继续扩大规模 (Continuing to Scale)

Using the replica database approach, we can horizontally scale our read traffic up to some extent. That’s great but we’ve hit a wall in regards to our write traffic — it’s still all in one server!

使用副本数据库方法,我们可以在一定程度上横向扩展读取流量。 太好了,但是我们在写入流量方面遇到了麻烦-它仍然全部在一个服务器中!

We’re not left with much options here. We simply need to split our write traffic into multiple servers as one is not able to handle it.

我们这里没有太多选择。 我们只需要将写入流量拆分为多个服务器,因为一台服务器无法处理。

One way is to go with a . There, instead of replicas that you can only read from, you have multiple primary nodes which support reads and writes. Unfortunately, this gets complicated real quick as you now have the ability to (e.g insert two records with same ID).

一种方法是采用 。 在这里,您有多个支持读取和写入的主节点,而不是只能从其中读取的副本。 不幸的是,由于您现在能够 (例如,插入两个具有相同ID的记录),因此这很快变得非常复杂。

Let’s go with another technique called (also called partitioning).

让我们来看看另一种称为技术 (也称为分区 )。

With sharding you split your server into multiple smaller servers, called shards. These shards all hold different records — you create a rule as to what kind of records go into which shard. It is very important to create the rule such that the data gets spread in an uniform way.

通过分片,您可以将服务器拆分为多个较小的服务器,称为分片。 这些分片都保存着不同的记录-您创建一条规则,确定哪种记录进入哪个分片。 创建规则以使数据以统一的方式传播非常重要。

A possible approach to this is to define ranges according to some information about a record (e.g users with name A-D).

一种可能的方法是根据一些有关记录的信息(例如,名称为AD的用户)来定义范围。

This sharding key should be chosen very carefully, as the load is not always equal based on arbitrary columns. (e.g more people have a name starting with C rather than Z). A single shard that receives more requests than others is called a hot spot and must be avoided. Once split up, re-sharding data becomes incredibly expensive and can cause significant downtime, as was the case with .

分片密钥应非常谨慎地选择,因为基于任意列,负载并不总是相等的。 (例如,更多人的名字以C而不是Z开头)。 收到比其他请求更多请求的单个分片称为热点 ,必须避免。 拆分后,重新共享数据变得非常昂贵,并且可能导致大量停机,就像 。

To keep our example simple, assume our client (the Rails app) knows which database to use for each record. It is also worth noting that there are many strategies for sharding and this is a simple example to illustrate the concept.

为了简化示例,假设我们的客户端(Rails应用程序)知道每个记录要使用哪个数据库。 还值得注意的是,有很多分片策略,这是一个简单的示例来说明这一概念。

We have won quite a lot right now — we can increase our write traffic N times where N is the number of shards. This practically gives us almost no limit — imagine how finely-grained we can get with this partitioning.

现在我们已经赢了很多钱-我们可以将写入流量增加N倍,其中N是分片数。 这几乎没有给我们任何限制-想象一下通过这种划分我们可以获得多么细的粒度。

陷阱 (Pitfall)

Everything in Software Engineering is more or less a trade-off and this is no exception. Sharding is no simple feat and is best avoided .

软件工程中的所有内容或多或少都是一个折衷,这也不例外。 分片并不是一件简单的事,最好避免。

We have now made queries by keys other than the partitioned key incredibly inefficient (they need to go through all of the shards). SQL JOIN queries are even worse and complex ones become practically unusable.

现在我们已经通过按键进行查询 除了分区密钥效率极低之外(它们需要遍历所有分片)。 SQL JOIN查询甚至更糟,复杂的查询实际上变得不可用。

分散与分布式 (Decentralized vs Distributed)

Before we go any further I’d like to make a distinction between the two terms.

在继续之前,我想对两个术语进行区分。

Even though the words sound similar and can be concluded to mean the same logically, their difference makes a significant technological and political impact.

尽管这些单词听起来很相似,并且在逻辑上可以得出相同的结论,但它们之间的差异会产生重大的技术和政治影响。

Decentralized is still distributed in the technical sense, but the whole decentralized systems is not owned by one actor. No one company can own a decentralized system, otherwise it wouldn’t be decentralized anymore.

从技术意义上讲,去中心仍然是分布式的,但是整个去中心化系统不是由一个参与者拥有的。 没有一家公司可以拥有去中心化的系统,否则就不会再去中心化了。

This means that most systems we will go over today can be thought of as distributed centralized systems — and that is what they’re made to be.

这意味着我们今天将要讨论的大多数系统都可以被视为分布式集中式系统 ,这就是它们的本质。

If you think about it — it is harder to create a decentralized system because then you need to handle the case where some of the participants are malicious. This is not the case with normal distributed systems, as you know you own all the nodes.

如果考虑一下,创建去中心化的系统将变得更加困难,因为这时您需要处理一些参与者是恶意的情况。 对于普通的分布式系统,情况并非如此,因为您知道自己拥有所有节点。

Note: This definition has been a lot and can be confused with others (peer-to-peer, federated). Regardless, what I gave you as a definition is what I feel is the most widely used now that blockchain and cryptocurrencies popularized the term.

注意:这个定义已经 ,可以与其他定义(点对点,联合)混淆。 无论如何,我给您提供的定义是,在区块链和加密货币普及后,我觉得使用最广泛。

分布式系统类别 (Distributed System Categories)

We are now going to go through a couple of distributed system categories and list their largest publicly-known production usage. Bear in mind that most such numbers shown are outdated and are most probably significantly bigger as of the time you are reading this.

现在,我们将研究几个分布式系统类别,并列出它们最大的公开生产用途。 请记住,显示的大多数此类数字已过时,并且在您阅读本文时,可能会大大增加。

分布式数据存储 (Distributed Data Stores)

Distributed Data Stores are most widely used and recognized as Distributed Databases. Most distributed databases are non-relational databases, limited to key-value semantics. They provide incredible performance and scalability at the cost of consistency or availability.

分布式数据存储被最广泛地使用,并被认为是分布式数据库。 大多数分布式数据库是非关系数据库,限于键值语义。 它们以一致性或可用性为代价提供了令人难以置信的性能和可伸缩性。

Known Scale — back in 2015

已知规模 —早在2015年

We cannot go into discussions of distributed data stores without first introducing the CAP Theorem.

在不首先介绍CAP定理的情况下,我们将无法讨论分布式数据存储

CAP定理 (CAP Theorem)

, the CAP theorem states that a distributed data store cannot simultaneously be consistent, available and partition tolerant.

,CAP定理就证明,分布式数据存储不能同时保持一致,可用和分区容忍。

Some quick definitions:

一些快速定义:

  • Consistency — What you read and write sequentially is what is expected (remember the gotcha with the database replication a few paragraphs ago?)

    一致性 -顺序读取和写入是预期的结果(还记得几段前的数据库复制陷阱吗?)

  • Availability — the whole system does not die — every non-failing node always returns a response.

    可用性 -整个系统不会消失-每个非故障节点始终返回响应。

  • Partition Tolerant — The system continues to function and uphold its consistency/availability guarantees in spite of

    分区容忍 —尽管存在 ,系统仍可继续运行并维护其一致性/可用性保证

In reality, partition tolerance must be a given for any distributed data store. As mentioned in many places, , you cannot have consistency and availability without partition tolerance.

实际上,必须为任何分布式数据存储指定分区容限。 正如许多地方所提到 ,没有分区容忍性就无法保持一致性和可用性。

Think about it: if you have two nodes which accept information and their connection dies — how are they both going to be available and simultaneously provide you with consistency? They have no way of knowing what the other node is doing and as such have can either become offline (unavailable) or work with stale information (inconsistent).

想一想:如果您有两个接受信息的节点及其连接模具,它们将如何同时可用并同时为您提供一致性? 他们无法知道其他节点在做什么,因此它们可能会变为离线(不可用)或使用陈旧信息(不一致)

In the end you’re left to choose if you want your system to be strongly consistent or highly available under a network partition.

最后,您可以选择是否要使系统在网络分区下保持高度一致或高度可用。

Practice shows that most applications value availability more. You do not necessarily always need strong consistency. Even then, that trade-off is not necessarily made because you need the 100% availability guarantee, but rather because network latency can be an issue when having to synchronize machines to achieve strong consistency. These and more factors make applications typically opt for solutions which offer high availability.

实践表明,大多数应用程序更看重可用性。 您不一定总是需要很强的一致性。 即使这样,也不一定要进行权衡,因为您需要100%的可用性保证,而是因为在必须同步计算机以实现强一致性时网络延迟可能成为问题。 这些因素以及更多因素使得应用程序通常选择提供高可用性的解决方案。

Such databases settle with the weakest consistency model — eventual consistency . This model guarantees that if no new updates are made to a given item, eventually all accesses to that item will return the latest updated value.

这样的数据库以最弱的一致性模型( 最终一致性 。 此模型保证,如果没有对给定项目进行任何新的更新,则最终对该项目的所有访问都将返回最新的更新值。

Those systems provide BASE properties (as opposed to traditional databases’ ACID)

这些系统提供BASE属性(与传统数据库的ACID相反)

  • Basically Available — The system always returns a response

    asically vailable -系统总是返回一个响应

  • Soft state — The system could change over time, even during times of no input (due to eventual consistency)

    小号经常状态-该系统可以随着时间的推移,即使在没有输入的时代的变化,(由于最终一致性)

  • Eventual consistency — In the absence of input, the data will spread to every node sooner or later — thus becoming consistent

    ëventual一致性-在没有输入时,数据会蔓延到每个节点迟早-从而成为一致的

Examples of such available distributed databases — , ,

此类可用分布式数据库的示例 , ,

Of course, there are other data stores which prefer stronger consistency — , ,

当然,还有其他一些数据存储区更喜欢更高的一致性 , ,

The CAP theorem is worthy of multiple articles on its own — some regarding how you can and others on .

CAP定理本身值得多篇文章,其中一些内容涉及如何来而另一些则涉及 。

卡桑德拉 (Cassandra)

Cassandra, as mentioned above, is a distributed No-SQL database which prefers the AP properties out of the CAP, settling with eventual consistency. I must admit this may be a bit misleading, as Cassandra is highly configurable — you can make it provide strong consistency at the expense of availability as well, but that is not its common use case.

如上所述,Cassandra是一个分布式No-SQL数据库,它更喜欢CAP之外的AP属性,并最终实现了一致性。 我必须承认,这可能有点误导,因为Cassandra是高度可配置的—您也可以使其具有很强的一致性,但以牺牲可用性为代价,但这不是它的常见用例。

Cassandra uses to determine which nodes out of your cluster must manage the data you are passing in. You set a replication factor, which basically states to how many nodes you want to replicate your data.

Cassandra使用来确定群集中的哪些节点必须管理您要传入的数据。您设置了复制因子 ,该因子基本上表示要复制数据的节点数。

When reading, you will read from those nodes only.

读取时,您只会从那些节点读取。

Cassandra is massively scalable, providing absurdly high write throughput.

Cassandra具有可伸缩性,可提供惊人的高写入吞吐量。

Even though this diagram might be biased and it looks like it compares Cassandra to databases set to provide strong consistency (otherwise I can’t see why MongoDB would drop performance when upgraded from 4 to 8 nodes), this should still show what a properly set up Cassandra cluster is capable of.

即使此图可能有偏差,并且看起来将Cassandra与设置为提供强一致性的数据库进行了比较(否则,我看不到为什么MongoDB从4个节点升级到8个节点时会降低性能),但这仍然应该显示正确设置的情况up Cassandra集群是有能力的。

Regardless, in the distributed systems trade-off which enables horizontal scaling and incredibly high throughput, Cassandra does not provide some fundamental features of ACID databases — namely, transactions.

无论如何,在实现水平扩展和令人难以置信的高吞吐量的分布式系统折衷方案中,Cassandra不提供ACID数据库的某些基本功能,即事务。

共识 (Consensus)

Database transactions are tricky to implement in distributed systems as they require each node to agree on the right action to take (abort or commit). This is known as consensus and it is a fundamental problem in distributed systems.

数据库事务很难在分布式系统中实现,因为它们要求每个节点都同意采取正确的措施(中止或提交)。 这被称为共识 ,这是分布式系统中的一个基本问题。

Reaching the type of agreement needed for the “transaction commit” problem is straightforward if the participating processes and the network are completely reliable. However, real systems are subject to a number of possible faults, such as process crashes, network partitioning, and lost, distorted, or duplicated messages.

如果参与的流程和网络是完全可靠的,那么达成“交易提交”问题所需的协议类型非常简单。 但是,实际系统会遇到许多可能的故障,例如进程崩溃,网络分区以及消息丢失,失真或重复。

This poses an issue — it has been to guarantee that a correct consensus is reached within a bounded time frame on a non-reliable network.

这带来了一个问题-已经保证在不可靠的网络上的有限时间内达到正确的共识。

In practice, though, there are algorithms that reach consensus on a non-reliable network pretty quickly. Cassandra actually provides through the use of the algorithm for distributed consensus.

但是实际上,有些算法可以很快在不可靠的网络上达成共识。 实际上,Cassandra通过使用算法提供分布式共识来提供 。

分布式计算 (Distributed Computing)

Distributed computing is the key to the influx of Big Data processing we’ve seen in recent years. It is the technique of splitting an enormous task (e.g aggregate 100 billion records), of which no single computer is capable of practically executing on its own, into many smaller tasks, each of which can fit into a single commodity machine. You split your huge task into many smaller ones, have them execute on many machines in parallel, aggregate the data appropriately and you have solved your initial problem. This approach again enables you to scale horizontally — when you have a bigger task, simply include more nodes in the calculation.

分布式计算是近年来涌现的大数据处理的关键。 它是一种将庞大的任务(例如,总计1000亿条记录)分割成许多较小的任务的技术,例如,没有任何一台计算机能够自己独立执行,每个任务都可以放入一台商用机器中。 您将巨大的任务分解为许多较小的任务,让它们在多台计算机上并行执行,适当地聚合了数据,从而解决了最初的问题。 这种方法再次使您能够水平缩放-当您执行更大的任务时,只需在计算中包括更多节点即可。

Known Scale — had

已知规模— 拥有16万台

An early innovator in this space was Google, which by necessity of their large amounts of data had to invent a new paradigm for distributed computation — MapReduce. They published a and the open source community later created based on it.

Google是该领域的早期创新者,由于其大量数据的必要性,他不得不发明一种新的分布式计算范式-MapReduce。 他们发表了开放源代码社区后来基于它创建了 。

MapReduce (MapReduce)

MapReduce can be simply defined as two steps — the data and it to something meaningful.

MapReduce可以简单地定义为两个步骤- 数据并将其为有意义的东西。

Let’s get at it with an example again:

让我们再举一个例子:

Say we are Medium and we stored our enormous information in a secondary distributed database for warehousing purposes. We want to fetch data representing the number of claps issued each day throughout April 2017 (a year ago).

假设我们是中型企业,我们将大量信息存储在辅助分布式数据库中以进行仓储。 我们想要获取代表2017年4月(一年前)每天发出的拍手数量的数据。

This example is kept as short, clear and simple as possible, but imagine we are working with loads of data (e.g analyzing billions of claps). We won’t be storing all of this information on one machine obviously and we won’t be analyzing all of this with one machine only. We also won’t be querying the production database but rather some “warehouse” database built specifically for low-priority offline jobs.

该示例尽可能简短,清晰和简单,但可以想象我们正在处理大量数据(例如分析数十亿拍手)。 我们显然不会将所有这些信息存储在一台机器上,也不会仅使用一台机器来分析所有这些信息。 我们也不会查询生产数据库,而是查询专门为低优先级脱机作业构建的一些“仓库”数据库。

Each Map job is a separate node transforming as much data as it can. Each job traverses all of the data in the given storage node and maps it to a simple tuple of the date and the number one. Then, three intermediary steps (which nobody talks about) are done — Shuffle, Sort and Partition. They basically further arrange the data and delete it to the appropriate reduce job. As we’re dealing with big data, we have each Reduce job separated to work on a single date only.

每个Map作业都是一个单独的节点,它会转换尽可能多的数据。 每个作业遍历给定存储节点中的所有数据,并将其映射到日期和数字1的简单元组。 然后,完成了三个中间步骤(没人谈论) –随机播放,排序和分区。 他们基本上会进一步整理数据并将其删除以进行适当的还原作业。 在处理大数据时,我们将每个Reduce作业分开以仅在单个日期工作。

This is a good paradigm and surprisingly enables you to do a lot with it — you can chain multiple MapReduce jobs for example.

这是一个很好的范例,令人惊讶的是,它使您可以完成很多工作—例如,可以链接多个MapReduce作业。

更好的技术 (Better Techniques)

MapReduce is somewhat legacy nowadays and brings some problems with it. Because it works in batches (jobs) a problem arises where if your job fails — you need to restart the whole thing. A 2-hour job failing can really slow down your whole data processing pipeline and you do not want that in the very least, especially in peak hours.

MapReduce如今已成为一种遗产,并带来了一些问题。 因为它可以成批工作(作业),所以如果作业失败,就会出现问题-您需要重新启动整个过程。 2小时的工作失败确实会减慢整个数据处理流程的速度,并且您至少不希望这样做,尤其是在高峰时段。

Another issue is the time you wait until you receive results. In real-time analytic systems (which all have big data and thus use distributed computing) it is important to have your latest crunched data be as fresh as possible and certainly not from a few hours ago.

另一个问题是您等待直到收到结果的时间。 在实时分析系统(它们都具有大数据并因此使用分布式计算)中,重要的是要使最新处理的数据尽可能新鲜,而且肯定要在几个小时之前。

As such, other that address these issues. Namely (mix of batch processing and stream processing) and (only stream processing). These advances in the field have brought new tools enabling them — , , .

因此, 了解决这些问题的其他 。 即 (批处理和流处理的混合)和 (仅流处理)。 这些领域的进步带来了支持它们的新工具 , 和 。

分布式文件系统 (Distributed File Systems)

Distributed file systems can be thought of as distributed data stores. They’re the same thing as a concept — storing and accessing a large amount of data across a cluster of machines all appearing as one. They typically go hand in hand with Distributed Computing.

可以将分布式文件系统视为分布式数据存储。 它们与概念是一回事,即跨一堆机器集群存储和访问大量数据。 它们通常与分布式计算齐头并进。

Known Scale —

已知规模—

Wikipedia defines the difference being that distributed file systems allow files to be accessed using the same interfaces and semantics as local files, not through a custom API like the Cassandra Query Language .

Wikipedia定义的区别在于,分布式文件系统允许使用与本地文件相同的接口和语义来访问文件,而不是通过像Cassandra Query Language 这样的自定义API来访问文件。

HDFS (HDFS)

Hadoop Distributed File System (HDFS) is the distributed file system used for distributed computing via the Hadoop framework. Boasting widespread adoption, it is used to store and replicate large files (GB or TB in size) across many machines.

Hadoop分布式文件系统(HDFS)是用于通过Hadoop框架进行分布式计算的分布式文件系统。 它被广泛采用,用于在许多计算机上存储和复制大文件(大小为GB或TB)。

Its architecture consists mainly of NameNodes and DataNodes. NameNodes are responsible for keeping metadata about the cluster, like which node contains which file blocks. They act as coordinators for the network by figuring out where best to store and replicate files, tracking the system’s health. DataNodes simply store files and execute commands like replicating a file, writing a new one and others.

它的体系结构主要由NameNodeDataNode组成 。 NameNode负责保留有关群集的元数据,例如哪个节点包含哪些文件块。 他们通过找出最佳存储和复制文件的位置,跟踪系统的运行状况,来充当网络的协调员。 DataNode只是存储文件并执行命令,例如复制文件,编写新文件和其他文件。

Unsurprisingly, HDFS is best used with Hadoop for computation as it provides data awareness to the computation jobs. Said jobs then get ran on the nodes storing the data. This leverages data locality — optimizes computations and reduces the amount of traffic over the network.

毫不奇怪,HDFS最好与Hadoop结合使用,因为它可以为计算作业提供数据感知能力。 然后,所述作业在存储数据的节点上运行。 这可以利用数据局部性-优化计算并减少网络上的流量。

IPFS (IPFS)

is an exciting new peer-to-peer protocol/network for a distributed file system. Leveraging technology, it boasts a completely decentralized architecture with no single owner nor point of failure.

是一种令人兴奋的分布式文件系统的新对等协议/网络。 利用技术,它拥有完全分散的架构,既没有单一所有者,也没有故障点。

IPFS offers a naming system (similar to DNS) called IPNS and lets users easily access information. It stores file via historic versioning, similar to how does. This allows for accessing all of a file’s previous states.

IPFS提供了一个称为IPNS的命名系统(类似于DNS),使用户可以轻松访问信息。 它通过历史版本控制存储文件,类似于操作。 这允许访问文件的所有先前状态。

It is still undergoing heavy development (v0.4 as of time of writing) but has already seen projects interested in building over it ().

它仍在进行大量开发(截至撰写本文时为v0.4),但已经看到有项目对其进行构建感兴趣( )。

分布式消息 (Distributed Messaging)

Messaging systems provide a central place for storage and propagation of messages/events inside your overall system. They allow you to decouple your application logic from directly talking with your other systems.

消息系统为您在整个系统内存储和传播消息/事件提供了一个中心位置。 它们使您可以将应用程序逻辑与直接与其他系统断开联系。

Known Scale —

已知规模

Simply put, a messaging platform works in the following way:

简而言之,消息传递平台的工作方式如下:

A message is broadcast from the application which potentially create it (called a producer), goes into the platform and is read by potentially multiple applications which are interested in it (called consumers).

消息从可能创建它的应用程序(称为生产者 )广播,进入平台,并被对此感兴趣的多个应用程序(称为消费者 )读取。

If you need to save a certain event to a few places (e.g user creation to database, warehouse, email sending service and whatever else you can come up with) a messaging platform is the cleanest way to spread that message.

如果您需要将某个事件保存到几个地方(例如,将用户创建到数据库,仓库,电子邮件发送服务以及您可以提供的其他服务),则消息传递平台是传播该消息的最干净的方法。

Consumers can either pull information out of the brokers (pull model) or have the brokers push information directly into the consumers (push model).

消费者可以将信息从经纪人中拉出(拉模型),也可以让经纪人将信息直接推入到消费者中(推模型)。

There are a couple of popular top-notch messaging platforms:

有两个流行的顶级消息传递平台:

— Message broker which allows you finer-grained control of message trajectories via routing rules and other easily configurable settings. Can be called a smart broker, as it has a lot of logic in it and tightly keeps track of messages that pass through it. Provides settings for both AP and CP from CAP. Uses a push model for notifying the consumers.

—消息代理,它使您可以通过路由规则和其他易于配置的设置来更精细地控制消息轨迹。 可以称为智能经纪人,因为它具有很多逻辑并且可以紧密跟踪通过它的消息。 提供CAP中 APCP的设置。 使用推送模型来通知消费者。

— Message broker (and all out platform) which is a bit lower level, as in it does not keep track of which messages have been read and does not allow for complex routing logic. This helps it achieve amazing performance. In my opinion, this is the biggest prospect in this space with active development from the open-source community and support from the . Kafka arguably has the most widespread use from top tech companies.

消息代理(和整体平台)级别较低,因为它无法跟踪已读取哪些消息,并且不允许使用复杂的路由逻辑。 这有助于实现惊人的性能。 我认为,在开源社区的积极开发和支持下,这是该领域的最大前景。 可以说,卡夫卡在顶尖高科技公司中的使用最为广泛。

— The oldest of the bunch, dating from 2004. Uses the JMS API, meaning it is geared towards Java EE applications. It got rewritten as , which provides outstanding performance on par with Kafka.

—最早的版本,可追溯到2004年。使用JMS API,这意味着它适用于Java EE应用程序。 它被改写为 ,与Kafka相比可提供出色的性能。

— A messaging service provided by AWS. Lets you quickly integrate it with existing applications and eliminates the need to handle your own infrastructure, which might be a big benefit, as systems like Kafka are notoriously tricky to set up. Amazon also offers two similar services — and , the latter of which is basically ActiveMQ but managed by Amazon.

— AWS提供的一种消息传递服务。 让您快速将其与现有应用程序集成在一起,而无需处理自己的基础结构,这可能是一个很大的好处,因为众所周知,像Kafka这样的系统很难设置。 亚马逊还提供两种类似的服务和 ,后者基本上是ActiveMQ,但由Amazon管理。

分布式应用 (Distributed Applications)

If you roll up 5 Rails servers behind a single load balancer all connected to one database, could you call that a distributed application? Recall my definition from up above:

如果将5台Rails服务器汇总在一个都连接到一个数据库的单个负载均衡器后面,您能否称其为分布式应用程序? 从上面回顾我的定义:

A distributed system is a group of computers working together as to appear as a single computer to the end-user. These machines have a shared state, operate concurrently and can fail independently without affecting the whole system’s uptime.
分布式系统是一组计算机,这些计算机一起工作,对于最终用户来说就像是一台计算机。 这些机器具有共享状态,可以同时运行,并且可以独立发生故障,而不会影响整个系统的正常运行时间。

If you count the database as a shared state, you could argue that this can be classified as a distributed system — but you’d be wrong, as you’ve missed the “working together” part of the definition.

如果将数据库视为共享状态,则可能会认为它可以归类为分布式系统-但您会错了,因为您错过了定义的“ 一起工作 ”部分。

A system is distributed only if the nodes communicate with each other to coordinate their actions.

仅当节点相互通信以协调其动作时,系统才被分发。

Therefore something like an application running its back-end code on a can better be classified as a distributed application. Regardless, this is all needless classification that serves no purpose but illustrate how fussy we are about grouping things together.

因此,诸如在上运行其后端代码的应用程序之类的东西可以更好地归类为分布式应用程序。 无论如何,这都是不必要的分类,没有任何作用,只是说明我们将事物分组在一起是多么繁琐。

Known Scale —

已知规模-

Erlang虚拟机 (Erlang Virtual Machine)

Erlang is a functional language that has great semantics for concurrency, distribution and fault-tolerance. The Erlang Virtual Machine itself handles the distribution of an Erlang application.

Erlang是一种功能性语言,在并发,分发和容错方面具有出色的语义。 Erlang虚拟机本身负责处理Erlang应用程序的分发。

Its model works by having many isolated all with the ability to talk to each other via a built-in system of message passing. This is called the and the Erlang OTP libraries can be thought of as a distributed actor framework (along the lines of for the JVM).

它的模型通过具有许多孤立的 所有这些都可以通过内置的消息传递系统相互通信。 这称为 可以将Erlang OTP库视为分布式参与者框架(沿着JVM的行)。

The model is what helps it achieve great concurrency rather simply — the processes are spread across the available cores of the system running them. Since this is indistinguishable from a network setting (apart from the ability to drop messages), Erlang’s VM can connect to other Erlang VMs running in the same data center or even in another continent. This swarm of virtual machines run one single application and handle machine failures via takeover (another node gets scheduled to run).

该模型非常简单地帮助其实现高并发性-进程分布在运行它们的系统的可用核心中。 由于这与网络设置没有区别(除了能够发送消息),Erlang的VM可以连接到在同一数据中心甚至在另一个大陆运行的其他Erlang VM。 这群虚拟机运行一个应用程序,并通过接管处理机器故障(安排另一个节点运行)。

In fact, the distributed layer of the language was added in order to provide fault tolerance. Software running on a single machine is always at risk of having that single machine dying and taking your application offline. Software running on many nodes allows easier hardware failure handling, provided the application was built with that in mind.

实际上,已添加了该语言的分布式层以提供容错能力。 在单台计算机上运行的软件始终面临使单台计算机崩溃并使应用程序脱机的风险。 只要在构建应用程序时考虑到这一点,在许多节点上运行的软件就可以更轻松地处理硬件故障。

比特流 (BitTorrent)

BitTorrent is one of the most widely used protocol for transferring large files across the web via torrents. The main idea is to facilitate file transfer between different peers in the network without having to go through a main server.

BitTorrent是通过torrent在网络上传输大文件的最广泛使用的协议之一。 主要思想是促进网络中不同对等方之间的文件传输,而不必通过主服务器。

Using a BitTorrent client, you connect to multiple computers across the world to download a file. When you open a .torrent file, you connect to a so-called , which is a machine that acts as a coordinator. It helps with peer discovery, showing you the nodes in the network which have the file you want.

使用BitTorrent客户端,您可以连接到世界各地的多台计算机以下载文件。 打开.torrent文件时,您将连接到所谓的 ,该是充当协调的计算机。 它有助于发现对等点,向您显示网络中具有所需文件的节点。

You have the notions of two types of user, a leecher and a seeder. A leecher is the user who is downloading a file and a seeder is the user who is uploading said file.

您有两种类型的用户,即leecherseeder 。 leecher是正在下载文件的用户,seeder是正在上传所述文件的用户。

The funny thing about peer-to-peer networks is that you, as an ordinary user, have the ability to join and contribute to the network.

对等网络的有趣之处在于,作为普通用户,您有能力加入网络并为网络做出贡献。

BitTorrent and its precursors (, ) allow you to voluntarily host files and upload to other users who want them. The reason BitTorrent is so popular is that it was the first of its kind to provide incentives for contributing to the network. Freeriding, where a user would only download files, was an issue with the previous file sharing protocols.

BitTorrent及其前身( , )使您可以自愿托管文件并上传到需要它们的其他用户。 BitTorrent之所以如此受欢迎,是因为它是同类中第一个为网络做出贡献的激励措施。 用户只能下载文件的Freeriding是以前的文件共享协议的问题。

BitTorrent solved freeriding to an extent by making seeders upload more to those who provide the best download rates. It works by incentivizing you to upload while downloading a file. Unfortunately, after you’re done, nothing is making you stay active in the network. This causes a lack of seeders in the network who have the full file and as the protocol relies heavily on such users, solutions like came into fruition. Private trackers require you to be a member of a community (often invite-only) in order to participate in the distributed network.

BitTorrent在一定程度上解决了搭便车问题,方法是使播种者向提供最佳下载速率的用户上传更多内容。 它会激励您在下载文件时进行上传。 不幸的是,完成后,没有任何事情可以使您保持活跃在网络中。 这导致网络中缺少拥有完整文件的播种机,并且由于该协议严重依赖于此类用户,因此像类的解决方案应运而生。 私人跟踪器要求您成为社区成员(通常是仅邀请对象),才能参与分布式网络。

After advancements in the field, trackerless torrents were invented. This was an upgrade to the BitTorrent protocol that did not rely on centralized trackers for gathering metadata and finding peers but instead use new algorithms. One such instance is (), a distributed hash table (DHT) which allows you to find peers through other peers. In effect, each user performs a tracker’s duties.

在该领域取得进步之后,发明了无轨洪流。 这是对BitTorrent协议的升级,该协议不依赖集中式跟踪器来收集元数据和查找对等方,而是使用新算法。 这样的实例之一就是 ( ),它是一种分布式哈希表(DHT),通过它您可以通过其他对等方找到对等方。 实际上,每个用户都执行跟踪器的职责。

分布式分类帐 (Distributed Ledgers)

A distributed ledger can be thought of as an immutable, append-only database that is replicated, synchronized and shared across all nodes in the distributed network.

分布式分类帐可以看作是一个不变的,仅追加的数据库,可以在分布式网络中的所有节点之间进行复制,同步和共享。

Known Scale — .

已知规模- 。

They leverage the pattern, allowing you to rebuild the ledger’s state at any time in its history.

他们利用模式,允许您在历史记录中的任何时候重建分类帐的状态。

区块链 (Blockchain)

Blockchain is the current underlying technology used for distributed ledgers and in fact marked their start. This latest and greatest innovation in the distributed space enabled the creation of the first ever truly distributed payment protocol — Bitcoin.

区块链是用于分布式账本的当前底层技术,实际上标志着它们的开始。 分布式空间中的这一最新,最伟大的创新使得能够创建有史以来第一个真正的分布式支付协议-比特币。

Blockchain is a distributed ledger carrying an ordered list of all transactions that ever occurred in its network. Transactions are grouped and stored in blocks. The whole blockchain is essentially a of blocks (hence the name). Said blocks are computationally expensive to create and are tightly linked to each other through cryptography.

区块链是一个分布式分类账,其中包含其网络中曾经发生的所有交易的有序列表。 事务被分组并存储在块中。 整个区块链本质上是一个区块 (因此而得名) 。 所述块的创建在计算上是昂贵的,并且通过密码术彼此紧密地链接。

Simply said, each block contains a special hash (that starts with X amount of zeroes) of the current block’s contents (in the form of a Merkle Tree) plus the previous block’s hash. This hash requires a lot of CPU power to be produced because the only way to come up with it is through brute-force.

简而言之,每个块包含当前块内容(以Merkle树的形式)的特殊哈希(以X的零个零开头)加上前一个块的哈希。 这种哈希需要产生大量的CPU能力,因为解决该哈希的唯一方法是通过蛮力。

Miners are the nodes who try to compute the hash (via bruteforce). The miners all compete with each other for who can come up with a random string (called a nonce) which, when combine with the contents, produces the aforementioned hash. Once somebody finds the correct nonce — he broadcasts it to the whole network. Said string is then verified by each node on its own and accepted into their chain.

矿工是试图(通过蛮力)计算哈希的节点。 矿工彼此竞争,谁能拿出一个随机字符串(称为nonce ),该字符串与内容组合时会产生上述哈希。 一旦有人找到正确的随机数,便将其广播到整个网络。 所述字符串然后由每个节点自己验证,并接受到其链中。

This translates into a system where it is absurdly costly to modify the blockchain and absurdly easy to verify that it is not tampered with.

这转化为一个系统,在该系统中,修改区块链的成本非常高昂,并且非常容易验证其未被篡改。

It is costly to change a block’s contents because that would produce a different hash. Remember that each subsequent block‘s hash is dependent on it. If you were to change a transaction in the first block of the picture above — you would change the Merkle Root. This would in turn change the block’s hash (most likely without the needed leading zeroes) — that would change block #2’s hash and so on and so on. This means you’d need to brute-force a new nonce for every block after the one you just modified.

更改块的内容是昂贵的,因为这会产生不同的哈希。 请记住,每个后续块的哈希都依赖于它。 如果要在上图的第一块中更改交易,则应更改Merkle Root。 反过来,这将更改块的哈希(很可能没有所需的前导零)—这将更改块#2的哈希,依此类推。 这意味着在刚修改的块之后,您需要为每个块强行使用一个新的随机数。

The network always trusts and replicates the longest valid chain. In order to cheat the system and eventually produce a longer chain you’d need more than 50% of the total CPU power used by all the nodes.

网络始终信任并复制最长的有效链。 为了欺骗系统并最终产生更长的链,您需要所有节点使用的总CPU能力的50%以上。

Blockchain can be thought of as a distributed mechanism for emergent consensus. Consensus is not achieved explicitly — there is no election or fixed moment when consensus occurs. Instead, consensus is an emergent product of the asynchronous interaction of thousands of independent nodes, all following protocol rules.

区块链可以被认为是达成共识的一种分布式机制。 没有明确达成共识-没有达成共识的选举或固定时刻。 相反,共识是成千上万个独立节点(遵循协议规则)的异步交互的新兴产物。

This unprecedented innovation has recently become a boom in the tech space with people predicting it will mark the creation of the . It is definitely the most exciting space in the software engineering world right now, filled with extremely challenging and interesting problems waiting to be solved.

最近,这种前所未有的创新已成为技术领域的一种繁荣,人们预言它将标志着 。 绝对是当前软件工程界最激动人心的领域,充满着亟待解决的极具挑战性和趣味性的问题。

比特币 (Bitcoin)

What previous distributed payment protocols lacked was a way to practically prevent the in real time, in a distributed manner. Research has produced interesting propositions[1] but Bitcoin was the first to implement a practical solution with clear advantages over others.

以前的分布式支付协议所缺少的是一种以分布式方式实际防止实时方法。 Research has produced interesting propositions[1] but Bitcoin was the first to implement a practical solution with clear advantages over others.

The double spending problem states that an actor (e.g Bob) cannot spend his single resource in two places. If Bob has $1, he should not be able to give it to both Alice and Zack — it is only one asset, it cannot be duplicated. It turns out it is really hard to truly achieve this guarantee in a distributed system. There are predating blockchain, but they do not completely solve the problem in a practical way.

The double spending problem states that an actor (eg Bob) cannot spend his single resource in two places. If Bob has $1, he should not be able to give it to both Alice and Zack — it is only one asset, it cannot be duplicated. It turns out it is really hard to truly achieve this guarantee in a distributed system. There are predating blockchain, but they do not completely solve the problem in a practical way.

Double-spending is solved easily by Bitcoin, as only one block is added to the chain at a time. Double-spending is impossible within a single block, therefore even if two blocks are created at the same time — only one will come to be on the eventual longest chain.

Double-spending is solved easily by Bitcoin, as only one block is added to the chain at a time. Double-spending is impossible within a single block, therefore even if two blocks are created at the same time — only one will come to be on the eventual longest chain.

Bitcoin relies on the difficulty of accumulating CPU power.

Bitcoin relies on the difficulty of accumulating CPU power.

While in a voting system an attacker need only add nodes to the network (which is easy, as free access to the network is a design target), in a CPU power based scheme an attacker faces a physical limitation: getting access to more and more powerful hardware.

While in a voting system an attacker need only add nodes to the network (which is easy, as free access to the network is a design target), in a CPU power based scheme an attacker faces a physical limitation: getting access to more and more powerful hardware.

This is also the reason malicious groups of nodes need to control over 50% of the computational power of the network to actually carry any successful attack. Less than that, and the rest of the network will create a longer blockchain faster.

This is also the reason malicious groups of nodes need to control over 50% of the computational power of the network to actually carry any successful attack. Less than that, and the rest of the network will create a longer blockchain faster.

以太坊 (Ethereum)

Ethereum can be thought of as a programmable blockchain-based software platform. It has its own cryptocurrency (Ether) which fuels the deployment of smart contracts on its blockchain.

Ethereum can be thought of as a programmable blockchain-based software platform. It has its own cryptocurrency (Ether) which fuels the deployment of smart contracts on its blockchain.

Smart contracts are a piece of code stored as a single transaction in the Ethereum blockchain. To run the code, all you have to do is issue a transaction with a smart contract as its destination. This in turn makes the miner nodes execute the code and whatever changes it incurs. The code is executed inside the Ethereum Virtual Machine.

Smart contracts are a piece of code stored as a single transaction in the Ethereum blockchain. To run the code, all you have to do is issue a transaction with a smart contract as its destination. This in turn makes the miner nodes execute the code and whatever changes it incurs. The code is executed inside the Ethereum Virtual Machine.

Solidity, Ethereum’s native programming language, is what’s used to write smart contracts. It is a turing-complete programming language which directly interfaces with the Ethereum blockchain, allowing you to query state like balances or other smart contract results. To prevent infinite loops, running the code requires some amount of Ether.

Solidity , Ethereum's native programming language, is what's used to write smart contracts. It is a turing-complete programming language which directly interfaces with the Ethereum blockchain, allowing you to query state like balances or other smart contract results. To prevent infinite loops, running the code requires some amount of Ether.

As the blockchain can be interpreted as a series of state changes, a lot of Distributed Applications have been built on top of Ethereum and similar platforms.

As the blockchain can be interpreted as a series of state changes , a lot of Distributed Applications have been built on top of Ethereum and similar platforms.

Further usages of distributed ledgers (Further usages of distributed ledgers)

— A service to anonymously and securely store proof that a certain digital document existed at some point of time. Useful for ensuring document integrity, ownership and timestamping.

— A service to anonymously and securely store proof that a certain digital document existed at some point of time. Useful for ensuring document integrity, ownership and timestamping.

— organizations which use blockchain as a means of reaching consensus on the organization’s improvement propositions. Examples are ,

— organizations which use blockchain as a means of reaching consensus on the organization's improvement propositions. Examples are ,

Decentralized Authentication — Store your identity on the blockchain, enabling you to use (SSO) everywhere. ,

Decentralized Authentication — Store your identity on the blockchain, enabling you to use (SSO) everywhere. ,

And many, many more. The distributed ledger technology really did open up endless possibilities. Some are most probably being invented as we speak!

And many, many more. The distributed ledger technology really did open up endless possibilities. Some are most probably being invented as we speak!

摘要 (Summary)

In the short span of this article, we managed define what a distributed system is, why you’d use one and go over each category a little. Some important things to remember are:

In the short span of this article, we managed define what a distributed system is, why you'd use one and go over each category a little. Some important things to remember are:

  • Distributed Systems are complex

    Distributed Systems are complex
  • They are chosen by necessity of scale and price

    They are chosen by necessity of scale and price
  • They are harder to work with

    They are harder to work with
  • CAP Theorem — Consistency/Availability trade-off

    CAP Theorem — Consistency/Availability trade-off
  • They have 6 categories — data stores, computing, file systems, messaging systems, ledgers, applications

    They have 6 categories — data stores, computing, file systems, messaging systems, ledgers, applications

To be frank, we have barely touched the surface on distributed systems. I did not have the chance to thoroughly tackle and explain core problems like , , , , and .

To be frank, we have barely touched the surface on distributed systems. I did not have the chance to thoroughly tackle and explain core problems like , , , , and .

警告 (Caution)

Let me leave you with a parting forewarning:

Let me leave you with a parting forewarning:

You must stray away from distributed systems as much as you can. The complexity overhead they incur with themselves is not worth the effort if you can avoid the problem by either solving it in a different way or some other out-of-the-box solution.

You must stray away from distributed systems as much as you can. The complexity overhead they incur with themselves is not worth the effort if you can avoid the problem by either solving it in a different way or some other out-of-the-box solution.

[1], 25–27 June 2007 — a proposed solution in which each ‘coin’ can expire and is assigned a witness (validator) to it being spent.

[1] , 25–27 June 2007 — a proposed solution in which each 'coin' can expire and is assigned a witness (validator) to it being spent.

, December 2005 — A high-level overview of a protocol extremely similar to Bitcoin’s. It is said this is the precursor to Bitcoin.

, December 2005 — A high-level overview of a protocol extremely similar to Bitcoin's. It is said this is the precursor to Bitcoin.

Further Distributed Systems Reading: (Further Distributed Systems Reading:)

— A great book that goes over everything in distributed systems and more.

— A great book that goes over everything in distributed systems and more.

— A long series of courses (6) going over distributed system concepts, applications

— A long series of courses (6) going over distributed system concepts, applications

— Blog explaining a lot of distributed technologies (ElasticSearch, Redis, MongoDB, etc)

— Blog explaining a lot of distributed technologies (ElasticSearch, Redis, MongoDB, etc)

Thanks for taking the time to read through this long(~5600 words) article!

Thanks for taking the time to read through this long(~5600 words) article!

If, by any chance, you found this informative or thought it provided you with value, please make sure to give it as many claps you believe it deserves and consider sharing with a friend who could use an introduction to this wonderful field of study.

If, by any chance, you found this informative or thought it provided you with value, please make sure to give it as many claps you believe it deserves and consider sharing with a friend who could use an introduction to this wonderful field of study.

~Stanislav Kozlovski

~Stanislav Kozlovski

更新资料 (Update)

I currently work at . Confluent is a Big Data company founded by the creators of themselves! I am immensely grateful for the opportunity they have given me — I currently work on Kafka itself, which is beyond awesome! We at Confluent help shape the whole open-source Kafka ecosystem, including a new managed Kafka-as-a-service cloud offering.

我目前在工作。 Confluent is a Big Data company founded by the creators of themselves! 我非常感谢他们给我的机会-我目前在Kafka上工作,这真是太棒了! Confluent的我们帮助塑造了整个开源Kafka生态系统,包括新的托管Kafka即服务云产品。

We are hiring for a lot of positions (especially SRE/Software Engineers) in Europe and the USA! If you are interested in working on Kafka itself, looking for new opportunities or just plain curious — make sure to message me on and I will share all the great perks that come from working in a bay area company.

我们正在欧洲和美国招聘许多职位(尤其是SRE /软件工程师)! 如果您有兴趣从事Kafka本身的工作,寻找新的机会或只是好奇,请确保在给我发消息,我将分享在湾区公司工作带来的所有好处。

翻译自:

分布式系统原理介绍

转载地址:http://kikzd.baihongyu.com/

你可能感兴趣的文章
3、VS2010+ASP.NET MVC4+EF4+JqueryEasyUI+Oracle项目开发之——用户登录
查看>>
面试记-(1)
查看>>
压力测试 相关
查看>>
android update automatically ( android 自动升级)
查看>>
session cookie
查看>>
POJ 1222 EXTENDED LIGHTS OUT(翻转+二维开关问题)
查看>>
【BZOJ-4059】Non-boring sequences 线段树 + 扫描线 (正解暴力)
查看>>
几种简单的负载均衡算法及其Java代码实现
查看>>
TMS3705A PCF7991AT 线路图
查看>>
白盒测试实践(小组作业)day4
查看>>
为什么学sail.js
查看>>
pythen创建cocos2dx项目
查看>>
js调用.net后台事件,和后台调用前台等方法总结
查看>>
Vert.x 之 HelloWorld
查看>>
太阳能路灯项目背景知识
查看>>
Objec类和final关键字的用法
查看>>
打开matlab遗传算法工具箱的方法
查看>>
Ajax制作智能提示搜索
查看>>
打赏页面
查看>>
JAVA之线程同步的三种方法
查看>>