Day 2: Advanced Usage, Distribution

Day 1 introduced us to Redis as a data structure server. Today, we’ll build on that foundation by looking at some of the advanced functions provided by Redis, such as pipelining, the publish-subscribe model, system configuration, and replication. Beyond that, we’ll look at how to create a Redis cluster, store a lot of data quickly, and use an advanced technique involving Bloom filters.

A Simple Interface

At 80,000 lines of source code, Redis is a fairly small project compared to most databases. But beyond code size, it has a simple interface that accepts the very strings we have been writing in the console.

Telnet

We can interact without the command-line interface by streaming commands through TCP on our own via telnet and terminating the command with a carriage return line feed (CRLF, or ). Run Ctrl+] at any time to exit.

 $ telnet localhost 6379
 Trying 127.0.0.1...
 Connected to localhost.
 Escape character is ​'^]'​.
 SET test hello
 +OK
 GET test
 $5
 hello
 SADD stest 1 99
 :2
 SMEMBERS stest
 *​2
 $1
 1
 $2
 99

Here we see four Redis commands as inputs (you should be able to identify them quickly) and their corresponding outputs. We can see that our input is the same as we provided in the Redis console, but the console has cleaned up the responses a bit. To give a few examples:

  • Redis streams the OK status prefixed by a + sign.

  • Before it returned the string hello, it sent $5, which means “the following string is five characters.”

  • After we add two set items to the test key, the number 2 is prefixed by : to represent an integer (two values were added successfully).

Finally, when we requested two items, the first line returned begins with an asterisk and the number 2—meaning there are two complex values about to be returned. The next two lines are just like the hello string but contain the string 1, followed by the string 99.

Pipelining

We can also stream our own strings one at a time by using the BSD netcat (nc) command, which is already installed on many Unix machines. With netcat, we must specifically end a line with CRLF (telnet did this for us implicitly). We also sleep for a second after the ECHO command has finished to give some time for the Redis server to return. Some nc implementations have a -q option, thus negating the need for a sleep, but not all do, so feel free to try it.

 $ ​​(echo​​ ​​-en​​ ​​"ECHO hello "​​;​​ ​​sleep​​ ​​1)​​ ​​|​​ ​​nc​​ ​​localhost​​ ​​6379
 $5
 hello

We can take advantage of this level of control by pipelining our commands, or streaming multiple commands in a single request.

 $ ​​(echo​​ ​​-en​​ ​​"PING PING PING "​​;​​ ​​sleep​​ ​​1)​​ ​​|​​ ​​nc​​ ​​localhost​​ ​​6379
 +PONG
 +PONG
 +PONG

This can be far more efficient than pushing a single command at a time and should always be considered if it makes sense to do so—especially in transactions. Just be sure to end every command with , which is a required delimiter for the server.

publish-subscribe

On Day 1, we were able to implement a rudimentary blocking queue using the list datatype. We queued data that could be read by a blocking pop command. Using that queue, we made a very basic queue according to a publish-subscribe model. Any number of messages could be pushed to this queue, and a single queue reader would pop messages as they were available. This is powerful but limited. Under many circumstances we want a slightly inverted behavior, where several subscribers want to read the announcements of a single publisher who sends a message to all subscribers, as shown in the following figure. Redis provides some specialized publish-subscribe (or pub-sub) commands.

images/redis-pubsub.png

Let’s improve on the commenting mechanism we made before using blocking lists, by allowing a user to post a comment to multiple subscribers (as opposed to just one). We start with a subscriber that listens on a key for messages (the key will act as a channel in pub-sub nomenclature). This will cause the CLI to output Reading messages... and then block while the subscriber listens for incoming messages.

 redis 127.0.0.1:6379> SUBSCRIBE comments
 Reading messages... (press Ctrl-C to quit)
 1) "subscribe"
 2) "comments"
 3) (integer) 1

With two subscribers, we can publish any string we want as a message to the comments channel. The PUBLISH command will return the integer 2, meaning two subscribers received it.

 redis 127.0.0.1:6379> PUBLISH comments "Check out this shortcoded site! 7wks"
 (integer) 2

Both of the subscribers will receive a multibulk reply (a list) of three items: the string “message,” the channel name, and the published message value.

 1) "message"
 2) "comments"
 3) "Check out this shortcoded site! 7wks"

When your clients no longer want to receive correspondence, they can execute the UNSUBSCRIBE comments command to disconnect from the comments channel or simply UNSUBSCRIBE alone to disconnect from all channels. However, note that in the redis-cli console you will have to press Ctrl+C to break the connection.

Server Info

Before getting into changing Redis’s system settings, it’s worth taking a quick look at the INFO command because changing settings values will alter some of these values as well. INFO outputs a list of server data, including version, process ID, memory used, and uptime.

 redis 127.0.0.1:6379> INFO
 # Server
 redis_version:3.2.8
 redis_git_sha1:00000000
 redis_git_dirty:0
 redis_build_id:b533f811ec736a0c
 redis_mode:standalone
 ...

You may want to revisit this command again in this chapter because it provides a useful snapshot of this server’s global information and settings. It even provides information on durability, memory fragmentation, and replication server status.

Redis Configuration

So far, we’ve only used Redis with its out-of-the-box configuration. But much of Redis’s power comes from its configurability, allowing you to tailor settings to your use case. The redis.conf file that comes with the distribution—found in /etc/redis on *nix systems or /usr/local/etc on Mac OS—is fairly self-explanatory, so we’re going to cover only a portion of the file. We’ll go through a few of the common settings in order.

 daemonize no
 port 6379
 loglevel verbose
 logfile stdout
 database 16

By default, daemonize is set to no, which is why the server always starts up in the foreground. This is nice for testing but not very production friendly. Changing this value to yes will run the server in the background while setting the server’s process ID in a pid file.

The next line is the default port number for this server, port 6379. This can be especially useful when running multiple Redis servers on a single machine.

loglevel defaults to verbose, but it’s good to set it to notice or warning in production to cut down on the number of log events. logfile outputs to stdout (standard output, the console), but a filename is necessary if you run in daemonize mode.

database sets the number of Redis databases we have available. We saw how to switch between databases. If you plan to only ever use a single database namespace, it’s not a bad idea to set this to 1 to prevent unwanted databases from being accidentally created.

Durability

Redis has a few persistence options. First is no persistence at all, which will simply keep all values in main memory. If you’re running a basic caching server, this is a reasonable choice since durability always increases latency.

One of the things that sets Redis apart from other fast-access caches like memcached[58] is its built-in support for storing values to disk. By default, key-value pairs are only occasionally saved. You can run the LASTSAVE command to get a Unix timestamp of the last time a Redis disk write succeeded, or you can read the last_save_time field from the server INFO output.

You can force durability by executing the SAVE command (or BGSAVE, to asynchronously save in the background).

 redis 127.0.0.1:6379> SAVE

If you read the redis-server log, you will see lines similar to this:

 [46421] 10 Oct 19:11:50 * Background saving started by pid 52123
 [52123] 10 Oct 19:11:50 * DB saved on disk
 [46421] 10 Oct 19:11:50 * Background saving terminated with success

Another durability method is to alter the snapshotting settings in the configuration file.

Snapshotting

We can alter the rate of storage to disk by adding, removing, or altering one of the save fields. By default, there are three, prefixed by the save keyword followed by a time in seconds and a minimum number of keys that must change before a write to disk occurs. For example, to trigger a save every 5 minutes (300 seconds) if any keys change at all, you would write the following:

 save 300 1

The configuration has a good set of defaults. The set means if 10,000 keys change, save in 60 seconds; if 10 keys change, save in 300 seconds, and any key changes will be saved in at least 900 seconds (15 minutes).

 save 900 1
 save 300 10
 save 60 10000

You can add as many or as few save lines as necessary to specify precise thresholds.

Append-Only File

Redis is eventually durable by default, in that it asynchronously writes values to disk in intervals defined by our save settings, or it is forced to write by client-initiated commands. This is acceptable for a second-level cache or session server but is insufficient for storing data that you need to be durable, like financial data. If a Redis server crashes, our users might not appreciate having lost money.

Redis provides an append-only file (appendonly.aof) that keeps a record of all write commands. This is like the write-ahead logging we saw in Chapter 3, HBase. If the server crashes before a value is saved, it executes the commands on startup, restoring its state; appendonly must be enabled by setting it to yes in the redis.conf file.

 appendonly yes

Then we must decide how often a command is appended to the file. Setting always is the more durable because every command is saved. It’s also slow, which often negates the reason people have for using Redis. By default, everysec is enabled, which saves up and writes commands only once a second. This is a decent trade-off because it’s fast enough, and worst case you’ll lose only the last one second of data. Finally, no is an option, which just lets the OS handle flushing. It can be fairly infrequent, and you’re often better off skipping the append-only file altogether rather than choosing it.

 # appendfsync always
 appendfsync everysec
 # appendfsync no

Append-only has more detailed parameters, which may be worth reading about in the config file when you need to respond to specific production issues.

Security

Although Redis is not natively built to be a fully secure server, you may run across the requirepass setting and AUTH command in the Redis documentation. These can be safely ignored because they are merely a scheme for setting a plaintext password. Because a client can try nearly 100,000 passwords a second, it’s almost a moot point, beyond the fact that plaintext passwords are inherently unsafe anyway. If you want Redis security, you’re better off with a good firewall and SSH security.

Interestingly, Redis provides command-level security through obscurity, by allowing you to hide or suppress commands. Adding this to your config will rename the FLUSHALL command (remove all keys from the system) into some hard-to-guess value like c283d93ac9528f986023793b411e4ba2:

 rename-command FLUSHALL c283d93ac9528f986023793b411e4ba2

If you attempt to execute FLUSHALL against this server, you’ll be hit with an error. The secret command works instead.

 redis 127.0.0.1:6379> FLUSHALL
 (error) ERR unknown command 'FLUSHALL'
 redis 127.0.0.1:6379> c283d93ac9528f986023793b411e4ba2
 OK

Or better yet, we can disable the command entirely by setting it to a blank string in the configuration.

 rename-command FLUSHALL ""

You can set any number of commands to a blank string, allowing you a modicum of customization over your command environment.

Tweaking Parameters

There are several more advanced settings for speeding up slow query logs, encoding details, making latency tweaks, and importing external config files. Keep in mind, though, if you run across some documentation about Redis virtual memory, simply ignore it, as that feature was removed in version 2.6.

To aid in testing your server configuration, Redis provides an excellent benchmarking tool. It connects locally to port 6379 by default and issues 10,000 requests using 50 parallel clients. We can execute 100,000 requests with the -n argument.

 $ ​​redis-benchmark​​ ​​-n​​ ​​100000
 ====== PING (inline) ======
  100000 requests completed in 0.89 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1
 99.91% <= 1 milliseconds
 100.00% <= 1 milliseconds
 32808.40 requests per second
 ...

Other commands are tested as well, such as SADD and LRANGE, with the more complex commands generally taking more time.

Master-Slave Replication

As with other NoSQL databases we’ve seen (such as MongoDB and Neo4j), Redis supports master-slave replication. One server is the master by default if you don’t set it as a slave of anything. Data will be replicated to any number of slave servers.

Making slave servers is easy. We first need a copy of our redis.conf file.

 $ ​​cp​​ ​​redis.conf​​ ​​redis-s1.conf

The file will remain largely the same but with the following changes:

 port 6380
 slaveof 127.0.0.1 6379

If all went according to plan, you should see something similar to the following in the slave server’s log when you start it:

 $ ​​redis-server​​ ​​redis-s1.conf
 34648:S 28 Apr 06:42:22.496 * Connecting to MASTER 127.0.0.1:6379
 34648:S 28 Apr 06:42:22.496 * MASTER <-> SLAVE sync started
 34648:S 28 Apr 06:42:22.497 * Non blocking connect for SYNC fired the event.
 34648:S 28 Apr 06:42:22.497 * Master replied to PING, replication can...
 34648:S 28 Apr 06:42:22.497 * Partial resynchronization not possible...
 34648:S 28 Apr 06:42:22.497 * Full resync from master: 4829...1f88a68bc:1
 34648:S 28 Apr 06:42:22.549 * MASTER <-> SLAVE sync: receiving 76 bytes...
 34648:S 28 Apr 06:42:22.549 * MASTER <-> SLAVE sync: Flushing old data
 34648:S 28 Apr 06:42:22.549 * MASTER <-> SLAVE sync: Loading DB in memory
 34648:S 28 Apr 06:42:22.549 * MASTER <-> SLAVE sync: Finished with success

And you should see the string 1 slaves output in the master log.

 redis 127.0.0.1:6379> SADD meetings "StarTrek Pastry Chefs" "LARPers Intl."

If we connect the command line to our slave, we should receive our meeting list.

 redis 127.0.0.1:6380> SMEMBERS meetings
 1) "StarTrek Pastry Chefs"
 2) "LARPers Intl."

In production, you’ll generally want to implement replication for availability or backup purposes and thus have Redis slaves on different machines.

Data Dump

So far, we’ve talked a lot about how fast Redis is, but it’s hard to get a feel for it without playing with a bit more data.

Let’s insert a large dummy dataset into our Redis server. You can keep the slave running if you like, but a laptop or desktop might run quicker if you’re running just a single master server. We’re going to autogenerate a list of keys and values of arbitrary size, where the keys will be key1, key2, and so on, while the values will be value1, and so on.

You’ll first need to install the redis Ruby gem.

 $ ​​gem​​ ​​install​​ ​​redis

There are several ways to go about inserting a large dataset, and they get progressively faster but more complex.

The simplest method is to simply iterate through a list of data and execute SET for each value using the standard redis-rb client. In our case, we don’t really care what the data looks like, as we just want to look at performance, so we’ll insert our randomized data.

 require ​'redis'
 #%w{hiredis redis/connection/hiredis}.each{|r| require r}
 
 # the number of set operations to perform will be defined as a CLI arg
 TOTAL_NUMBER_OF_ENTRIES = ARGV[0].to_i
 
 $redis = Redis.new(​:host​ => ​"127.0.0.1"​, ​:port​ => 6379)
 $redis.flushall
 count, start = 0, Time.now
 
 (1..TOTAL_NUMBER_OF_ENTRIES).each ​do​ |n|
  count += 1
  key = ​"key​​#{​n​}​​"
  value = ​"value​​#{​n​}​​"
 
  $redis.set(key, value)
 
 # stop iterating when we reach the specified number
 break​ ​if​ count >= TOTAL_NUMBER_OF_ENTRIES
 end
 puts ​"​​#{​count​}​​ items in ​​#{​Time.now - start​}​​ seconds"

Run the script, specifying the number of SET operations. Feel free to experiment with lower or higher numbers. Let’s start with 100,000.

 $ ​​ruby​​ ​​data_dump.rb​​ ​​100000
 100000 items in 5.851211 seconds

If you want to speed up insertion—and are not running JRuby—you can optionally install the hiredis gem. It’s a C driver that is considerably faster than the native Ruby driver. Just uncomment the %w{hiredis redis/connection/hiredis}.each{|r| require r} statement at the top in order to load the driver and then re-run the script. You may not see a large improvement for this type of CPU-bound operation, but we highly recommend hiredis for production Ruby use.

You will, however, see a big improvement with pipelinined operations. Here we batch 1,000 lines at a time and pipeline their insertion. You may see time reductions of 500 percent or more.

 require ​'redis'
 #%w{hiredis redis/connection/hiredis}.each{|r| require r}
 
 TOTAL_NUMBER_OF_ENTRIES = ARGV[0].to_i
 BATCH_SIZE = 1000
 
 # perform a single batch update for each number
 def​ flush(batch)
  $redis.pipelined ​do
  batch.each ​do​ |n|
  key, value = ​"key​​#{​n​}​​"​, ​"value​​#{​n​}​​"
  $redis.set(key, value)
 end
 end
  batch.clear
 end
 
 $redis = Redis.new(​:host​ => ​"127.0.0.1"​, ​:port​ => 6379)
 $redis.flushall
 
 batch = []
 count, start = 0, Time.now
 (1..TOTAL_NUMBER_OF_ENTRIES).each ​do​ |n|
  count += 1
 
 # push integers into an array
  batch << n
 
 # watch this number fluctuate between 1 and 1000
  puts ​"Batch size: ​​#{​batch.length​}​​"
 
 # if the array grows to BATCH_SIZE, flush it
 if​ batch.size == BATCH_SIZE
  flush(batch)
 end
 
 break​ ​if​ count >= TOTAL_NUMBER_OF_ENTRIES
 end
 # flush any remaining values
 flush(batch)
 
 puts ​"​​#{​count​}​​ items in ​​#{​Time.now - start​}​​ seconds"
 $ ​​ruby​​ ​​data_dump_pipelined.rb​​ ​​100000
 100000 items in 1.061089 seconds

This reduces the number of Redis connections required, but building the pipelined dataset has some overhead of its own. You should try it out with different numbers of batched operations when pipelining in production. For now, experiment with increasing the number of items and re-run the script using the hiredis gem for an even more dramatic performance increase.

Redis Cluster

Beyond simple replication, many Redis clients provide an interface for building a simple ad hoc distributed Redis cluster. The Ruby client supports a consistent-hashing managed cluster.

To get started with building out a managed cluster, we need another server. Unlike the master-slave setup, both of our servers will take the master (default) configuration. We copied the redis.conf file and changed the port to 6380. That’s all that’s required for the servers.

 require ​'redis'
 require ​'redis/distributed'
 
 TOTAL_NUMBER_OF_ENTRIES = ARGV[0].to_i
 
 $redis = Redis::Distributed.new([
 "redis://localhost:6379/"​,
 "redis://localhost:6380/"
 ])
 $redis.flushall
 count, start = 0, Time.now
 
 (1..TOTAL_NUMBER_OF_ENTRIES).each ​do​ |n|
  count += 1
 
  key = ​"key​​#{​n​}​​"
  value = ​"value​​#{​n​}​​"
 
  $redis.set(key, value)
 
 break​ ​if​ count >= TOTAL_NUMBER_OF_ENTRIES
 end
 puts ​"​​#{​count​}​​ items in ​​#{​Time.now - start​}​​ seconds"

Bridging between two or more servers requires only some minor changes to our existing data dump client. First, we need to require the redis/distributed file from the redis gem.

 require ​'redis/distributed'

Then replace the Redis client with Redis::Distributed and pass in an array of server URIs. Each URI requires the redis scheme, server (localhost in our case), and port.

 $redis = Redis::Distributed.new([
 "redis://localhost:6379/"​,
 "redis://localhost:6380/"
 ])

Running the client is the same as before.

 $ ​​ruby​​ ​​data_dump_cluster.rb​​ ​​10000
 100000 items in 6.614907 seconds

We do see a performance decrease here because a lot more work is being done by the client, since it handles computing which keys are stored on which servers. You can validate that keys are stored on separate servers by attempting to retrieve the same key from each server through the CLI.

 $ ​​redis-cli​​ ​​-p​​ ​​6379​​ ​​--raw​​ ​​GET​​ ​​key537
 $ ​​redis-cli​​ ​​-p​​ ​​6380​​ ​​--raw​​ ​​GET​​ ​​key537

Only one client will be able to GET the value of value537. But as long as you retrieve keys set through the same Redis::Distributed configuration, the client will access the values from the correct servers.

Bloom Filters

A good way to improve the performance of just about any data retrieval system is to simply never perform queries that you know are doomed to fail and find no data. If you know that you dropped your car keys in your house, it’s senseless to scour your neighbor’s house in search of them. You should start your search operation at home, maybe starting with the couch cushions. What matters is that you can safely exclude some search avenues that you know will be fruitless.

A Bloom filter enables you to do something similar with database queries. These filters are probabilistic data structures that check for the nonexistence of an item in a set, first covered in Compression and Bloom Filters. Although Bloom filters can return false positives, they cannot return a false negative. This is very useful when you need to quickly discover whether a value does not exist in a system. If only they made Bloom filters for lost car keys!

Bloom filters succeed at discovering nonexistence by converting a value to a very sparse sequence of bits and comparing that to a union of every value’s bits. In other words, when a new value is added, it is OR’d against the current Bloom filter bit sequence. When you want to check whether the value is already in the system, you perform an AND against the Bloom filter’s sequence. If the value has any true bits that aren’t also true in the Bloom filter’s corresponding buckets, then the value was never added. In other words, this value is definitely not in the Bloom filter. The following figure provides a graphic representation of this concept.

images/redis-bloomfilter.png

To get started writing our own Bloom filter, we need to install a new gem:

 $ ​​gem​​ ​​install​​ ​​bloomfilter-rb

Ruby wunderkind Ilya Grigorik created this Redis-backed Bloom filter, but the concepts are transferable to any language. Let’s have a look at a script that looks a bit like our previous data dump script but with a few key differences.

For our example here, we’ll download a text file containing the entire text of Moby Dick from Project Gutenberg[59] and assemble a list of all words in the text (including a lot of repeat words such as “the” and “a”). Then, we’ll loop through each word, check if it’s already in our Bloom filter, and insert it into the filter if it isn’t there already.

 require ​'bloomfilter-rb'
 
 bloomfilter = BloomFilter::Redis.new(​:size​ => 1000000)
 bloomfilter.clear
 
 # we'll read the file data and strip out all the non-word material
 text_data = File.read(ARGV[0])
 clean_text = text_data.gsub(​/ /​, ​' '​).gsub(​/[,-.;'?"()!*]/​, ​''​)
 
 clean_text.split(​' '​).each ​do​ |word|
  word = word.downcase
 
 next​ ​if​ bloomfilter.include?(word)
  puts word
  bloomfilter.insert(word)
 end
 
 puts ​"Total number of words: ​​#{​text_data.length​}​​"
 puts ​"Number of words in filter: ​​#{​bloomfilter.size​}​​"

Let’s download the text using cURL and run the script:

 $ ​​curl​​ ​​-o​​ ​​moby-dick.txt​​ ​​https://www.gutenberg.org/files/2701/old/moby10b.txt
 $ ​​ruby​​ ​​bloom_filter.rb​​ ​​moby-dick.txt​​ ​​>​​ ​​output.txt

Open up the output.txt and scroll through the contents. Each word in this file has not yet been added to the filter. At the top of the list, you’ll find a lot of common words like the, a, and but. At the bottom of the list, you’ll see the word “orphan,” which is the very last word in the Epilogue, which explains why it hadn’t been added to the filter yet! Some other fairly esoteric words toward the end include “ixion,” “sheathed,” “dirgelike,” and “intermixingly.”

What essentially happened here is that the more frequently used words were more likely to get filtered out early, whereas less common words or words used only once were filtered out later. The upside with this approach is the ability to detect duplicate words. The downside is that a few false positives will seep through—the Bloom filter may flag a word we have never seen before. This is why in a real-world use case you would perform some secondary check, such as a slower database query to a system of record, which should happen only a small percentage of the time, presuming a large enough filter size, which is computable.[60]

SETBIT and GETBIT

As mentioned earlier, Bloom filters function by flipping certain bits in a sparse binary field. The Redis Bloom filter implementation we just used takes advantage of two Redis commands that perform just such actions: SETBIT and GETBIT.

Like all Redis commands, SETBIT is fairly descriptive. The command sets a single bit (either 1 or 0) at a certain location in a bit sequence, starting from zero. It’s a common use case for high-performance multivariate flagging—it’s faster to flip a few bits than write a set of descriptive strings.

If we want to keep track of the toppings on a hamburger, we can assign each type of topping to a bit position, such as ketchup = 0, mustard = 1, onion = 2, lettuce = 3. So, a hamburger with only mustard and onion could be represented as 0110 and set in the command line:

 redis 127.0.0.1:6379> SETBIT my_burger 1 1
 (integer) 0
 redis 127.0.0.1:6379> SETBIT my_burger 2 1
 (integer) 0

Later, a process can check whether my burger should have lettuce or mustard. If zero is returned, the answer is false—one if true.

 redis 127.0.0.1:6379> GETBIT my_burger 3
 (integer) 0
 redis 127.0.0.1:6379> GETBIT my_burger 1
 (integer) 1

The Bloom filter implementation takes advantage of this behavior by hashing a value as a multibit value. It calls SETBIT X 1 for each on position in an insert (where X is the bit position) and verifies existence by calling GETBIT X on the include? method—returning false if any GETBIT position returns 0.

Bloom filters are excellent for reducing unnecessary traffic to a slower underlying system, be it a slower database, limited resource, or network request. If you have a slower database of IP addresses and you want to track all new users to your site, you can use a Bloom filter to first check whether the IP address exists in your system. If the Bloom filter returns false, you know the IP address has yet to be added and can respond accordingly. If the Bloom filter returns true, this IP address may or may not exist on the back end and requires a secondary lookup to be sure. This is why computing the correct size is important—a well-sized Bloom filter can reduce (but not eliminate) the error rate or the likelihood of a false positive.

Day 2 Wrap-Up

Today we rounded out our Redis investigation by moving beyond simple operations into squeezing every last bit of speed out of an already very fast system. Redis provides for fast and flexible data structure storage and simple manipulations as we saw in Day 1, but it’s equally adept at more complex behaviors by way of built-in publish-subscribe functions and bit operations. It’s also highly configurable, with many durability and replication settings that conform to whatever your needs may be. Finally, Redis also supports some nice third-party enhancements, such as Bloom filters and clustering.

This concludes major operations for the Redis data structure store. Tomorrow we’re going to do something a bit different, by using Redis as the cornerstone of a polyglot persistence setup along with CouchDB and Neo4j.

Day 2 Homework

Find

  1. Find out what messaging patterns are, and discover how many Redis can implement.

  2. Read some documentation on Sentinel,[61] a system used to manage high-availability Redis clusters.

Do

  1. Run the data dump script with all snapshotting and the append-only file turned off. Then try running with appendfsync set to always, noting the speed difference.

  2. Using your favorite programming language’s web framework, try to build a simple URL-shortening service backed by Redis with an input box for the URL and a simple redirect based on the URL. Back it up with a Redis master-slave replicated cluster across multiple nodes as your back end.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset