DSD connecting to memory, file, or database

DSD_Memory provides a streaming interface to a matrix or a dataframe in memory. One of the interesting ways to use DSD_Memory is to make it replay an old stream data.

Let us look at a small R example:

> random.stream <- DSD_Gaussians(k = 2, d = 4, mu = rbind(c(200,77,20,750),c(196,80,16,790)))
> data.un <- get_points(random.stream, n =2000)
> head(data.un)
X1 X2 X3 X4
1 199.9456 77.09095 19.98670 750.1129
2 195.9696 80.10380 16.01115 789.9727
3 196.0109 79.95394 16.08678 790.1042
4 199.9882 77.03385 20.00825 750.0069
5 200.0485 76.84687 19.94311 750.0130
6 200.0462 76.94537 20.00657 750.0701
>

We used  DSD_Gaussian to create some random four-dimensional records, and have stored them in a dataframe called data.un.

Let us create a DSD_Memory object:

> replayer <- DSD_Memory(data.un, k = NA)
> replayer
Memory Stream Interface
Class: DSD_Memory, DSD_R, DSD_data.frame, DSD
With NA clusters in 4 dimensions
Contains 2000 data points - currently at position 1 - loop is FALSE
>

We have created a memory stream interface to our dataframe: data.un. Now, we can play this dataframe as we like:

> get_points(replayer, n=5)
X1 X2 X3 X4
1 199.9456 77.09095 19.98670 750.1129
2 195.9696 80.10380 16.01115 789.9727
3 196.0109 79.95394 16.08678 790.1042
4 199.9882 77.03385 20.00825 750.0069
5 200.0485 76.84687 19.94311 750.0130
> replayer
Memory Stream Interface
Class: DSD_Memory, DSD_R, DSD_data.frame, DSD
With NA clusters in 4 dimensions
Contains 2000 data points - currently at position 6 - loop is FALSE
>

Using get_points, we have extracted five points. We can now see that our interface is in position 6.

Let us go ahead and rewind it:

> reset_stream(replayer, pos = 2)
> replayer
Memory Stream Interface
Class: DSD_Memory, DSD_R, DSD_data.frame, DSD
With NA clusters in 4 dimensions
Contains 2000 data points - currently at position 2 - loop is FALSE

We can see that it has now gone back to position 2.

DSD_ReadCSV can help to read a very large CSV file line by line, if it cannot be loaded into memory completely.

DSD_ReadDB can be used with an open query to a database.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset