Category Archives: Spark Streaming

DStream

DStream means discretized stream, which represents a continuous stream of data. In fact, DStream is a sequence of RDDs. Each RDD is a DStream contains data from a certain interval. Any operation applied on a DStream translates to operations on the underlying RDDs.

Window Operations

window operations allow to apply transformation over a sliding window of data. There are two parameters: windowLenght and slideInterval.