Package org.apache.heron.streamlet
Interface Streamlet<R>
-
- All Superinterfaces:
StreamletBase<R>
- All Known Subinterfaces:
KVStreamlet<K,V>
- All Known Implementing Classes:
CountByKeyAndWindowStreamlet
,CountByKeyStreamlet
,CustomStreamlet
,FilterStreamlet
,FlatMapStreamlet
,GeneralReduceByKeyAndWindowStreamlet
,GeneralReduceByKeyStreamlet
,JoinStreamlet
,KeyByStreamlet
,KVStreamletShadow
,MapStreamlet
,ReduceByKeyAndWindowStreamlet
,ReduceByKeyStreamlet
,RemapStreamlet
,SourceStreamlet
,SplitStreamlet
,SpoutStreamlet
,StreamletImpl
,StreamletShadow
,SupplierStreamlet
,TransformStreamlet
,UnionStreamlet
@Evolving public interface Streamlet<R> extends StreamletBase<R>
A Streamlet is a (potentially unbounded) ordered collection of tuples. Streamlets originate from pub/sub systems(such Pulsar/Kafka), or from static data(such as csv files, HDFS files), or for that matter any other source. They are also created by transforming existing Streamlets using operations such as map/flatMap, etc. Besides the tuples, a Streamlet has the following properties associated with it a) name. User assigned or system generated name to refer the streamlet b) nPartitions. Number of partitions that the streamlet is composed of. Thus the ordering of the tuples in a Streamlet is wrt the tuples within a partition. This allows the system to distribute each partition to different nodes across the cluster. A bunch of transformations can be done on Streamlets(like map/flatMap, etc.). Each of these transformations operate on every tuple of the Streamlet and produce a new Streamlet. One can think of a transformation attaching itself to the stream and processing each tuple as they go by. Thus the parallelism of any operator is implicitly determined by the number of partitions of the stream that it is operating on. If a particular transformation wants to operate at a different parallelism, one can repartition the Streamlet before doing the transformation.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description <T> Streamlet<T>
applyOperator(IStreamletOperator<R,T> operator)
Returns a new Streamlet by applying the operator on each element of this streamlet.<T> Streamlet<T>
applyOperator(IStreamletOperator<R,T> operator, StreamGrouping grouper)
Returns a new Streamlet by applying the operator on each element of this streamlet.List<Streamlet<R>>
clone(int numClones)
Clones the current Streamlet.StreamletBase<R>
consume(SerializableConsumer<R> consumer)
Applies the consumer function to every element of the stream This function does not return anything.<K> KVStreamlet<K,Long>
countByKey(SerializableFunction<R,K> keyExtractor)
Returns a new stream ofby counting tuples in this stream on each key. <K> KVStreamlet<KeyedWindow<K>,Long>
countByKeyAndWindow(SerializableFunction<R,K> keyExtractor, WindowConfig windowCfg)
Returns a new stream ofby counting tuples over a window in this stream on each key. Streamlet<R>
filter(SerializablePredicate<R> filterFn)
Return a new Streamlet by applying the filterFn on each element of this streamlet and including only those elements that satisfy the filterFn<T> Streamlet<T>
flatMap(SerializableFunction<R,? extends Iterable<? extends T>> flatMapFn)
Return a new Streamlet by applying flatMapFn to each element of this Streamlet and flattening the resultString
getName()
Gets the name of the Streamlet.int
getNumPartitions()
Gets the number of partitions of this Streamlet.String
getStreamId()
Gets the stream id of this Streamlet.<K,S,T>
KVStreamlet<KeyedWindow<K>,T>join(Streamlet<S> other, SerializableFunction<R,K> thisKeyExtractor, SerializableFunction<S,K> otherKeyExtractor, WindowConfig windowCfg, JoinType joinType, SerializableBiFunction<R,S,? extends T> joinFunction)
Return a new KVStreamlet by joining 'this streamlet with ‘other’ streamlet.<K,S,T>
KVStreamlet<KeyedWindow<K>,T>join(Streamlet<S> other, SerializableFunction<R,K> thisKeyExtractor, SerializableFunction<S,K> otherKeyExtractor, WindowConfig windowCfg, SerializableBiFunction<R,S,? extends T> joinFunction)
Return a new Streamlet by inner joining 'this streamlet with ‘other’ streamlet.<K> KVStreamlet<K,R>
keyBy(SerializableFunction<R,K> keyExtractor)
Return a new KVStreamletby applying key extractor to each element of this Streamlet <K,V>
KVStreamlet<K,V>keyBy(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,V> valueExtractor)
Return a new KVStreamletby applying key and value extractor to each element of this Streamlet StreamletBase<R>
log()
Logs every element of the streamlet using String.valueOf function This is one of the sink functions in the sense that this operation returns void<T> Streamlet<T>
map(SerializableFunction<R,? extends T> mapFn)
Return a new Streamlet by applying mapFn to each element of this Streamlet<K,T>
KVStreamlet<K,T>reduceByKey(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,T> valueExtractor, SerializableBinaryOperator<T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet and applying reduceFn on those tuples.<K,T>
KVStreamlet<K,T>reduceByKey(SerializableFunction<R,K> keyExtractor, T identity, SerializableBiFunction<T,R,? extends T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet and applying reduceFn on those tuples.<K,V>
KVStreamlet<KeyedWindow<K>,V>reduceByKeyAndWindow(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,V> valueExtractor, WindowConfig windowCfg, SerializableBinaryOperator<V> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet over a Window defined by windowCfg and applying reduceFn on those tuples.<K,T>
KVStreamlet<KeyedWindow<K>,T>reduceByKeyAndWindow(SerializableFunction<R,K> keyExtractor, WindowConfig windowCfg, T identity, SerializableBiFunction<T,R,? extends T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet over a Window defined by windowCfg and applying reduceFn on those tuples.Streamlet<R>
repartition(int numPartitions)
Same as filter(filterFn).setNumPartitions(nPartitions) where filterFn is identityStreamlet<R>
repartition(int numPartitions, SerializableBiFunction<R,Integer,List<Integer>> partitionFn)
A more generalized version of repartition where a user can determine which partitions any particular tuple should go to.Streamlet<R>
setName(String sName)
Sets the name of the BaseStreamlet.Streamlet<R>
setNumPartitions(int numPartitions)
Sets the number of partitions of the streamletStreamlet<R>
split(Map<String,SerializablePredicate<R>> splitFns)
Returns multiple streams by splitting incoming stream.StreamletBase<R>
toSink(Sink<R> sink)
Applies the sink's put function to every element of the stream This function does not return anything.<T> Streamlet<T>
transform(SerializableTransformer<R,? extends T> serializableTransformer)
Returns a new Streamlet by applying the transformFunction on each element of this streamlet.Streamlet<R>
union(Streamlet<? extends R> other)
Returns a new Streamlet that is the union of this and the ‘other’ streamlet.Streamlet<R>
withStream(String streamId)
Set the id of the stream to be used by the children nodes.
-
-
-
Method Detail
-
setName
Streamlet<R> setName(String sName)
Sets the name of the BaseStreamlet.- Specified by:
setName
in interfaceStreamletBase<R>
- Parameters:
sName
- The name given by the user for this BaseStreamlet- Returns:
- Returns back the Streamlet with changed name
-
getName
String getName()
Gets the name of the Streamlet.- Specified by:
getName
in interfaceStreamletBase<R>
- Returns:
- Returns the name of the Streamlet
-
setNumPartitions
Streamlet<R> setNumPartitions(int numPartitions)
Sets the number of partitions of the streamlet- Specified by:
setNumPartitions
in interfaceStreamletBase<R>
- Parameters:
numPartitions
- The user assigned number of partitions- Returns:
- Returns back the Streamlet with changed number of partitions
-
getNumPartitions
int getNumPartitions()
Gets the number of partitions of this Streamlet.- Specified by:
getNumPartitions
in interfaceStreamletBase<R>
- Returns:
- the number of partitions of this Streamlet
-
withStream
Streamlet<R> withStream(String streamId)
Set the id of the stream to be used by the children nodes. Usage (assuming source is a Streamlet object with two output streams: stream1 and stream2): source.withStream("stream1").filter(...).log(); source.withStream("stream2").filter(...).log();- Parameters:
streamId
- The specified stream id- Returns:
- Returns back the Streamlet with changed stream id
-
getStreamId
String getStreamId()
Gets the stream id of this Streamlet.- Returns:
- the stream id of this Streamlet
-
map
<T> Streamlet<T> map(SerializableFunction<R,? extends T> mapFn)
Return a new Streamlet by applying mapFn to each element of this Streamlet- Parameters:
mapFn
- The Map Function that should be applied to each element
-
flatMap
<T> Streamlet<T> flatMap(SerializableFunction<R,? extends Iterable<? extends T>> flatMapFn)
Return a new Streamlet by applying flatMapFn to each element of this Streamlet and flattening the result- Parameters:
flatMapFn
- The FlatMap Function that should be applied to each element
-
filter
Streamlet<R> filter(SerializablePredicate<R> filterFn)
Return a new Streamlet by applying the filterFn on each element of this streamlet and including only those elements that satisfy the filterFn- Parameters:
filterFn
- The filter Function that should be applied to each element
-
repartition
Streamlet<R> repartition(int numPartitions)
Same as filter(filterFn).setNumPartitions(nPartitions) where filterFn is identity
-
repartition
Streamlet<R> repartition(int numPartitions, SerializableBiFunction<R,Integer,List<Integer>> partitionFn)
A more generalized version of repartition where a user can determine which partitions any particular tuple should go to. For each element of the current streamlet, the user supplied partitionFn is invoked passing in the element as the first argument. The second argument is the number of partitions of the downstream streamlet. The partitionFn should return 0 or more unique numbers between 0 and npartitions to indicate which partitions this element should be routed to.
-
clone
List<Streamlet<R>> clone(int numClones)
Clones the current Streamlet. It returns an array of numClones Streamlets where each Streamlet contains all the tuples of the current Streamlet- Parameters:
numClones
- The number of clones to clone
-
join
<K,S,T> KVStreamlet<KeyedWindow<K>,T> join(Streamlet<S> other, SerializableFunction<R,K> thisKeyExtractor, SerializableFunction<S,K> otherKeyExtractor, WindowConfig windowCfg, SerializableBiFunction<R,S,? extends T> joinFunction)
Return a new Streamlet by inner joining 'this streamlet with ‘other’ streamlet. The join is done over elements accumulated over a time window defined by windowCfg. The elements are compared using the thisKeyExtractor for this streamlet with the otherKeyExtractor for the other streamlet. On each matching pair, the joinFunction is applied.- Parameters:
other
- The Streamlet that we are joining with.thisKeyExtractor
- The function applied to a tuple of this streamlet to get the keyotherKeyExtractor
- The function applied to a tuple of the other streamlet to get the keywindowCfg
- This is a specification of what kind of windowing strategy you like to have. Typical windowing strategies are sliding windows and tumbling windowsjoinFunction
- The join function that needs to be applied
-
join
<K,S,T> KVStreamlet<KeyedWindow<K>,T> join(Streamlet<S> other, SerializableFunction<R,K> thisKeyExtractor, SerializableFunction<S,K> otherKeyExtractor, WindowConfig windowCfg, JoinType joinType, SerializableBiFunction<R,S,? extends T> joinFunction)
Return a new KVStreamlet by joining 'this streamlet with ‘other’ streamlet. The type of joining is declared by the joinType parameter. The join is done over elements accumulated over a time window defined by windowCfg. The elements are compared using the thisKeyExtractor for this streamlet with the otherKeyExtractor for the other streamlet. On each matching pair, the joinFunction is applied. Types of joinsJoinType
- Parameters:
other
- The Streamlet that we are joining with.thisKeyExtractor
- The function applied to a tuple of this streamlet to get the keyotherKeyExtractor
- The function applied to a tuple of the other streamlet to get the keywindowCfg
- This is a specification of what kind of windowing strategy you like to have. Typical windowing strategies are sliding windows and tumbling windowsjoinType
- Type of Join. OptionsJoinType
joinFunction
- The join function that needs to be applied
-
reduceByKey
<K,T> KVStreamlet<K,T> reduceByKey(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,T> valueExtractor, SerializableBinaryOperator<T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet and applying reduceFn on those tuples.- Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keyvalueExtractor
- The function applied to a tuple of this streamlet to extract the value to be reduced onreduceFn
- The reduce function that you want to apply to all the values of a key.
-
reduceByKey
<K,T> KVStreamlet<K,T> reduceByKey(SerializableFunction<R,K> keyExtractor, T identity, SerializableBiFunction<T,R,? extends T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet and applying reduceFn on those tuples.- Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keyidentity
- The identity element is the initial value for each keyreduceFn
- The reduce function that you want to apply to all the values of a key.
-
reduceByKeyAndWindow
<K,V> KVStreamlet<KeyedWindow<K>,V> reduceByKeyAndWindow(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,V> valueExtractor, WindowConfig windowCfg, SerializableBinaryOperator<V> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet over a Window defined by windowCfg and applying reduceFn on those tuples.- Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keyvalueExtractor
- The function applied to a tuple of this streamlet to extract the value to be reduced onwindowCfg
- This is a specification of what kind of windowing strategy you like to have. Typical windowing strategies are sliding windows and tumbling windowsreduceFn
- The reduce function that you want to apply to all the values of a key.
-
reduceByKeyAndWindow
<K,T> KVStreamlet<KeyedWindow<K>,T> reduceByKeyAndWindow(SerializableFunction<R,K> keyExtractor, WindowConfig windowCfg, T identity, SerializableBiFunction<T,R,? extends T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet over a Window defined by windowCfg and applying reduceFn on those tuples. For each window, the value identity is used as a initial value. All the matching tuples are reduced using reduceFn startin from this initial value.- Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keywindowCfg
- This is a specification of what kind of windowing strategy you like to have. Typical windowing strategies are sliding windows and tumbling windowsidentity
- The identity element is both the initial value inside the reduction window and the default result if there are no elements in the windowreduceFn
- The reduce function takes two parameters: a partial result of the reduction and the next element of the stream. It returns a new partial result.
-
union
Streamlet<R> union(Streamlet<? extends R> other)
Returns a new Streamlet that is the union of this and the ‘other’ streamlet. Essentially the new streamlet will contain tuples belonging to both Streamlets
-
transform
<T> Streamlet<T> transform(SerializableTransformer<R,? extends T> serializableTransformer)
Returns a new Streamlet by applying the transformFunction on each element of this streamlet. Before starting to cycle the transformFunction over the Streamlet, the open function is called. This allows the transform Function to do any kind of initialization/loading, etc.- Type Parameters:
T
- The return type of the transform- Parameters:
serializableTransformer
- The transformation function to be applied- Returns:
- Streamlet containing the output of the transformFunction
-
applyOperator
<T> Streamlet<T> applyOperator(IStreamletOperator<R,T> operator)
Returns a new Streamlet by applying the operator on each element of this streamlet.- Type Parameters:
T
- The return type of the transform- Parameters:
operator
- The operator to be applied- Returns:
- Streamlet containing the output of the operation
-
applyOperator
<T> Streamlet<T> applyOperator(IStreamletOperator<R,T> operator, StreamGrouping grouper)
Returns a new Streamlet by applying the operator on each element of this streamlet.- Type Parameters:
T
- The return type of the transform- Parameters:
operator
- The operator to be appliedgrouper
- The grouper to be applied with the operator- Returns:
- Streamlet containing the output of the operation
-
split
Streamlet<R> split(Map<String,SerializablePredicate<R>> splitFns)
Returns multiple streams by splitting incoming stream.- Parameters:
splitFns
- The Split Functions that test if the tuple should be emitted into each stream Note that there could be 0 or multiple target stream ids
-
keyBy
<K> KVStreamlet<K,R> keyBy(SerializableFunction<R,K> keyExtractor)
Return a new KVStreamletby applying key extractor to each element of this Streamlet - Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the key
-
keyBy
<K,V> KVStreamlet<K,V> keyBy(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,V> valueExtractor)
Return a new KVStreamletby applying key and value extractor to each element of this Streamlet - Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keyvalueExtractor
- The function applied to a tuple of this streamlet to extract the value
-
countByKey
<K> KVStreamlet<K,Long> countByKey(SerializableFunction<R,K> keyExtractor)
Returns a new stream ofby counting tuples in this stream on each key. - Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the key
-
countByKeyAndWindow
<K> KVStreamlet<KeyedWindow<K>,Long> countByKeyAndWindow(SerializableFunction<R,K> keyExtractor, WindowConfig windowCfg)
Returns a new stream ofby counting tuples over a window in this stream on each key. - Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keywindowCfg
- This is a specification of what kind of windowing strategy you like to have. Typical windowing strategies are sliding windows and tumbling windows Note that there could be 0 or multiple target stream ids
-
log
StreamletBase<R> log()
Logs every element of the streamlet using String.valueOf function This is one of the sink functions in the sense that this operation returns void
-
consume
StreamletBase<R> consume(SerializableConsumer<R> consumer)
Applies the consumer function to every element of the stream This function does not return anything.- Parameters:
consumer
- The user supplied consumer function that is invoked for each element of this streamlet.
-
toSink
StreamletBase<R> toSink(Sink<R> sink)
Applies the sink's put function to every element of the stream This function does not return anything.- Parameters:
sink
- The Sink whose put method consumes each element of this streamlet.
-
-