Package org.apache.heron.streamlet.impl
Class StreamletImpl<R>
- java.lang.Object
-
- org.apache.heron.streamlet.impl.StreamletBaseImpl<R>
-
- org.apache.heron.streamlet.impl.StreamletImpl<R>
-
- All Implemented Interfaces:
Streamlet<R>
,StreamletBase<R>
- Direct Known Subclasses:
CountByKeyAndWindowStreamlet
,CountByKeyStreamlet
,CustomStreamlet
,FilterStreamlet
,FlatMapStreamlet
,GeneralReduceByKeyAndWindowStreamlet
,GeneralReduceByKeyStreamlet
,JoinStreamlet
,KeyByStreamlet
,MapStreamlet
,ReduceByKeyAndWindowStreamlet
,ReduceByKeyStreamlet
,RemapStreamlet
,SourceStreamlet
,SplitStreamlet
,SpoutStreamlet
,StreamletShadow
,SupplierStreamlet
,TransformStreamlet
,UnionStreamlet
public abstract class StreamletImpl<R> extends StreamletBaseImpl<R> implements Streamlet<R>
A Streamlet is a (potentially unbounded) ordered collection of tuples. Streamlets originate from pub/sub systems(such Pulsar/Kafka), or from static data(such as csv files, HDFS files), or for that matter any other source. They are also created by transforming existing Streamlets using operations such as map/flatMap, etc. Besides the tuples, a Streamlet has the following properties associated with it a) name. User assigned or system generated name to refer the streamlet b) nPartitions. Number of partitions that the streamlet is composed of. Thus the ordering of the tuples in a Streamlet is wrt the tuples within a partition. This allows the system to distribute each partition to different nodes across the cluster. A bunch of transformations can be done on Streamlets(like map/flatMap, etc.). Each of these transformations operate on every tuple of the Streamlet and produce a new Streamlet. One can think of a transformation attaching itself to the stream and processing each tuple as they go by. Thus the parallelism of any operator is implicitly determined by the number of partitions of the stream that it is operating on. If a particular transformation wants to operate at a different parallelism, one can repartition the Streamlet before doing the transformation.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.heron.streamlet.impl.StreamletBaseImpl
StreamletBaseImpl.StreamletNamePrefix
-
-
Field Summary
-
Fields inherited from class org.apache.heron.streamlet.impl.StreamletBaseImpl
name, nPartitions
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
StreamletImpl()
Only used by the implementors
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description <T> Streamlet<T>
applyOperator(IStreamletOperator<R,T> operator)
Returns a new Streamlet by applying the operator on each element of this streamlet.<T> Streamlet<T>
applyOperator(IStreamletOperator<R,T> operator, StreamGrouping grouper)
Returns a new Streamlet by applying the operator on each element of this streamlet.List<Streamlet<R>>
clone(int numClones)
Clones the current Streamlet.StreamletBase<R>
consume(SerializableConsumer<R> consumer)
Applies the consumer function for every element of this streamlet<K> KVStreamlet<K,Long>
countByKey(SerializableFunction<R,K> keyExtractor)
Returns a new stream ofby counting tuples in this stream on each key. <K> KVStreamlet<KeyedWindow<K>,Long>
countByKeyAndWindow(SerializableFunction<R,K> keyExtractor, WindowConfig windowCfg)
Returns a new stream ofby counting tuples over a window in this stream on each key. Streamlet<R>
filter(SerializablePredicate<R> filterFn)
Return a new Streamlet by applying the filterFn on each element of this streamlet and including only those elements that satisfy the filterFn<T> Streamlet<T>
flatMap(SerializableFunction<R,? extends Iterable<? extends T>> flatMapFn)
Return a new Streamlet by applying flatMapFn to each element of this Streamlet and flattening the resultprotected Set<String>
getAvailableStreamIds()
Get the available stream ids in the Streamlet.String
getStreamId()
Gets the stream id of this Streamlet.<K,S,T>
KVStreamlet<KeyedWindow<K>,T>join(Streamlet<S> otherStreamlet, SerializableFunction<R,K> thisKeyExtractor, SerializableFunction<S,K> otherKeyExtractor, WindowConfig windowCfg, JoinType joinType, SerializableBiFunction<R,S,? extends T> joinFunction)
Return a new KVStreamlet by joining 'this streamlet with ‘other’ streamlet.<K,S,T>
KVStreamlet<KeyedWindow<K>,T>join(Streamlet<S> otherStreamlet, SerializableFunction<R,K> thisKeyExtractor, SerializableFunction<S,K> otherKeyExtractor, WindowConfig windowCfg, SerializableBiFunction<R,S,? extends T> joinFunction)
Return a new Streamlet by inner joining 'this streamlet with ‘other’ streamlet.<K> KVStreamlet<K,R>
keyBy(SerializableFunction<R,K> keyExtractor)
Return a new KVStreamletby applying key extractor to each element of this Streamlet <K,V>
KVStreamlet<K,V>keyBy(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,V> valueExtractor)
Return a new KVStreamletby applying key and value extractor to each element of this Streamlet StreamletBase<R>
log()
Logs every element of the streamlet using String.valueOf function Note that LogStreamlet is an empty streamlet.<T> Streamlet<T>
map(SerializableFunction<R,? extends T> mapFn)
Return a new Streamlet by applying mapFn to each element of this Streamlet<K,T>
KVStreamlet<K,T>reduceByKey(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,T> valueExtractor, SerializableBinaryOperator<T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet and applying reduceFn on those tuples.<K,T>
KVStreamlet<K,T>reduceByKey(SerializableFunction<R,K> keyExtractor, T identity, SerializableBiFunction<T,R,? extends T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet and applying reduceFn on those tuples.<K,T>
KVStreamlet<KeyedWindow<K>,T>reduceByKeyAndWindow(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,T> valueExtractor, WindowConfig windowCfg, SerializableBinaryOperator<T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet over a Window defined by windowCfg and applying reduceFn on those tuples.<K,T>
KVStreamlet<KeyedWindow<K>,T>reduceByKeyAndWindow(SerializableFunction<R,K> keyExtractor, WindowConfig windowCfg, T identity, SerializableBiFunction<T,R,? extends T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet over a Window defined by windowCfg and applying reduceFn on those tuples.Streamlet<R>
repartition(int numPartitions)
Same as filter(Identity).setNumPartitions(nPartitions)Streamlet<R>
repartition(int numPartitions, SerializableBiFunction<R,Integer,List<Integer>> partitionFn)
A more generalized version of repartition where a user can determine which partitions any particular tuple should go toStreamlet<R>
setName(String sName)
Sets the name of the Streamlet.Streamlet<R>
setNumPartitions(int numPartitions)
Sets the number of partitions of the streamletStreamlet<R>
split(Map<String,SerializablePredicate<R>> splitFns)
Returns multiple streams by splitting incoming stream.StreamletBase<R>
toSink(Sink<R> sink)
Uses the sink to consume every element of this streamlet<T> Streamlet<T>
transform(SerializableTransformer<R,? extends T> serializableTransformer)
Returns a new Streamlet by applying the transformFunction on each element of this streamlet.Streamlet<R>
union(Streamlet<? extends R> otherStreamlet)
Returns a new Streamlet that is the union of this and the ‘other’ streamlet.Streamlet<R>
withStream(String streamId)
Set the id of the stream to be used by the children nodes.-
Methods inherited from class org.apache.heron.streamlet.impl.StreamletBaseImpl
addChild, build, doBuild, getChildren, getName, getNumPartitions, isBuilt, isFullyBuilt, setDefaultNameIfNone
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.heron.streamlet.Streamlet
getName, getNumPartitions
-
-
-
-
Method Detail
-
setName
public Streamlet<R> setName(String sName)
Sets the name of the Streamlet.- Specified by:
setName
in interfaceStreamlet<R>
- Specified by:
setName
in interfaceStreamletBase<R>
- Overrides:
setName
in classStreamletBaseImpl<R>
- Parameters:
sName
- The name given by the user for this streamlet- Returns:
- Returns back the Streamlet with changed name
-
setNumPartitions
public Streamlet<R> setNumPartitions(int numPartitions)
Sets the number of partitions of the streamlet- Specified by:
setNumPartitions
in interfaceStreamlet<R>
- Specified by:
setNumPartitions
in interfaceStreamletBase<R>
- Overrides:
setNumPartitions
in classStreamletBaseImpl<R>
- Parameters:
numPartitions
- The user assigned number of partitions- Returns:
- Returns back the Streamlet with changed number of partitions
-
withStream
public Streamlet<R> withStream(String streamId)
Set the id of the stream to be used by the children nodes. Usage (assuming source is a Streamlet object with two output streams: stream1 and stream2): source.withStream("stream1").filter(...).log(); source.withStream("stream2").filter(...).log();- Specified by:
withStream
in interfaceStreamlet<R>
- Parameters:
streamId
- The specified stream id- Returns:
- Returns back the Streamlet with changed stream id
-
getAvailableStreamIds
protected Set<String> getAvailableStreamIds()
Get the available stream ids in the Streamlet. For most Streamlets, there is only one internal stream id, therefore the function returns a set of one single stream id.- Returns:
- Returns a set of one single stream id.
-
getStreamId
public String getStreamId()
Gets the stream id of this Streamlet.- Specified by:
getStreamId
in interfaceStreamlet<R>
- Returns:
- the stream id of this Streamlet`
-
map
public <T> Streamlet<T> map(SerializableFunction<R,? extends T> mapFn)
Return a new Streamlet by applying mapFn to each element of this Streamlet
-
flatMap
public <T> Streamlet<T> flatMap(SerializableFunction<R,? extends Iterable<? extends T>> flatMapFn)
Return a new Streamlet by applying flatMapFn to each element of this Streamlet and flattening the result
-
filter
public Streamlet<R> filter(SerializablePredicate<R> filterFn)
Return a new Streamlet by applying the filterFn on each element of this streamlet and including only those elements that satisfy the filterFn
-
repartition
public Streamlet<R> repartition(int numPartitions)
Same as filter(Identity).setNumPartitions(nPartitions)- Specified by:
repartition
in interfaceStreamlet<R>
-
repartition
public Streamlet<R> repartition(int numPartitions, SerializableBiFunction<R,Integer,List<Integer>> partitionFn)
A more generalized version of repartition where a user can determine which partitions any particular tuple should go to- Specified by:
repartition
in interfaceStreamlet<R>
-
clone
public List<Streamlet<R>> clone(int numClones)
Clones the current Streamlet. It returns an array of numClones Streamlets where each Streamlet contains all the tuples of the current Streamlet
-
join
public <K,S,T> KVStreamlet<KeyedWindow<K>,T> join(Streamlet<S> otherStreamlet, SerializableFunction<R,K> thisKeyExtractor, SerializableFunction<S,K> otherKeyExtractor, WindowConfig windowCfg, SerializableBiFunction<R,S,? extends T> joinFunction)
Return a new Streamlet by inner joining 'this streamlet with ‘other’ streamlet. The join is done over elements accumulated over a time window defined by windowCfg. The elements are compared using the thisKeyExtractor for this streamlet with the otherKeyExtractor for the other streamlet. On each matching pair, the joinFunction is applied.- Specified by:
join
in interfaceStreamlet<R>
- Parameters:
otherStreamlet
- The Streamlet that we are joining with.thisKeyExtractor
- The function applied to a tuple of this streamlet to get the keyotherKeyExtractor
- The function applied to a tuple of the other streamlet to get the keywindowCfg
- This is a specification of what kind of windowing strategy you like to have. Typical windowing strategies are sliding windows and tumbling windowsjoinFunction
- The join function that needs to be applied
-
join
public <K,S,T> KVStreamlet<KeyedWindow<K>,T> join(Streamlet<S> otherStreamlet, SerializableFunction<R,K> thisKeyExtractor, SerializableFunction<S,K> otherKeyExtractor, WindowConfig windowCfg, JoinType joinType, SerializableBiFunction<R,S,? extends T> joinFunction)
Return a new KVStreamlet by joining 'this streamlet with ‘other’ streamlet. The type of joining is declared by the joinType parameter. The join is done over elements accumulated over a time window defined by windowCfg. The elements are compared using the thisKeyExtractor for this streamlet with the otherKeyExtractor for the other streamlet. On each matching pair, the joinFunction is applied. Types of joinsJoinType
- Specified by:
join
in interfaceStreamlet<R>
- Parameters:
otherStreamlet
- The Streamlet that we are joining with.thisKeyExtractor
- The function applied to a tuple of this streamlet to get the keyotherKeyExtractor
- The function applied to a tuple of the other streamlet to get the keywindowCfg
- This is a specification of what kind of windowing strategy you like to have. Typical windowing strategies are sliding windows and tumbling windowsjoinType
- Type of Join. OptionsJoinType
joinFunction
- The join function that needs to be applied
-
reduceByKey
public <K,T> KVStreamlet<K,T> reduceByKey(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,T> valueExtractor, SerializableBinaryOperator<T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet and applying reduceFn on those tuples.- Specified by:
reduceByKey
in interfaceStreamlet<R>
- Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keyvalueExtractor
- The function applied to a tuple of this streamlet to extract the value to be reduced onreduceFn
- The reduce function that you want to apply to all the values of a key.
-
reduceByKey
public <K,T> KVStreamlet<K,T> reduceByKey(SerializableFunction<R,K> keyExtractor, T identity, SerializableBiFunction<T,R,? extends T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet and applying reduceFn on those tuples.- Specified by:
reduceByKey
in interfaceStreamlet<R>
- Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keyidentity
- The identity element is the initial value for each keyreduceFn
- The reduce function that you want to apply to all the values of a key.
-
reduceByKeyAndWindow
public <K,T> KVStreamlet<KeyedWindow<K>,T> reduceByKeyAndWindow(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,T> valueExtractor, WindowConfig windowCfg, SerializableBinaryOperator<T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet over a Window defined by windowCfg and applying reduceFn on those tuples.- Specified by:
reduceByKeyAndWindow
in interfaceStreamlet<R>
- Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keyvalueExtractor
- The function applied to a tuple of this streamlet to extract the value to be reduced onwindowCfg
- This is a specification of what kind of windowing strategy you like to have. Typical windowing strategies are sliding windows and tumbling windowsreduceFn
- The reduce function that you want to apply to all the values of a key.
-
reduceByKeyAndWindow
public <K,T> KVStreamlet<KeyedWindow<K>,T> reduceByKeyAndWindow(SerializableFunction<R,K> keyExtractor, WindowConfig windowCfg, T identity, SerializableBiFunction<T,R,? extends T> reduceFn)
Return a new Streamlet accumulating tuples of this streamlet over a Window defined by windowCfg and applying reduceFn on those tuples. For each window, the value identity is used as a initial value. All the matching tuples are reduced using reduceFn starting from this initial value.- Specified by:
reduceByKeyAndWindow
in interfaceStreamlet<R>
- Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keywindowCfg
- This is a specification of what kind of windowing strategy you like to have. Typical windowing strategies are sliding windows and tumbling windowsidentity
- The identity element is both the initial value inside the reduction window and the default result if there are no elements in the windowreduceFn
- The reduce function takes two parameters: a partial result of the reduction and the next element of the stream. It returns a new partial result.
-
union
public Streamlet<R> union(Streamlet<? extends R> otherStreamlet)
Returns a new Streamlet that is the union of this and the ‘other’ streamlet. Essentially the new streamlet will contain tuples belonging to both Streamlets
-
log
public StreamletBase<R> log()
Logs every element of the streamlet using String.valueOf function Note that LogStreamlet is an empty streamlet. That is its a streamlet that does not contain any tuple. Thus this function returns void.
-
consume
public StreamletBase<R> consume(SerializableConsumer<R> consumer)
Applies the consumer function for every element of this streamlet
-
toSink
public StreamletBase<R> toSink(Sink<R> sink)
Uses the sink to consume every element of this streamlet
-
transform
public <T> Streamlet<T> transform(SerializableTransformer<R,? extends T> serializableTransformer)
Returns a new Streamlet by applying the transformFunction on each element of this streamlet. Before starting to cycle the transformFunction over the Streamlet, the open function is called. This allows the transform Function to do any kind of initialization/loading, etc.
-
applyOperator
public <T> Streamlet<T> applyOperator(IStreamletOperator<R,T> operator)
Returns a new Streamlet by applying the operator on each element of this streamlet.- Specified by:
applyOperator
in interfaceStreamlet<R>
- Type Parameters:
T
- The return type of the transform- Parameters:
operator
- The operator to be applied- Returns:
- Streamlet containing the output of the operation
-
applyOperator
public <T> Streamlet<T> applyOperator(IStreamletOperator<R,T> operator, StreamGrouping grouper)
Returns a new Streamlet by applying the operator on each element of this streamlet.- Specified by:
applyOperator
in interfaceStreamlet<R>
- Type Parameters:
T
- The return type of the transform- Parameters:
operator
- The operator to be appliedgrouper
- The grouper to be applied with the operator- Returns:
- Streamlet containing the output of the operation
-
split
public Streamlet<R> split(Map<String,SerializablePredicate<R>> splitFns)
Returns multiple streams by splitting incoming stream.
-
keyBy
public <K> KVStreamlet<K,R> keyBy(SerializableFunction<R,K> keyExtractor)
Return a new KVStreamletby applying key extractor to each element of this Streamlet
-
keyBy
public <K,V> KVStreamlet<K,V> keyBy(SerializableFunction<R,K> keyExtractor, SerializableFunction<R,V> valueExtractor)
Return a new KVStreamletby applying key and value extractor to each element of this Streamlet
-
countByKey
public <K> KVStreamlet<K,Long> countByKey(SerializableFunction<R,K> keyExtractor)
Returns a new stream ofby counting tuples in this stream on each key. - Specified by:
countByKey
in interfaceStreamlet<R>
- Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the key
-
countByKeyAndWindow
public <K> KVStreamlet<KeyedWindow<K>,Long> countByKeyAndWindow(SerializableFunction<R,K> keyExtractor, WindowConfig windowCfg)
Returns a new stream ofby counting tuples over a window in this stream on each key. - Specified by:
countByKeyAndWindow
in interfaceStreamlet<R>
- Parameters:
keyExtractor
- The function applied to a tuple of this streamlet to get the keywindowCfg
- This is a specification of what kind of windowing strategy you like to have. Typical windowing strategies are sliding windows and tumbling windows Note that there could be 0 or multiple target stream ids
-
-