综合编程

Streamz: Python pipelines to manage continuous streams of data

微信扫一扫,分享到朋友圈

Streamz: Python pipelines to manage continuous streams of data

Continuous data streams arise in many applications like the following:

  1. Log processing from web servers
  2. Scientific instrument data like telemetry or image processing pipelines
  3. Financial time series
  4. Machine learning pipelines for real-time and on-line learning

Sometimes these pipelines are very simple, with a linear sequence of processing steps:

And sometimes these pipelines are more complex, involving branching, look-back periods, feedback into earlier stages, and more.

Streamz endeavors to be simple in simple cases, while also being powerful enough to let you define custom and powerful pipelines for your application.

Why not Python generator expressions?

Python users often manage continuous sequences of data with iterators or generator expressions.

def fib():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
sequence = (f(n) for n in fib())

However iterators become challenging when you want to fork them or control the flow of data. Typically people rely on tools like itertools.tee , and zip .

x1, x2 = itertools.tee(x, 2)
y1 = map(f, x1)
y2 = map(g, x2)

However this quickly become cumbersome, especially when building complex pipelines.

您知道SASS吗?

上一篇

福特计划4月14日在美重启生产线

下一篇

你也可能喜欢

评论已经被关闭。

插入图片

热门栏目

Streamz: Python pipelines to manage continuous streams of data

长按储存图像,分享给朋友