Shaoshan Liu is currently chairman and co-founder of PerceptIn, working on developing the next-generation robotics platform.
At PerceptIn, we build cutting-edge surveillance robots with object recognition capabilities enabled by our proprietary machine learning algorithms and powered by our state-of-the-art cloud architecture. Our robots patrol a user’s property intelligently and send an alert when any abnormality is detected.
In addition to on-demand video streaming and alerting functions, our robots can also recognize objects and can identify and playback objects that the user wants to search for. For example, for households with pets, our robots will be able to recognize whenever the pet is in the frame and the user can search the stored videos for all the instances where the pet was captured on film allowing the homeowner to see what their pet is up to when they are away.
The large amounts of data from video streaming combined with customer’s need for 24-hour control and remote access to the robot and video data drive the need for a performant and scalable cloud architecture. This blog post explores how PerceptIn designs and implements a cloud architecture with Alluxio distributed storage software as the key enabling technology to support user requirements. More importantly, we hope to demonstrate how Alluxio delivers high throughput, low latency, and a unified namespace to support these emerging cloud architectures.
Building a Cloud Architecture for Robotics
The rise of robotics applications demands new cloud architectures that deliver high throughput and low-latency. Issue includes:
Dawei Sun is currently with Tsinghua University and PerceptIn, working on Deep Learning and cloud infrastructures, autonomous robots, as well as embedded systems.
- Customers demand real-time responses : it is critical that users have access to on-demand video feed and are notified of any abnormalities in real-time. Therefore the cloud solution must deliver high throughput and low-latency for writing and retrieving video feeds.
- Large-scale data is distributed across disparate storage systems : on-demand video streaming generates enormous amounts of data and not all data is created equal. Depending on the age of data and whether it is being used by our custom applications (such as our on-demand video streaming application and our object detection application), data is stored across disparate storage systems to optimize for resources allocation efficiency.
The implication from these requirements is that we need a storage engine that not only can handle an enormous amount of incoming data, which will end up in different storage systems but also provides high throughput and low-latency for writing and retrieving video feeds.
PerceptIn Powers the Robotics Cloud with Alluxio
To fulfill the above requirements, we have designed and implemented a cloud architecture:
PerceptIn Video Streaming Cloud Architecture
We leverage Alluxio as the storage layer, one that provides high performance and unification across disparate storage systems on-premise and in the cloud. With Alluxio, our business analytics, object recognition, query engines and key-value store can all interact with data stored in Amazon Web Services Simple Storage Service (S3), Ceph, and the Hadoop File System (HDFS) with ease at memory speed.
To understand how Alluxio helps in this scenario, let us first compare the write throughput of Alluxio, compared to that of a local disk. This throughput is critical as it determines how fast we can write a video feed to storage. If the throughput is too low, then the storage layer may become the bottleneck of the whole multimedia data pipeline. As shown in the figure below, it is easy with Alluxio to achieve more than 650 MB/s throughput, whereas, with the native file system, only 120 MB/s achieved.
Thus Alluxio delivers at least 5 times improvement in this case because writing to Alluxio in-memory storage avoids hitting the hard disk I/O bottleneck.
Then we compared the video retrieval latency. Using Alluxio, a video can be retrieved within 500ms. However, when the video is stored in remote machines, the latency can be as high as 20 seconds. Therefore, using Alluxio to buffer “hot” video data could reduce retrieval latencies by as many as 40 fold, and this is critical to user experiences.
In addition, different users demand different storage systems beneath Alluxio, some use Ceph, some use HDFS, others use S3. Without Alluxio, we would have to manage multiple interfaces, one for each under storage system. With Alluxio’s unified namespace, we could now maintain one major interface and meanwhile enjoy the benefits of different underlying storages.
The rise of robotics cloud architectures imposes many new requirements, which translate to requirements to the storage layer. The storage layer needs to support heterogeneous persistent storages with a unified interface to ease development and management. Additionally, the storage layer needs to provide high throughput and low latency to enable faster time to insights. Alluxio perfectly fulfills these needs and therefore we chose Alluxio as the default storage engine for our cloud infrastructure.