HStreaming Library offers a variety of configurable ready-made job templates and data stream connectors to run real-time analytics on both HStreaming Cloud and HStreaming Enterprise. Sign-in to explore our library including real-time analytics on social media, real-time web log processing, or analytics of network traffic data. Check in frequently as we are continuously extending the library.
- Real-time Twitter analytics on public data feeds (spritzer)
- Real-time netflow packet inspection, classification, and analysis
- Real-time analytics on Apache Web Server log files
HStreaming currently comes with a number of default transport connectors:
- HTTP client
- HTTP server
- TCP client
- TCP server
- UDP unicast client
- UDP unicast server
- UDP multicast client
- HDFS
- Amazon S3
HStreaming currently supports the following streaming data formats on top of the transport connectors:
- Plain text
- JSON
- Netflow
The HStreaming library provides a set of Pig and native Hadoop examples to get you familiar with stream processing capabilities. Following is an example of the classic Hadoop word-count example using HStreaming's JSON connector accessing a Twitter stream:
A = load 'http://user:pass@stream.twitter.com/1/statuses/sample.json' using HStreamJson('\n');
B = foreach A generate flatten(TOKENIZE((chararray)($0#'text'))) as word;
C = group B by word; D = foreach C generate COUNT(B) as count,group as word;
store D into 'http://localhost//' using HStream();