Resources:
What's New
HStreaming Library

HStreaming Library offers a variety of configurable ready-made job templates and data stream connectors to run real-time analytics on both HStreaming Cloud and HStreaming Enterprise. Sign-in to explore our library including real-time analytics on social media, real-time web log processing, or analytics of network traffic data. Check in frequently as we are continuously extending the library.

Selection of Available Scripts
  • Real-time Twitter analytics on public data feeds (spritzer)
  • Real-time netflow packet inspection, classification, and analysis
  • Real-time analytics on Apache Web Server log files
Transport Connectors

HStreaming currently comes with a number of default transport connectors:

  • HTTP client
  • HTTP server
  • TCP client
  • TCP server
  • UDP unicast client
  • UDP unicast server
  • UDP multicast client
  • HDFS
  • Amazon S3
Data Formats for Streaming

HStreaming currently supports the following streaming data formats on top of the transport connectors:

  • Plain text
  • JSON
  • Netflow
Stream processing made easy with Apache Pig for HStreaming

The HStreaming library provides a set of Pig and native Hadoop examples to get you familiar with stream processing capabilities. Following is an example of the classic Hadoop word-count example using HStreaming's JSON connector accessing a Twitter stream:

A = load 'http://user:pass@stream.twitter.com/1/ statuses/sample.json' using HStreamJson('\n');
B = foreach A generate flatten(TOKENIZE((chararray) ($0#'text'))) as word;
C = group B by word; D = foreach C generate COUNT(B) as count,group as word;
store D into 'http://localhost//' using HStream();