Storm and Spark Streaming




Monday, January 5, 2015
Let's have a quick highlight on differences and similarities between Apache Storm and Apache Spark streaming.
 Storm:

  • Well-known for distributed real-time computation
  • Task Parallel
  • Workflows in Directed Acyclic Graphs (topologies”)
  • Run until shutdown by the user
  • Natively does not run on top of Hadoop
  • Implemented in JVM based languages
    • Support Scala
  • processing model
    • Event-Stream Processing
    • micro-batching
  • Configurable delivery: At most once, at least once, exactly once.
  • Near real-time analytics 
    • Perfect for data normalization adn ETL

Spark Streaming
  • Data Parallel
  • Hadoop friendly
    • Does not need Hadoop for its operation
  • Run until shutdown by the user
  • Implemented in JVM based languages
    • Support Scala
  • Processing model
    • Event-Stream Processing
    • micro-batching
  • Exactly once delivery
  • Near real-time analytics
 

Favorite Quotes

"I have never thought of writing for reputation and honor. What I have in my heart must out; that is the reason why I compose." --Beethoven

"All models are wrong, but some are useful." --George Box

Copyright © 2015 • [Deprecated] visit: firooz.us/blog