Let's have a quick highlight on differences and similarities between Apache Storm and Apache Spark streaming.
Storm:
- Well-known for distributed real-time computation
- Task Parallel
- Workflows in Directed Acyclic Graphs (topologies”)
- Run until shutdown by the user
- Natively does not run on top of Hadoop
- Implemented in JVM based languages
- Support Scala
- processing model
- Event-Stream Processing
- micro-batching
- Configurable delivery: At most once, at least once, exactly once.
- Near real-time analytics
- Perfect for data normalization adn ETL
Spark Streaming
- Data Parallel
- Hadoop friendly
- Does not need Hadoop for its operation
- Run until shutdown by the user
- Implemented in JVM based languages
- Support Scala
- Processing model
- Event-Stream Processing
- micro-batching
- Exactly once delivery
- Near real-time analytics