[go: up one dir, main page]

Try our new research platform with insights from 80,000+ expert users
Apache Spark Logo

Apache Spark Reviews

Vendor: Apache
4.2 out of 5
Badge Leader
What is Apache Spark?
Featured Apache Spark reviews
Apache Spark mindshare
As of November 2025, the mindshare of Apache Spark in the Hadoop category stands at 17.1%, down from 18.1% compared to the previous year, according to calculations based on PeerSpot user engagement data.
Hadoop Market Share Distribution
ProductMarket Share (%)
Apache Spark17.1%
Cloudera Distribution for Hadoop19.1%
HPE Data Fabric14.6%
Other49.199999999999996%
Hadoop
PeerResearch reports based on Apache Spark reviews
TypeTitleDate
CategoryHadoopNov 25, 2025Download
ProductReviews, tips, and advice from real usersNov 25, 2025Download
ComparisonApache Spark vs Cloudera Distribution for HadoopNov 25, 2025Download
ComparisonApache Spark vs Amazon EMRNov 25, 2025Download
ComparisonApache Spark vs HPE Data FabricNov 25, 2025Download
Suggested products
TitleRatingMindshareRecommending
Spring Boot4.2N/A95%41 interviewsAdd to research
Jakarta EE3.7N/A66%3 interviewsAdd to research
 
 
Key learnings from peers
Last updated Nov 21, 2025
Valuable FeaturesRoom for Improvement
ROIPricing
Popular Use Cases
Service and SupportDeploymentScalabilityStability
Review data by company size
By reviewers
Company SizeCount
Small Business25
Midsize Enterprise13
Large Enterprise25
By reviewers
By visitors reading reviews
Company SizeCount
Small Business115
Midsize Enterprise32
Large Enterprise353
By visitors reading reviews
Top industries
By visitors reading reviews
Financial Services Firm
26%
Computer Software Company
11%
Manufacturing Company
7%
Comms Service Provider
6%
University
5%
Government
5%
Retailer
5%
Insurance Company
5%
Educational Organization
4%
Healthcare Company
4%
Construction Company
2%
Real Estate/Law Firm
2%
Outsourcing Company
2%
Non Profit
2%
Performing Arts
2%
Media Company
1%
Legal Firm
1%
Recreational Facilities/Services Company
1%
Hospitality Company
1%
Pharma/Biotech Company
1%
Consumer Goods Company
1%
Transportation Company
1%
Renewables & Environment Company
1%
Energy/Utilities Company
1%
 
Apache Spark Reviews Summary
Author infoRatingReview Summary
Data Architect at Devtech4.5I’ve used Apache Spark for four years, mainly for data integration and access. Its in-memory processing and open-source flexibility suit my needs, despite some stability issues. I prefer it over commercial tools like Informatica due to cost and adaptability.
Data Engineer at a tech company with 10,001+ employees5.0I use Apache Spark for real-time data processing and transformation across multiple sources like CRM and Siebel. It's reliable, fast, and improves our decision-making, though I see future needs for better integration with emerging cloud solutions.
Senior Developer at Infosys3.5No summary available
Senior Software Architect at USEReady4.0No summary available
Sr Manager at a transportation company with 10,001+ employees4.5I use Apache Spark for real-time data processing and ETL tasks. It offers unparalleled features but faces limitations due to its in-memory implementation. Despite improvements in version 3.0, reducing costs and addressing memory issues would enhance it further.
Data Scientist at a financial services firm with 10,001+ employees4.5I primarily use Apache Spark for data processing tasks involving large datasets, appreciating its ease of use and portability. While it's efficient for both small and large datasets, the lack of support for geospatial data is a limitation.
Data engineer at Cocos pt4.5We use Apache Spark primarily for Spark SQL and occasionally Spark Streaming, processing data from sources like SAP and Azure Data Warehouse. Its in-memory processing significantly outperforms Hadoop, offering faster data handling and enhanced query optimization.
Head of Data at a energy/utilities company with 51-200 employees4.0Apache Spark significantly reduced operational costs by 50% and although it supports parallel processing, it needs improvements in scalability and user-friendliness. Working with datasets isn't as straightforward as with Pandas, though it's flexible and functional.