Hadoop is a data storage and processing system. It is scalable, fault-tolerant and distributed. Hadoop is able to store any kind of data in its native format and to perform a wide variety of analyses and transformations on that data. Hadoop stores terabytes, and even petabytes, of data inexpensively. It is robust and reliable and handles hardware and system failures automatically, without losing data or interrupting data analyses.
“In a nutshell, this is what Hadoop provides: a reliable, scalable platform for storage and
“Apache Hadoop is an open-source, reliable, scalable, cost-effective, flexible
distributed file system solution to store millions of large files meant for batch
processing (large streaming reads/writes).“