Difference Between Distributed Database and Hadoop
A distributed database is a database that consists of two or more files located on different computers or sites either on the same network or on an entirely different network. It represents multiple interconnected databases spread out across several sites connected by a network.
Hadoop is a collection of open-source software utilities that are used to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.
Distributed Database vs Hadoop:
|1. Distributed database is a database that runs and stores data across multiple computers, as opposed to doing everything on a single machine.
|1. Hadoop is a collection of open-source software that connects many computers to solve problems involving a large amount of data and computation.
|2. It stores structured data
|2. It stores structured, semi-structured and unstructured data.
|3. It has vertical scalability.
|3. It has horizontal scalability.
|4. It stores average amount of data
|4. It stores a large amount of data
|5. It's throughput is higher
|5. It's throughput is lower