- Green versionCheck
- Green versionCheck
- Green versionCheck
- Green versionCheck
Application introduction:
Alluxio is released. Alluxio is a highly fault-tolerant distributed file system that allows files to be reliably shared across cluster frameworks at memory speeds, similar to Spark and MapReduce. By leveraging lineage information and aggressively using memory, Alluxio's throughput is more than 300 times higher than HDFS. Alluxio processes cache files in memory, and allows different Jobs/Queries and frameworks to access cache files at memory speed.
Application product features:
Alluxio sits between traditional big data storage and big data computing frameworks (such as Spark, Hadoop Mapreduce);
In the field of big data, the lowest level is distributed file systems, such as Amazon S3, Apache HDFS, etc., while higher-level applications are distributed computing frameworks, such as Spark, MapReduce, Hbase, Flink, etc.
About Alluxio:
Consistent with other big data-related frameworks such as HDFS, HBase, and Spark, Alluxio is also a master-slave structure system. Its master node is Master, which is responsible for managing global file system metadata, such as file system trees, etc., while its slave node is Worker, which is responsible for managing the data storage service of this node. In addition, Alluxio also has a component called Client, which provides users with a unified file access service interface. When an application needs to access Alluxio, it first communicates with the master node Master through the client, perhaps corresponding to the metadata of the file, and then communicates with the corresponding Worker node to perform actual file access operations. All Workers will periodically send heartbeats to the Master to maintain file system metadata information and ensure that they are perceived by the Master and provide normal services in the cluster. The Master will not actively initiate communication with other components, it will only respond to requests. way to communicate with other components. This is consistent with the design patterns of distributed systems such as HDFS and HBase.
it works
it works
it works