Thursday, September 6, 2018

What are Edge Nodes or Gateway Nodes in Hadoop?

Edge nodes are the interface between the Hadoop cluster and the outside network from which Hadoop user can store files in Hadoop cluster. It’s a gateway to the cluster, Hence some time we refer it as a gateway node as well.

Commonly, edge nodes are used to run cluster administration tools and client applications.  Edge-nodes are kept separate from the cluster nodes that contain HDFS, MapReduce, etc components in it, It mainly to keeps the computing resources separate from the outer world.

Edge nodes running within the cluster allow for centralized management of all the Hadoop configurations on the cluster nodes which helps to reduce the administration efforts needed to update the config files through cluster administrators. 
It’s a limited security within Hadoop itself, even if your Hadoop cluster operates in a LAN or WAN behind a security firewall. You may consider a cluster-specific firewall to fully protect non-public data of Hadoop cluster.