Wednesday, 24 April 2019

Configuration files for hadoop eco-system components


Configuration files for hadoop eco-system components
    • Hive:
      • Hive will usually inherit its HDFS and YARN configuration from the Hadoop configuration files just spoken about (core-site.xml, hdfs-site.xml, yarn-site.xml, and mapred-site.xml). However, many other properties specific to Hive are maintained in the hive-site.xml configuration file, typically located in /etc/hive/conf.
      • The most significant settings in hive-site.xml are the connection details to a shared Hive metastore

    • Pig:
      • As with Hive, most HDFS- and YARN-specific configuration details are sourced from the Hadoop configuration on the host, typically a client including the Pig and Hadoop client libraries. Additional Pig-specific properties are located in the pig.properties file, which is in /etc/pig/conf on most distributions. 
      • Note that the pig.properties file, unlike many of the other Hadoop configuration files, is not an XML document.

    • Spark:
      • Spark configuration properties are set through the spark-defaults.conf file located in $SPARK_HOME/conf. This configuration file is read by Spark applications and daemons upon startup.
      • The spark-defaults.conf file, like pig.properties, is also not in the standard Hadoop XML configuration format.Spark configuration properties can also be set programmatically in your driver code using the SparkConf object

    • Hbase:
      • HBase configuration is typically stored in /etc/hbase/conf, the primary configuration file being the hbase-site.xml file. This will govern the behavior of HBase and will be used by the HMaster and RegionServers alike.


No comments:

Post a Comment