Blog Posts

Winutils.exe Hadoop

3/23/2019

Furious 7 brian tyler rar. Winutils.exe was apparently created as a stopgap measure to allow Hadoop to 'work' on Windows platforms, because the NativeIO libraries aren't implemented there.

**Table of Contents** [TOCM] # Prerequisits for Hadoop Suite (HDFS, Yarn, MapReduce) - **JAVA**: Install [Java SE Development Kit (x64)](and set `C: PROGRA~1 Java jdk1.8.0_131` as `JAVA_HOME` enviornment variable and append `%JAVA_HOME% bin` to `PATH`. You should be able to see java on console by executing `java -version`. Output for my system is as shown ``` C: >java -version java version '1.8.0_131' Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode) ``` - **Cygwin**: Install using [setup-x86_64.exe](in folder like `C: APPs Cygwin`, and set `CYGWIN_PATH` enviornment variable to `C: APPs Cygwin` and append `%CYGWIN_PATH% bin` to `PATH`.

You shouls be able to run Cygwin (_linux like_) commands from command prompt like `bzip2.exe --help`. - **Maven**: Get [Apache Bniary](and extract [apache-maven-3.5.0-bin.tar.gz](to `C: APPs ApacheSuite Maven`. Setup two enviornment variables `M2_HOME` and `MAVEN_HOME` with value `C: APPs ApacheSuite Maven` and append to `PATH` value `%M2_HOME bin`. You should be able to execute `mvn --version` from command prompt.

Its output is shown below ``` C: >mvn --version C: Apache Maven 3.5.0 (ff8f5eaf65f6095cf426; 2017-04-04T01:09:06+05:30) Maven home: C: APPs ApacheSuite Maven bin. Java version: 1.8.0_131, vendor: Oracle Corporation Java home: C: PROGRA~1 Java jdk1.8.0_131 jre Default locale: en_US, platform encoding: Cp1252 OS name: 'windows 10', version: '10.0', arch: 'amd64', family: 'windows' ``` # Installation & Using Hadoop Suite (HDFS, Yarn, MapReduce) Get [Hadoop Releases (2.8.0)](and extract to folder `C: APPs ApacheSuite Hadoop`.

Setup enviornment variable `HADOOP_HOME` to binary folder `C: APPs ApacheSuite Hadoop` and append `%HADOOP_HOME% bin` to `PATH`. ## Additional Enviornament Variables - Add to `PATH` enviorment variable `%HADOOP_HOME% sbin` which is `C: APPs ApacheSuite Hadoop sbin`. This will enlist all commands like `start-all.cmd`, `stop-all.cmd`, `start-dfs.cmd`, etc. - Also add two enviornment variables named `HADOOP_CONF_DIR` and `YARN_CONF_DIR`with values `%HADOOP_HOME% etc hadoop` for accessibility to Hadoop configuration files. - Add to `PATH` enviorment variable `%HADOOP_HOME% etc hadoop` for excessibility to `hadoop-env.cmd` file. **_Now carry out following configurations and actions:_** ## HDFS Configuration _**HDFS is a distributed file system that provides high-throughput access to application data.**_ In `etc/hadoop/core-site.xml` file enter HDFS configuration and set it to listen to `localhost:9000`. Configuration is as follows: ``` fs.defaultFS hdfs://0.0.0.0:9000 ``` ## Map-Reduce & Yarn Configuration _**Yarn is a framework for job scheduling and cluster resource management.

Map-Reduce is a yarn-based system for parallel processing of large data sets.**_. Copy `etc/hadoop/mapred-site.xml.template` as `etc/hadoop/mapred-site.xml` and enter following configuration: ``` mapreduce.framework.name yarn ``` Add following configuration to file `etc/hadoop/yarn-site.xml`: ``` yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.aux-services.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler ``` ## Namenode & DataNode Configuration Add following configuration to `etc/hadoop/hdfs-site.xml` file.

Автор

Архивы

Категории