Atittit HDFS hadoop 大数据文件系统java使用总结 目录 1. 操作系统,进行操作 1 2. Hdfs 类似nfs ftp远程分布式文件服务 2 3. 启动hdfs服务start

Atittit HDFS hadoop 大数据文件系统java使用总结

 

目录

1. 操作系统,进行操作 1

2. Hdfs 类似nfs ftp远程分布式文件服务 2

3. 启动hdfs服务start-dfs.cmd 2

3.1. 配置core-site.xml 2

3.2. 启动 2

3.3. Code 2

4. prob总结 6

4.1. 启动hdfs服务中提示windows找不到hadoop 6

4.2. D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format 6

4.3. 提示file://没权限 10

4.4. 347java.io.IOException: NameNode is not formatted 11

4.5. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory C:\tmp\hadoop-Administrator\dfs\name is in an inconsistent state: storage directory does not exist or is not accessible. 11

4.6. Unsafe link  ,, 12

4.7. Unkonw host  java.net.ConnectException: Connection refused: no further information 12

5. Theory 12

5.1. 建立文件夹 13

5.2. 写文件 13

6. ref 13

 

 

文件系统的几种操作

 

  1. 操作系统,进行操作
  1. 文件夹的操作:增删改查
  2. 远程文件的IO操作
  3. 文件的上传下载 (本地 远程文件复制操作

 

.具体的操作命令

  1. 根据配置获取HDFS文件操作系统(共有三种方式)
    1. 方法一:直接获取配置文件方法
      通常情况下该方法用于本地有hadoop系统,可以直接进行访问。此时仅需在配置文件中指定要操作的文件系统为hdfs即可。这里的conf的配置文件可以设置hdfs的各种参数,并且优先级比配置文件要搞
    2. 方法二:指定URI路径,进而获取配置文件创建操作系统
      通常该方法用于本地没有hadoop系统,但是可以通过URI的方式进行访问。此时要给给定hadoop的NN节点的访问路径,hadoop的用户名,以及配置文件信息(此时会自动访问远程hadoop的配置文件)
  1. Hdfs 类似nfs ftp远程分布式文件服务
  2. 启动hdfs服务start-dfs.cmd
    1. 配置core-site.xml

 

D:\haddop\hadoop-3.1.1\etc\hadoop>core-site.xml

 

<configuration>

 

 <property>

  <name>fs.default.name</name>

  <value>hdfs://huabingood01:9000</value>

</property>

 

</configuration>

 

    1. 启动

 

    1. Code

package hdfsHadoopUse;

 

import java.io.IOException;

 

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

 

public class hdfsHadoopClass {

 

public static void main(String[] args) throws IOException {

 String pathToCreate = "/firstDirS09/secdirS09";

 hdfsHadoopClass hdfsHadoopClass = new hdfsHadoopClass();

FileSystem fs=hdfsHadoopClass.getHadoopFileSystem();

hdfsHadoopClass.myCreatePath(fs, pathToCreate);

System.out.println("--f");

}

/**

     * 根据配置文件获取HDFS操作对象

     * 有两种方法:

     *  1.使用conf直接从本地获取配置文件创建HDFS对象

     *  2.多用于本地没有hadoop系统,但是可以远程访问。使用给定的URI和用户名,访问远程的配置文件,然后创建HDFS对象。

     * @return FileSystem

 * @throws IOException

     */

    public FileSystem getHadoopFileSystem() throws IOException {

 

 

        FileSystem fs = null;

        Configuration conf = null;

 

        // 方法一,本地有配置文件,直接获取配置文件(core-site.xml,hdfs-site.xml)

        // 根据配置文件创建HDFS对象

        // 此时必须指定hdsf的访问路径。

        conf = new Configuration();

        // 文件系统为必须设置的内容。其他配置参数可以自行设置,且优先级最高

        conf.set("fs.defaultFS", "hdfs://0.0.0.0:19000");

        conf.set("fs.hdfs.impl",org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());  

        

            // 根据配置文件创建HDFS对象

            fs = FileSystem.get(conf);

       

 

 

        return fs;

    }

    

    /**

     * 这里的创建文件夹同shell中的mkdir -p 语序前面的文件夹不存在

     * 跟java中的IO操作一样,也只能对path对象做操作;但是这里的Path对象是hdfs中的

     * @param fs

     * @return

     * @throws IOException

     */

    public boolean myCreatePath(FileSystem fs, String pathToCreate) throws IOException{

        boolean b = false;

 

     //   String pathToCreate = "/hyw/test/huabingood/hyw";

Path path = new Path(pathToCreate);

        try {

            // even the path exist,it can also create the path.

            b = fs.mkdirs(path);

        }   finally {

           

                fs.close();

           

        }

        return b;

    }

}

 

  1. prob总结

 

Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: huabingood01

Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: huabingood01

需要启动hdfs服务。。

%HADOOP_PREFIX%\sbin\start-dfs.cmd

    1. 启动hdfs服务中提示windows找不到hadoop

start-dfs.cmd

start "Apache Hadoop Distribution" hadoop namenode

start "Apache Hadoop Distribution" hadoop datanode

需要先建立namenode

 D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format

这里的hadoop指的是应该是bin\hadoop.cmd命令,把他加入到path目录envi pathvar

    1. D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format

D:\haddop\hadoop-3.1.1\sbin> D:\haddop\hadoop-3.1.1\bin\hdfs namenode -format

2018-10-28 07:02:54,801 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = hmNotePC/192.168.1.101

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 3.1.1

STARTUP_MSG:   classpath = D:\haddop\hadoop-3.1.1\etc\hadoop;D:\haddop\hadoop-3.1.1\share\had

STARTUP_MSG:   build = https://github.com/apache/hadoop -r 2b9a8c1d3a2caf1e733d57f346af3ff0d5ba529c; compiled by 'leftnoteasy' on 2018-08-02T04:26Z

STARTUP_MSG:   java = 1.8.0_31

************************************************************/

2018-10-28 07:02:54,854 INFO namenode.NameNode: createNameNode [-format]

Formatting using clusterid: CID-ecf4351a-e57c-411b-8ef3-2198981bc44b

2018-10-28 07:02:56,060 INFO namenode.FSEditLog: Edit logging is async:true

2018-10-28 07:02:56,090 INFO namenode.FSNamesystem: KeyProvider: null

2018-10-28 07:02:56,092 INFO namenode.FSNamesystem: fsLock is fair: true

2018-10-28 07:02:56,100 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false

2018-10-28 07:02:56,119 INFO namenode.FSNamesystem: fsOwner             = Administrator (auth:SIMPLE)

2018-10-28 07:02:56,120 INFO namenode.FSNamesystem: supergroup          = supergroup

2018-10-28 07:02:56,120 INFO namenode.FSNamesystem: isPermissionEnabled = true

2018-10-28 07:02:56,121 INFO namenode.FSNamesystem: HA Enabled: false

2018-10-28 07:02:56,203 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling

2018-10-28 07:02:56,229 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000

2018-10-28 07:02:56,229 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true

2018-10-28 07:02:56,238 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000

2018-10-28 07:02:56,239 INFO blockmanagement.BlockManager: The block deletion will start around 2018 十月 28 07:02:56

2018-10-28 07:02:56,243 INFO util.GSet: Computing capacity for map BlocksMap

2018-10-28 07:02:56,243 INFO util.GSet: VM type       = 64-bit

2018-10-28 07:02:56,248 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB

2018-10-28 07:02:56,251 INFO util.GSet: capacity      = 2^21 = 2097152 entries

2018-10-28 07:02:56,269 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false

2018-10-28 07:02:56,362 INFO Configuration.deprecation: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS

2018-10-28 07:02:56,362 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033

2018-10-28 07:02:56,363 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0

2018-10-28 07:02:56,364 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000

2018-10-28 07:02:56,364 INFO blockmanagement.BlockManager: defaultReplication         = 3

2018-10-28 07:02:56,365 INFO blockmanagement.BlockManager: maxReplication             = 512

2018-10-28 07:02:56,366 INFO blockmanagement.BlockManager: minReplication             = 1

2018-10-28 07:02:56,366 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2

2018-10-28 07:02:56,367 INFO blockmanagement.BlockManager: redundancyRecheckInterval  = 3000ms

2018-10-28 07:02:56,368 INFO blockmanagement.BlockManager: encryptDataTransfer        = false

2018-10-28 07:02:56,368 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000

2018-10-28 07:02:56,425 INFO util.GSet: Computing capacity for map INodeMap

2018-10-28 07:02:56,426 INFO util.GSet: VM type       = 64-bit

2018-10-28 07:02:56,426 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB

2018-10-28 07:02:56,427 INFO util.GSet: capacity      = 2^20 = 1048576 entries

2018-10-28 07:02:56,428 INFO namenode.FSDirectory: ACLs enabled? false

2018-10-28 07:02:56,429 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true

2018-10-28 07:02:56,429 INFO namenode.FSDirectory: XAttrs enabled? true

2018-10-28 07:02:56,430 INFO namenode.NameNode: Caching file names occurring more than 10 times

2018-10-28 07:02:56,440 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRo

 

2018-10-28 07:02:56,444 INFO snapshot.SnapshotManager: SkipList is disabled

2018-10-28 07:02:56,452 INFO util.GSet: Computing capacity for map cachedBlocks

2018-10-28 07:02:56,452 INFO util.GSet: VM type       = 64-bit

2018-10-28 07:02:56,453 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB

2018-10-28 07:02:56,453 INFO util.GSet: capacity      = 2^18 = 262144 entries

2018-10-28 07:02:56,467 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10

2018-10-28 07:02:56,468 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10

2018-10-28 07:02:56,469 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25

2018-10-28 07:02:56,475 INFO namenode.FSNamesystem: Retry cache on namenode is enabled

2018-10-28 07:02:56,476 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis

2018-10-28 07:02:56,481 INFO util.GSet: Computing capacity for map NameNodeRetryCache

2018-10-28 07:02:56,482 INFO util.GSet: VM type       = 64-bit

2018-10-28 07:02:56,482 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB

2018-10-28 07:02:56,483 INFO util.GSet: capacity      = 2^15 = 32768 entries

2018-10-28 07:02:56,527 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1079199093-192.168.1.101-1540681376517

2018-10-28 07:02:56,547 INFO common.Storage: Storage directory \tmp\hadoop-Administrator\dfs\name has been successfully formatted.

2018-10-28 07:02:56,580 INFO namenode.FSImageFormatProtobuf: Saving image file \tmp\hadoop-Administrator\dfs\name\current\fsimage.ckpt_0000000000000000000 us

2018-10-28 07:02:56,717 INFO namenode.FSImageFormatProtobuf: Image file \tmp\hadoop-Administrator\dfs\name\current\fsimage.ckpt_0000000000000000000 of size 3

2018-10-28 07:02:56,741 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0

2018-10-28 07:02:56,756 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at hmNotePC/192.168.1.101

************************************************************/

    1. 提示file://没权限

D:\haddop\hadoop-3.1.1\etc\hadoop>core-site.xml

 

<configuration>

 

 <property>

  <name>fs.default.name</name>

  <value>hdfs://huabingood01:9000</value>

</property>

 

</configuration>

 

    1. 347java.io.IOException: NameNode is not formatted

 

评论

 

访问localhost:50070失败,说明namenode启动失败
3、查看namenode启动日志

 

    1. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory C:\tmp\hadoop-Administrator\dfs\name is in an inconsistent state: storage directory does not exist or is not accessible.

        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSI

mage.java:376)

        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(

 

    1. Unsafe link  ,,

Hadoop.dll  winutil.exe  feodg   system  win  dir hto...maybe  need boot

    1. Unkonw host  java.net.ConnectException: Connection refused: no further information

 

D:\haddop\hadoop-3.1.1\etc\hadoop>core-site.xml

 

<configuration>

 

 <property>

  <name>fs.default.name</name>

  <value>hdfs://huabingood01:9000</value>

</property>

 

</configuration>

    conf = new Configuration();

        // 文件系统为必须设置的内容。其他配置参数可以自行设置,且优先级最高

        conf.set("fs.defaultFS", "hdfs://0.0.0.0:19000");

        conf.set("fs.hdfs.impl",org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());  

        

 

Cfg file and code url yao yyo..

 

 

  1. Theory

 

C:\tmp\hadoop-Administrator>tree

卷 p1sys 的文件夹 PATH 列表

卷序列号为 A87E-7AB4

C:.

├─dfs

│  ├─data

│  └─name

└─nm-local-dir

 

    1. 建立文件夹

 

 String pathToCreate = "/firstDirS09/secdirS09";

    hdfsHadoopClass.myCreatePath(fs, pathToCreate);

实际建立的文件夹在的盘

 

D:\firstDirS09\secdirS09

 

    1. 写文件

 

    //wirte file

FSDataOutputStream FSDataOutputStream1 = fs.create(new Path("/file1S09.txt"));  

FSDataOutputStream1.writeUTF("attilax bazai");

FSDataOutputStream1.close();

 

 

D:\file1S09.txt

D:\.file1S09.txt.crc

 

  1. ref

使用javaAPI操作hdfs - huabingood - 博客园.html

 

原文地址:https://www.cnblogs.com/attilax/p/15197507.html