国内精品久久久久久久亚洲,国产精品亚洲专区一区,久久精品国产亚洲av麻

《進(jìn)擊大數(shù)據(jù)》系列教程之hadoop大數(shù)據(jù)基礎(chǔ)

網(wǎng)友投稿 773 2025-04-03

前言

一、使用http的方式訪問(wèn)hdfs

二、hdfs各組件及其作用

三、hdfs中的數(shù)據(jù)塊（Block）

四、java api 操作hdfs

五、java 開(kāi)發(fā) hdfs 應(yīng)用時(shí)注意的事項(xiàng)

六、DataNode心跳機(jī)制的作用

七、NameNode中EditsLog 和FSImage 的作用

八、SecondaryNameNode 幫助NameNode 減負(fù)

九、NameNode 如何擴(kuò)展？

十、創(chuàng)建文件的快照（備份）命令

十一、平衡數(shù)據(jù)

十二、safemode 安全模式

前言

時(shí)隔一年多，忙忙碌碌一直在做java web端的業(yè)務(wù)開(kāi)發(fā)，大數(shù)據(jù)基本忘的差不多了，此次出一個(gè)大數(shù)據(jù)系列教程博文將其撿起。

hadoop，hdfs的下載安裝以及啟動(dòng)，停止在這里就不一一介紹了，不會(huì)的可以查看我的歷史博客。

默認(rèn)我們已經(jīng)搭建好了一個(gè)三節(jié)點(diǎn)的主從hadoop節(jié)點(diǎn)，一個(gè)master，兩個(gè)salve

一、使用http的方式訪問(wèn)hdfs

在hdfs-site.xml中增加如下配置，然后重啟hdfs：

dfs.webhdfs.enabled

true

使得可以使用http的方式訪問(wèn)hdfs

使用http訪問(wèn)：

查詢user/hadoop-twq/cmd 文件系統(tǒng)中的error.txt文件

http://master:50070/webhdfs/v1/user/hadoop-twq/cmd/error.txt?op=LISTSTATUS

http://master:50070/webhdfs/v1/user/hadoop-twq/cmd/error.txt?op=OPEN

支持的op見(jiàn)：

http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/hadoop-hdfs/WebHDFS.html

二、hdfs各組件及其作用

三、hdfs中的數(shù)據(jù)塊（Block）

數(shù)據(jù)塊的默認(rèn)大小：128M

設(shè)置數(shù)據(jù)塊的大小為：256M = 256*1024*1024

在${HADOOP_HOME}/ect/hadoop/hdfs-site.xml增加配置：

dfs.block.size

268435456

數(shù)據(jù)塊的默認(rèn)備份數(shù)是3

設(shè)置數(shù)據(jù)塊的備份數(shù)

對(duì)指定的文件設(shè)置備份數(shù)

hadoop? fs? -setrep? 2 /user/hadoop-twq/cmd/big_file.txt

全局文件備份數(shù)是直接在hdfs-site進(jìn)行設(shè)置：

dfs.replication

數(shù)據(jù)塊都是存儲(chǔ)在每一個(gè)datanode 所在的機(jī)器本地磁盤(pán)文件中

四、java api 操作hdfs

使用javaapi 寫(xiě)數(shù)據(jù)到文件

package com.dzx.hadoopdemo;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FSDataOutputStream;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

import java.io.IOException;

import java.net.URI;

import java.net.URISyntaxException;

import java.nio.charset.StandardCharsets;

/**

* @author duanzhaoxu

* @ClassName:

* @Description:

* @date 2020年12月17日 17:35:37

public class hdfs {

public static void main(String[] args) throws Exception {

String content = "this is a example";

String dest = "hdfs://master:9999/user/hadoop-twq/cmd/java_writer.txt";

Configuration configuration = new Configuration();

FileSystem fileSystem = FileSystem.get(URI.create(dest), configuration);

FSDataOutputStream fsDataOutputStream = fileSystem.create(new Path(dest));

fsDataOutputStream.write(content.getBytes(StandardCharsets.UTF_8));

fsDataOutputStream.close();

}

使用java api 讀取文件

package com.dzx.hadoopdemo;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FSDataInputStream;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

import java.io.BufferedReader;

import java.io.InputStreamReader;

import java.net.URI;

《進(jìn)擊大數(shù)據(jù)》系列教程之hadoop大數(shù)據(jù)基礎(chǔ)

/**

* @author duanzhaoxu

* @ClassName:

* @Description:

* @date 2020年12月17日 17:35:37

public class hdfs {

public static void main(String[] args) throws Exception {

String dest = "hdfs://master:9999/user/hadoop-twq/cmd/java_writer.txt";

Configuration configuration = new Configuration();

FileSystem fileSystem = FileSystem.get(URI.create(dest), configuration);

FSDataInputStream fsDataInputStream = fileSystem.open(new Path(dest));

BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(fsDataInputStream));

String line = null;

while (bufferedReader.readLine() != null) {

System.out.println(line);

}

fsDataInputStream.close();

bufferedReader.close();

}

使用java api 獲取文件狀態(tài)信息

package com.dzx.hadoopdemo;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FSDataInputStream;

import org.apache.hadoop.fs.FileStatus;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.fs.permission.FsAction;

import org.apache.hadoop.fs.permission.FsPermission;

import java.io.BufferedReader;

import java.io.InputStreamReader;

import java.net.URI;

/**

* @author duanzhaoxu

* @ClassName:

* @Description:

* @date 2020年12月17日 17:35:37

public class hdfs {

public static void main(String[] args) throws Exception {

//獲取指定文件的文件狀態(tài)信息

String dest = "hdfs://master:9999/user/hadoop-twq/cmd/java_writer.txt";

Configuration configuration = new Configuration();

FileSystem fileSystem = FileSystem.get(URI.create("hdfs://master:9999/"), configuration);

FileStatus fileStatus = fileSystem.getFileStatus(new Path(dest));

System.out.println(fileStatus.getPath());

System.out.println(fileStatus.getAccessTime());

System.out.println(fileStatus.getBlockSize());

System.out.println(fileStatus.getGroup());

System.out.println(fileStatus.getLen());

System.out.println(fileStatus.getModificationTime());

System.out.println(fileStatus.getOwner());

System.out.println(fileStatus.getPermission());

System.out.println(fileStatus.getReplication());

System.out.println(fileStatus.getSymlink());

//獲取指定目錄下的所有文件的文件狀態(tài)信息

FileStatus[] fileStatuses = fileSystem.listStatus(new Path("hdfs://master:9999/user/hadoop-twq/cmd"));

for (FileStatus status : fileStatuses) {

System.out.println(status.getPath());

System.out.println(status.getAccessTime());

System.out.println(status.getBlockSize());

System.out.println(status.getGroup());

System.out.println(status.getLen());

System.out.println(status.getModificationTime());

System.out.println(status.getOwner());

System.out.println(status.getPermission());

System.out.println(status.getReplication());

System.out.println(status.getSymlink());

}

//創(chuàng)建目錄

fileSystem.mkdirs(new Path("hdfs://master:9999/user/hadoop-twq/cmd/java"));

//創(chuàng)建目錄并指定權(quán)限 rwx--x---

fileSystem.mkdirs(new Path("hdfs://master:9999/user/hadoop-twq/cmd/temp"), new FsPermission(FsAction.ALL, FsAction.EXECUTE, FsAction.NONE));

//刪除指定文件

fileSystem.delete(new Path("hdfs://master:9999/user/hadoop-twq/cmd/java/1.txt"), false);

//刪除指定目錄

fileSystem.delete(new Path("hdfs://master:9999/user/hadoop-twq/cmd/java"), true);

}

五、java 開(kāi)發(fā) hdfs 應(yīng)用時(shí)注意的事項(xiàng)

//需要把core-site.xml文件放到resources目錄下，自動(dòng)讀取hdfs的ip端口配置

String dest = "user/hadoop-twq/cmd/java_writer.txt";

Configuration configuration = new Configuration();

FileSystem fileSystem = FileSystem.get(configuration);

FileStatus fileStatus = fileSystem.getFileStatus(new Path(dest));

六、DataNode心跳機(jī)制的作用

七、NameNode中EditsLog 和FSImage 的作用

八、SecondaryNameNode 幫助NameNode 減負(fù)

九、NameNode 如何擴(kuò)展？

向主節(jié)點(diǎn)master的hdfs-site.xml 增加如下配置

查看 master節(jié)點(diǎn)的 clusterId

將hdfs-site.xml 從主節(jié)點(diǎn)master 拷貝到 slave1 和 slave2

形成三個(gè)節(jié)點(diǎn)的集群之后，我們使用java api 就不知道指定哪個(gè) nameNode 的ip：端口了，所以我們需要進(jìn)行viewFs配置

首先將core-site.xml 中的fs.defaultFS 配置項(xiàng)注釋掉，然后添加如下配置

fs.default.name

viewfs://my-cluster

然后增加一個(gè) mountTable.xml 文件( 元數(shù)據(jù)管理分布映射，相當(dāng)于將namenode 管理的元數(shù)據(jù)分散到不同的namenode節(jié)點(diǎn)上)

然后將修改好的配置文件同步到 slave1 和 slave2 上，重啟hdfs集群即可

重啟之后在任意節(jié)點(diǎn)都可以使用通用的請(qǐng)求方式，例如：

hadoop? fs? -ls? viewfs:://my-cluster/

十、創(chuàng)建文件的快照（備份）命令

給指定的目錄授權(quán)創(chuàng)建快照

hadoop dfsadmin -allowSnapshot? /user/hadoop-twq/data

創(chuàng)建快照

hadoop fs -createSnapshot /user/hadoop-twq/data? data-20180317-snapshot

查看創(chuàng)建的快照文件

hadoop fs -ls /user/hadoop-twq/data/.snapshot/data-20180317-snapshot

其他快照相關(guān)命令

十一、平衡數(shù)據(jù)

當(dāng)我們對(duì)hdfs 集群進(jìn)行擴(kuò)展的時(shí)候，難免新擴(kuò)展進(jìn)來(lái)的節(jié)點(diǎn)，分配的數(shù)據(jù)量較少，這個(gè)時(shí)候?yàn)榱四軌蚓獾姆峙鋽?shù)據(jù)，可以使用hdfs? balancer命令

十二、safemode 安全模式

開(kāi)啟安全模式之后無(wú)法創(chuàng)建，刪除目錄和文件，只允許查看目錄和文件

hadoop? dfsadmin? -safemode? get

Safe? mode is OFF

hadoop dfsadmin -safemode enter

Safe? mode is ON

hadoop dfsadmin -safemode leave

Safe? mode is ON

大數(shù)據(jù) Hadoop

版權(quán)聲明：本文內(nèi)容由網(wǎng)絡(luò)用戶投稿，版權(quán)歸原作者所有，本站不擁有其著作權(quán)，亦不承擔(dān)相應(yīng)法律責(zé)任。如果您發(fā)現(xiàn)本站中有涉嫌抄襲或描述失實(shí)的內(nèi)容，請(qǐng)聯(lián)系我們jiasou666@gmail.com 處理，核實(shí)后本網(wǎng)站將在24小時(shí)內(nèi)刪除侵權(quán)內(nèi)容。

大數(shù)據(jù)服務(wù)上云的思考">大數(shù)據(jù)服務(wù)上云的思考

773 2025-04-03

《進(jìn)擊大數(shù)據(jù)》系列教程之hadoop大數(shù)據(jù)基礎(chǔ)

大數(shù)據(jù)服務(wù)上云的思考">大數(shù)據(jù)服務(wù)上云的思考

公眾號(hào)文章匯總

國(guó)美&華為，戰(zhàn)略合作簽約！

推薦文章

企業(yè)生產(chǎn)管理是什么，企業(yè)生產(chǎn)管理軟件

進(jìn)盤(pán)點(diǎn)進(jìn)銷(xiāo)存軟件排行榜前十名

進(jìn)銷(xiāo)存系統(tǒng)哪個(gè)簡(jiǎn)單好用？進(jìn)銷(xiāo)存系統(tǒng)優(yōu)點(diǎn)

工廠生產(chǎn)管理（工廠生產(chǎn)管理流程及制度）

生產(chǎn)管理軟件，機(jī)械制造業(yè)生產(chǎn)管理，制造業(yè)生產(chǎn)過(guò)程管理軟件

進(jìn)銷(xiāo)存軟件和ERP有什么區(qū)別？進(jìn)銷(xiāo)存與erp軟件理解

進(jìn)銷(xiāo)存如何進(jìn)行庫(kù)存管理

如何利用excel制作銷(xiāo)售訂單管理系統(tǒng)？

數(shù)據(jù)庫(kù)訂單管理系統(tǒng)有哪些功能？數(shù)據(jù)庫(kù)訂單管理系統(tǒng)怎么設(shè)計(jì)？

什么是數(shù)據(jù)庫(kù)管理系統(tǒng)？

最近發(fā)表

熱評(píng)文章

零代碼開(kāi)發(fā)是什么？2022低代碼平臺(tái)排行榜">零代碼開(kāi)發(fā)是什么？2022低代碼平臺(tái)排行榜

進(jìn)銷(xiāo)存庫(kù)存管理 系統(tǒng)（智慧進(jìn)銷(xiāo)存）">智能進(jìn)銷(xiāo)存庫(kù)存管理系統(tǒng)（智慧進(jìn)銷(xiāo)存）

在線文檔哪家強(qiáng)？8款在線文檔編輯軟件推薦">在線文檔哪家強(qiáng)？8款在線文檔編輯軟件推薦

WPS2016怎么繪制簡(jiǎn)單的價(jià)格表?

系統(tǒng)的功能有哪些？餐飲服務(wù)系統(tǒng)的構(gòu)成及工作程序">連鎖餐飲管理系統(tǒng)的功能有哪些？餐飲服務(wù)系統(tǒng)的構(gòu)成及工

進(jìn)銷(xiāo)存庫(kù)存管理盤(pán)點(diǎn)">簡(jiǎn)單進(jìn)銷(xiāo)存庫(kù)存管理盤(pán)點(diǎn)

友情鏈接