Hadoop快速入門——第二章、分布式集群(第四節、搭建開發環境)
Hadoop快速入門——第二章、分布式集群
引包:
可以先安裝一下【Big Data Tools】
安裝完成后需要重新啟動一下。
個人建議,先改一下【鏡像】位置為國內的,我就沒改,直接update了,玩了好幾把【連連看】都沒下載完畢。
創建測試類:
import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; public class Action { public static void main(String[] args) { Action action = new Action(); action.init(); System.out.println(action.conf); System.out.println(action.fs); } Configuration conf = null; FileSystem fs = null; public void init() { conf = new Configuration(); try { fs = FileSystem.get(conf); } catch (Exception e) { e.printStackTrace(); } } }
輸出:
Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml
org.apache.hadoop.fs.LocalFileSystem@43195e57
文件操作:
mkdirs:創建文件夾
import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; public class Action { public static void main(String[] args) { Action action = new Action(); action.init(); try { boolean isf = action.fs.mkdirs(new Path("/data/infos/")); System.out.println(isf?"創建成功":"創建失敗"); } catch (IOException e) { e.printStackTrace(); } } Configuration conf = null; FileSystem fs = null; public void init() { conf = new Configuration(); try { fs = FileSystem.get(conf); } catch (Exception e) { e.printStackTrace(); } } }
會創建在【C盤的根目錄】
copyFromLocalFile:復制文件到服務器(本地模擬)
import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; public class Action { public static void main(String[] args) { Action action = new Action(); action.init(); try { action.fs.copyFromLocalFile(new Path("D:/info.txt"),new Path("/data/infos")); } catch (IOException e) { e.printStackTrace(); } } Configuration conf = null; FileSystem fs = null; public void init() { conf = new Configuration(); try { fs = FileSystem.get(conf); } catch (Exception e) { e.printStackTrace(); } } }
本地效果:
修改文件名稱【rename】:
import java.io.IOException; import java.text.SimpleDateFormat; import java.util.Date; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; public class Action { public static void main(String[] args) { Action action = new Action(); action.init(); try { SimpleDateFormat format=new SimpleDateFormat("yyyy_MM_dd"); Date now = new Date(); boolean isf = action.fs.rename(new Path("/data/infos/info.txt"), new Path("/data/infos/" + format.format(now) + ".txt")); System.out.println(isf?"修改成功":"修改失敗"); } catch (IOException e) { e.printStackTrace(); } } Configuration conf = null; FileSystem fs = null; public void init() { conf = new Configuration(); try { fs = FileSystem.get(conf); } catch (Exception e) { e.printStackTrace(); } } }
本地效果:
刪除文件deleteOnExit:
import java.io.IOException; import java.text.SimpleDateFormat; import java.util.Date; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; public class Action { public static void main(String[] args) { Action action = new Action(); action.init(); try { boolean isf = action.fs.deleteOnExit(new Path("/data/infos/2022_04_19.txt")); System.out.println(isf?"刪除成功":"刪除失敗"); } catch (IOException e) { e.printStackTrace(); } } Configuration conf = null; FileSystem fs = null; public void init() { conf = new Configuration(); try { fs = FileSystem.get(conf); } catch (Exception e) { e.printStackTrace(); } } }
查看目錄信息:
做一些測試文件:
遍歷【/data/下的所有文件】
import java.io.IOException; import java.text.SimpleDateFormat; import java.util.Date; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.LocatedFileStatus; import org.apache.hadoop.fs.Path; import org.apache.hadoop.fs.RemoteIterator; import org.apache.hadoop.fs.permission.FsPermission; public class Action { public static void main(String[] args) { Action action = new Action(); action.init(); try { //查看根目錄信息 RemoteIterator
遍歷文件以及文件夾listStatus:
編碼:
import java.io.IOException; import java.text.SimpleDateFormat; import java.util.Date; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.*; import org.apache.hadoop.fs.permission.FsPermission; public class Action { public static void main(String[] args) { Action action = new Action(); action.init(); try { //查看根目錄信息 FileStatus[] fileStatuses = action.fs.listStatus(new Path("/data/infos/")); for (FileStatus file:fileStatuses) { if(file.isFile()){ System.out.println("文件"+file.getPath().getName()); }else{ System.out.println("文件夾"+file.getPath().getName()); } } } catch (IOException e) { e.printStackTrace(); } } Configuration conf = null; FileSystem fs = null; public void init() { conf = new Configuration(); try { fs = FileSystem.get(conf); } catch (Exception e) { e.printStackTrace(); } } }
效果:
獲取所有節點信息(win系統上看不到)
import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem import org.apache.hadoop.hdfs.DistributedFileSystem; import org.apache.hadoop.hdfs.protocol.DatanodeInfo; public class Action { public static void main(String[] args) { Action action = new Action(); action.init(); try { DistributedFileSystem distributedFileSystem = (DistributedFileSystem) action.fs; DatanodeInfo[] datanodeInfos = distributedFileSystem.getDataNodeStats(); for (DatanodeInfo datanodeInfo : datanodeInfos) { System.out.println(datanodeInfo.getHostName()); } } catch (IOException e) { e.printStackTrace(); } } Configuration conf = null; FileSystem fs = null; public void init() { conf = new Configuration(); try { fs = FileSystem.get(conf); } catch (Exception e) { e.printStackTrace(); } } }
HDFS 的設計特點
能夠存儲超大文件
流式數據訪問
商用硬件
不能處理低時間延遲的數據訪問
不能存放大量小文件
無法高效實現多用戶寫入或者任意修改文件
Hadoop 分布式
版權聲明:本文內容由網絡用戶投稿,版權歸原作者所有,本站不擁有其著作權,亦不承擔相應法律責任。如果您發現本站中有涉嫌抄襲或描述失實的內容,請聯系我們jiasou666@gmail.com 處理,核實后本網站將在24小時內刪除侵權內容。