Hadoop 运行 c++ 程序实验

假设有C++程序boss.exe, 其执行格式如下(第一个参数是输入文件,第二个参数是输出文件):

./boss.exe ADDRESS_BOOK_FILE NEW_ADDRESS_BOOK_FILE

现在需要在hadoop的Map函数中启动boss.exe,其输入输出文件均在HDFS中,格式为:

hdfs://127.0.0.1:8020/user/donal/address1.txt 或者 hdfs:///user/donal/address1.txt

Map函数解决思路:

1.先将输入文件从HDFS拷贝到本地

hadoop fs -copyToLocal /user/donal/address1.txt /tmp/address1.txt

2.执行C++程序

./boss.exe /tmp/address1.txt /tmp/address2.txt

3.将结果文件拷贝到HDFS

hadoop fs -copyFromLocal /tmp/address2.txt /user/donal/address2.txt

具体的代码如下:

Map.java

import java.lang.Runtime;
import java.util.Arrays;

class Map{
public static int RunProcess(String[] args){
int exitcode = -1;
System.out.println(Arrays.toString(args));
try{
Runtime runtime=Runtime.getRuntime();
final Process process=runtime.exec(args);
// any error message?
new StreamGobbler(process.getErrorStream(), "ERROR").start();
// any output?
new StreamGobbler(process.getInputStream(), "OUTPUT").start();
process.getOutputStream().close();
exitcode=process.waitFor();
}catch (Throwable t){
t.printStackTrace();
}
return exitcode;
}
public static void main(String[] args) throws Exception {
if (args.length != 2) {
System.err.println("Usage: Map ADDRESS_BOOK_FILE NEW_ADDRESS_BOOK_FILE");
System.exit(-1);
}
String inFileName=args[0];
String outFileName=args[1];
String localInFileName="";
String localOutFileName="";
//String[] commandArgs;
try{
if(args[0].startsWith("hdfs://")){
inFileName=args[0].substring(args[0].indexOf('/',6)+1);
localInFileName="/tmp/" + inFileName.substring(inFileName.lastIndexOf('/')+1);
//copy the input file from HDFS
RunProcess(new String[]{"/bin/sh","-c","/usr/lib/hadoop/bin/hadoop fs -copyToLocal "+inFileName+" "+localInFileName});
}
if(args[1].startsWith("hdfs://")){
outFileName=args[1].substring(args[1].indexOf('/',6)+1);
localOutFileName="/tmp/" + outFileName.substring(outFileName.lastIndexOf('/')+1);
}
String[] commandArgs={"./boss.exe",localInFileName,localOutFileName};
int exitcode = RunProcess(commandArgs);
if(args[1].startsWith("hdfs://")){
//copy the result file to HDFS
RunProcess(new String[]{"/bin/sh","-c","/usr/lib/hadoop/bin/hadoop fs -copyFromLocal "+localOutFileName+" "+outFileNam
e});
}
System.out.println("finish:"+exitcode);
}catch (Throwable t){
t.printStackTrace();
}
}
}

StreamGobbler.java:

import java.util.*;
import java.io.*;
class StreamGobbler extends Thread
{
InputStream is;
String type;
StreamGobbler(InputStream is, String type)
{
this.is = is;
this.type = type;
}

public void run()
{
try
{
InputStreamReader isr = new InputStreamReader(is);
BufferedReader br = new BufferedReader(isr);
String line=null;
while ((line = br.readLine()) != null)
System.out.println(type + ">" + line);
} catch (IOException ioe)
{
ioe.printStackTrace();
}
}
}


说明:

1.HDFS文件的拷贝可以使用Java API实现,本例中调用了shell命令

2.StreamGobbler.java是为了输出子进程的ErrorSteam和InputSteam输出



原文地址:https://www.cnblogs.com/Donal/p/2387873.html