MapReduce中ArrayWritable 使用指南
MapReduce是一种编程模型,用于大规模数据集的并行运算。概念Map(映射)和Reduce(归约)和他们的主要思想,都是从函数式编程语言里借来的,还有从矢量编程语言里借来的特性。他极大地方
在编写MapReduce程序时,Map和Reduce之间传递的数据需要是ArrayList类型的,在调试运行时遇到了这样的一个错误:
java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.io.ArrayWritable.<init>()
经查询官网API文档后发现这样的一段话:
A Writable for arrays containing instances of a class. The elements of this writable must all be instances of the same class. If this writable will be the input for a Reducer, you will need to create a subclass that sets the value to be of the proper type. For example: public class IntArrayWritable extends ArrayWritable { public IntArrayWritable() { super(IntWritable.class); } }
原来是要自己实现一个ArrayWritable类的派生类,使用时只要实现两个构造函数即可
public static class TextArrayWritable extends ArrayWritable { public TextArrayWritable() { super(Text.class); } public TextArrayWritable(String[] strings) { super(Text.class); Text[] texts = new Text[strings.length]; for (int i = 0; i < strings.length; i++) { texts[i] = new Text(strings[i]); } set(texts); } }
精彩图集
精彩文章