hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ranjini Rathinam <ranjinibe...@gmail.com>
Subject XML to TEXT
Date Wed, 08 Jan 2014 09:23:56 GMT
Hi,

As suggest i tried with the code , but in the result.txt i got output only
header. Nothing else was printing.

After debugging i came to know that while parsing , there is no value.

The problem is in line given below which is bold. While putting SysOut i
found no value printing in this line.

String xmlContent = value.toString();

        InputStream is = new ByteArrayInputStream(xmlContent.getBytes());
        DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
        DocumentBuilder builder;
        try {
            builder = factory.newDocumentBuilder();
         *   Document doc = builder.parse(is);*


*String ed=doc.getDocumentElement().getNodeName();*
out.write(ed.getBytes());
            DTMNodeList list = (DTMNodeList) getNode("/Company/Employee",
doc,XPathConstants.NODESET);


When iam printing

out.write(xmlContent.getBytes):- the whole xml is being printed.

then i wrote for Sysout for list ,nothing printed.
out.write(ed.getBytes):- nothing is being printed.

Please suggest where i am going wrong. Please help to fix this.

Thanks in advance.

I have attached my code.Please review.


Mapper class:-

public class XmlTextMapper extends Mapper<LongWritable, Text, Text, Text> {
    private static final XPathFactory xpathFactory =
XPathFactory.newInstance();
    @Override
    public void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {
        String resultFileName = "/user/task/Sales/result.txt";

        Configuration conf = new Configuration();
        FileSystem fs = FileSystem.get(URI.create(resultFileName), conf);
        FSDataOutputStream out = fs.create(new Path(resultFileName));
        InputStream resultIS = new ByteArrayInputStream(new byte[0]);
        String header = "id,name\n";
        out.write(header.getBytes());
        String xmlContent = value.toString();

        InputStream is = new ByteArrayInputStream(xmlContent.getBytes());
        DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
        DocumentBuilder builder;
        try {
            builder = factory.newDocumentBuilder();
            Document doc = builder.parse(is);

   String ed=doc.getDocumentElement().getNodeName();
   out.write(ed.getBytes());
            DTMNodeList list = (DTMNodeList) getNode("/Company/Employee",
doc,XPathConstants.NODESET);
            int size = list.getLength();
            for (int i = 0; i < size; i++) {
                Node node = list.item(i);
                String line = "";
                NodeList nodeList = node.getChildNodes();
                int childNumber = nodeList.getLength();
                for (int j = 0; j < childNumber; j++)
    {
                    line += nodeList.item(j).getTextContent() + ",";
                }
                if (line.endsWith(","))
                    line = line.substring(0, line.length() - 1);
                line += "\n";
                out.write(line.getBytes());
            }
        } catch (ParserConfigurationException e) {
             e.printStackTrace();
        } catch (SAXException e) {
             e.printStackTrace();
        } catch (XPathExpressionException e) {
             e.printStackTrace();
        }
        IOUtils.copyBytes(resultIS, out, 4096, true);
        out.close();
    }
    public static Object getNode(String xpathStr, Node node, QName
retunType)
            throws XPathExpressionException {
        XPath xpath = xpathFactory.newXPath();
        return xpath.evaluate(xpathStr, node, retunType);
    }
}



Main class
public class MainXml {
    public static void main(String[] args) throws Exception {

Configuration conf = new Configuration();

        if (args.length != 2) {
            System.err
                    .println("Usage: XMLtoText <input path> <output path>");
            System.exit(-1);
        }

  String output="/user/task/Sales/";
       Job job = new Job(conf, "XML to Text");
        job.setJarByClass(MainXml.class);
       // job.setJobName("XML to Text");

        FileInputFormat.addInputPath(job, new Path(args[0]));

       // FileOutputFormat.setOutputPath(job, new Path(args[1]));
  Path outPath = new Path(output);
  FileOutputFormat.setOutputPath(job, outPath);
  FileSystem dfs = FileSystem.get(outPath.toUri(), conf);
  if (dfs.exists(outPath)) {
  dfs.delete(outPath, true);
  }
        job.setMapperClass(XmlTextMapper.class);

        job.setNumReduceTasks(0);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(Text.class);
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}



My xml file

<Company>
<Employee>
<id>100</id>
<ename>ranjini</ename>
<dept>IT1</dept>
<sal>123456</sal>
<location>nextlevel1</location>
<Address>
<Home>Chennai1</Home>
<Office>Navallur1</Office>
</Address>
</Employee>
<Employee>
<id>1001</id>
<ename>ranjinikumar</ename>
<dept>IT</dept>
<sal>1234516</sal>
<location>nextlevel</location>
<Address>
<Home>Chennai</Home>
<Office>Navallur</Office>
</Address>
</Employee>
</Company>


Thanks in advance
Ranjini. R

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message