sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Douglas Spadotto <dougspado...@gmail.com>
Subject Re: Read export counters on command-line
Date Thu, 08 Dec 2016 15:17:48 GMT
Hi Rickard,

Great suggestion, thanks a lot!

I'll try and use this and compare with the quick and dirty way I wrote so
far, which was to pull this info from the Sqoop command's output log. Your
suggestion is much more elegant.

Regards,

Douglas

On Thu, Dec 8, 2016 at 12:55 PM, Rickard Cardell <rickard.cardell@klarna.com
> wrote:

> Hi
> We are doing a similar thing, but a job id is required. We fetch all job
> stats from the Rest api of the Jobhistory server and push it to an ELK
> cluster. We can then graph all kinds of stuff :) But perhaps for the use
> case you describe it might be enough to curl the jobhistory server.
>
> The counter/metric that you are looking for is
> org.apache.hadoop.mapreduce.TaskCounter.MAP_OUTPUT_
> RECORDS.totalCounterValue.
>
> So if you have the hadoop job id, you could then fetch the information
> from the REST API of the Jobhistory
> <https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html>
> server.
> e.g curl http://localhost:19888/ws/v1/history/mapreduce/jobs/job_
> 1480501523856_12395/counters | python -m json.tool
>
> {
>     "jobCounters": {
>         "counterGroup": [
>             {
>                 "counter": [
>                     {
>                         "mapCounterValue": 22574506,
>                         "name": "MAP_INPUT_RECORDS",
>                         "reduceCounterValue": 0,
>                         "totalCounterValue": 22574506
>                     },
>                     {
>                         "mapCounterValue": 22574506,
>                         "name": "MAP_OUTPUT_RECORDS",
>                         "reduceCounterValue": 0,
>                         "totalCounterValue": 22574506
>                     },
>                     ...
>                 ],
>                 "counterGroupName": "org.apache.hadoop.mapreduce.
> TaskCounter"
>             }
>         ],
>         "id": "job_1480501523856_12395"
>     }
> }
>
>
> //Rickard
>
> 2016-11-08 21:01 GMT+01:00 Douglas Spadotto <dougspadotto@gmail.com>:
>
>> Hello everyone,
>>
>> Is there a way for me to read the number of rows that were by a Sqoop
>> command without having to parse the command output?
>>
>> I tried displaying the environment variables after an execution and
>> didn't find anything meaningful.
>>
>> I saw this is doable when you call Sqoop from Java code, but couldn't
>> find anything on the command-line.
>>
>> Am I to parse logs to have that information? Another idea was to write a
>> custom validator that only made the statistics available on a more readable
>> format (env. variable, file, etc.).
>>
>> Thanks in advance,
>>
>> Douglas
>>
>> -----
>> Frodo: "I wish none of this had happened."
>> Gandalf: "So do all who live to see such times, but that is not for them
>> to decide. All we have to decide is what to do with the time that is given
>> to us."
>> -- Lord of the Rings: The Fellowship of the Ring (2001)
>>
>
>
>
> --
>
> *Rickard Cardell*
> System Developer
> Datavault Core, Bill (former Odin)
> +46 701 612 644 <+46+701+612+644>
>
> Klarna AB
> Sveavägen 46, 111 34 Stockholm
> Tel: +46 8 120 120 00 <+46812012000>
> Reg no: 556737-0431
> klarna.com
>
>


-- 
Visite: http://canseidesercowboy.wordpress.com/
Siga: @dougspadotto ou @excowboys
-----
Frodo: "I wish none of this had happened."
Gandalf: "So do all who live to see such times, but that is not for them to
decide. All we have to decide is what to do with the time that is given to
us."
-- Lord of the Rings: The Fellowship of the Ring (2001)

Mime
View raw message