spot-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Terry Healy <the...@bnl.gov>
Subject Another OA snag
Date Wed, 13 Sep 2017 17:40:44 GMT
So having installed the missing requirements, I'm moving along slowly 
towards getting OA to run.

When I run:     python2.7 start_oa.py -d 20170911 -t flow -l 3000

Here is the error trace received:

2017-09-13 13:22:37,885 - OA - INFO - -------------------- STARTING OA 
---------------------
2017-09-13 13:22:37,885 - OA - INFO - Validating input parameter values
2017-09-13 13:22:38,273 - OA.Flow - INFO - Creating folder structure for 
OA (data and ipynb)
2017-09-13 13:22:38,273 - OA.Flow - INFO - Cleaning data from previous 
executions for the day

Debug1: /user/spot/flow/hive/oa/path/y=2017/m=9/d=11

Traceback (most recent call last):
   File "start_oa.py", line 82, in <module>
     main()
   File "start_oa.py", line 40, in main
     start_oa(args)
   File "start_oa.py", line 55, in start_oa
     oa_process.start()
   File 
"/home/spot/apache-spot-1.0-incubating/spot-oa/oa/flow/flow_oa.py", line 
82, in start
     self._clear_previous_executions()
   File 
"/home/spot/apache-spot-1.0-incubating/spot-oa/oa/flow/flow_oa.py", line 
110, in _clear_previous_executions
HDFSClient.delete_folder("{0}/{1}/hive/oa/{2}/y={3}/m={4}/d={5}".format(HUSER,self._table_name,path,yr,int(mn),int(dy)),user="impala")


   File "../api/resources/hdfs_client.py", line 62, in delete_folder
     client.delete(hdfs_file,recursive=True)
   File "/usr/lib/python2.7/site-packages/hdfs/client.py", line 850, in 
delete
     return self._delete(hdfs_path, recursive=recursive).json()['boolean']
   File "/usr/lib/python2.7/site-packages/hdfs/client.py", line 112, in 
api_handler
     raise err


----------------------------------------------------------------------------------------

The code in flow_oa.py says (My crude attempts at debug is the print 
statement, with "flow" hardcoded here)

         print "Debug1: 
{0}/{1}/hive/oa/flow/y={2}/m={3}/d={4}".format(HUSER,self._table_name,yr,int(mn),int(dy))
         for path in table_schema:
HDFSClient.delete_folder("{0}/{1}/hive/oa/{2}/y={3}/m={4}/d={5}".format(HUSER,self._table_name,path,yr,int(mn),int(dy)),user="impala")


Actual data path in HDFS is: (i.e. no "oa" in path)

     /user/spot/flow/hive/y=2017/m=9/d=11/[hours here]


But there exists also these directories which seem to be what it is after:

     /user/spot/flow/hive/oa/['suspicious', 
'edge','chords','threat_investigation', 'timeline', 'storyboard', 
'summary' ]

** Note that the month here as "9" vs. "09" is required by my installation.

What do I have out of sync here?

Thanks,
Terry

Mime
View raw message