mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <j...@windwardsolutions.com>
Subject Re: mahout quickstart-kmeans script sequencefile parameter
Date Thu, 03 Jun 2010 23:55:02 GMT
Yes, the options have changed a bit recently and that script evidently 
did not get updated yet. We are working to make all the algorithm 
command lines more uniform and still have a ways to go to accomplish 
that goal.

- w should now be -ow and causes the output directory to be overwritten
- x (--maxIter) is also required though perhaps it should not be? Do you 
really want kmeans to run forever?

If you run the driver with incorrect arguments, does it not print out 
the help information for you?
Jeff


On 6/3/10 2:58 PM, Tommy Chheng wrote:
>  Thanks Drew,
> I started a new EC2 instance with the mahout trunk and got it working. 
> There is a problem with the last line though.
>
> The last line in the script gave an error:
> ../bin/mahout kmeans -i 
> ./work/reuters-out-seqdir-sparse/tfidf/vectors/ -c ./work/clusters -o 
> ./work/reuters-kmeans -k 20 -w
>
> org.apache.commons.cli2.OptionException: Unexpected -w while 
> processing Options
>
> Removing the -w and adding the -maxIter fixes it.
> ../bin/mahout kmeans -i 
> ./work/reuters-out-seqdir-sparse/tfidf-vectors/ -c ./work/clusters -o 
> ./work/reuters-kmeans -k 20 --maxIter 20
>
> I added a comment to
> https://issues.apache.org/jira/browse/MAHOUT-390
>
> @tommychheng
> Programmer and UC Irvine Graduate Student
> Find a great grad school based on research interests: 
> http://gradschoolnow.com
>
>
> On 6/2/10 8:27 PM, Drew Farris wrote:
>> Very strange:
>>
>> drew@skirnir:~/mahout/svn-trunk$ svn info
>> Path: .
>> URL: https://svn.apache.org/repos/asf/mahout/trunk
>> Repository Root: https://svn.apache.org/repos/asf
>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>> Revision: 950859
>> [...]
>> drew@skirnir:~/mahout/svn-trunk$ ./bin/mahout seqdirectory -i
>> ./work/reuters-out -o ./work/reuters-out-seqdir -c UTF-8
>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>> [..]
>> drew@skirnir:~/mahout/svn-trunk$ ls ./work/reuters-out-seqdir
>> chunk-0
>>
>> To be absolutely certain nothing old is lurking in your target 
>> directories,
>> try 'mvn clean install' to rebuild and see if your results differ. If 
>> you
>> prefer, you can skip test execution 'mvn clean install -DskipTests=true'
>>
>> IF that doesn't work, run 'mvn -v' and post the results -- that might
>> provide some clues.
>>
>> - Drew
>>
>> On Tue, Jun 1, 2010 at 9:39 PM, Tommy Chheng<tommy.chheng@gmail.com>  
>> wrote:
>>
>>>   I updated the svn and did a mvn install but still getting a parsing
>>> command line error on the seqdirectory command.
>>> $svn info
>>> Path: .
>>> URL: http://svn.apache.org/repos/asf/mahout/trunk
>>> Repository Root: http://svn.apache.org/repos/asf
>>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>>> Revision: 950329
>>> Node Kind: directory
>>> Schedule: normal
>>> Last Changed Author: srowen
>>> Last Changed Rev: 950049
>>> Last Changed Date: 2010-06-01 05:55:49 -0700 (Tue, 01 Jun 2010)
>>>
>>>
>>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>>> ./work/reuters-out-seqdir -c UTF-8
>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>>> Unexpected -i while processing Options
>>>         at 
>>> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>>>         at
>>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)

>>>
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

>>>
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

>>>
>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>         at
>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)

>>>
>>>         at
>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>         at 
>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>>
>>> @tommychheng
>>> Programmer and UC Irvine Graduate Student
>>> Find a great grad school based on research interests:
>>> http://gradschoolnow.com
>>>
>>> On 6/1/10 12:43 PM, Grant Ingersoll wrote:
>>>
>>>> Can you try doing an SVN update and then "mvn install" and then run 
>>>> again?
>>>>
>>>> On May 31, 2010, at 12:28 PM, Tommy Chheng wrote:
>>>>
>>>>   Hi,
>>>>> I'm using the quickstart-kmeans.sh script from
>>>>> https://issues.apache.org/jira/browse/MAHOUT-390 to run the example
>>>>> kmeans. I'm on mahout trunk.
>>>>>
>>>>> It fails on the SequenceFile generation step:
>>>>> $./bin/mahout seqdirectory -i ./work/reuters-out/ -o
>>>>> ./work/reuters-out-seqdir -c UTF-8
>>>>> no HADOOP_CONF_DIR or HADOOP_HOME set, running locally
>>>>> Exception in thread "main" org.apache.commons.cli2.OptionException:
>>>>> Unexpected -i while processing Options
>>>>>         at
>>>>> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:99)
>>>>>         at
>>>>> org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:205)

>>>>>
>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
>>>>> Method)
>>>>>         at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

>>>>>
>>>>>         at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

>>>>>
>>>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>         at
>>>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)

>>>>>
>>>>>         at
>>>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>>>         at
>>>>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>>>>
>>>>> Alternatively, I tried ./bin/mahout seqdirectory --input
>>>>> ./work/reuters-out/ -o ./work/reuters-out-seqdir -c UTF-8 but the 
>>>>> get the
>>>>> same unexpected --input error.
>>>>>
>>>>>
>>>>> -- 
>>>>>
>>>>> @tommychheng
>>>>> Programmer and UC Irvine Graduate Student
>>>>> Find a great grad school based on research interests:
>>>>> http://gradschoolnow.com
>>>>>
>>>>>
>


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message