hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leonidas Fegaras <fega...@cse.uta.edu>
Subject Re: [ANNOUNCEMENT] A query system for BSP processing
Date Thu, 30 Aug 2012 14:40:58 GMT
Yes sure. I have fixed the bug with the repeat stopping condition but  
I have only tested pagerank on my small cluster. I still need to fix  
the k-means clustering (it's a special case because you improve a  
fixed number of points).
Leonidas

On Aug 30, 2012, at 9:02 AM, Edward J. Yoon wrote:

> Shall we work together?
>
> On Fri, Aug 24, 2012 at 9:01 PM, Leonidas Fegaras  
> <fegaras@cse.uta.edu> wrote:
>> Thank you very much for your interest and for testing my system.
>> It seems that my release was premature: It worked for some random  
>> data but
>> didn't for some others. It's a minor logical error that I will try  
>> to fix in
>> the next few days. The problem is with the stopping condition of  
>> the repeat
>> expression that calculates the new pagerank from the old. It must  
>> stop if
>> ALL peers reach  the specified precision. This is done by having  
>> those peers
>> that need to continue send a message to others to continue. It  
>> seems that
>> now when all peers agree at the same time, the program works fine.  
>> But if
>> one finishes sooner, instead of continuing the repeat loop, it runs  
>> away to
>> the next BSP step that follows the repeat, then exits prematurely  
>> and the
>> system hangs. The casting errors are due to the run-away peers  
>> executing the
>> wrong BSP steps reading wrong messages. Queries without repeat  
>> though are
>> OK.
>> By the way, I had a problem exchanging large amount of data during  
>> sync (I
>> discussed this with Thomas).  My solution was to to break a BSP  
>> superstep
>> into multiple substeps so that each substep can handle a max number  
>> of
>> messages. Of course my program has to collect all messages in a  
>> vector in
>> memory. When the vector is too big, it is spilled in a local file.  
>> This
>> moved the problem from the Hama side to my side and allowed me to  
>> handle
>> larger data, especially in joins. I think this problem of  
>> exchanging large
>> amount of data during a superstep is currently a weakness of Hama.
>> Leonidas
>>
>>
>>
>> On 08/24/2012 04:15 AM, Thomas Jungblut wrote:
>>>
>>> BTW, should we feature this on our website?
>>>
>>> 2012/8/24 Thomas Jungblut <thomas.jungblut@gmail.com>
>>>
>>>> Hi Leonidas!
>>>>
>>>> I have to admit that I have known what is going on (and had to keep
>>>> silent), but I have to say: Thank you very much!
>>>> This will help many people writing BSPs in a more easier way.
>>>>
>>>> Of course this is not as fast as the native BSP code, Hive and  
>>>> Pig suffer
>>>> from the same problems in MR.
>>>> But it gives people the opportunity to develop faster and get  
>>>> their code
>>>> in production with just a minor time expense.
>>>>
>>>> And I think, that we will help you gladly on improving the BSP  
>>>> part of
>>>> your framework. At least I would do ;)
>>>>
>>>> Thanks!
>>>>
>>>> 2012/8/24 Edward J. Yoon <edwardyoon@apache.org>
>>>>
>>>> Here's my few test results on Oracle BDA (40G/s infiniband  
>>>> network).
>>>>>
>>>>> It seems slow than our PageRank example.
>>>>>
>>>>> P.S., There are some errors so I couldn't test large-scale.
>>>>> (java.lang.ClassCastException: hadoop.mrql.MR_int cannot be cast  
>>>>> to
>>>>> hadoop.mrql.Inv and java.lang.Error: Cannot clear a non- 
>>>>> materialized
>>>>> sequence ..., etc.)
>>>>>
>>>>>
>>>>>
>>>>> == 100K nodes and 1M edges ==
>>>>>
>>>>> *** Using 10 BSP tasks (out of a max 10). Each task will handle  
>>>>> about
>>>>> 2383611 bytes of input data.
>>>>>
>>>>> Run time: 30.384 secs
>>>>>
>>>>> *** Using 20 BSP tasks (out of a max 20). Each task will handle  
>>>>> about
>>>>> 1191805 bytes of input data.
>>>>>
>>>>> Run time: 24.412 secs
>>>>>
>>>>> On Fri, Aug 24, 2012 at 9:36 AM, Edward J. Yoon <edwardyoon@apache.org

>>>>> >
>>>>> wrote:
>>>>>>
>>>>>> Wow, very interesting. I'm going to install and test on my large
>>>>>
>>>>> cluster.
>>>>>>
>>>>>> On Fri, Aug 24, 2012 at 4:41 AM, Leonidas Fegaras <fegaras@cse.uta.edu

>>>>>> >
>>>>>
>>>>> wrote:
>>>>>>>
>>>>>>> Dear Hama users,
>>>>>>> I am pleased to announce that the MRQL query processing system
 
>>>>>>> can now
>>>>>>> evaluate SQL-like queries on a Hama cluster. MRQL is available
 
>>>>>>> at:
>>>>>>>
>>>>>>> http://lambda.uta.edu/mrql/
>>>>>>>
>>>>>>> MRQL (the Map-Reduce Query Language) is an SQL-like query  
>>>>>>> language for
>>>>>>> large-scale, distributed data analysis. MRQL is powerful  
>>>>>>> enough to
>>>>>>> express most common data analysis tasks over many different 

>>>>>>> kinds of
>>>>>>> raw data, including hierarchical data and nested collections,
 
>>>>>>> such as
>>>>>>> XML data. MRQL can run in two modes: in MR (Map-Reduce) mode
 
>>>>>>> using
>>>>>>> Apache Hadoop and in BSP (Bulk Synchronous Parallel) mode  
>>>>>>> using Apache
>>>>>>> Hama. Both modes use Apache's HDFS to read and write their data.
>>>>>>>
>>>>>>> Note that, the BSP mode is currently experimental (not fine-

>>>>>>> tuned yet)
>>>>>>> and lacks any fault-tolerance (if an error occurs, the entire
 
>>>>>>> job must
>>>>>>> be restarted). Due to our limited resources, MRQL has only  
>>>>>>> been tested
>>>>>>> on a small cluster (7-nodes/28-cores). We compared the BSP  
>>>>>>> mode with
>>>>>>> the MR mode by evaluating a pagerank query over a small graph
 
>>>>>>> (100K
>>>>>>> nodes, 1M edges) and found that BSP mode is about 4.5 times 

>>>>>>> faster
>>>>>>> than the MR mode. Please let me know if you'd like to  
>>>>>>> contribute to
>>>>>>> this project by testing MRQL on a larger cluster.
>>>>>>> Best regards,
>>>>>>> Leonidas Fegaras
>>>>>>> University of Texas at Arlington
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards, Edward J. Yoon
>>>>>> @eddieyoon
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards, Edward J. Yoon
>>>>> @eddieyoon
>>>>>
>>>>
>>> .
>>>
>>
>
>
>
> -- 
> Best Regards, Edward J. Yoon
> @eddieyoon


Mime
View raw message