lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Bernstein <joels...@gmail.com>
Subject Re: Solr streaming innerJoin doesn't return rows
Date Thu, 02 Nov 2017 18:49:06 GMT
The joins are MapReduce joins which require shuffling of entire result
sets. This means you need to use the /export handler to make them work.

The joins in general are designed to be done in parallel on large clusters.
You won't be able to get good performance with large joins on a single node
or even a small cluster.

So you'll really need to think about how the joins are designed and whether
they fit your use case.

Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Nov 2, 2017 at 2:25 PM, Webster Homer <webster.homer@sial.com>
wrote:

> I'm using Solr 6.2.0. I am trying to understand how the streaming api
> works.
>
> in 6.2 simple expressions seem to behave well. I am having a problem making
> the joins work. I don't see errors, but I don't see data either.
>
> Using the Solr Admin Console for testing, this query works:
> search(test-catalog-product-170724,
> defType="edismax",q="T1503SIGMA",qf="id_record_spec",mm="2<-12%",fl="id_
> record_spec,
> id_s",sort="id_record_spec asc")
>
> As does this:
> search(sial-catalog-material-171030,
> defType="edismax",q="T1503SIGMA",qf="id_record_spec",fl="id_record_spec,
> stream_en_s_pri_name,display_cas_number,display_package_
> size,key_erp_material_number,display_material_qty,display_
> formula_weight,display_material_uom,key_brand,display_en_name",sort="id_
> record_spec
> asc")
>
> And this works:
> innerJoin(
> search(sial-catalog-material-171030,
> defType="edismax",q="T1503SIGMA",qf="id_record_spec",fl="id_record_spec,
> stream_en_s_pri_name,display_cas_number,display_package_
> size,key_erp_material_number,display_material_qty,display_
> formula_weight,display_material_uom,key_brand,display_en_name",sort="id_
> record_spec
> asc"),
> search(test-catalog-product-170724,
> defType="edismax",q="T1503SIGMA",qf="id_record_spec",mm="2<-12%",fl="id_
> record_spec,
> id_s",sort="id_record_spec asc"),
> on="id_record_spec"
> )
>
> but this doesn't throw an error, but it also doesn't return anything.
> innerJoin(
> search(sial-catalog-material-171030, q=*:*,
> fl="id_record_spec,stream_en_s_pri_name,display_cas_number,
> display_package_size,key_erp_material_number,display_
> material_qty,display_formula_weight,display_material_uom,
> key_brand,display_en_name",sort="id_record_spec
> asc"),
> search(test-catalog-product-170724,
> defType="edismax",q="T1503SIGMA",qf="id_record_spec",mm="2<-12%",fl="id_
> record_spec,
> id_s",sort="id_record_spec asc"),
> on="id_record_spec"
> )
>
> Do we have to  explicitly provide the same query to both searches in the
> join? I see examples in the documents that look like my last join.
>
> I also see the same behavior with this:
> hashJoin(
> search(test-catalog-product-170724,
> defType="edismax",q="T1503SIGMA",qf="id_record_spec",mm="2<-12%",fl="id_
> record_spec,
> id_s",sort="id_record_spec asc"),
> hashed=search(sial-catalog-material-171030, q=*:*,
> fl="id_record_spec,stream_en_s_pri_name,display_cas_number,
> display_package_size,key_erp_material_number,display_
> material_qty,display_formula_weight,display_material_uom,
> key_brand,display_en_name",sort="id_record_spec
> asc"),
> on="id_record_spec"
> )
>
> no errors but no data either. There is data, so what am I doing wrong? I
> suspect some user error but am at a loss to understand what it is.
>
> Thanks
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message