manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Query in Sharepoint connector
Date Wed, 09 Jul 2014 18:51:15 GMT
Hi Ameya,

Can you include the Solr [INFO] log entry for one of these indexing
actions?  I want to see if last_modified is set incorrectly there.  If you
are running these documents through Solr Cell, it may well be Tika that is
providing the last_modified date, not ManifoldCF.

Karl



On Wed, Jul 9, 2014 at 2:37 PM, Ameya Aware <ameya.aware@gmail.com> wrote:

> Hi Karl,
>
> Please find below screenshot.
>
> [image: Inline image 1]
>
> looks like indexing date comes out to be good.
>
> Also, just for your reference pasting my Solr screenshot as well.
>
> [image: Inline image 2]
>
>
> Thanks,
> Ameya
>
>
>
> On Wed, Jul 9, 2014 at 2:20 PM, Karl Wright <daddywri@gmail.com> wrote:
>
>> On second look, it all looks fine.
>>
>> The last thing to check is to look at what is getting set as data.
>> Around line 1996 in SharePointRepository, there is this code:
>>
>>
>> >>>>>>
>>                 if (modifiedDate != null)
>>                   data.setModifiedDate(modifiedDate);
>>                 if (createdDate != null)
>>                   data.setCreatedDate(createdDate);
>> <<<<<<
>>
>> Can you add this line:
>>
>> >>>>>>
>>                System.out.println("Indexing modified date:
>> "+modifiedDate);
>> <<<<<<
>>
>> ... and recrawl?
>>
>> If that works, we'll have to start looking at Solr.
>>
>> Thanks,
>> Karl
>>
>>
>>
>> On Wed, Jul 9, 2014 at 2:13 PM, Karl Wright <daddywri@gmail.com> wrote:
>>
>>> Actually, looking at your screen shot, it is harder to see for sure
>>> since there are multiple threads active.  So it may well be that there is
>>> no issue with the parsing.  Let me see if I can confirm that.
>>>
>>> Karl
>>>
>>>
>>>
>>> On Wed, Jul 9, 2014 at 2:05 PM, Karl Wright <daddywri@gmail.com> wrote:
>>>
>>>> https://issues.apache.org/jira
>>>>
>>>>
>>>>
>>>> On Wed, Jul 9, 2014 at 2:04 PM, Ameya Aware <ameya.aware@gmail.com>
>>>> wrote:
>>>>
>>>>> how do i open the ticket?
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Ameya
>>>>>
>>>>>
>>>>> On Wed, Jul 9, 2014 at 2:03 PM, Karl Wright <daddywri@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> bq. How is that coming good?
>>>>>>
>>>>>> No idea.  It may be a bug in the SimpleDateFormat class pertaining
to
>>>>>> only specific dates.
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 9, 2014 at 2:01 PM, Ameya Aware <ameya.aware@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Ok.
>>>>>>>
>>>>>>> But then same thing should happen in created date also , isnt
it?
>>>>>>>
>>>>>>> How is that coming good?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ameya
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 9, 2014 at 1:56 PM, Karl Wright <daddywri@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> This shows clearly that the parsing is not doing the expected
>>>>>>>> thing.  It's not clear why, since it's a pretty straight
usage of
>>>>>>>> SimpleDateFormat, but that is what is going wrong.
>>>>>>>>
>>>>>>>> Please open a ticket for us to look at this.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Karl
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 9, 2014 at 1:53 PM, Ameya Aware <ameya.aware@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Karl,
>>>>>>>>>
>>>>>>>>> Please find screenshot below to show modified date values
as date
>>>>>>>>> object and string as well.
>>>>>>>>>
>>>>>>>>> [image: Inline image 1]
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Ameya
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jul 9, 2014 at 12:32 PM, Karl Wright <daddywri@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Ameya,
>>>>>>>>>>
>>>>>>>>>> The ability to parse Microsoft's special 8601 dates
is in fact
>>>>>>>>>> already there.  So what might be happening is a timezone
issue, since the
>>>>>>>>>> timezone is not being explicitly set during parsing.
 Printing the value of
>>>>>>>>>> modifiedDateValue will show us if that is indeed
the problem.
>>>>>>>>>>
>>>>>>>>>> Karl
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 9, 2014 at 12:19 PM, Karl Wright <daddywri@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Ameya,
>>>>>>>>>>>
>>>>>>>>>>> Try printing "modifiedDateValue", in addition
to printing
>>>>>>>>>>> "modifiedDate".  The parsed form is a date object,
not a string.
>>>>>>>>>>>
>>>>>>>>>>> Karl
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jul 9, 2014 at 11:20 AM, Ameya Aware
<
>>>>>>>>>>> ameya.aware@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Ok.
>>>>>>>>>>>>
>>>>>>>>>>>> But created date for all files is coming
good.
>>>>>>>>>>>> Also,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> if (modifyDate != null)
>>>>>>>>>>>>                 {
>>>>>>>>>>>>                   // Item has a modified
date, so we presume it
>>>>>>>>>>>> exists
>>>>>>>>>>>>                   Date modifiedDateValue
=
>>>>>>>>>>>> DateParser.parseISO8601Date(modifiedDate);
>>>>>>>>>>>>                   Date createdDateValue =
>>>>>>>>>>>> DateParser.parseISO8601Date(createdDate);
>>>>>>>>>>>>
>>>>>>>>>>>>                   System.out.println("Modified
date string is:
>>>>>>>>>>>> '"+modifiedDate+"'");
>>>>>>>>>>>>                   System.out.println("Modify
Date:" +
>>>>>>>>>>>> modifyDate);
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> <<<<<<<<
>>>>>>>>>>>>
>>>>>>>>>>>> Above code prints out correct date even after
parsing.
>>>>>>>>>>>> So is the issue coming after this step??
>>>>>>>>>>>>
>>>>>>>>>>>> I am using Sharepoint 2010.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Ameya
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jul 9, 2014 at 11:10 AM, Karl Wright
<
>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Ok, well SharePoint claims these dates
are ISO8601 dates, but
>>>>>>>>>>>>> they are clearly not in this case.  Here
are the tests for 8601 dates in
>>>>>>>>>>>>> the MCF core code:
>>>>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>     Date d =
>>>>>>>>>>>>> DateParser.parseISO8601Date("96-11-15T01:32:33.344GMT");
>>>>>>>>>>>>>     assertNotNull(d);
>>>>>>>>>>>>>     d =
>>>>>>>>>>>>> DateParser.parseISO8601Date("2012-11-15T01:32:33.344Z");
>>>>>>>>>>>>>     assertNotNull(d);
>>>>>>>>>>>>>     d = DateParser.parseISO8601Date("2012-11-15T01:32:33Z");
>>>>>>>>>>>>>     assertNotNull(d);
>>>>>>>>>>>>>     d =
>>>>>>>>>>>>> DateParser.parseISO8601Date("2012-11-15T01:32:33+0100");
>>>>>>>>>>>>>     assertNotNull(d);
>>>>>>>>>>>>>     d =
>>>>>>>>>>>>> DateParser.parseISO8601Date("2012-11-15T01:32:33-03:00");
>>>>>>>>>>>>>     assertNotNull(d);
>>>>>>>>>>>>>     d =
>>>>>>>>>>>>> DateParser.parseISO8601Date("2012-11-15T01:32:33GMT-03:00");
>>>>>>>>>>>>>     assertNotNull(d);
>>>>>>>>>>>>>     d =
>>>>>>>>>>>>> DateParser.parseISO8601Date("2012-11-15T01:32:33.001-04:00");
>>>>>>>>>>>>>     assertNotNull(d);
>>>>>>>>>>>>> <<<<<<
>>>>>>>>>>>>>
>>>>>>>>>>>>> You will note that there is supposed
to be a "T" and a
>>>>>>>>>>>>> timezone in an ISO-8601 date.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What version of SharePoint are you using,
and what is the
>>>>>>>>>>>>> locale settings for the server that your
SharePoint is running on?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jul 9, 2014 at 11:06 AM, Ameya
Aware <
>>>>>>>>>>>>> ameya.aware@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please find below screenshot for
dates.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Modify Date i added on my own.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [image: Inline image 1]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Ameya
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Jul 9, 2014 at 11:03 AM,
Karl Wright <
>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Ameya,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The SharePoint connector parses
the date.  Can you send me
>>>>>>>>>>>>>>> some EXAMPLES of the dates coming
back so that I can be sure they will
>>>>>>>>>>>>>>> parse correctly?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Jul 9, 2014 at 10:59
AM, Ameya Aware <
>>>>>>>>>>>>>>> ameya.aware@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Karl,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I ran the job and at this
point values coming for modified
>>>>>>>>>>>>>>>> date are correct.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Ameya
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Jul 9, 2014 at 10:25
AM, Karl Wright <
>>>>>>>>>>>>>>>> daddywri@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Ameya,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I will provide instructions
for how I'd like you to
>>>>>>>>>>>>>>>>> research this.  I don't
suggest running under eclipse for this research.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Around line 1047 in SharePointRepository.java,
there is
>>>>>>>>>>>>>>>>> this code:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>                 String
modifiedDate =
>>>>>>>>>>>>>>>>> values.get("Modified");
>>>>>>>>>>>>>>>>>                 String
createdDate = values.get("Created");
>>>>>>>>>>>>>>>>>                 String
guid = values.get("GUID");
>>>>>>>>>>>>>>>>>                 String
modifyDate =
>>>>>>>>>>>>>>>>> values.get("Last_x0020_Modified");
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> <<<<<
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please add this line:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>                System.out.println("Modified
date string
>>>>>>>>>>>>>>>>> is: '"+modifiedDate+"'");
>>>>>>>>>>>>>>>>> <<<<<
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please run the job and
send me some examples of the
>>>>>>>>>>>>>>>>> modified date string.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Jul 9, 2014 at
10:13 AM, Ameya Aware <
>>>>>>>>>>>>>>>>> ameya.aware@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> i am finding it difficult
debug the code.. Whatever
>>>>>>>>>>>>>>>>>> changes i do,i just
build from scratch and check if changes are done or not.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Can you help me with
how can i debug this code? (i am
>>>>>>>>>>>>>>>>>> using eclipse IDE)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Ameya
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Jul 9, 2014
at 10:08 AM, Karl Wright <
>>>>>>>>>>>>>>>>>> daddywri@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi Ameya,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Other users have
had no trouble with this attribute in
>>>>>>>>>>>>>>>>>>> the past.  SharePoint,
though, has been known to use non-ISO-8601-format
>>>>>>>>>>>>>>>>>>> dates in some
cases.  I wonder if this is one of those cases?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In order to determine
this, you will need to edit the
>>>>>>>>>>>>>>>>>>> code for the
SharePoint connector and add debugging output.  Are you in a
>>>>>>>>>>>>>>>>>>> position to do
that?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Jul 9,
2014 at 9:51 AM, Ameya Aware <
>>>>>>>>>>>>>>>>>>> ameya.aware@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Karl,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am seeing
Shared documents from Sharepoint.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Ameya
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Tue, Jul
8, 2014 at 5:52 PM, Karl Wright <
>>>>>>>>>>>>>>>>>>>> daddywri@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi Ameya,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> There
is no difference in treatment between created
>>>>>>>>>>>>>>>>>>>>> date
and modified date that I can find in the connector.  Can you tell me
>>>>>>>>>>>>>>>>>>>>> what
kind of SharePoint entity you are seeing this on?  Eg documents, list
>>>>>>>>>>>>>>>>>>>>> items,
attachments?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Sent
from my Windows Phone
>>>>>>>>>>>>>>>>>>>>> ------------------------------
>>>>>>>>>>>>>>>>>>>>> From:
Ameya Aware
>>>>>>>>>>>>>>>>>>>>> Sent:
7/8/2014 3:41 PM
>>>>>>>>>>>>>>>>>>>>> To: Karl
Wright
>>>>>>>>>>>>>>>>>>>>> Subject:
Re: Query in Sharepoint connector
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>  I did
not get you properly. Please see below if it
>>>>>>>>>>>>>>>>>>>>> satisfies
your query.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Last
modified is date field in Sharepoint. When i run
>>>>>>>>>>>>>>>>>>>>> job and
send metadata to Solr, the date which is being sent to Solr is far
>>>>>>>>>>>>>>>>>>>>> different
than that of it is in Sharepoint.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Please
let me know if you need any more details.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> Ameya
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Tue,
Jul 8, 2014 at 3:35 PM, Karl Wright <
>>>>>>>>>>>>>>>>>>>>> daddywri@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> What
does this field look like in SharePoint?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Sent
from my Windows Phone
>>>>>>>>>>>>>>>>>>>>>> From:
Ameya Aware
>>>>>>>>>>>>>>>>>>>>>> Sent:
7/8/2014 1:50 PM
>>>>>>>>>>>>>>>>>>>>>> To:
dev@manifoldcf.apache.org
>>>>>>>>>>>>>>>>>>>>>> Subject:
Query in Sharepoint connector
>>>>>>>>>>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Last_modified
metadata sent from Sharepoint to Solr
>>>>>>>>>>>>>>>>>>>>>> is
not giving correct
>>>>>>>>>>>>>>>>>>>>>> values.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Even
last_modified showing lesser value than
>>>>>>>>>>>>>>>>>>>>>> created_by
date. (Created_by
>>>>>>>>>>>>>>>>>>>>>> date
is coming good).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Is
this bug?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> Ameya
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message