manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Cizmar <michael.ciz...@mcplusa.com>
Subject Re: URL Mapping
Date Thu, 28 May 2020 16:47:27 GMT
The "!ut" and then a bunch of session information is from Web Sphere Portal.  Some information
about it here:
https://books.google.com/books?id=bqAXnpmj5LwC&pg=PA180&lpg=PA180&dq=%22!ut%22+session+variables+websphere#v=onepage&q=%22!ut%22%20session%20variables%20websphere&f=false

I'll look at making a change to the web crawler to suppor this like the BV and ASP.NET

________________________________
From: Karl Wright <daddywri@gmail.com>
Sent: Thursday, May 28, 2020 11:41 AM
To: user@manifoldcf.apache.org <user@manifoldcf.apache.org>
Subject: Re: URL Mapping

Hi,

There are provisions in the URL canonicallization part of the world for removal of session
information from the URL.  It only knows about some kinds of widely used sessions; java app
server sessions, for example, Broadvision sessions, etc.  If you can convince me that your
session information is (a) uniquely identifiable, and (b) commonly used, the proper approach
is to incorporate session removal in this framework.  Please let me know.

Karl


On Thu, May 28, 2020 at 12:11 PM Michael Cizmar <michael.cizmar@mcplusa.com<mailto:michael.cizmar@mcplusa.com>>
wrote:
I've got a really long url with a bunch of unnecessary session query string parameters.  I've
been trying unsuccessfully to map it to the same url without the session.

an example of the url below.  I thought I could do this:

url map regular expression:

(.*)\/!ut

replacement configuration:


[cid:1725c275aadcb971f161]

So the go would be that the url be:
http://localhost:8080/mcplusa/myportal/agents/portal/quoteenroll/digs%20-%20quoting%20%20enrollment%20(individual)/

But the url gets rejected.

Sample Crawl Url

http://localhost:8080/mcplusa/myportal/agents/portal/quoteenroll/digs%20-%20quoting%20%20enrollment%20(individual)/!ut/p/a1/rZHLTsMwEEV_hS6yjDx5OWZpdRFImzYCAYk3lZM6D5TYSWoqPh8HFu2GQhHejEeae-aOLmIoQ0zyY1tz3SrJu7lneLfdBtTxI1iRhzsMFEfrpZ_6AFFoBnIzAN88Cj_pXxBDrJR60A3KeS2kvimV1KZaMKhJ886C8U1pIeSkOtNM3Pz5QewO3IJG9WIGDGW7RzkB7hZFIWxyyx3bL8LAJo6L7QoELitMPAH7r4WXLefmpvBkOoqfiTHth6vYTRxIAT1eufMy8D74Z2DqXg2Mf5Fz-zqOjJq05nzeNcr-FpchuVOyTGpjkOvGbmWlUHYmQtmZCGWfoqF_6omHq83G5gUBL-iOa0oXiw9FOxLu/dl5/d5/L0lJS2FZcHBpbW1LYVlwcGltbVlwcGchIS9vSHd3QUFBSXdpRUFJSkRBQ1VZaUVJVTVCZ09DbFFBQUlBQVNvU0FyUnFBQURBQWF0QXdMTzlRQUFFQUJ3WWVBR0tTQUFDa0k1Z21HU3dTaXJTQUFDZ0s5ZzBIUS80SmlHcGhxRWFoR29ScUVhbEdwaC9aNl9PTzVBMTRHMEs4Ukg2MEE2R0xDNFA0MDBHNy9hZ2VudCBjb250ZW50JTBwb3J0YWwlMHF1b3RlZW5yb2xsJTBkaWdzIC0gcXVvdGluZyAgZW5yb2xsbWVudCAoaW5kaXZpZHVhbCkvZjQ0YmEyOWUtODQwOC00YjFlLTg4MzktMTFlMjI4NDgxYTVhL2RpZ3MgLSBxdW90aW5nICBlbnJvbGxtZW50IChpbmRpdmlkdWFsKQ

Mime
View raw message