incubator-droids-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thorsten Scherler (JIRA)" <>
Subject [jira] Commented: (DROIDS-54) Make LinkTask supports arbitrary data by extends HashMap, and consider to refactor Task, Link, and LinkTask
Date Thu, 18 Jun 2009 10:49:07 GMT


Thorsten Scherler commented on DROIDS-54:

I agree to simplify Link, Task and LinkTask since the extra information that we need for task
can now be stored in the map.

> Make LinkTask supports arbitrary data by extends HashMap, and consider to refactor Task,
Link, and LinkTask
> -----------------------------------------------------------------------------------------------------------
>                 Key: DROIDS-54
>                 URL:
>             Project: Droids
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 0.01
>            Reporter: Mingfai Ma
> refer to the initial idea at:
> The current implementation of LinkTask
> {code}
> public class LinkTask implements Link, Serializable
> {
>   private Date started;
>   private final int depth;
>   private final URI uri;
>   private final Link from;
>   private Date lastModifedDate;
>   private Collection<URI> linksTo;
>   private String anchorText;
>   private int weight;
> {code}
> Suggested change:
> {code}
> public class LinkTask extends HashMap<String, Serializable> 
> or
> public class LinkTask extends HashMap<String, Serializable> implements Link
> {code}
> The minimum required attributes are:
>  - final ? id, 
>    - mainly to have a minimum size value as hash key and store in memory/data grid for
lookup, e.g. for use as history to avoid duplicated fetching. refer to DROIDS-53 
>  - final String url
>    - the original String representation of the URL (preferred), or representation
with the encoded string (seems no good).
>    - the url is the original one provided by the user in construction. two diff url may
refer to the same url, e.g. and, it's up to the
user to decide if they should be normalized. (and they could use the URL/LinkNormalizer in
> the other fields are basically optional. 
>   - started/taskDate, if the queue use it for sorting, then it's useful, otherwise, it's
just for logging.
>   -  "weight" is another example that not all implementation may need. 
>   - "linksTo", a.k.a. outLinks, is also optional to be attached to the LinkTask. an implementation
may extract the outlink and put them in queue directly without storing the outlinks in the
>   - "from", a.k.a. referrer, should not store the Link reference as it will affect GC.

> btw, should we also simplify Link, Task and LinkTask?  if we use a Map, it's very generic
already. Link and Task could be different concepts if we need to use them separately.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message