ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Sarma <ksa...@ksarma.com>
Subject Re: files vs strings in collection reader
Date Tue, 07 May 2013 19:25:05 GMT
Hmm, without having actually reviewed the code in cTAKES (I'm not on my
work computer), my understanding of the "correct" way of doing this is to
use the listFiles method on the directory File to get an array of Files;
this should be implemented natively by the JVM and could be faster than
individual initialization.

Karthik Sarma
UCLA Medical Scientist Training Program Class of 20??
Member, UCLA Medical Imaging & Informatics Lab
Member, CA Delegation to the House of Delegates of the American Medical
gchat: ksarma@gmail.com
linkedin: www.linkedin.com/in/ksarma

On Tue, May 7, 2013 at 12:17 PM, Tim Miller <
timothy.miller@childrens.harvard.edu> wrote:

> The FilesInDirectoryCollectionRead**er creates an arraylist of
> java.io.File objects when it is initialized. For large datasets (~50k
> files) this is substantial time overhead and probably memory as well. Seems
> like it would be more efficient to use Strings instead of Files there and
> just open the File object when getNext() is called. It is pretty easy to
> implement, any downside to making this switch?
> Tim

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message