8000 Choose only one IOPlugin implementation · Issue #4 · scijava/scijava-plugins-io-table · GitHub
[go: up one dir, main page]

Skip to content
This repository was archived by the owner on Sep 28, 2022. It is now read-only.

Choose only one IOPlugin implementation #4

Closed
ctrueden opened this issue Dec 12, 2018 · 9 comments
Closed

Choose only one IOPlugin implementation #4

ctrueden opened this issue Dec 12, 2018 · 9 comments
Assignees

Comments

@ctrueden
Copy link
Member
ctrueden commented Dec 12, 2018

Right now, we have both DefaultTableIOPlugin and CommonsCSVTableIOPlugin. The default one wins, because it has a higher priority. The commons CSV one is never used.

One of these plugins should be deleted. But which one? Let's make a decision. The commons CSV implementation is substantially shorter and hence probably more maintainable. But which one is faster? Is the commons-csv dependency acceptable? Is either one more correct than the other, covering more cases? Etc.

@stelfrich
Copy link
Member
stelfrich commented Dec 13, 2018

Based on the benchmarks below, I'd suggest to keep DefaultTableIOPlugin. Since most people will use it to write CSVs instead of reading, the performance gain is significant.

Benchmark                                                           Mode  Cnt    Score   Error  Units
DefaultTableIOPluginBenchmark.openLargeWithCommonsCSVTableIOPlugin  avgt   20   73.279 ± 0.744  ms/op
DefaultTableIOPluginBenchmark.openLargeWithDefaultTableIOPlugin     avgt   20  108.049 ± 2.475  ms/op
TableIOPluginsSaveBenchmark.writeLargeWithCommonsCSVTableIOPlugin   avgt   10  248.536 ± 8.554  ms/op
TableIOPluginsSaveBenchmark.writeLargeWithDefaultTableIOPlugin      avgt   10  169.832 ± 4.247  ms/op

To get the best of both worlds, we could use the reading implementation from CommonsCSVTableIOPlugin. What do you think @ctrueden?

@ctrueden
Copy link
Member Author

I don't have a strong opinion. Whatever you think is a good balance of performance and maintainability works for me. Especially since according to the POM of this project, you and @imagejan are in charge here, not me. 😜

@imagejan
Copy link
Member

I wonder whether CommonsCSVTableIOPlugin would work at all for the use cases of @PetrBainar in imagej-server, as it assumes source to be a file path:

try (FileReader fileReader = new FileReader(new File(source));
CSVParser csvFileParser = new CSVParser(fileReader, csvFileFormat);)

So we might want to stick with DefaultTableIOPlugin for now, or adapt the other one to use generic SciJava Locations as well.

@ctrueden
Copy link
Member Author

I just noticed that scijava-table also has org.scijava.table.TableLoader, which is yet another loader implementation. Should we delete the TableLoader? The only thing that uses it is the DefaultLUTService in imagej-common.

@stelfrich
Copy link
Member

So we might want to stick with DefaultTableIOPlugin for now, or adapt the other one to use generic SciJava Locations as well.

I just realized that my interpretation of the benchmark results was wrong. The CommonsCSV-based implementation is faster at reading large(ish) file. Before making any decision here, I'll take a look at the performance for smaller files (just to be on the safe side). If we decide to stick with the CommonsCSV implementation, we will have to look into migrating it to the Locations - thanks for noticing @imagejan!

I just noticed that scijava-table also has org.scijava.table.TableLoader, which is yet another loader implementation.

I can do some benchmarks for this implementation as well - for the sake of completeness. If we decide to remove it, we will have to add another dependency to imagej-common. Would you be OK with that, @ctrueden?

@ctrueden
Copy link
Member Author
ctrueden commented Jan 8, 2019

we will have to add another dependency to imagej-common.

Or! We could change the DefaultLUTService to use IOService and just punt if it can't read the thing as expected. This sort of dependency avoidance is one of the benefits of IOService, after all.

Regarding the benchmarks, I would advise not worrying too much about performance of small files. The scalability is what matters most here, in both time and space. I'll admit there is a point where (e.g. if you are looping reads of millions of tiny CSV files) horrible performance of small files could be an issue. But let's not prematurely optimize, either.

@imagejan
Copy link
Member
imagejan commented Jan 9, 2019

Thanks @stelfrich and @ctrueden for improving this component! I suggest to release scijava-plugins-io-table-0.2.0 soon and also add it to imagej/imagej as a runtime dependency, so that it gets included by default on the update site.

If there are no objections, I'll release a new version within the next few days.

@stelfrich
Copy link
Member
stelfrich commented Jan 9, 2019

@imagejan I haven't had time yet to look into @ctrueden's suggestion for DefaultLUTService, but plan to have a clearer picture by the end of the week. If we don't have to change API to make this work, I am fine with releasing a version..

@stelfrich
Copy link
Member

@imagejan From my perspective, we are good to release a version! 👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0