I was really confused by the output from the files list stream
16:25:29.032 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.031 [main] INFO StressTest - compute from parallel file with collection 16:25:29.032 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.032 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel file with collection 16:25:29.039 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [main] INFO StressTest - compute from parallel file with collection 16:25:29.039 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.039 [main] INFO StressTest - parallel compute for file names with collect 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel string with collection 16:25:29.042 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - parallel compute for file names 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.050 [main] INFO StressTest - compute from parallel string 16:25:29.050 [main] INFO StressTest - compute from parallel string 16:25:29.050 [main] INFO StressTest - compute from parallel string 16:25:29.050 [main] INFO StressTest - parallel compute for files 16:25:29.051 [main] INFO StressTest - compute from parallel file 16:25:29.051 [main] INFO StressTest - compute from parallel file 16:25:29.051 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file Process finished with exit code 0
corresponding to code
try { Files.list(Paths.get("....")) .parallel() .filter(Files::isRegularFile) .map(Path::toFile) .filter(file -> file.getName().startsWith(Constants.AB_MODEL)) .collect(Collectors.toList()) .parallelStream() .forEach(s -> { log.info("compute from parallel file with collection"); }); } catch (IOException e) { e.printStackTrace(); } log.info("parallel compute for file names with collect"); try { Files.list(Paths.get("....")) .parallel() .filter(Files::isRegularFile) .map(Path::toFile) .filter(file -> file.getName().startsWith(Constants.AB_MODEL)) .map(file -> file.getName()) .collect(Collectors.toList()) .parallelStream() .forEach(s -> { log.info("compute from parallel string with collection"); }); } catch (IOException e) { e.printStackTrace(); } log.info("parallel compute for file names"); try { Files.list(Paths.get("....")) .parallel() .filter(Files::isRegularFile) .map(Path::toFile) .filter(file -> file.getName().startsWith(Constants.AB_MODEL)) .map(file -> file.getName()) .forEach(s -> { log.info("compute from parallel string"); }); } catch (IOException e) { e.printStackTrace(); } log.info("parallel compute for files"); try { Files.list(Paths.get("....")) .parallel() .filter(Files::isRegularFile) .map(Path::toFile) .filter(file -> file.getName().startsWith(Constants.AB_MODEL)) .forEach(s -> { log.info("compute from parallel file"); }); } catch (IOException e) { e.printStackTrace(); }
so the parallel from Files.list is resulting in a single thread to process all files (~50 files).
Unless there is a collect to do a parallel stream again, then it will split into the common pool.
After a lot of investigate and research, turns out JCP has a really not crafted implementations on parallel:
source: http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-July/034539.html
so basically when it’s doing splitIterator, it was using Long.MAX_VALUE:
IteratorSpliterator (est. MAX_VALUE elements) | | ArraySpliterator (est. 1024 elements) IteratorSpliterator (est. MAX_VALUE elements) | | /---------------/ | | | ArraySpliterator (est. 2048 elements) IteratorSpliterator (est. MAX_VALUE elements) | | /---------------/ | | | ArraySpliterator (est. 3072 elements) IteratorSpliterator (est. MAX_VALUE elements) | | /---------------/ | | | ArraySpliterator (est. 856 elements) IteratorSpliterator (est. MAX_VALUE elements) | (split returns null: refuses to split anymore)
do { a[j] = i.next(); } while (++j < n && i.hasNext());