git blame would give the last commit on all lines. however, there are times it’s really needed is what are the commits for certain one line of code, git log would work out here
git log -S"private List<LocalDate> ...;" src/main/java/...java
git blame would give the last commit on all lines. however, there are times it’s really needed is what are the commits for certain one line of code, git log would work out here
git log -S"private List<LocalDate> ...;" src/main/java/...java
it doesn’t come as exact a preview for merge results. however, it does come close, and most importantly tell any new commits on the other branch since the current branch and the other branch diverge
git diff ...the_other_branch
the modern spring framework has now been able to DI for abstract class properties.
For example:
public abstract class AbstractCacheReader<T extends AbstractMessage> implements ModelCacheReader<T> { T models; @Value("${....cache.deployment.directory}") protected String CACHE_DEPLOYMENT_DIRECTORY; .... }
with concrete class
@Component @Slf4j @Setter public class ABCCacheReader extends AbstractCacheReader<ABC.ABCModels> implements ModelCacheReader<ABC.ABCModels> { public ABCCacheReader() { .. models = ABC.ABCModels.newBuilder().build(); } .... }
the CACHE_DEPLOYMENT_DIRECTORY would be properly wired
onto
ABCCacherReader
bean.
After several `git diff` across different versions, turns out I was missing the env variable when launching,
-Dspring.profiles.active=dev
which resulted the spring boot to exit silently.
quite impressed by protobuf performance, that the built in implementation is even better than streaming earlier:
models.toBuilder().clear().mergeFrom(
Files.list(..).filter(Files::isRegularFile)
.map(Path::toFile)
.filter(file -> ...)
.collect(Collectors.toList())
.parallelStream()
.map(file -> ..)
//read into protobuf
.reduce((m1, m2) -> m1.toBuilder().mergeFrom(m2).build())
.orElseGet(()-> Stress.StressModels.getDefaultInstance())
).build();
models.getModelsList().parallelStream().forEach(...)
is even better performing than
models = Files.list(Paths.get(..)).filter(Files::isRegularFile)
.map(Path::toFile)
.filter(file -> ..)
.collect(Collectors.toList())
.parallelStream()
.map(file -> ..)
.flatMap(m -> m.getModelsList().stream());
models.parallel().forEach(...)
I was really confused by the output from the files list stream
16:25:29.032 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.031 [main] INFO StressTest - compute from parallel file with collection 16:25:29.032 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.038 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel file with collection 16:25:29.032 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel file with collection 16:25:29.039 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel file with collection 16:25:29.037 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.037 [main] INFO StressTest - compute from parallel file with collection 16:25:29.039 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel file with collection 16:25:29.039 [main] INFO StressTest - parallel compute for file names with collect 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.042 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-3] INFO StressTest - compute from parallel string with collection 16:25:29.042 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-1] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [ForkJoinPool.commonPool-worker-2] INFO StressTest - compute from parallel string with collection 16:25:29.043 [main] INFO StressTest - parallel compute for file names 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.045 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.048 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.049 [main] INFO StressTest - compute from parallel string 16:25:29.050 [main] INFO StressTest - compute from parallel string 16:25:29.050 [main] INFO StressTest - compute from parallel string 16:25:29.050 [main] INFO StressTest - compute from parallel string 16:25:29.050 [main] INFO StressTest - parallel compute for files 16:25:29.051 [main] INFO StressTest - compute from parallel file 16:25:29.051 [main] INFO StressTest - compute from parallel file 16:25:29.051 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.052 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.053 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file 16:25:29.054 [main] INFO StressTest - compute from parallel file Process finished with exit code 0
corresponding to code
try { Files.list(Paths.get("....")) .parallel() .filter(Files::isRegularFile) .map(Path::toFile) .filter(file -> file.getName().startsWith(Constants.AB_MODEL)) .collect(Collectors.toList()) .parallelStream() .forEach(s -> { log.info("compute from parallel file with collection"); }); } catch (IOException e) { e.printStackTrace(); } log.info("parallel compute for file names with collect"); try { Files.list(Paths.get("....")) .parallel() .filter(Files::isRegularFile) .map(Path::toFile) .filter(file -> file.getName().startsWith(Constants.AB_MODEL)) .map(file -> file.getName()) .collect(Collectors.toList()) .parallelStream() .forEach(s -> { log.info("compute from parallel string with collection"); }); } catch (IOException e) { e.printStackTrace(); } log.info("parallel compute for file names"); try { Files.list(Paths.get("....")) .parallel() .filter(Files::isRegularFile) .map(Path::toFile) .filter(file -> file.getName().startsWith(Constants.AB_MODEL)) .map(file -> file.getName()) .forEach(s -> { log.info("compute from parallel string"); }); } catch (IOException e) { e.printStackTrace(); } log.info("parallel compute for files"); try { Files.list(Paths.get("....")) .parallel() .filter(Files::isRegularFile) .map(Path::toFile) .filter(file -> file.getName().startsWith(Constants.AB_MODEL)) .forEach(s -> { log.info("compute from parallel file"); }); } catch (IOException e) { e.printStackTrace(); }
so the parallel from Files.list is resulting in a single thread to process all files (~50 files).
Unless there is a collect to do a parallel stream again, then it will split into the common pool.
After a lot of investigate and research, turns out JCP has a really not crafted implementations on parallel:
source: http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-July/034539.html
so basically when it’s doing splitIterator, it was using Long.MAX_VALUE:
IteratorSpliterator (est. MAX_VALUE elements) | | ArraySpliterator (est. 1024 elements) IteratorSpliterator (est. MAX_VALUE elements) | | /---------------/ | | | ArraySpliterator (est. 2048 elements) IteratorSpliterator (est. MAX_VALUE elements) | | /---------------/ | | | ArraySpliterator (est. 3072 elements) IteratorSpliterator (est. MAX_VALUE elements) | | /---------------/ | | | ArraySpliterator (est. 856 elements) IteratorSpliterator (est. MAX_VALUE elements) | (split returns null: refuses to split anymore)
do { a[j] = i.next(); } while (++j < n && i.hasNext());
Have spent quite sometime to auto apply the datasource configuration from yml, turns out it could be achieved through the
@ConfigurationProperties
annotations.
So with a configuration in yaml,
spring: profiles: dev datasource: hikari: auto-commit: true connection-timeout: 30000 maximum-pool-size: 20 url: jdbc:sqlserver://.. username: password:
and a bean configuration
@Bean(name = "RODataSource") @ConfigurationProperties("spring.datasource.hikari") public DataSource getDataSource(){ HikariDataSource dataSource = DataSourceBuilder.create() .type(HikariDataSource.class) .url(url) .username(username).password(pwd) .driverClassName(driver) .build(); return dataSource; }
The configurationProperties is able to reflect what’s the Bean and apply the corresponding properties.
(auto-commit, pool size and timeout value for example)
while gRPC server serving stream of response, it they are responding concurrently, looks likes it will then will fall into
INFO: Transport failed java.lang.IllegalStateException: Stream 3 sent too many headers EOS: false at io.grpc.netty.shaded.io.netty.handler.codec.http2.DefaultHttp2ConnectionEncoder.validateHeadersSentState(DefaultHttp2ConnectionEncoder.java:157) at io.grpc.netty.shaded.io.netty.handler.codec.http2.DefaultHttp2ConnectionEncoder.writeHeaders0(DefaultHttp2ConnectionEncoder.java:230) at io.grpc.netty.shaded.io.netty.handler.codec.http2.DefaultHttp2ConnectionEncoder.writeHeaders(DefaultHttp2ConnectionEncoder.java:150) at io.grpc.netty.shaded.io.netty.handler.codec.http2.DecoratingHttp2FrameWriter.writeHeaders(DecoratingHttp2FrameWriter.java:45) at io.grpc.netty.shaded.io.grpc.netty.NettyServerHandler.sendResponseHeaders(NettyServerHandler.java:707) at io.grpc.netty.shaded.io.grpc.netty.NettyServerHandler.write(NettyServerHandler.java:626)
The solution is to stream in serial instead:
..
//.parallel() //disable the parrallel stream
.mapToObj(value -> ...newBuilder().setMessage(value)....build())
.forEach(reply -> responseObserver.onNext(reply));
basically, it’s same class (FQ class name) being loaded from different class loaders. It’s a constraint check implemented by JVM since 1.2.
So that given permission to a class from one classloader, doesn’t grant the same permission to “same” class loaded by another classloader by default.
There was an issue when running the parquet writer on windows
Change File Mode By Mask error’ (5): Access is denied.
turns out this is due to Parquet is using hadoop filesystem for accessing the file, which subsequently requires a permission for the /tmp/hive folder.
the solution is to run
winutils.exe chmod -R 777 C:\tmp\hive
however, to note, the drive letter should be the same as the HADOOP_HOME or the winutils location.