gc on old gen

i have a large app which is currently running with 600GB max memory (Xmx). the app is processing at a controlled rate now (every half an hour) to avoid an OOM.

each run is consuming and processing >3.2 million kafka messages in less than 5 minutes (1 or 2 minutes normally).

even after a lot tuning, when i look at the heap size, i saw the memory footprint is keep bumping up. even though the eden space has a lot frequent GC (minor) during the <5 minutes interval, the old gen is keep bumping up gradually.

whole heap

Eden

old gen

this really seems like a memory leak.

however, after another thorough check into the code, it looks like the collection of objects which are not used did get dereferenced.

so if the code is right, then it looks like the gc might not be doing its job on the old gen.

so i have then triggered a manual GC, which resulted

after spending sometime to look into this in details, turns out the default ratio to trigger the gc on old gen is >40%. and this is definitely the ideal set up, because for the application, for example, it’s sit idling there 25 minutes out of every 30 minutes interval. while because the gc is waiting on an fixed size ratio to trigger, the memory was mostly wasted.

around 120GB was wasted in this case before the manual gc.

so turns out this was a proposal by JEP 346 in 2018 to tune this.

http://openjdk.java.net/jeps/346

and obviously for now before that JEP implemented, leverage on the periodicGC is a much needed practice instead of leaving to the gc algorithm alone.

-XX:G1PeriodicGCInterval=600000 -XX:G1PeriodicGCSystemLoadThreshold=LOAD 
     -XX:-G1PeriodicGCInvokesConcurrent

full gc with default setting where it’s triggered at >40% threshold:

jvm memory tuning

a big memory drainer is the string objects.

with the object header, pointer for the char array, there are minimum ~20 bytes (varies by java version) occupied even for a empty string.

this could become an especially a big problem, if a large volume (like millions of records) of messages to parse onto single jvm.

from java 8, there are two ways to handle this, (especially for situations where a large amount of data all have for example same headers, like “portfolio”, “name”, “currency”. these are likely to have limited/constant number of variances for both the key/attribute/property and values)

  1. string intern, this is the approach before java 8

caveat though, the java default implementation with string intern could be slow with native implementation.

an alternative to the native implementation is using map, which would server the same purpose and faster. like

public final class StringRepo extends ConcurrentHashMap<String, String> {
    public final static StringRepo repo = new StringRepo();
    public String intern(String s){
      //Note: handle npe
       return computeIfAbsent(s, String::intern);
    }
}

2. from java 8, string deduplication could be used to take on gc’s help on reduce the string memory footprint.

-XX:+UseG1GC -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics

with GC

https://github.com/FasterXML/jackson-core/issues/726

protobuf NPE

What language does this apply to?
Java

If it’s a proto syntax change, is it for proto2 or proto3?
proto3

If it’s about generated code change, what programming language?
Java

Describe the problem you are trying to solve.
For the message below,

message Position {
    string portfolio =1;
}

The generated setter would be something like this

      public Builder setPortfolio(
          java.lang.String value) {
        if (value == null) {
         throw new NullPointerException();
        }
  
        portfolio_ = value;
        onChanged();
        return this;
      }

There is a throw NPE within the method.

I think this is really an opinioned approach, which instead should leave to developers to decide whether to handle it or throw NPE.
There could be a position message, for example, with many known optional field which could be null. Developers should be in a better position on how those fields should be set.

Describe the solution you’d like

The generated class should take the value to be set as it is. Something like

      public Builder setPortfolio(
          java.lang.String value) {
//        if (value == null) {
//         throw new NullPointerException();
//        }
  
        portfolio_ = value;
        onChanged();
        return this;
      }

Describe alternatives you’ve considered

Additional context
Add any other context or screenshots about the feature request here.

I guess the current “opinioned” approach probably could be due to a constraint from the protobuf format, where an int was used to determine the length-delimited value’s length. If no, I think by introducing a negative int (-1) could tell whether the following value is really empty (0) or null (-1).

https://github.com/protocolbuffers/protobuf/issues/9207

thread safe sorted map

besides using Collections to get a synchronized version of a `TreeMap`, another approach to get a output according to sorted key is to do the reverse using a helper class.

for example, using ObjectMapper

//set the ordering
mapper.configure(SerializationFeature.ORDER_MAP_ENTRIES_BY_KEYS, true);

then to get the output or transform to another object, using the `mapper`

//output
mapper.writeValueAsString(pairs)

//transform
mapper.convertValue(pairs, Map.class)

intellij compiled class

when I finished updating a class in IntelliJ, I was expecting the hot reload (spring devtools) to kick in, however that didn’t happen automatically.

after a further look, seems like when a class is updated, IntelliJ has a different place (probably in memory) to store the updated class file or maybe only the delta.

the class file on the file system is not updated, still with the old timestamp. hence the filewatcher won’t kick the reload.

a rebuild of the project would then update any class updated. the timestamp would be updated, subsequently the reload.

spring kafka seek

I have a kafka consumer which after resume, it’s picking up from last committed offset and sequentially consuming from there. Even with several calls to seekToEnd().

Logs

//before pause
2021-10-09 15:44:18,569 [fx-kafka-consumer-0-C-1] INFO  c.x.f.r.k.HarvestConsumer - check raw msg size 1 on [17] @ [38012024115]   
2021-10-09 15:44:28,573 [fx-kafka-consumer-0-C-1] INFO  c.x.f.r.k.HarvestConsumer - check raw msg size 1 on [17] @ [38012024116]   

//paused

//called seekToEnd before resume

//after resume
//called seekToEnd several times before and after resume
2021-10-09 15:45:13,603 [kafka-consumer-0-C-1] INFO  c.x.f.r.k.HarvestConsumer - check raw msg size 1 on [17] @ [38012024117]   
2021-10-09 15:45:53,610 [kafka-consumer-0-C-1] INFO  c.x.f.r.k.HarvestConsumer - check raw msg size 1 on [17] @ [38012024118]  
2021-10-09 15:46:03,612 [kafka-consumer-0-C-1] INFO  c.x.f.r.k.HarvestConsumer - check raw msg size 1 on [17] @ [38012024119]   
2021-10-09 15:46:13,613 [kafka-consumer-0-C-1] INFO  c.x.f.r.k.HarvestConsumer - check raw msg size 1 on [17] @ [38012024120] 

Turns out this actually due to an issue with the implementation of the subclass extend AbstractConsumerSeekAware.

Turned out there is an issue with my current code. At consumer start up, I did a manual seek in the onPartitionsAssigned. This so far have been working well.

However, now when there is a need to pause/resume the consumer, turned out the overrode method to do the manual seek has since skipped maintaining the Consumer into the callbacks field.

When the seekToEnd was invoked, it’s really iterating through an empty map.

sql server parameter sniffing

i was using PreparedStatement to load some data out of sql database, which has been working fine for quite some time. Till recently, for one day’s data, the query is not able to finish (hang on reading the response from database)

      "stackTrace": [
        {
          "methodName": "socketRead0",
          "fileName": "SocketInputStream.java",
          "lineNumber": -2,
          "className": "java.net.SocketInputStream",
          "nativeMethod": true
        },
        {
          "methodName": "socketRead",
          "fileName": "SocketInputStream.java",
          "lineNumber": 116,
          "className": "java.net.SocketInputStream",
          "nativeMethod": false
        },
        {
          "methodName": "read",
          "fileName": "SocketInputStream.java",
          "lineNumber": 171,
          "className": "java.net.SocketInputStream",
          "nativeMethod": false
        },
        {
          "methodName": "read",
          "fileName": "SocketInputStream.java",
          "lineNumber": 141,
          "className": "java.net.SocketInputStream",
          "nativeMethod": false
        },
        {
          "methodName": "read",
          "fileName": "IOBuffer.java",
          "lineNumber": 2058,
          "className": "com.microsoft.sqlserver.jdbc.TDSChannel",
          "nativeMethod": false
        },
        {
          "methodName": "readPacket",
          "fileName": "IOBuffer.java",
          "lineNumber": 6617,
          "className": "com.microsoft.sqlserver.jdbc.TDSReader",
          "nativeMethod": false
        },
        {
          "methodName": "nextPacket",
          "fileName": "IOBuffer.java",
          "lineNumber": 6567,
          "className": "com.microsoft.sqlserver.jdbc.TDSReader",
          "nativeMethod": false
        },
        {
          "methodName": "ensurePayload",
          "fileName": "IOBuffer.java",
          "lineNumber": 6540,
          "className": "com.microsoft.sqlserver.jdbc.TDSReader",
          "nativeMethod": false
        },
        {
          "methodName": "skip",
          "fileName": "IOBuffer.java",
          "lineNumber": 7200,
          "className": "com.microsoft.sqlserver.jdbc.TDSReader",
          "nativeMethod": false
        },
        {
          "methodName": "skipValue",
          "fileName": "dtv.java",
          "lineNumber": 3362,
          "className": "com.microsoft.sqlserver.jdbc.ServerDTVImpl",
          "nativeMethod": false
        },
        {
          "methodName": "skipValue",
          "fileName": "dtv.java",
          "lineNumber": 162,
          "className": "com.microsoft.sqlserver.jdbc.DTV",
          "nativeMethod": false
        },
        {
          "methodName": "skipValue",
          "fileName": "Column.java",
          "lineNumber": 152,
          "className": "com.microsoft.sqlserver.jdbc.Column",
          "nativeMethod": false
        },
        {
          "methodName": "skipColumns",
          "fileName": "SQLServerResultSet.java",
          "lineNumber": 216,
          "className": "com.microsoft.sqlserver.jdbc.SQLServerResultSet",
          "nativeMethod": false
        },
        {
          "methodName": "loadColumn",
          "fileName": "SQLServerResultSet.java",
          "lineNumber": 770,
          "className": "com.microsoft.sqlserver.jdbc.SQLServerResultSet",
          "nativeMethod": false
        },
        {
          "methodName": "getterGetColumn",
          "fileName": "SQLServerResultSet.java",
          "lineNumber": 2036,
          "className": "com.microsoft.sqlserver.jdbc.SQLServerResultSet",
          "nativeMethod": false
        },
        {
          "methodName": "getValue",
          "fileName": "SQLServerResultSet.java",
          "lineNumber": 2054,
          "className": "com.microsoft.sqlserver.jdbc.SQLServerResultSet",
          "nativeMethod": false
        },
        {
          "methodName": "getValue",
          "fileName": "SQLServerResultSet.java",
          "lineNumber": 2040,
          "className": "com.microsoft.sqlserver.jdbc.SQLServerResultSet",
          "nativeMethod": false
        },
        {
          "methodName": "getString",
          "fileName": "SQLServerResultSet.java",
          "lineNumber": 2525,
          "className": "com.microsoft.sqlserver.jdbc.SQLServerResultSet",
          "nativeMethod": false
        },

the same query however works well with plain run in the sql tool, or if run using plain java statement.

and until I have added a dummy where clause “… and 1=1“, then suddenly the preparedstatement is able to return the result timely again.

in the beginning, i thought it was the dummy clause which made a difference strangely. turns out, this was a problem with sql server parameter sniffing.

the dummy where clause worked only because sql server now see it as a different query, hence not using the previous cached execution plan.

this can be reproduced by adding `

option (recompile)

to the query. this will trigger sql server to drop the previous cached execution plan, as such, even the original query, without the dummy where clause is now back to performing again.

SELECT * FROM sys.database_scoped_configurations;

https://www.databasejournal.com/features/mssql/turning-off-parameter-sniffing-for-a-sql-server-database-by-default.html

==========================

looking further into this, the execution was stuck at using previous executing plan.

seems like because the first plan, in blue, is not working out. finally it created a new plan in

green.

and the wait for the blue plan was for HTDELETE. this is a change made in SQL server since 2014.

SQL Server 2014 now uses one shared hash table instead of per-thread copy.
This provides the benefit of significantly lowering the amount of memory
required to persist the hash table but, as you can imagine, the multiple
threads depending on that single copy of the hash table must synchronize with
each other before, for example, deallocating the hash table. To do so, those
threads wait on the HTDELETE (Hash Table DELETE) wait type.<o:p></o:p>

https://social.msdn.microsoft.com/Forums/en-US/8579f864-cbdc-49b9-be26-d47af61df56d/sql-server-2014-wait-info-htdelete?forum=sql14db

likely this is a bug in SQL server, which was hit when I am running the program using parallelStream(), which steam() would work.*

*There are multiple process hitting the same query.

Provides the benefit of significantly lowering the amount of memory required to persist the hash table but, as you can imagine, the multiple threads depending on that single copy of the hash table must synchronize with each other before, for example, deallocating the hash table. To do so, those threads wait on the HTDELETE (Hash Table DELETE) wait type.

TimSort

when I am writing a comparator using lamba, java throws out this exception

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Comparison method violates its general contract!
        at java.util.TimSort.mergeLo(TimSort.java:777)
        at java.util.TimSort.mergeAt(TimSort.java:514)
        at java.util.TimSort.mergeCollapse(TimSort.java:441)
        at java.util.TimSort.sort(TimSort.java:245)
        at java.util.Arrays.sort(Arrays.java:1512)
        at java.util.ArrayList.sort(ArrayList.java:1464)
        at java.util.stream.SortedOps$RefSortingSink.end(SortedOps.java:392)
        at java.util.stream.Sink$ChainedReference.end(Sink.java:258)
        at java.util.stream.Sink$ChainedReference.end(Sink.java:258)
        at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:500)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)

seems like `TimSort` is having a safety check to ensure the Comparator contract, however, it’s not always working.

So this only throws out an issue when running enough times

    @Test
    public void test3() {
        List<Domain> domains = new ArrayList<>();
        for (int i = 0; i < 1_000; i++) {
//            domains.clear();
            domains.add(Domain.builder().val(0.1).name("1").build());
            domains.add(Domain.builder().val(0.5).name("2").build());
            domains.add(Domain.builder().val(1.2).name("3").build());
            domains.add(Domain.builder().val(2.1).name("4").build());
            Collections.shuffle(domains);
            log.info("{}th start sorting {}", i, domains);
            log.info("check sorted {}", domains.parallelStream().sorted((o1, o2) -> {
                        final int diff = (int) (o2.val - o1.val);
                        log.info("check {} - {} = {}", o1, o2, diff);
                        return diff;
                    }).collect(Collectors.toList())
            );
        }

    }
    @Test
    public void test4() {
        List<Domain> domains = new ArrayList<>();
        for (int i = 0; i < 1_000_000; i++) {
//            domains.clear();
            domains.add(Domain.builder().val(0.1).name("1").build());
            domains.add(Domain.builder().val(0.5).name("2").build());
            domains.add(Domain.builder().val(1.2).name("3").build());
            domains.add(Domain.builder().val(2.1).name("4").build());
            Collections.shuffle(domains);
//            log.info("{}th start sorting {}", i, domains);
            log.info("check sorted {}", domains.parallelStream().sorted((o1, o2) -> Double.compare(o2.val, o1.val)).collect(Collectors.toList()).size()
            );
        }

    }
}

@Data
@Builder
class Domain {
    double val;
    String name;
}

kafka rebalance

i have set up a program to consume from a kafka topic.

it has been running fine, till ~ 25% of the messages (out of 2/3mil rows) are consumed, there would be a lot of reblance started happening.

i have tried to increase the `

max.poll.interval.ms

and reduce the `

max.poll.records

and to asynchronously process the messages. have tried to set up the `

onPartitionsAssigned

and `

onPartitionsRevoked

`method as well. however, the rebalance is still keep happening, and the program starts stucking at 25%.

after a lot of investigation and trials, turns out that was due to the memory footprints of the program.

as more messages are put onto heap, several snap * 25% * 2/3 million rows, JVM starts busy running the GC thread, which eventually blocked the consumer to send back the heartbeat in time.

filter the message, or run this on a server would fix the issue.