Building a production-grade emotion detection API with Java, Quarkus, and ONNX — and why tokenization quietly decides whether AI systems tell the truth
Really solid breakdown here, especially the tokenization comparisions. It reminds me of a project where we spent weeks tunning the model only to discover the preprocessing was mangling multi-byte characters the whole time. You're spot on that the tokenizer is basically part of the model - people forget that the embedding space changes completely when you mess with the input encoding. The Quarkus demo makes it way easier to see why this stuff matters in production sistems.
Hi Markus, I haven't read through the complete tutorial yet but alone the intro is a great and well balanced read... Thanks again for another interesting article. How you come with all these ideas and produce these articles in the given quality in the short intervals is a mystery to me but I do enjoy them. Thank you & best regards, Michael
Really solid breakdown here, especially the tokenization comparisions. It reminds me of a project where we spent weeks tunning the model only to discover the preprocessing was mangling multi-byte characters the whole time. You're spot on that the tokenizer is basically part of the model - people forget that the embedding space changes completely when you mess with the input encoding. The Quarkus demo makes it way easier to see why this stuff matters in production sistems.
Hi Markus, I haven't read through the complete tutorial yet but alone the intro is a great and well balanced read... Thanks again for another interesting article. How you come with all these ideas and produce these articles in the given quality in the short intervals is a mystery to me but I do enjoy them. Thank you & best regards, Michael
Thanks 😊 it’s passion and Quarkus ;-)