Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Fury – Fast multi-language serialization framework powered by JIT and Zero-copy (github.com/alipay)
142 points by chaokunyang on Oct 8, 2023 | hide | past | favorite | 68 comments


The comparison to JDK serialization is surprising me. Noone should be using it for anything serious in production. Even the Chief Architect of the Java platform has called it a "horrible mistake". https://www.infoworld.com/article/3275924/oracle-plans-to-du...


Just to be clear, though, the mistake that Mark Reinhold refers to isn't the particular implementation of the JDK's core serialization, but the design that allows arbitrary objects to be deserialized while bypassing their constructors (and so their established invariants). Unfortunately, many other serialization libraries β€” faster or slower β€” are repeating the same mistake. I.e., any serialization library that can serialize any class that implements Serializable suffers from all the same flaws of core serialization.

Correct serialization can only be done for classes that are designed for it by having a well-known construction protocol, which are currently basic collections, enums, Strings, records, and classes that register specific serialization code.


I see, totally agreed! If any objects can be serialized, the deserialization will introduce security issue too. For example, `constructor/equals/hashCode` may contains malicious code, which introduce the deserialization risks.

Fury put much work on this to avoid the open dynamic deserialization risks.

But although `basic collections, enums, Strings, records, and classes that register specific serialization code` are the only objects should be allowed for serialization, and they are serialized by the construction protocol in fury already. There are so many applications has used the existing serialization protocol assumption, we have to keep compatible, otherwise most of application can't use fury.


Why?

Zero copy state transfer is a viable and high performance alternative.

Security and integrity can (should?) be implemented at a different layer.


> Zero copy state transfer is a viable and high performance alternative.

Completely agreed, but also, adding my own emphasis there.

The technique has has enough gotchas, edge cases, and additional security considerations that it should really be an alternative that people can opt for when they need it, and never be the default approach.


Yes, it should be an alternative. In fury, we disabled it by default, and all those benchmarks doesn't enable it. Most objects are value bounded, not binary data bounded. Zero-copy won't have a big speed up too. But if the objects graph has many ByteBuffer/Tensor/DataFrame/ArrowTable, then the zero-copy will have a big leap. see python pickle5 out-of-band serialization to speed up pandas/numpy as an example


Not exactly. For java serialization, you can serialize java native object directly without define dsl and compile the schema, which are mush more easy to use. But it also means the deserialization will need to create the user-defined class, which may contains malicious coode in `constructor/queals/hashCode`. So the security can't be done at a different layer unless you are in a intra-net which no attack will happen which may be implemented at a different layer but we can't ensure that.

The


We must realize there always a tradeoff here. If you define a dsl for serialization data and generate the code like protobuf, the security issue will be much less. But it comes with the cost. Protobuf generated class are not the domain class, and can't be used for domain-driven application developemnt, and it doesn't support circular references too. What fury does it provide better performance and provide better usability.


But the only reason core serialization is considered a "horrible mistake" has nothing to do with performance or with any aspect of the technical implementation. It is considered a mistake only because serialising arbitrary classes breaks invariants in a way that has serious correctness and security implications.


The point is that security and bypassing constructors are completely orthogonal as serialisation by memcpy is completely viable strategy and is no less secure than calling constructors on objects.


Whether or not they're orthogonal, it is that aspect that we talk about when we talk about the mistake of core serialization in the JDK. Also, whether calling constructors is necessary for security depends on many details. When we talk about fixing serialization in Java, we talk about a design that is independent of the format used for the serialized data. I understand you're that this is a different layer from the one you're interested in, but that is the layer where the mistake was made, and so that is the layer that requires fixing.


The premise of the sentence you cited is that it is necessary to call application code (constructors) to maintain security/integrity.

What I am saying is that it is NOT necessary to implement secure serialisation and deserialisation because you CAN (and should?) implement it at a different layer and transparently to the application code.


It is necessary to call application code to maintain integrity (i.e. the guarantee of global program invariants), if you want to design classes for serialization that is agnostic to serialized format and that accounts for cases where not both sides of the communication are trusted. If program objects can be constructed from bits that arrive from external input that might not be trusted, some code needs to run to ensure that the objects maintain their invariants. Of course, you could say that objects cannot maintain any of their own invariants (that is the approach taken by Zig, where, unlike in Java, it is impossible to explicitly define, at the language level, an object that represents only prime integers), but that requires a very different language design, one that has not yet been proven to scale well (and one that isn't taken by Java specifically).


> if you want to design classes for serialization that is agnostic to serialized format

You mean XML/JSON/binary etc? This is impossible :) - the only thing that comes close is ASN.1.

But serialisation can be easy: just don't do anything and let runtime handle that. Any compacting GC can be seen as serialiser/deserialiser (it does move object graph from one place to another) - and it does not need to run any application code to perform its task. Criu and any other live program ("object") migration methods are similar.

> and that accounts for cases where not both sides of the communication are trusted

I don't believe Java serialisation will ever be capable of handling deserialisation of untrusted input.


There are too many java serialization libraries, I compared fury with jdk/kryo/fst/protostuff/protobuf/flatbuffers/thrift/msgpack/canproto/acro/jackson/json. Fury are fastest too. See https://github.com/eishay/jvm-serializers/wiki for detailed benchmark results.

It's a hard choice to select one for the title, so I use JDK for it, which may be not a good choice.

BTW, fury support jit serialization for jdk17 record, which is super fast compared to other serialization frameworks such as kryo


There's also a chapter on JDK serialized in Bloch's "Effective Java" worth a read:

    Item 85: Prefer alternatives to Java serialization
    Item 86: Implement Serializable with great caution
    Item 87: Consider using a custom serialized form
    Item 88: Write readObject methods defensively
    Item 89: For instance control, prefer enum types to readResolve
    Item 90: Consider serialization proxies instead of serialized instances


Those pattern are all supported in fury. And it seems fury are the only framework which implement the jdk `writeObject/readObject/writeReplace/readResolve/readObjectNoData` methods except jdk iteself. Other java serialization will jsut ignore those methods, and get incorect results. But today I'll suggest to avoid to use JDK `riteObject/readObject/`. Since to be compatible with JDK serialization API behaviour, it will introduce performance and space overhead. We can register custom Serializer by fury.registerSerializer(xxx.class, XXXSerializer.class).

`writeReplace/readResolve` are an useful pattern, and can be used when needed.



Fury is also the fastest jvm serialization framework in the https://github.com/eishay/jvm-serializers/wiki


Weird they aren't emphasizing competitiveness with Protobuf and Avro, which is what I'd look at long before java serialization if I wanted fast object serialization.


Protobuf/Avro doesn't support polymorphism and circular references, maybe that's one of thge reason? But even compared with protobuf, fury is 3.2x faster. When comparing with avro, fury is 5.3x faster


How can this be faster than Protobuf? Protobuf basically does a direct storage of bytestream to memory. No parsing needed because it precompiled the stubs. Theoretically it can’t be much faster.

And having an IDL is an advantage imho. Clear specs and interface design.


Not exactly. Protobuf has compression for string/numbers. Fury use different compression protocol. Another thing is that Protobuf generated code are not efficient enough. Fury generated the code at runtime, which has more information about the jvm, then it can genereate more eifficient code, avoid many memory access, branchs, and do more code optimization.


Just imagine the project landscape you'd need to be in to consider starting this worthwhile, given all uncertainty of success and so on. Or a very large middle finger to YAGNI, but I doubt it's that.


But I want to say that Fury will be continuously maintained. I created it 4 years ago, open sourced it in 2023.07. I've maintained it with little credits for 4 years. Now with it open-sourced, it will just make me put more efforts on it.

But on the other hand, the success of fury does not depend on whether I put in enough effort but on whether I can build a thriving community to involve more people to join us. I must admit that I am still learning in this area and have a long way to go.


Beware of burnout -- if companies take you at your word and expect continuous maintenance (for free), the load might be hard to bear after a while.

The best way to avoid it I know of is to delegate responsibility and find other people to maintain the library with you (and/or charge for development/encourage corporate sponsorships!)


Thanks for your kind words. I also want to find more people to join us tro maintain the library. But I still don't know how to involve more people to join the fury community.


This is a hard general problem -- in the end most people just can't/won't contribute. A lot of really "successful" open source is funded by large companies.

That said, maybe consider doing the best you can to encourage people to work on the issues that come up -- if an issue comes up, prep as much context as you can and hand it to the person if they're capable of committing code. And then, help people with their PRs once they have something up.

Also do things like post about your project in places Java places might look, or going into "Java weekly" style newsletters (they often have a "call for participation" at the bottom).

Oh and don't forget things like Hacktober (https://hacktoberfest.com/)

Good luck out there! Make sure to take care of yourself -- delivering value for free is not something many people do (thanks for even trying!), but doing it sustainably is harder than it looks, much better for the world long term to have people like you not burn out, even if a feature ships 6 months later (or never at all).


Thank you so much for your response and valuable insights! The suggestions about providing as much context for issues, and help people with their PRs are very useful. I will take it across the whole open-sourcing.

The suggestions about sharing the project in Java-related communities are also excellent. Things like Hacktober are also fantastic opportunity to involve more contributors. Thanks for those suggestions.

Your words of encouragement and reminder to take care of myself are truly heartwarming. It means a lot to me that you acknowledge the effort I'm putting into fury for free. I will strive to find a balance that allows me to continue contributing in the long term.

Thank you once again for your kind words and support. It's people like you who make the open source community a wonderful place to be.


Good point! The success of a project is not only determined by what it can do, but more determined by what it chooses not to do. Only with clear and simple goals can we focus on what truly matters, while also attracting more developers to join our community.


I'm curious, why do you appear to generate and compile Java source code at runtime instead of generating bytecode directly or using `MethodHandle`s?


Good question! Part of code are generated using `MethodHandle` such as some field getters, or constructor. But the overall code are generated as source code first at runtime instead of bytecode. This is because binary protocol are very complicated. Generated the bytecode directly will make the troubleshooting more difficult. but it's possible to generated the bytecode techniquely. We have an IR abstraction, it's possible to change it to generate the bytecode


The danger with generating source code is that you now depend on source rules which may have changed, never applied to your source language, or are made more complicated by class loading refs. I’d definitely go for byte code generation, and possibly using constant dynamic to handle any really tricky constants, over source code any day.


I would go for `MethodHandle`s and lambda metafactory over bytecode where possible. Much easier to avoid and debug bugs in the generation (better errors than those provided by Hotspot's verifier), and avoids needing to worry about constant dynamic at all. Of course, once you need to implement more than one method, you can no longer just use lambda metafactory.

A while back I actually implemented some utilities for generating classes at runtime and "linking" constants into them using constant dynamic. One day I might clean it up and release it as a library.


Good point, I used `lambda metafactory` too in fury, but only for some `constructor/getters`. I implements some Function utils to generate lambda function, see https://github.com/alipay/fury/blob/main/java/fury-core/src/...

I'm not sure whether `MethodHandle` can generate the most complicated code, since the serialization logic here are more complicated even than the manual written code.


Looking forward to your library


We used https://github.com/janino-compiler/janino to compile the generated code at runtime It's stable and the compiler used by spark/flink.

Janino can generated the bytecode for fury generated java code.

I must agree that generating bytecode directly has it's advantages, the abstraction is more low-level, thus more flexible, except more complicated for developing.


Interesting, I work in the same company but have never heard of it


haha, maybe it's time to use it now. It's used wisely in distributed systems at our company. I posted several blogs in our internal ata bbs, you can search it


OOI, in a 'modern' app, what would you be using this kind of serialisation and deserialisation for? Some kind of complex RPC?


There are 3 kinds of scenarios are serialization bounded: 1. RPC for large microservices application, which are latency&throughput sensitive. The domain model can be huge, serialization will be the bottleneck. Some domain object graph may contains thousands of objects, serializing using jdk will take seconds, and the serialization result may be 50~100kb. Take Order as an example, imagine you need to process an order which has 500 items.

2. Data transfer for bigdata distributed systems: bigdata systems will handle much data, the data needs be transfered between workers. The serialization can be the bottleneck too. Spark RDD,Flink DataStream all have this bottleneck. The use bianry format such as arrow/tungsten format to reduce serialization overhead. But it's limited, sql oritiened, and can't express complex logic such as graph/event/domain.

3. Task scheduling: Image you have a mpp distributed systems, you need to schedule thounds of tasks in a process every sub-seconds. Serialization will be the bottleneck too.


Yes and: Kyro was motivated by building multi player games.


Yes, Game is another scenario, it's very latency sensitive. Fury is very suitable too. Actually the java implememtation has been featured by some game developers. And there has always been a demand within the community for FURY to support C#: https://github.com/alipay/fury/issues/686 . I don't have experience for c#, so it's not supported. Hope c# can be supported with the help of the community


Given it's a binary serialization framework, it should not be too difficult, because the domain is well-explored and numerous libraries exist in C# which address same goals that Fury does.

More popular/newer examples are https://github.com/Cysharp/MemoryPack (which is similar to Fury with its own spec, C#-code first schema), https://github.com/MessagePack-CSharp/MessagePack-CSharp or even gRPC / Protobuf tooling https://github.com/grpc/grpc-dotnet


Glad to see those great libraries. Can't wait to make fury support c#. One of the biggest difference for fury is that if support circular/shared reference and polymorphism. It's different from many other libraries.


Interesting. I will be curious to see how it stacks up against Jackson Smile, both in terms of features and performance. Since it appears to support both polymorphism and back-references, it looks like it already has some advanced features.


Yeah, you are right. Fury supports polymorphism and circular/shared references, which are not supported by json/jackson. I did a benchmark with jackson before, here are the results: 1) Fury is 41.6x faster than jackson for Struct serialization 2) Fury is 65.6x faster than jackson for Struct deserialization 3) Fury is 9.4x faster than jackson for MediaContent serialization 4) Fury is 9.6x faster than jackson for MediaContent deserialization

see https://github.com/chaokunyang/fury-benchmarks for detailed benchmark code.


Another issue about json is that json is not storage efficient. For a float 0.3333333333333335, json needs 18 bytes, but fury needs only 4 bytes.


Two questions:

- Does Fury support versioning?

- What is source of truth for the schema when dealing with multiple languages and the same data?


Good questions. 1)Fury support versions. 2) The schema will be included in the data, and will be used for deserializing/validating the data. The schema will be sent to peer once for a tpc connection.


No comparisons to SBE in the benchmarks?


SBE needs to define the xml for the java bean, and generated the code. It need lots of work to implement a multiple-layer nested objects serialization. So I don't added it for now.


Should still be added. Are are talking a lot about perfomance so this is where SBE shines. Worth comparing if you want to show of the numbers.


No, I just mean it's not easy to use. When using fury, you just need one line code for serialization: `fury.serialize(javaObject)`. When using SBE, it need to much coding, it needs define schema xml, compile code, and so on.


It's as much work as protobuf. How about Chronicle Wire, can you add that?


I wonder how this compares with Microstream One which looked to have quite slick serialisation capabilities


I benchmarked with Microstream using jmh and jvm-serializers data. Hare are the results: 1) fury is 43x faster for the speed. 2) Fury is 5x smaller for serialized binary size

Here is the code for reproduction: https://github.com/chaokunyang/fury-benchmarks#fury-vs-micro...


Oh nice!! I will certainly try this out. Much of my work these days is Java (Kotlin), C# and Python so a good serialization framework is always helpful.


How does Fury compare to static schema serialization frameworks that aren’t Java focused? For example, flat buffers, alkahest, protobuf etc.


Compared with protobuf, fury is 3.2x faster. When comparing with avro, fury is 5.3x faster. Compared with flatbuffers, fury is 4.8x faster. See https://github.com/eishay/jvm-serializers/wiki for detailed benchmark data


* on the JVM

Right? Flatbuffers supposedly being slower than protobufs is circumspect since most benchmarks I’ve seen for it in C++/Rust show it outperforming, especially since it too does zero-copy deser.


Yes, on the jvm. Haven't tested it for native languages, C++/Rust should be faster since it doesn't compress data and it's zero-copy


can someone help a noob like me understand where i would use something like this? is this a way to get faster json parsing?


Json doesn't support polymorphism and circular/shared references, those scenarios are more suitable for fury. And json is a text protocol, which is inefficient compared to fury binary protocol. Taking jackson json lib as an example: 1) Fury is 41.6x faster than jackson for Struct serialization 2) Fury is 65.6x faster than jackson for Struct deserialization 3) Fury is 9.4x faster than jackson for MediaContent serialization 4) Fury is 9.6x faster than jackson for MediaContent deserialization

For http rest api, json is more suitable. For RPC/bigdata/game/scheduling, fury may be better


Another issue for json format is that it will bloat the payload. For example, for an int `1111122222` will take 10 bytes in json, but only 4 bytes in fury. `0.3333333333333333` will need 18 bytes in json, but only 4 bytes in fury.


Thank you chao!


Great that it's fast - but I'm not sure it would make all that much difference in Python code.


For python, the speedup won't be so huge, we get a 50%~100% speed up for dataclasses. We haven't put much efforts on it. In future I believe the speed up will be larger.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: