• 1 Post
  • 19 Comments
Joined 1 year ago
cake
Cake day: July 3rd, 2023

help-circle








  • Oh yeah, it’s actually pretty extensive and expressive. If you’re interested in this sort of stuff it’s worth checking out the IR language reference a bit. Apparently you can even specify the specific garbage collection strategy on a per-function basis if you want to. They do however specify the following: “Note that LLVM itself does not contain a garbage collector, this functionality is restricted to generating machine code which can interoperate with a collector provided externally” (source: https://llvm.org/docs/LangRef.html#garbage-collector-strategy-names )

    If you’re interested in this stuff it’s definitely fun to work through a part of that language reference document. It’s pretty approachable. After going through the first few chapters I had some fun writing some IR manually for some toy programs.










  • I know they are used in google’s BigTable. All data there is stored in seperate SSTables and you can specify that a locality group should have bloom filters generated for its SSTables. Apparently cassandra has them too.

    Both are the same general application though and you already mentioned databases.

    I did think about using them at some point for authentication purposes in a webservice. The idea being to check for double uses of a refresh token. This way the user database would need to store only a small amount of extra storage to check for the reuse of a refresh token and if you set the parameters accordingly, the false positives are kind of a benefit in that users cannot infinitely refresh and they actually have to reauthenticate sometimes.

    Edit to add: I also read a paper recently that uses a datastructure called a collage that is closely related to bloom filters to perform in-network calculations in a sensor network. If I understand correctly, the basic idea there is that every node in the network adds a bit to the datastructure while it is in transit, so data from the entire network is aggregated. The result can then be fed to a classifier ML model. (Source: Oostvogels, J., Michiels, S., & Hughes, D. (2022). One-Take: Gathering Distributed Sensor Data Through Dominant Symbols for Fast Classification. )


  • It does create a MITM vulnerability, the question is just whether it matters or not. With HTTPS a third party will only know which url you’re accessing. With HTTP they can see exactly what data is transferred and can modify that data at will.

    So adding HTTPS here accomplishes:

    • hiding which exact page of the hacker’s dictionary you’re accessing
    • hiding the exact contents of the page
    • ensuring that this page doesn’t get modified in transit

    None of these are really an issue, so using http in this situation is fine. In general though, I’d consider not having HTTPS as a bug for most sites, unless you’re extremely resource constrained on either side of the connection and you think carefully about the security and privacy implications