• 4 Posts
  • 146 Comments
Joined 3 years ago
cake
Cake day: June 23rd, 2023

help-circle





  • On September 19, Ruby Central, a nonprofit organization that manages RubyGems.org, a platform for sharing Ruby code and libraries, asserted control over several GitHub repositories for Ruby Gems as well as other critical Ruby open source projects that the rest of the Ruby development community relies on.

    Uhm, so how does this happen? If some people create Ruby Gems and host them under their own github account, how would Ruby Central suddenly assert control over them?





  • A lot of the times this comes down to a user error.

    For example, very similar to your case, I knew someone that enabled Cloudtrail, and configured some things to have Cloudtrail logs dumped on S3. Guess what? Dumping things on S3 also creates a Cloudtrail that gets logged to S3 that Cloudtrail logs. Etc

    Doing things like that and creating a loop can get you massive bills






  • Many people believe that the ToS was added to make Mozilla legally able to train AIs on the collected data.

    “Don’t attribute to malice what is easily explained by incompetence”

    So yea Mozilla wrote some terms that where ambiguous and could be interpreted in different ways, and ‘many people believed’ that they did this intentionally and had the worst intentions possible by their interpretation of the new ToS

    Then Mozilla rewrote that ToS after seeing how people were interpreting the original ToS:
    https://www.theverge.com/news/622080/mozilla-revising-firefox-terms-of-use-data

    And yea, now ‘many people will believe’ that ‘Mozilla revised their decision to do this after the backslash’ - OR, it was never their intention and now phrased it better after the confusion

    People just want to get their pitchforks out and start drama at any possible opportunity without evidence of wrongdoing… Mozilla added stupid stuff to the ToS, ok yea fair enough - but if they actually did “steal user data” - this would be very easily detectable with Wireshark or something





  • Also some feedback, a bit more technical, since I was trying to see how it works, more of a suggestion I suppose

    It looks like you’re looping through the documents and asking it for known tags, right? ({str(db.current_library.tags)}.)

    I don’t know if I would do this through a chat completion and a chat response, there are special functions for keyword-like searching, like embeddings. It’s a lot faster, and also probably way cheaper, since you’re paying barely anything for embeddings compared to chat tokens

    So the common way to do something like this in AI would be to use Vectors and embeddings: https://platform.openai.com/docs/guides/embeddings

    So - you’d ask for an embedding (A vector) for all your tags first. Then you ask for embeddings of your document.

    Then you can do a Nearest Neighbor Search for the tags, and see how closely they match


  • I haven’t used json(b) in a Spring app, so I can’t say much about that.

    Json vs Jsonb depends on the use-case. Inserting json is faster than inserting Jsonb. Reading json (based on searching for specific json properties) Jsonb is faster, because Jsonb is parsed into a more optimized tree.

    From my experience, I don’t really like doing selects based on json properties. If I know I’ll be selecting a certain property, I usually add an additional column next to the json with the data, and insert that property there (At least in c#/dotnet, with EF) The frameworks don’t have that much support for selecting within json (you can do it, it’s just a lot more natively supported to use proper columns)