Dr Cog

Dr Cog@mander.xyz · 10 months ago

They’ll get the picture in the fine letter so make sure you give them your best one-finger salute

Dr Cog@mander.xyz · 1 year ago

Definitely better than this outdated version. Nobody uses Python2 unless they want to at this point

Dr Cog@mander.xyz · 1 year ago

The work is not reproduced in its entirety. Simply using the work in its entirety is not a violation of copyright law, just as reading a book or watching a movie (even if pirated) is not a violation. The reproduction of that work is the violation, and LLMs simply do not store the works in their entirety nor are they capable of reproducing them.

Dr Cog@mander.xyz · 1 year ago

The argument is less that an LLM is a human and more that it is not a copyright violation to use a material to train the LLM. By current legal definitions, it is fair use unless the material is able to be reproduced in its entirety (or at least, in some meaningful way).

Dr Cog@mander.xyz · 1 year ago

It’s only black box because nobody has the time (likely years to decades) to wade through the layers of a finished model to check every node and weight.

This is exactly correct, except you’re also not accounting for the insane amount of computational power that would be necessary to backtrack a single output of a single model. This is why it is a black box. It simply is not possible on a meaningful level.

So if math and computer science isn’t an exact science, what is?

Things that are reproducible with known inputs and outputs, allowing for all components to be studied and explained. As an example from my field: if you damage the dorsolateral prefrontal cortex in a fully grown adult, they will have the impulse control of a three-year old. We know this because we have observed damage to this area in multiple individuals, and can measure the effects based on the severity of that damage.

In contrast, if you provide the same billion-parameter neural network identical inputs, you will not receive identical outputs.

Dr Cog@mander.xyz · 1 year ago

Look, I understand why you think this. I thought this too when I was first beginning to learn machine learning and data science. But I’ve now been working with machine learning models including neural networks for nearly a decade, and the truth is that is nearly impossible to track the path of an input to a given output in machine learning models other than regression-based models and decision tree-based models.

There is an entire field of data science devoted to explaining how these models arrive at their conclusions. It’s called “explainable AI” or “xAI”, and I have a few papers that I’ve published in exploring the utility of them. The basic explanation for how they work is that we run hundreds of thousands of different models and then do statistical analysis to estimate why the models arrived at their conclusion. It isn’t an exact science, however.

Dr Cog@mander.xyz · 1 year ago

You really don’t understand how these models work and you should learn about them before you make statements about them.

Machine learning models are, almost by definition, non-deterministic.

Dr Cog@mander.xyz · edit-2 1 year ago

Neither citation nor compensation are necessary for fair use, which is what occurs when an original work is used for its concepts but not reproduced.

Dr Cog@mander.xyz · 1 year ago

I agree. But that isn’t what AI is doing, because it doesn’t store the actual book and it isn’t possible to reproduce any part in a format that is recognizable as the original work.

Dr Cog@mander.xyz · 1 year ago

LOL

We understand less about how LLMs generate a single output than we do about the human brain. You clearly have no experience developing models.

Dr Cog@mander.xyz · 1 year ago

I don’t need to negotiate with Sarah Silverman if Im handed her book by a friend, and neither should an AI

Dr Cog@mander.xyz · 1 year ago

Tried to store documents in paint and it ruined the documents, 0/10

I’ll let you know how my file cabinet art show goes

Dr Cog@mander.xyz · 1 year ago

We do not federate with Meta

Dr Cog@mander.xyz · edit-2 1 year ago

That’s irrelevant. The plaintiff bought the FSD package and his attorney (not prosecutor, I missed that this was a civil suit not criminal trial) will likely argue that it introduced confusion on the part of his client. It doesn’t matter that the FSD package wasn’t actually in use if the plaintiff believed it was (or, more importantly, that he believed it could do things that it could not due to the confusing terminology)

Dr Cog@mander.xyz · 1 year ago

Sigh…

Jonathan Michaels, an attorney for the plaintiffs, in his opening statement at the trial in Riverside, California, said that when the 37-year-old Lee bought Tesla’s “full self-driving capability package” for $6,000 for his Model 3 in 2019, the system was in “beta,” meaning it was not yet ready for release.

RTFA

Dr Cog@mander.xyz · 1 year ago

The way it is marketed is not in line with it’s functionality. I expect the prosecution will claim the term “Full Self Driving” is confusing to consumers

Dr Cog@mander.xyz · 1 year ago

The concept isn’t, I agree. But it also isn’t a useful idea, either. There really doesn’t appear to be any benefit to using NFTs in any meaningful application, or at least nobody has pitched one that isn’t either a grift or a way to appear “trendy” by reinventing the wheel.

Dr Cog@mander.xyz · 1 year ago

No, they wouldn’t have. Because owning a link to a thing doesn’t mean anything, no matter what that thing is. They were only valuable because people didn’t understand NFTs and wanted to get rich quick.

Dr Cog@mander.xyz · 1 year ago

What exactly would this bot do as far as testing?

Dr Cog@mander.xyz · 1 year ago

I’m the director of technology for a neurology lab, where we collect patient health record data in a variety of disparate machines and modalities (e.g., MRI, EEG, physical functioning, retinal scans, etc.). We’ve been using the open-source database software REDCap (basically a wrapper for MySQL that enables easy GUI-based data entry), but we are reaching the limits of what it can handle and need something that can scale with our growing database.

I have little experience in database management myself, but I am a competent programmer and feel comfortable learning whatever is needed (famous last words, I know).