• 0 Posts
  • 92 Comments
Joined 1 year ago
cake
Cake day: August 27th, 2023

help-circle







  • C is dangerous like your uncle who drinks and smokes. Y’wanna make a weedwhacker-powered skateboard? Bitchin’! Nail that fucker on there good, she’ll be right. Get a bunch of C folks together and they’ll avoid all the stupid easy ways to kill somebody, in service to building something properly dangerous. They’ll raise the stakes from “accident” to “disaster.” Whether or not it works, it’s gonna blow people away.

    C++ is dangerous like a quiet librarian who knows exactly which forbidden tomes you’re looking for. He and his… associates… will gladly share all the dark magic you know how to ask about. They’ll assure you, oh no no no, the power cosmic would never turn someone inside-out, without sufficient warning. They don’t question why a loving god would allow the powers you crave. They will show you which runes to carve, and then, they will hand you the knife.


  • I have to admit - my initial outrage over Copilot training on open-source code has vanished.

    Now that these networks are trained on literally anything they can grab, including extremely copyrighted movies… we’ve seen that they’re either thoroughly transformative soup, or else the worst compression and search tools you’ve ever seen. There’s not really a middle ground. The image models where people have teased out lookalike frames for Dune or whatever aren’t good at much else. The language models that try to answer questions as more than dream-sequence autocomplete poetry will confidently regurgitate dangerous nonsense because they’re immune to sarcasm.

    The comparisons to a human learning from code by reading it are half-right. There are systems that discern relevant information without copying specific examples. They’re just utterly terrible at applying that information. Frankly, so are the ones copying specific examples. Once again, we’ve advanced the state of “AI,” and the A went a lot further than the I.

    And I cannot get offended on Warner Brothers’ behalf if a bunch of their DVDs were sluiced into a model that can draw Superman. I don’t even care when people copy their movies wholesale. Extracting the essence of an iconic character from those movies is obviously a transformative use. If some program will emit “slow motion zoom on Superman slapping Elon Musk,” just from typing that, that’s cool as hell and I refuse to pretend otherwise. It’s far more interesting than whatever legal fictions both criminalized 1700s bootlegging and encouraged Walt Disney’s corpse to keep drawing.

    So consider the inverse:

    Someone trains a Copilot clone on a dataset including the leaked Windows source code.

    Do you expect these corporations to suddenly claim their thing is being infringed upon, in front of any judge with two working eyes?

    More importantly - do you think that stupid robot would be any help what-so-ever to Wine developers? I don’t. These networks are good at patterns, not specifics. Good is being generous. If I wanted that illicit network to shamelessly clone Windows code, I expect the brace style would definitely match, the syntax might parse, and the actual program would do approximately dick.

    Neural networks feel like magic when hideously complex inputs have sparse approximate outputs. A zillion images could satisfy the request, “draw a cube.” Deep networks given a thousand human examples will discern some abstract concept of cube-ness… and also the fact you handed those thousand humans a blue pen. It’s simply not a good match for coding. Software development is largely about hideously complex outputs that satisfy sparse inputs in a very specific way. One line, one character, can screw things up in ways that feel incomprehensible. People have sneered about automation taking over coding since the punched-tape era, and there’s damn good reasons it keeps taking their jobs instead of ours. We’re not doing it on purpose. We’re always trying to make our work take less work. We simply do not know how to tell the machine to do what we do with machines. And apparently - neither do the machines.






  • ‘This markup language isn’t even as capable as Habbo Hotel, but it counts anyway because I just called it a programming language.’

    There is a literal hierarchy of syntaxes which are recognized by different categories of machine. Programs require a Turing machine. Anything lesser - in a subset like pushdown automata or finite-state machines - doesn’t need a proper computer. So it’s not a program.




  • Years ago I found myself explaining to Chinese Room dinguses - in a neural network, the part that does stuff is not the part written by humans.

    I’m not sure it’s meaningful to say this sort of AI has source. You can have open data sets. (Or rather you can be open about your data sets. I don’t give a shit if LLMs list a bunch of commercial book ISBNs.) But rebuilding a network isn’t exactly a matter of hitting “compile” and going out for coffee. It can take months, and the power output of a small city… and it still can’t be exact. There’s so much randomness involved in the process that it’d be iffy whether you get the same weights twice, even if you built everything around that goal.

    Saying “here’s the binary, do whatever” is honestly a lot better for neural networks than for code, because it’s not like the people who made it know how it works either.